MinIO just passed the one billion Docker download milestone—no storage company has ever come close to this achievement. This milestone speaks to the emergence of object storage as primary storage and the changing nature of how applications are built, deployed and scaled. It also speaks to MinIO’s particular capabilities but let’s focus on the trends.
The traditional knock on object storage was that it was slow but highly scalable. Amazon S3 changed the game with its Simple Storage Service (S3) in 2006. Using RESTful APIs and delivering performance and value, it became the standard for the cloud.
AWS EMR, Redshift, Databricks, Snowflake, Snapchat, Netflix, BigQuery all run on object storage. Object is superior to SAN/NAS when it comes to APIs, complexity, performance at scale, cost (capex and opex), reliability, resilience and security. The attributes of object storage make it ideal for the cloud operating model and that model now exists everywhere—other public clouds, private clouds, on-prem and even at the edge.
As a result, high-performance, cloud-native object storage has made that leap too. It is everywhere, running every type of workload—notably the workloads that were previously the domain of SAN/NAS, like databases.
Databases moving to object storage are driven by the performance characteristics of modern object storage and the explosive growth of data/metadata. Because of these two forces, almost every major database vendor now includes S3 compatible endpoints. For most workloads, this becomes the default architecture whether in the cloud or on-prem.
Let’s explore the concepts briefly.
The performance requirements associated with databases have recently inverted. Databases previously demanded high IOPS to make many small changes across the network. This was well-suited to SAN and NAS architectures and thus databases became their bread and butter, but IOPS is not economically scalable.
Databases no longer mutate the data across the network in 4KB chunks. Instead, they stream the objects (specifically table segments) in MB-sized extents to the client-side memory and mutate them locally. The local memory IOPS in combination with a 100GBe makes this a throughput problem—not an IOPS problem.
Object storage is throughput driven, not IOPS or latency driven. The new database model is ideal for object storage because the extents are immutable. Since every change is automatically versioned, object storage can provide continuous data protection without needing snapshots.
MinIO is not just the world’s fastest object store in terms of throughput but also in small objects performance, where the table segments range from 256K to 2MB because MinIO does not use a metadata database due to its deterministic hashing to look up objects.
For an object store to host a database, it needs to deliver exceptional throughput alongside acceptable latency. If an object store can deliver on these, it can run 80% or more of the database’s requirements. If it can’t, it will only run 20-30% of those workloads—effectively backups.
Database vendors have disaggregated storage and compute, claiming the compute side for themselves and offloading the storage to high-performance object storage. They focus on distributed, high-performance query processing, emphasizing features and functions while leaving the heavy lifting of storage to companies like MinIO—allowing them to lay claim to ever-larger swaths of data because of object’s scalability.
Most file and block systems were not designed for scale—when you push past 100TB issues arise. 1PB is rarefied air for these systems.
Object storage, on the other hand, just starts hitting its sweet spot at 1PB. This makes object storage the ideal complement for databases delivering against giant application workloads that cover large components of an organization’s data.
The model is simple, given the throughput capabilities of fast object storage. A small portion of data is kept “in memory” for ultra-fast processing while the vast majority of it sits in a (very) warm tier, available using the standard S3 calls that define the modern application ecosystem.
This is even more potent when S3 Select support is available. Few vendors support this predicate pushdown functionality, and none deliver high performance. MinIO does, and as a result just what you need can be pulled from PBs of data.
This object storage trend won’t stop, for the simple reason that the cloud-native movement is exponentially larger than the POSIX movement. The cloud operating model wins, and it prefers object storage.
This is why MinIO has days with 1.3+ million Docker pulls while averaging 1M per day. How MinIO accumulated 35.5 Github Stars and 20,000 members of our public Slack channel. It is because we build an AWS S3-compatible object store that runs anywhere—from inside of AWS itself to the edge. The milestone is certainly cool, but it is just that: a milestone on a journey that is just beginning. See for yourself at https://min.io/download.
Image licensed by pixabay.com