Back to Blog

Database Sharding - The Story of How Instagram Scaled Past 2 Billion Users

When One Database Isn't Enough

In October 2010, Instagram launched with 25,000 signups on day one. The entire backend ran on a single Amazon EC2 machine. The database was one PostgreSQL instance. The engineering team was two people: Kevin Systrom and Mike Krieger.

Within a week, they had 100,000 users. Within two months, a million. The single PostgreSQL database was handling every like, every comment, every follow, every photo upload. And it was starting to sweat.

At first, the fixes were simple. Add an index here, optimize a query there, throw more RAM at the database server. This is the phase every startup goes through - what engineers call vertical scaling or "scaling up." You buy a bigger box. You get a faster SSD. You increase the connection pool. And for a while, it works.

But PostgreSQL on a single machine has hard limits. The largest Amazon RDS instance in 2012 had about 68 GB of RAM and could handle maybe 20,000 transactions per second on a good day. Instagram was doubling its user base every few months. Simple math told them they were heading for a wall.

The question was not whether they needed to distribute their data across multiple databases. The question was how - and that decision, made in 2012 with around 30 million users, would shape Instagram's infrastructure for the next decade and beyond.

They chose sharding. And the way they did it is one of the cleanest examples of database sharding in the real world - which is exactly why interviewers love asking about it.

This is the story of that journey: from one database to a globally distributed sharded infrastructure serving over 2 billion monthly active users. Along the way, you will learn every major sharding concept that appears in system design interviews.

Vertical vs Horizontal Scaling: Pick Your Pain

Before we get into the details of sharding, let us make sure the foundation is solid. Every scaling discussion starts with a fundamental choice.

Vertical Scaling (Scaling Up)

This means making your existing machine more powerful. More CPU cores, more RAM, faster disks, better network cards. It is the simplest approach because your application code does not change. Your one database just runs on beefier hardware.

Advantages: No code changes. No distributed systems complexity. Transactions, joins, and constraints work exactly as before.

Limits: There is a ceiling. The biggest cloud instances top out around 24 TB of RAM and 448 cores. Costs scale non-linearly - doubling capacity often more than doubles the price. And you have a single point of failure.

Instagram hit this wall around 2012. Their primary PostgreSQL instance was on the largest available EC2 instance, and it was running hot. Read replicas helped absorb read traffic, but writes were bottlenecked on the single primary.

Horizontal Scaling (Scaling Out)

This means adding more machines and distributing the work across them. Instead of one database handling everything, you have 10, 100, or 1,000 databases, each handling a portion of the data.

Advantages: No hard ceiling - just add more machines. Cost scales roughly linearly. Each machine is commodity hardware. If one fails, only a fraction of data is affected.

Costs: Your application needs shard-aware routing. Cross-machine queries become expensive. Transactions spanning machines require distributed coordination.

Sharding is the most common form of horizontal scaling - splitting your data into shards, distinct subsets each on a different database server. The critical decision is how you split.

The Real-World Progression

Do not present this as binary. Most systems use a combination: single database, then vertical scaling, then read replicas, then caching, then sharding as a last resort when writes become the bottleneck. Instagram followed exactly this progression. That disciplined approach is what interviewers want to hear.

Shard Keys: The Decision That Haunts You Forever

The most important decision in any sharding strategy is the shard key - the field you use to determine which shard a piece of data lives on. Choose well, and your system scales beautifully for years. Choose poorly, and you will be rewriting your data layer in 18 months.

Instagram's data model (simplified) has a few core entities:

  • Users: profiles, settings, follower counts
  • Posts: photos, captions, timestamps
  • Likes: who liked which post
  • Comments: who said what on which post
  • Follows: who follows whom
  • Stories: ephemeral content with view tracking

Option 1: Shard by User ID

Put all data for User #12345 - their profile, their posts, their likes, their comments - on the same shard. This is what Instagram chose, and here is why it makes sense:

Most queries are user-centric. "Show me my feed," "show me my profile," "show me my notifications" - these all start with a user ID. If all of a user's data is on one shard, these queries hit a single database. No cross-shard joins needed for the most common operations.

Write patterns are naturally distributed. Each user generates roughly the same amount of write traffic (posts, likes, comments), so sharding by user ID distributes writes fairly evenly. There are outliers (celebrity accounts), but the distribution is good enough.

The downside: Some queries span multiple users and therefore multiple shards. "Show me all likes on this post" requires checking the shards of every user who liked the post. "Show me my timeline" requires fetching posts from every user I follow, potentially across dozens of shards.

Option 2: Shard by Post ID

Put all data related to Post #67890 - the post itself, its likes, its comments - on the same shard.

Advantage: "Show all comments on this post" hits a single shard.

Disadvantage: "Show me User X's posts" spans multiple shards. And a viral post generates millions of likes hitting the same shard - a hotspot.

Option 3: Shard by Geography

Put users in North America on shards in US data centers, European users on European shards, and so on.

Advantage: Lower latency for users (their data is geographically close). Compliance with data residency regulations (GDPR, for example, may require European user data to stay in Europe).

Disadvantage: Users interact across geographies all the time. A user in Tokyo liking a post from a user in Brazil creates a cross-shard, cross-geography operation. Follow relationships span the globe. And user distribution is extremely uneven - a few countries have the majority of Instagram's users, creating hotspots on those shards.

Instagram's Choice

Instagram chose user ID-based sharding with a specific implementation. They created a logical shard scheme where each user ID embeds the shard information directly. Their ID generation system (inspired by Twitter's Snowflake) produces 64-bit IDs with the following structure:

  • 41 bits: millisecond timestamp (gives ~69 years of IDs)
  • 13 bits: logical shard ID (8,192 possible logical shards)
  • 10 bits: auto-incrementing sequence within that shard/millisecond

This means you can extract the shard from any ID just by bit-shifting - no lookup table, no hash function, no external service. The shard information is baked into the ID itself.

The 8,192 logical shards are then mapped to a smaller number of physical database servers. Multiple logical shards live on each physical server. When they need to rebalance (move data around), they move entire logical shards - not individual rows - which makes migrations much cleaner.

The Hotspot Problem

No matter how clever your shard key, some shards will be hotter than others. Justin Bieber's shard handles dramatically more traffic than yours or mine. Instagram mitigates this through:

  • Caching: Celebrity profiles and popular posts are cached aggressively. The shard only handles cache misses.
  • Read replicas per shard: Each shard has its own read replicas to absorb read traffic.
  • Logical-to-physical flexibility: If a physical server is overloaded, they can remap its logical shards to different physical servers without changing the application.

In an interview, acknowledge hotspots explicitly. It shows you understand that theoretical uniform distribution rarely matches reality.

Consistent Hashing - Explained with a Pizza Analogy

Once you have decided on a shard key (user ID), you need a function that maps each user to a specific shard. The naive approach is modular hashing: shard = user_id % number_of_shards. Simple. Fast. And terrible when you need to change the number of shards.

Why Modular Hashing Breaks

Suppose you have 4 shards. User #17 goes to shard 17 % 4 = 1. User #23 goes to shard 23 % 4 = 3. Everything is fine.

Now your system is overloaded and you need to add a 5th shard. With modular hashing, 17 % 5 = 2 and 23 % 5 = 3. User #17 just moved from shard 1 to shard 2. In fact, when you go from 4 to 5 shards with modular hashing, roughly 80% of your keys get reassigned to different shards. That means migrating 80% of your data. For a database with billions of rows, this is a catastrophe.

The Pizza Analogy

Imagine you and three friends are sharing a pizza. You cut it into 4 equal slices - one per person. Everyone eats their slice. Life is good.

Now a fifth friend shows up. With modular hashing, you would have to re-cut the entire pizza from scratch into 5 slices. Everyone has to give back their partially eaten slice and get a new, differently-shaped piece. Chaos.

Consistent hashing works differently. Imagine the pizza is round (it is). Instead of cutting it into equal slices, each person puts a marker at a random point on the edge of the pizza. You go clockwise from any given point to find the nearest marker - that person "owns" that section.

When the fifth friend arrives, they place their marker on the edge. Only the section between their marker and the next one clockwise is affected. One person gives up a portion of their section. Everyone else keeps exactly what they had.

That is consistent hashing. When you add a new shard, only a small fraction of keys need to move - approximately 1/n where n is the new number of shards.

Hash Rings and Virtual Nodes

Formally, consistent hashing works by placing both shards and keys on a circular hash space (the "hash ring"):

  1. Hash each shard's identifier to a position on the ring.
  2. Hash each key (user ID) to a position on the ring.
  3. Each key is assigned to the first shard encountered when walking clockwise from the key's position.

The problem with basic consistent hashing is that shards may be unevenly spaced on the ring, leading to uneven load distribution. The fix is virtual nodes: instead of placing each shard at one point on the ring, you place it at 100 or 200 points (virtual nodes). This gives you a much more even distribution, because with enough virtual nodes, the law of large numbers ensures each shard owns approximately the same portion of the ring.

Hash Ring (simplified):

     V1-A        V1-B
      |            |
      v            v
   ──●─────────●─────────●──
     ^                    ^
     |                    |
   V1-C               V2-A

Shard A has virtual nodes: V1-A, V1-B, V1-C
Shard B has virtual nodes: V2-A, V2-B, V2-C
...and so on

Instagram's Approach

Instagram actually uses a mapping table rather than pure consistent hashing - their 8,192 logical shards map to physical servers through an updatable configuration, giving more explicit control over data placement.

In an interview, present consistent hashing as the general-purpose solution, then mention that production systems often use explicit mapping tables for more control (regulatory compliance, hardware-aware placement, gradual migration).

Cross-Shard Queries: The Hidden Tax

Here is the uncomfortable truth about sharding that candidates often gloss over: the moment you shard your database, some queries become dramatically more expensive.

The Problem

On a single database, you can join any table with any other table. "Show me all posts by users I follow, ordered by time, with like counts" is one SQL query with a few joins. Fast, elegant, handled entirely by the database engine's query optimizer.

On a sharded database, the users you follow are spread across dozens of shards. Their posts are on those shards too. To build your feed, you need to:

  1. Look up your follow list (on your shard).
  2. For each user you follow, determine which shard they are on.
  3. Query each of those shards for recent posts.
  4. Aggregate the results and sort by time.
  5. Return the top N posts.

This is called a scatter-gather query. You scatter the query across multiple shards, then gather and merge the results. It works, but it is slower than a single-shard query, uses more network bandwidth, and puts load on multiple database servers simultaneously.

Instagram's Solutions

Instagram uses several strategies to mitigate cross-shard query costs:

Denormalization: Instead of joining across shards at query time, they precompute and store the results. For example, a user's feed might be materialized as a separate data structure - when someone you follow posts a photo, the post ID is written into your feed table (which lives on your shard). This trades write amplification (every post triggers N writes to followers' feed tables) for fast reads (your feed is a single-shard query on your own shard).

Caching layers: Cross-shard query results are aggressively cached. Your feed, once computed, is cached in Memcached. Subsequent requests serve from cache, and the cache is incrementally updated when new posts arrive rather than recomputed from scratch.

Asynchronous fan-out: When a popular user with millions of followers posts, writing to all followers' feed tables synchronously would take forever. Instead, the fan-out happens asynchronously through a message queue. Your feed might not include the very latest post from a celebrity for a few seconds - but for a social media feed, this is an acceptable trade-off.

Secondary indexes: For some query patterns, Instagram maintains separate index tables that span shards. For example, a "posts by hashtag" index might be sharded by hashtag rather than by user ID, allowing efficient hashtag searches without scatter-gather across all user shards.

The Denormalization Trade-off

Denormalization is the most common answer to cross-shard queries, and interviewers will push back on it. Be ready to discuss the trade-offs:

AspectNormalized (join at read)Denormalized (precompute)
Read speedSlow (cross-shard joins)Fast (single-shard read)
Write speedFast (write once)Slow (write to many places)
StorageEfficient (no duplication)Expensive (data duplicated)
ConsistencyStrong (one source of truth)Eventual (copies may lag)
ComplexityComplex readsComplex writes

For Instagram's use case - read-heavy, latency-sensitive, with feeds that tolerate a few seconds of staleness - denormalization is the right call. But for a banking system where every read must be perfectly consistent, you might make a different choice.

Rebalancing Without Downtime

Your system is sharded and running. Then one shard starts getting too hot (too much data, too much traffic). You need to move some data to a new shard. Without taking the system offline. While millions of users are actively reading and writing.

This is the rebalancing problem, and it is one of the hardest operational challenges in a sharded system.

Strategy 1: Double-Write

The safest approach for migrating data between shards:

  1. Start dual-writing: All new writes go to both the old shard and the new shard. The old shard remains the source of truth for reads.
  2. Backfill historical data: Copy existing data from the old shard to the new shard. This can happen in the background using a batch process.
  3. Verify consistency: Run a comparison process that checks the new shard against the old shard to ensure all data was copied correctly.
  4. Switch reads: Once verified, switch reads to the new shard. The new shard is now the source of truth.
  5. Stop dual-writing to old shard: Remove writes to the old shard.
  6. Clean up: Delete the migrated data from the old shard.

This process can take hours or days, depending on the data volume, but the system stays online throughout. The dual-write period ensures no data is lost during migration.

Strategy 2: Shadow Reads

Before fully switching reads to the new shard, you can run shadow reads - sending read queries to both the old and new shards, comparing the results, but only serving the old shard's results to users. This catches inconsistencies before they affect users.

Shadow reads add load (every read becomes two reads), so you typically run them on a sample of traffic rather than 100%.

Strategy 3: Logical Shards with Remapping

This is Instagram's approach and the most elegant solution. Because their 8,192 logical shards are mapped to physical servers through a configuration, rebalancing is just remapping a logical shard to a different physical server.

The process:

  1. Set up a new physical server (or identify an underloaded one).
  2. Replicate the logical shard's data to the new server (using PostgreSQL replication).
  3. When the replica is caught up, update the mapping to point the logical shard at the new server.
  4. The switch is nearly instantaneous - just a configuration change.

This works because the logical shard is a complete, self-contained unit. All the data that was on logical shard #2047 stays together - it just moves to a different physical home.

The Key Insight for Interviews

The reason Instagram's approach works so well is the separation of logical and physical sharding. Logical shards are a permanent, immutable part of the data model (embedded in every ID). Physical servers are operational infrastructure that can change. This decoupling is a powerful design pattern. Mention it in your interview - it shows you understand that the sharding scheme should not be tightly coupled to the hardware topology.

When NOT to Shard

This might be the most important section of this entire post. In interviews, the candidates who stand out are not the ones who jump to sharding - they are the ones who know when sharding is unnecessary.

Premature Sharding Is Real

Sharding adds enormous complexity: shard-aware routing, painful cross-shard queries, distributed transactions, coordinated schema migrations, and harder debugging. If you shard at 10,000 users, you are paying the complexity tax of a billion-user system with zero benefit.

Try These First

Before sharding, exhaust these alternatives:

Read replicas: If your workload is read-heavy (most web applications are), adding read replicas can handle 5-10x more read traffic. Writes still go to the primary, reads are distributed across replicas. Instagram used this to great effect before sharding.

Connection pooling: Tools like PgBouncer can dramatically improve throughput by reusing connections - many performance issues are actually connection management issues.

Caching: A Redis or Memcached layer can reduce database load by 90%+. If most requests hit a small percentage of data (true for social feeds, catalogs, profiles), caching is transformative.

Table partitioning: Modern databases support native partitioning - splitting a table into physical storage units by a partition key. You get smaller indexes and partition pruning without application complexity.

Vertical scaling: Cloud instances in 2026 can have 24 TB of RAM. A bigger machine might buy you another year - and that year lets you design a proper sharding strategy.

Archiving old data: Moving rarely-accessed data to cold storage can shrink your hot dataset and improve performance dramatically.

The Decision Framework

Shard when:

  • Your write volume exceeds what a single machine can handle (read replicas do not help with writes).
  • Your data volume exceeds what a single machine can store (even after archiving old data).
  • You need geographic distribution for latency or compliance reasons.
  • You have a clear, natural shard key that aligns with your primary access patterns.

Do not shard when:

  • You have not tried read replicas, caching, and query optimization first.
  • Your data fits comfortably on one machine.
  • Your primary bottleneck is read traffic (use read replicas and caching instead).
  • You do not have a clear shard key (sharding with a bad key is worse than not sharding).
  • Your team does not have the operational capacity to manage a sharded system.

What Instagram Teaches Us

Instagram's story is powerful in an interview because it demonstrates disciplined scaling. They did not shard on day one. They started with one PostgreSQL instance. They scaled up. They added read replicas. They added aggressive caching. They sharded only when writes became the bottleneck and they had a clear shard key (user ID) that aligned with their access patterns.

When you present a system design, follow this same progression. Start simple. Identify the bottleneck. Apply the least complex solution that addresses it. Shard as a last resort, not a first instinct.

The Complete Picture

Let us bring together everything we have covered by looking at Instagram's evolved architecture:

User Request
    ↓
Load Balancer
    ↓
Application Server
    ↓
┌─────────────────────────────────┐
│         Shard Router            │
│  (extracts shard from user ID)  │
└────────────┬────────────────────┘
             ↓
    Logical Shard #2047
             ↓
    ┌────────────────┐
    │ Physical Server │──→ Read Replica 1
    │   (PostgreSQL)  │──→ Read Replica 2
    └────────────────┘
             ↑
    Cached in Memcached
    (most reads never hit DB)

Each user ID contains the logical shard number. The shard router extracts it with a bit-shift operation - no network call, no hash function, no lookup table. The logical shard maps to a physical server through a configuration. Each physical server has read replicas for read scaling. And a caching layer sits in front of everything, handling the vast majority of reads.

Simple? In concept, yes. But getting here required years of incremental evolution, careful decision-making, and the discipline to avoid premature complexity. That journey - and the reasoning behind each decision - is exactly what interviewers want to hear.


FAQ

What's the difference between sharding and partitioning?

The terms are often used interchangeably in casual conversation, but there is a meaningful distinction. Partitioning refers to splitting a table's data into smaller pieces, and it can be either vertical (splitting columns across tables) or horizontal (splitting rows across tables). Sharding is specifically horizontal partitioning where the partitions live on different database servers. You can think of sharding as "partitioning across machines." Native database partitioning (like PostgreSQL's table partitioning) splits data within a single database instance - you get benefits like smaller indexes and partition pruning, but you are still limited by one machine's resources. Sharding distributes partitions across multiple machines, giving you theoretically unlimited horizontal scale at the cost of significantly more application complexity. In an interview, use "partitioning" when the data stays on one machine and "sharding" when it spans multiple machines. This precision signals that you understand the operational difference.

How do you choose a shard key?

Choosing a shard key comes down to three criteria, in order of priority. First, query alignment: the shard key should match your most common query pattern so that the majority of queries hit a single shard. If 80% of your queries start with a user ID, shard by user ID. If they start with a merchant ID, shard by merchant ID. Second, write distribution: the shard key should distribute writes roughly evenly across shards. Keys with high cardinality (like user IDs) naturally distribute well; keys with low cardinality (like country codes) create hotspots. Third, data growth: consider how data grows. If you shard by date, older shards become cold while the current shard gets all the traffic. If you shard by user ID, growth is distributed as new users are spread across shards. A practical test is to look at your top 10 database queries, identify which field appears in the WHERE clause of most of them, and check whether that field has high cardinality and even distribution. That field is likely your shard key.

Can you shard a relational database?

Yes, and Instagram is proof - they shard PostgreSQL, a fully relational database. However, sharding a relational database means giving up some of the features that make relational databases powerful. Specifically, you lose the ability to do cross-shard joins efficiently, foreign key constraints across shards cannot be enforced by the database, and distributed transactions require external coordination (two-phase commit, which is slow, or saga patterns, which add application complexity). You retain relational features within a single shard - joins, constraints, ACID transactions all work normally for data on the same shard. This is why shard key selection is so critical: if your shard key is chosen well, 95% of your queries stay within one shard and you barely notice the loss of cross-shard features. Many teams choose relational databases for sharding precisely because they want ACID guarantees within each shard. The alternative - using a distributed NoSQL database like Cassandra or DynamoDB - gives you built-in sharding but trades away joins, constraints, and multi-row transactions entirely.

What is consistent hashing and why does it matter?

Consistent hashing is an algorithm that distributes data across a changing set of servers while minimizing the amount of data that needs to move when servers are added or removed. In traditional modular hashing (key % N), changing N (the number of servers) reassigns the majority of keys - roughly (N-1)/N of all data moves. Consistent hashing reduces this to approximately 1/N. It works by placing both servers and data keys on a circular hash space (a "hash ring"). Each key is assigned to the nearest server clockwise on the ring. When a server is added, it takes responsibility for keys between itself and the previous server on the ring - only those keys move. When a server is removed, its keys move to the next server clockwise. Virtual nodes improve the scheme by placing each physical server at multiple points on the ring (100-200 is common), which ensures more even distribution. Consistent hashing matters because in real systems, servers are added and removed constantly - scaling events, hardware failures, maintenance windows. A sharding scheme that requires migrating 80% of data every time the server count changes is operationally unworkable. Consistent hashing makes these changes incremental and manageable. It is used in DynamoDB, Cassandra, Memcached, and many content delivery networks.


Ready to practice explaining sharding strategies with real-time feedback? Try Levelop's System Design Canvas - draw architecture diagrams, get AI-powered evaluation, and build the muscle memory for your next interview.

Continue your system design journey: learn how Netflix handles streaming at scale, or trace a WhatsApp message from send to delivery.

Keep Reading

System Design

Vanishing Links: I Designed a URL Shortener and the Expiry Logic Was the Hard Part

A URL shortener is not a mapping problem. It is a lifecycle problem under adversarial conditions. Here's the full design — three systems sharing a data layer, and the expiry logic that makes or breaks the whole thing.

Read Article
System Design

I Designed Netflix in 45 Minutes Flat - Here's the Exact Blueprint

Sweaty palms, a blank whiteboard, and five words: 'Design Netflix.' Here's the exact blueprint that turned my panic into a perfect system design answer.

Read Article
System Design

How WhatsApp Handles 100 Billion Messages a Day - Explained Like You're in an Interview

One message, a million hops. Follow a single WhatsApp message from your thumb hitting send to your friend's phone buzzing - and every system it touches along the way.

Read Article