Sharding
ELI5 — The Vibe Check
Sharding splits your database across multiple servers based on some rule — like user IDs 1-1M on server 1, 1M-2M on server 2. It is how you scale a database beyond what one machine can handle. Also massively increases complexity. Use only when you truly need it.
Real Talk
Sharding is a horizontal scaling strategy that partitions data across multiple database instances (shards). Each shard holds a subset of the data determined by a shard key. It allows virtually unlimited write scale but complicates queries that span shards, transactions, and schema changes. Many large-scale systems (Stripe, Discord) use sharding.
When You'll Hear This
"We shard by user_id to distribute load across database nodes." / "Sharding makes cross-shard queries expensive and complex."
Related Terms
Database
A database is like a super-organized filing cabinet for your app's data.
DynamoDB
DynamoDB is Amazon's NoSQL database that scales to literally any size without you doing anything.
Partitioning
Partitioning divides a huge table into smaller physical chunks while still appearing as one table to your queries.
Replication
Replication means automatically copying your database to one or more other servers in real time. If the main server dies, a replica takes over.