Four caching strategies diagram showing cache-aside, read-through, write-through, and write-behind patterns

System Design

Cache Invalidation Strategies: How to Keep a Cache Fresh

May 25, 2026 7 min read Avinash Tyagi

cache invalidation strategies cache invalidation ttl cache expiration write-through invalidation delete-on-write cache stampede distributed cache redis caching system design interview cache invalidation redis

There is an old joke that the two hardest problems in computer science are naming things, cache invalidation, and off-by-one errors. The joke lands because cache invalidation really is hard. Adding a cache is easy. Deciding when to throw away or update what is in it, without serving stale data or melting your database, is where systems get complicated.

This post breaks down the cache invalidation strategies you actually need to know: what each one does, when it fits, and how it fails. If you want the broader view of caching read and write patterns first, start with our guide to caching strategies for system design. Here the focus is narrower and more important: keeping cached data correct.

What Cache Invalidation Actually Means

Cache invalidation is the process of removing or updating cached entries when the underlying source of truth changes. A cache is fast because it is a copy, and every copy can go stale the moment the original changes. Invalidation is the mechanism that decides how long a stale copy is allowed to live and who is responsible for refreshing it.

Two numbers frame every cache invalidation strategy. The staleness window is how long the cache may serve outdated data after the source changes. The cache hit ratio is how often requests are served from data stored in the cache instead of the database. A good invalidation strategy keeps the staleness window inside what your product can tolerate without destroying the hit ratio. Get it wrong in one direction and users see wrong data. Get it wrong in the other and your cache stops earning its keep.

The Core Cache Invalidation Strategies

TTL and Time-Based Expiration

The simplest cache invalidation strategy is TTL cache expiration, also called time to live. You attach an expiry to each entry stored in the cache, and when the TTL elapses the entry is evicted and the next read repopulates it. You never explicitly invalidate anything; time to live does it for you.

ttl_expiration.pypython

def cache_user(user_id, user):
    ttl = 3600 + random.randint(0, 300)  # base + jitter
    redis.setex(f"user:{user_id}", ttl, json.dumps(user))

TTL cache expiration works when bounded staleness is acceptable: product catalogs, config that changes slowly, feed data. The danger is that many keys expiring at the same instant cause a stampede, which is why the jitter above matters. Time to live alone never guarantees freshness, only an upper bound on how stale data can get.

Write-Through Invalidation

With write-through invalidation, every write updates the cache and the database together, synchronously. The cache is refreshed at the exact moment the source changes, so there is effectively no staleness window. This is the most consistent cache invalidation strategy and the reason write-through is the conservative choice for user-facing data.

write_through.pypython

def update_user(user_id, data):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.setex(f"user:{user_id}", 3600, json.dumps(data))

The cost is doubled write latency, since every write must succeed in two places. For low to moderate write volume that is fine. For write-heavy systems it becomes a bottleneck.

Delete-on-Write (Cache-Aside Invalidation)

The most common cache invalidation strategy in practice is delete-on-write. On a write, you update the database and then delete the cached key. The next read misses, loads fresh data, and repopulates the cache lazily.

delete_on_write.pypython

def update_user(user_id, data):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.delete(f"user:{user_id}")

Deleting rather than updating avoids writing values nobody will read. The subtle risk is a race: between the database write and the delete, a concurrent read can repopulate the cache with the old value and leave it there. Mitigations include deleting before and after the write, or using a short TTL cache expiration as a backstop so any leaked stale entry self-heals.

Event-Based and Pub/Sub Invalidation

In a distributed cache with many nodes, a single delete is not enough because each node holds its own copy. Event-based invalidation publishes a change event, often driven by change data capture on the database, and every cache node subscribes and invalidates the affected key.

pubsub_invalidation.pypython

db.execute("UPDATE users SET ... WHERE id = %s", user_id)
redis.publish("cache-invalidate", f"user:{user_id}")

This is how you keep caches consistent across services and regions. It adds moving parts: a message bus, ordering guarantees, and the chance that an event is missed, so production systems pair it with a TTL so a missed event still heals eventually.

The Failure Modes of Cache Invalidation

Every cache invalidation strategy has a way it breaks. Naming the failure mode in a system design interview is what separates a shallow answer from a strong one.

Cache stampede, also called thundering herd, happens when a popular entry is invalidated and hundreds of concurrent requests all miss at once and hit the database for the same row. Prevent it with cache locking so only one request repopulates while others wait, staggered TTLs with jitter, and background refresh that reloads an entry just before it expires.

The stale data window is the gap between the source changing and the cache reflecting it. Delete-on-write and TTL cache expiration both leave a window; write-through closes it at the cost of latency. Decide explicitly how large a window your product tolerates.

Distributed inconsistency is the failure mode of multi-node caches. Without event-based invalidation, different nodes serve different versions of the same key. Without ordering guarantees, an out-of-order event can reinstate a stale value, which is why robust systems attach versions or timestamps.

Choosing a Cache Invalidation Strategy

The right cache invalidation strategy falls out of three questions. What is the read-to-write ratio? Read-heavy workloads favor delete-on-write or TTL cache expiration; write-heavy ones make synchronous write-through expensive. How much staleness can you tolerate? Near-zero pushes you to write-through, seconds or minutes allow TTL. Is the cache single-node or distributed? Distributed caches need event-based invalidation plus a TTL as a backstop.

In practice you combine them. A common production pattern is delete-on-write for correctness on each change, plus TTL cache expiration as insurance against missed deletes, plus pub/sub for cross-node consistency. The AWS database caching guidance and the Redis documentation both describe these combinations in depth.

Cache Invalidation in System Design Interviews

If an interviewer asks how you would cache a system, do not stop at putting Redis in front of the database. Say which cache invalidation strategy you would use and why, for example delete-on-write with a short TTL backstop because you can tolerate a few seconds of staleness and writes are infrequent. Then name the failure mode you are guarding against, like cache stampede, and how you would prevent it. For the full set of read and write caching patterns that sit alongside invalidation, see our caching strategies for system design guide, and explore more breakdowns on the Levelop blog.

Frequently asked questions

What is cache invalidation in simple terms?

Cache invalidation is removing or refreshing data stored in the cache once the original data changes, so the cache does not keep serving an outdated copy. It is the mechanism that controls how stale your cache is allowed to get.

What are the main cache invalidation strategies?

The core strategies are TTL cache expiration, write-through invalidation that refreshes the cache on every write, delete-on-write invalidation that removes the key and repopulates lazily, and event-based or pub/sub invalidation for distributed caches. Most production systems combine several.

Why is cache invalidation considered hard?

Because a cache is a copy, and keeping a copy in sync with a changing source across concurrent reads, writes, and multiple cache nodes creates race conditions, stale windows, and consistency problems that are easy to get subtly wrong.

How do you prevent a cache stampede during invalidation?

Use cache locking so only one request repopulates an expired key while others wait, add random jitter to TTLs so keys do not all expire together, and use background refresh to reload hot entries just before they expire.

Should I delete or update the cache on a write?

Deleting with delete-on-write is usually safer because it avoids caching values nobody reads and sidesteps some races, with the next read repopulating fresh data. Updating in place with write-through gives stronger consistency but doubles write latency, so it fits low-write, consistency-critical data.