Table of contents
- When do you use caching in a system design?
- System design is one of the most important concepts of software engineering. One of the main problems of designing a system is that the terminology used in the system design resources is hard to grasp at first. Besides you need to know which tool or term is used for which problem. Familiarizing yourself with the basic concepts and terminologies of system design would greatly help in designing a system.
- This article explores an essential topic of system design, Caching. It is one of the essential techniques of system design. In almost all the systems, we might need to use caching. It is a technique used for the performance improvement of a system.
- Cache:
- The cache is a piece of hardware or software that stores data that can be retrieved faster than other data sources. Caches are generally used to keep track of frequent responses to user requests. It can also be used in the case of storing the results of long computational operations. Caching is storing data in a location different than the main data source such that it’s faster to access the data. Like load balancers, caching can be used in various places of the system.
- Caching is done to avoid redoing the same complex computation again and again. It is used to improve the time complexity of algorithms; for example, say dynamic programming, we use the memorization technique to reduce time complexity. In the case of system design concepts, caching as a concept is also a bit similar. Caching is used to speed up a system. To improve the latency of a system, we need to use caching. Reducing network requests can also be a cause for using caching.
- Generally, caches contain the most recently accessed data as there is a chance that recently requested data is likely to be asked again and again. So, in such scenarios, we need to use caching to minimize data retrieval operations from the database. But there are different ways to select which data will stay in a cache. We will discuss it later part of the article.
- Terms of Caching:
- When requested data is found in the cache, it’s called a Cache Hit. When the requested information is not found in the cache, it negatively affects the system. It’s called a Cache Miss. It is a measure of poor design. We need to increase the number of hits and decrease the miss rate for performance improvement.
- Data can become stale if the primary source of data gets updated and the cache doesn’t. If in a system, stale data is not a problem caching can quickly improve performance significantly. Let’s say we are designing a system of youtube video watch count. It does not matter that much if different users see different values in the watch count. So, staleness is not a problem in such cases.
- Client-Side Caching:
- Client-level caching can be done so that client does not need to request to the server side. Similarly, the server may also use a cache. In that case, the server does not need to always hit the database for fetching data. We can have a cache in between two components also.
- Figure: Client-side cache helps reduce network call
- Handle Database pressure:
- Another example of using cache can be a popular Facebook celebrity profile access. Say, a celebrity updates a new post with photos in a Facebook profile. A lot of people are checking that profile updates. If all the users are requesting the same status update and the database has to retrieve the same data every time, it will put huge pressure on the database. In the worst case, it might crash the database. In such cases, we can cache the profile of the popular profile and get the data from there instead of risking losing the primary data source.
- Figure: A lot of users are requesting the same data, which is putting huge pressure on the database
- Let’s imagine we are writing an article for Medium. So, we have a browser that is the client of the system, and Medium is the server. A user requests the server to write an article; then, it is stored in a database.
- Now we may store the article in the server cache. So, we have two data sources for the same article. So, the question will be when to write to the database and when to write in the cache in case of an article edit. So, we need to know about caching invalidation techniques. Otherwise, we will get stale data for client requests.
- Cache Invalidation
- We usually use the cache as a faster data source that keeps a copy of the database. In the case of data modification in DB, if the cache contains the previous data, then that is called stale data.
- So, we need to have a technique for the invalidation of cache data. Else, the application will show inconsistent behavior. As the cache has limited memory, we need to update the data stored in it. This process is known as Cache Invalidation.
- We can invalidate the cache data; also, we have to update the latest data in the cache. Otherwise, the system will search in the cache and not find the data and again go to the database for data, affecting latency. Some techniques are used for cache data invalidation. We will discuss them below:
- Write-through cache:
- In this technique, data is written in the cache and DB. Before the data is written to the DB, the cache is updated with the data first.
- We will have two advantages; Cached data will provide fast retrieval so that performance will be faster. As the same cached data is stored in the database, consistency between cache and database will remain intact. As we keep the data in the database also, we have a backup copy in case of system failure. So, data will not be lost.
- But, in system design, no technique is perfect. There is always a downside to a method. We have to consider that too. Though this technique minimizes the data loss risk, we need to write data in two data sources before returning the client a success notification. So, the write or update operation will have higher latency.
- Write-back cache:
- This one is a bit different because we store data in the database in the other options, but here, data is written only to the cache. After data is written in the cache, a completion notification is sent to the client. The writing to the database is done after a time interval. This technique is useful when the application is write-heavy. And it provides low latency for such applications. You can reduce the frequency of database writes with this strategy.
- But, as you can already guess, <mark>this performance improvement comes with the risk of losing data in case of a crashed cache</mark>. Because the cache is the only copy of the written data, we need to be careful about it. If the cache fails before the DB is updated, the data might get lost.
- Cache Aside:
- In this strategy, the cache works along with the database trying to reduce the hits on DB as much as possible. When the user sends a request, the system first looks for data in the cache. If data is found, then just return it to the user. The database does not need to be involved. If the data is not found in the cache, then the data is retrieved from the database, the cache is updated with this data, and then is returned to the user. So, the next time anybody requests the same data, it is available in the cache.
- This approach works better for a read-heavy system; the data in the system is not frequently updated. For instance, user profile data in the @Medium like user name, mail id, user id, etc. normally does not need to be updated frequently.
- The problem with this approach is that the data present in the cache and the database could get inconsistent. To avoid this data the cache has a TTL “Time to Live”. After the time interval, the data needs to be invalidated from the cache.
- Read-Through Cache
- Cache eviction policies:
- The cache does not have a vast amount of space like a database. Also, in the case of stale data, we may need to remove them from the cache. So, cache eviction policies are important factors to consider while designing a cache. Below we may check some of the most common cache eviction policies:
- Eviction policy is something that depends on the system that you are designing. According to the requirements, we may select an eviction policy for the system.
- Conclusion:
- Caching is a key component of the performance of any system. It ensures low latency and high throughput. A cache is a backup of primary data storage; It has a limited amount of space. It is faster to retrieve data from a cache than the original data source like databases. Caches can be kept at all levels in the system, but we need to keep cache near the front end to quickly return the requested data. Caching is beneficial to performance, but it has some pitfalls. While designing a system, we need to be careful about the staleness of cache data.
- If data is mostly static, caching is an easy way to improve performance. In the case of data that is edited often, the cache is a bit tricky to implement. Caching will enable the system to make sure that the system can use its resources better.
When do you use caching in a system design?
System design is one of the most important concepts of software engineering. One of the main problems of designing a system is that the terminology used in the system design resources is hard to grasp at first. Besides you need to know which tool or term is used for which problem. Familiarizing yourself with the basic concepts and terminologies of system design would greatly help in designing a system.
This article explores an essential topic of system design, Caching. It is one of the essential techniques of system design. In almost all the systems, we might need to use caching. It is a technique used for the performance improvement of a system.
Cache:
The cache is a piece of hardware or software that stores data that can be retrieved faster than other data sources. Caches are generally used to keep track of frequent responses to user requests. It can also be used in the case of storing the results of long computational operations. Caching is storing data in a location different than the main data source such that it’s faster to access the data. Like load balancers, caching can be used in various places of the system.
Caching is done to avoid redoing the same complex computation again and again. It is used to improve the time complexity of algorithms; for example, say dynamic programming, we use the memorization technique to reduce time complexity. In the case of system design concepts, caching as a concept is also a bit similar. Caching is used to speed up a system. To improve the latency of a system, we need to use caching. Reducing network requests can also be a cause for using caching.
Generally, caches contain the most recently accessed data as there is a chance that recently requested data is likely to be asked again and again. So, in such scenarios, we need to use caching to minimize data retrieval operations from the database. But there are different ways to select which data will stay in a cache. We will discuss it later part of the article.
Terms of Caching:
When requested data is found in the cache, it’s called a Cache Hit. When the requested information is not found in the cache, it negatively affects the system. It’s called a Cache Miss. It is a measure of poor design. We need to increase the number of hits and decrease the miss rate for performance improvement.
Data can become stale if the primary source of data gets updated and the cache doesn’t. If in a system, stale data is not a problem caching can quickly improve performance significantly. Let’s say we are designing a system of youtube video watch count. It does not matter that much if different users see different values in the watch count. So, staleness is not a problem in such cases.
Client-Side Caching:
Client-level caching can be done so that client does not need to request to the server side. Similarly, the server may also use a cache. In that case, the server does not need to always hit the database for fetching data. We can have a cache in between two components also.
Figure: Client-side cache helps reduce network call
Handle Database pressure:
Another example of using cache can be a popular Facebook celebrity profile access. Say, a celebrity updates a new post with photos in a Facebook profile. A lot of people are checking that profile updates. If all the users are requesting the same status update and the database has to retrieve the same data every time, it will put huge pressure on the database. In the worst case, it might crash the database. In such cases, we can cache the profile of the popular profile and get the data from there instead of risking losing the primary data source.
Figure: A lot of users are requesting the same data, which is putting huge pressure on the database
Let’s imagine we are writing an article for Medium. So, we have a browser that is the client of the system, and Medium is the server. A user requests the server to write an article; then, it is stored in a database.
Now we may store the article in the server cache. So, we have two data sources for the same article. So, the question will be when to write to the database and when to write in the cache in case of an article edit. So, we need to know about caching invalidation techniques. Otherwise, we will get stale data for client requests.
Cache Invalidation
We usually use the cache as a faster data source that keeps a copy of the database. In the case of data modification in DB, if the cache contains the previous data, then that is called stale data.
So, we need to have a technique for the invalidation of cache data. Else, the application will show inconsistent behavior. As the cache has limited memory, we need to update the data stored in it. This process is known as Cache Invalidation.
We can invalidate the cache data; also, we have to update the latest data in the cache. Otherwise, the system will search in the cache and not find the data and again go to the database for data, affecting latency. Some techniques are used for cache data invalidation. We will discuss them below:
Figure: Caching Techniques
Write-through cache:
In this technique, data is written in the cache and DB. Before the data is written to the DB, the cache is updated with the data first.
We will have two advantages; Cached data will provide fast retrieval so that performance will be faster. As the same cached data is stored in the database, consistency between cache and database will remain intact. As we keep the data in the database also, we have a backup copy in case of system failure. So, data will not be lost.
But, in system design, no technique is perfect. There is always a downside to a method. We have to consider that too. Though this technique minimizes the data loss risk, we need to write data in two data sources before returning the client a success notification. So, the write or update operation will have higher latency.
Write-back cache:
This one is a bit different because we store data in the database in the other options, but here, data is written only to the cache. After data is written in the cache, a completion notification is sent to the client. The writing to the database is done after a time interval. This technique is useful when the application is write-heavy. And it provides low latency for such applications. You can reduce the frequency of database writes with this strategy.
But, as you can already guess, this performance improvement comes with the risk of losing data in case of a crashed cache. Because the cache is the only copy of the written data, we need to be careful about it. If the cache fails before the DB is updated, the data might get lost.
Cache Aside:
In this strategy, the cache works along with the database trying to reduce the hits on DB as much as possible. When the user sends a request, the system first looks for data in the cache. If data is found, then just return it to the user. The database does not need to be involved. If the data is not found in the cache, then the data is retrieved from the database, the cache is updated with this data, and then is returned to the user. So, the next time anybody requests the same data, it is available in the cache.
This approach works better for a read-heavy system; the data in the system is not frequently updated. For instance, user profile data in the @Medium like user name, mail id, user id, etc. normally does not need to be updated frequently.
The problem with this approach is that the data present in the cache and the database could get inconsistent. To avoid this data the cache has a TTL “Time to Live”. After the time interval, the data needs to be invalidated from the cache.
Read-Through Cache
This is similar to the Cache Aside strategy; The difference is that the cache always stays consistent with the database. The cache library has to take the responsibility of maintaining consistency.
A problem in this approach is that for the first time when information is requested by a user, it will be a cache miss. Then the system has to update the cache before returning the response. We can pre-load the cache with the information which has the chance to be requested most by the users.
Cache eviction policies:
The cache does not have a vast amount of space like a database. Also, in the case of stale data, we may need to remove them from the cache. So, cache eviction policies are important factors to consider while designing a cache. Below we may check some of the most common cache eviction policies:
First In First Out (FIFO): In this policy, the cache behaves in the same way as a queue. The cache evicts the data that is accessed first. It does not consider how often or how many times it was accessed before.
Last In First Out (LIFO): This one is the opposite of the FIFO technique. The cache removes data that is most recently added. Again no consideration of how many times that data is accessed.
Least Recently Used (LRU): In this policy, the cache discards the least recently used data. If data is not used recently, we assume there is less chance of them being called, so removing them provides options for more recent data in the cache.
Least Frequently Used (LFU): Here, we need to count how often a cache item is accessed. Those that have the least use frequency are discarded first. It is a notion that the least used data are wasting space of cache. So, remove them and update the cache with fresh data.
Random Selection: Here, the system randomly selects a data item from the cache and removes it to make the cache space when necessary. When the cache is almost full, this policy can be used. Otherwise, randomly you might remove the data item which is most needed.