Rails Cache. Introduction to optimizing Apps
As long as your application is small and has a limited amount of visitors everything goes smooth and fast. At this stage you don’t have to worry (well, you should, but let’s assume that time to market was the key factor) about making it more performant by focusing on website optimization. But as your website grows, gains popularity and visitors, loading time starts to increase...
And as the performance goes down, users are getting more and more annoyed waiting for a website to load. At this point, it’s about time to boost the speed of your website. But before you’ll go to heroku to implement additional workers, wasn’t there something you forgot earlier? Let’s take a step back and add some cache - in the end it might save the day and make your application capable of handling much more traffic. Learn more about Ruby on Rails cache.
What is cache and where to use it?
Web pages are built of the parts that change quite rarely. Normally an application needs to process everything that is defined in view, for example, list some objects in a loop or get some attributes from the object each and every time it gets request for it from a browser. It is a huge waste of time and resources that could be used for something else. A cache allows us to store in memory these repetitive parts of view logic and reuse them every time we are going back to the web page.
How does Rails cache work?
You can cache a whole page and almost completely skip any database queries by sending pre-saved rendering result to your visitors - this method would probably save most time and resources, once implemented. But imagine even a simple news site with posts and comments. People add new content all the time, so comment counters change. If we implemented full page cache, visitors to the main page would see old count every time. Of course, we could “invalidate” (delete) cache whenever someone adds new comment, but this would be highly ineffective. But what about caching just parts of a page, ie. every post block separately?
In rails this very specific kind of cache is calledfragment cache
and in this article we are going to focus mainly on it.
Here is an example of a cache definition in our code:
As we can see we are using cache
method mainly to wrap the code where the data is processed in some way. On the first request a cache object (in memory, file on disk, etc) won’t be found, so the code inside of the block will be executed and the new cache object will be created. From now on when the same cache block occurs, it will be able to find the cache object for an object passed in the first parameter, and if it will be the same as generated on first request, it will reuse this object rather than execute code. So the question now is how an app is going to recognize that this ID represents this particular cache block and not a different one. Well, after the word cache
we define the key that is unique for one cache block, app is simply generating key from our argument and search for the ID that will fit it. For block same as in above example cache key might look like this:
views/articles/2-20161018190855530909000/g5ebea385672ogt530mjkirh046102w4
This is the default key generated when we are passing an object as argument. The first part is the path to the view where the cache is defined, this next long number is an object attribute updated_at
and the last part is a hash generated based on the view.
The biggest role of a key is to prevent us from using outdated cache objects that don’t reflect the current state of the data presented on the web page. As stated before generated key includes rendered object updated_at
attribute, so every time something changes in that object the updated_at
attribute also changes its value. From this moment, the generated key is different than cache ID stored previously, so it is not going to be found and the new one is going to be generated with the current state of data.
Russian doll caching
During my adventure with a cache this method turned out to be very useful. It is about using cache blocks inside the other cache block. It is mainly used with loops, just like in the example:
In controller:
In view:
We have N cache objects for every article object and one cache object for the entire loop. When nothing changes in the upper key, cache object is found, then rendered and content of cache block is not executed. When something changes, it is detected in the upper key, then every lower key in the loop is checked, invalid lower cache objects are updated and valid ones are reused.
As you can see we are passing ActiveRecord::Relation
object in the upper key. In this case the method will add to the key updated_at
attribute of a recently updated object from passed relation.
From effectiveness perspective, if no element has changed, we are generating and looking for a cache key only once, otherwise we need to iterate through every element and check key as many times as we have objects. Also caching the entire loop once is better because after detecting a change in upper cache block we can reuse all lower cache objects that are still accurate.
The difficulty of a good cache key
Above examples were pretty simple and they mainly contained only one simple object. But what if we are not caching information from one object, but from the entire partials or larger fragments of code containing data from associated objects, just like in the example below:
The code in this example will only seem to work properly. You will change the article name, cached object will change, so theoretically everything should be fine. But take a closer look - we are having a key containing only updated_at
attribute from an article
object, but we are caching information that not only belongs to the article, but also article user (probably it’s author), so if username would be changed, it wouldn’t be properly reflected in view. Why? It’s simple - cache object uses updated_at
from article
, not user
, so it won’t be updated and the old data would be displayed. To make it work we need to create an array containing all objects that are rendered in a cache block. Passing objects like that in the array will create the key containing updated_at
attribute for both of them. It should look like this:
When it comes to associations there is also a different, simpler approach for updating cache objects. To do so, we need to define a model of dependant object belongs_to
association in this way (note the touch: true
option passed to belongs_to
call):
And thanks to it we only need:
By using touch: true
every time user object updated_at
attribute has changed, this attribute will also be updated for the article object.
You need to remember that it depends on your sedulity if a cache is going to work properly and will always be updated when it's supposed to. We need to pay attention to what partial contains and what can change in there. It may be the entire object, just some attributes or some data processed earlier. The best thing to do is to analyze this fragment very carefully and to note down every kind of object that may change in there. Then you should pass these objects in an array.
For example:
In the controller:
And in the view:
Just like in the above example, you can pass ActiveRecord::Relation
(articles_from_category
) inside of an array. This would create the upper key containing updated_at
value of lately updated article that is associated with a category and a category updated_at
attribute.
So, to sum and clear things up: If we are passing ActiveRecord::Relation
object, cache method will take only updated_at
value of the recently updated relation element. If we are passing an array of objects it will take updated_at
attribute values of each of them and put them into the key.
Rails cache configuration
If you want to start playing around with a cache and test how it works in your project folder, go to config/environments/development.rb
and put this additional line in there:
config.action_controller.perform_caching = true
Remember to restart your server if your app is still running. From now on, all your cache blocks are going to be saved. You can find them in tmp/cache
folder.
For Rails 5 there is the new approach for enabling caching in development mode.
You just need to type in your console rails dev:cache
which creates caching-dev.txt
file in tmp directory.
Creating this file will trigger proper cache configuration in your config/environments/development.rb
file which by default looks like that:
Of course you can modify and change those settings according to your needs.
Summary: Rails cache
Cache is a very useful thing without doubt, but it needs to be used with caution. Every time we create a new cache ID it takes extra time. So if the popularity of your page is pretty low, it may happen that cache will update almost every time user will enter a website and not be reused by anyone else. This case makes caching useless and causes worse performance for this small group of people that wants to use your app. The border is very thin in here and overusing cache may slow down your page and exhaust all your free memory. But when it is made with care then it will boost your app and make life easier both for you and your users.
Congratulations! Now your website is successfully optimized and ready to host a bigger amount of the viewers. Now, why won’t you go with the flow and gain those visitors by internationalizing your app? Read about Rails internalization.
Photo from: Sanwal Deen.