When your Ruby On Rails Website gets famous you're going to wish you implemented proper caching. Are you worried? Maybe just a little?
This tutorial is going to show everything you need to know to use Caching in your Rails applications, so when you get digg'd or slashdot'd you won't be left begging your hosting provider for more CPU processing power.
Since there are so many different types of caching, I'm going to split this up into several blog entries. Each one will build on the previous, talking about more complex types of caching and how to implement them. We'll even discuss some advanced caching plugins people have written for customized caching.
Today we're going to dive into the FASTEST rails caching mechanism, page caching!
Table of Contents
- Why for art thou caching?
- Configuration
- Page Caching
- Page caching with pagination
- Cleaning up your cache
- Sweeping up your mess
- Playing with Apache/Lighttpd
- Moving your cache
- Clearing out your whole/partial cache
- Advanced page caching techniques
- Testing your page caching
- Conclusion
Why for art thou caching?
(Feel free to skip this if you're a l33t hax0r)Ruby is what we call an "Interpreted Programming Language" (as you probably already know). What this means is that your code does not get translated into machine code (the language your computer talks) until someone actually runs it.
If you're a PHP developer, you're probably saying "No Duh!" about now. PHP is also an "Interpreted Language". However, Java code on the other hand needs to be compiled before it can be executed.
Unfortunately this means that every time someone surfs onto your Ruby on Rails website, your code gets read and processed that instant. As you can probably imagine, handling more than 100 requests a second can take great deal of processor power. So how can we speed things up?
Caching!
Caching, in the web application world, is the art of taking a processed web page (or part of a webpage), and storing it in a temporary location. If another user requests this same webpage, then we can serve up the cached version.
Loading up a cached webpage can not only save us from having to do ANY database queries, it can even allow us to serve up websites without touching our Ruby on Rails Server. Sounds kinda magical doesn't it? Keep on reading for the good stuff.
Before we get our feet wet, there's one small configuration step you need to take..
Configuration
There's only one thing you'll need to do to start playing with caching, and this is only needed if you're in development mode. Look for the following line and change it to true in your /config/environments/development.rb:
|
config.action_controller.perform_caching = true
|
Normally you probably don't want to bother with caching in development mode, but we want try it out already!
Page Caching
Page caching is the FASTEST Rails caching mechanism, so you should do it if at all possible. Where should you use page caching?
- If your page is the same for all users.
- If your page is available to the public, with no authentication needed.
If your app contains pages that meet these requirements, keep on reading. If it doesn't, you probably should know how to use it anyways, so keep reading!
Say we have a blog page (Imagine that!) that doesn't change very often. The controller code for our front page might look like this:
1 2 3 4 5 |
class BlogController < ApplicationController def list Post.find(:all, :order => "created_on desc", :limit => 10) end ... |
As you can see, our List action queries the latest 10 blog posts, which we can then display on our webpage. If we wanted to use page caching to speed things up, we could go into our blog controller and do:
1 2 3 4 5 6 7 |
class BlogController < ApplicationController caches_page :list def list Post.find(:all, :order => "created_on desc", :limit => 10) end ... |
The "caches_page" directive tells our application that next time the "list" action is requested, take the resulting html, and store it in a cached file.
If you ran this code using mongrel, the first time the page is viewed your /logs/development.log would look like this:
1 2 3 4 5 6 |
Processing BlogController#list (for 127.0.0.1 at 2007-02-23 00:58:56) [GET] Parameters: {"action"=>"list", "controller"=>"blog"} SELECT * FROM posts ORDER BY created_on LIMIT 10 Rendering blog/list Cached page: /blog/list.html (0.00000) Completed in 0.18700 (5 reqs/sec) | Rendering: 0.10900 (58%) | DB: 0.00000 (0%) | 200 OK [http://localhost/blog/list] |
See the line where it says "Cached page: /blog/list.html". This is telling you that the page was loaded, and the resulting html was stored in a file located at /public/blog/list.html. If you looked in this file you'd find plain html with no ruby code at all.
Subsequent requests to the same url will now hit this html file rather then reloading the page. As you can imagine, loading a static html page is much faster than loading and processing a interpreted programming language. Like 100 times faster!
However, it is very important to note that Loading Page Cached .html files does not invoke Rails at all! What this means is that if there is any content that is dynamic from user to user on the page, or the page is secure in some fashion, then you can't use page caching. Rather you'd probably want to use action or fragment caching, which I will cover in part 2 of this tutorial.
What if we then say in our model:
|
caches_page :show
|
Where do you think the cached page would get stored when we visited "/blog/show/5" to show a specific blog post?
The answer is /public/blog/show/5.html
Here are a few more examples of where page caches are stored.:
1 2 3 4 5 |
http://localhost:3000/blog/list => /public/blog/list.html http://localhost:3000/blog/edit/5 => /public/edit/5.html http://localhost:3000/blog => /public/blog.html http://localhost:3000/ => /public/index.html http://localhost:3000/blog/list?page=2 => /public/blog/list.html |
Hey, wait a minute, notice how above the first item is the same as the last item. Yup, page caching is going to ignore all additional parameters on your url.
But what if I want to cache my pagination pages?
Very interesting question, and a more interesting answer. In order to cache your different pages, you just have to create a differently formed url. So instead of linking "/blog/list?page=2", which wouldn't work because caching ignores additional parameters, we would want to link using "/blog/list/2", but instead of 2 being stored in params[:id], we want that 2 on the end to be params[:page].
We can make this configuration change in our /config/routes.rb
1 2 3 4 5 |
map.connect 'blog/list/:page', :controller => 'blog', :action => 'list', :requirements => { :page => /\d+/}, :page => nil |
With this new route defined, we can now do:
|
<%= link_to "Next Page", :controller => 'blog', :action => 'list', :page => 2 %> |
the resulting url will be "/blog/list/2". When we click this link two great things will happen:
- Rather than storing the 2 in params[:id], which is the default, the application will store the 2 as params[:page],
- The page will be cached as /public/blog/list/2.html
The moral of the story is; If you're going to use page caching, make sure all the parameters you require are part of the URL, not after the question mark! Many thanks to Charlie Bowman for inspiration.
Cleaning up the cache
You must be wondering, "What happens if I add another blog post and then refresh /blog/list at this point?"
Well, not quite nothing. We would see the /blog/list.html cached file which was generated a minute ago, but it won't contain our newest blog entry.
To remove this cached file so a new one can be generated we'll need to expire the page. To expire the two pages we listed above, we would simply run:
1 2 3 4 5 |
# This will remove /blog/list.html expire_page(:controller => 'blog', :action => 'list') # This will remove /blog/show/5.html expire_page(:controller => 'blog', :action => 'show', :id => 5) |
We could obviously go and add this to every place where we add/edit/remove a post, and paste in a bunch of expires, but there is a better way!
Sweepers
Sweepers are pieces of code that automatically delete old caches when the data on the cached page gets old. To do this, sweepers observe of one or more of your models. When a model is added/updated/removed the sweeper gets notified, and then runs those expire lines I listed above.
Sweepers can be created in your controllers directory, but I think they should be separated, which you can do by adding this line to your /config/environment.rb.
1 2 3 4 5 |
Rails::Initializer.run do |config| # ... config.load_paths += %W( #{RAILS_ROOT}/app/sweepers ) # ... end |
(don't forget to restart your server after you do this)
With this code, we can create an /app/sweepers directory and start creating sweepers. So, lets jump right into it. /app/sweepers/blog_sweeper.rb might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
class BlogSweeper < ActionController::Caching::Sweeper observe Post # This sweeper is going to keep an eye on the Post model # If our sweeper detects that a Post was created call this def after_create(post) expire_cache_for(post) end # If our sweeper detects that a Post was updated call this def after_update(post) expire_cache_for(post) end # If our sweeper detects that a Post was deleted call this def after_destroy(post) expire_cache_for(post) end private def expire_cache_for(record) # Expire the list page now that we posted a new blog entry expire_page(:controller => 'blog', :action => 'list') # Also expire the show page, incase we just edited a blog entry expire_page(:controller => 'blog', :action => 'show', :id => record.id) end end |
NOTE: We can call "after_save", instead of "after_create" and "after_update" above, to dry out our code.
We then need to tell our controller when to invoke this sweeper, so in /app/controllers/BlogController.rb:
1 2 3 4 |
class BlogController < ApplicationController caches_page :list, :show cache_sweeper :blog_sweeper, :only => [:create, :update, :destroy] ... |
If we then try creating a new post we would see the following in our logs/development.log:
1 2 |
Expired page: /blog/list.html (0.00000) Expired page: /blog/show/3.html (0.00000) |
That's our sweeper at work!
Playing nice with Apache/Lighttpd
When deploying to production, many rails applications still use Apache as a front-end, and dynamic Ruby on Rails requests get forwarded to a Rails Server (Mongrel or Lighttpd). However, since we are actually pushing out pure html code when we do caching, we can tell Apache to check to see if the page being requested exists in static .html form. If it does, we can load the requested page without even touching our Ruby on Rails server!
Our httpd.conf might look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
<VirtualHost *:80> ... # Configure mongrel_cluster <Proxy balancer://blog_cluster> BalancerMember http://127.0.0.1:8030 </Proxy> RewriteEngine On # Rewrite index to check for static RewriteRule ^/$ /index.html [QSA] # Rewrite to check for Rails cached page RewriteRule ^([^.]+)$ $1.html [QSA] # Redirect all non-static requests to cluster RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f RewriteRule ^/(.*)$ balancer://blog_cluster%{REQUEST_URI} [P,QSA,L] ... </VirtualHost> |
In lighttpd you might have:
1 2 3 |
server.modules = ( "mod_rewrite", ... ) url.rewrite += ( "^/$" => "/index.html" ) url.rewrite += ( "^([^.]+)$" => "$1.html" ) |
The proxy servers will then look for cached files in your /public directory. However, you may want to change the caching directory to keep things more separated. You'll see why shortly.
Moving your Page Cache
First you'd want to add the following to your /config/environment.rb:
|
config.action_controller.page_cache_directory = RAILS_ROOT + "/public/cache/" |
This tells Rails to publish all your cached files in the /public/cache directory. You would then want to change your Rewrite rules in your httpd.conf to be:
1 2 3 4 5 |
# Rewrite index to check for static RewriteRule ^/$ cache/index.html [QSA] # Rewrite to check for Rails cached page RewriteRule ^([^.]+)$ cache/$1.html [QSA] |
Clearing out a partial/whole cache
When you start implementing page caching, you may find that when you add/edit/remove one model, almost all of your cached pages need to be expired. This could be the case if, for instance, all of your website pages had a list which showed the 10 most recent blog posts.
One alternative would be to just delete all your cached files. In order to do this you'll first need to move your cache directory (as shown above). Then you might create a sweeper like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
class BlogSweeper < ActionController::Caching::Sweeper observe Post def after_save(record) self.class::sweep end def after_destroy(record) self.class::sweep end def self.sweep cache_dir = ActionController::Base.page_cache_directory unless cache_dir == RAILS_ROOT+"/public" FileUtils.rm_r(Dir.glob(cache_dir+"/*")) rescue Errno::ENOENT RAILS_DEFAULT_LOGGER.info("Cache directory '#{cache_dir}' fully sweeped.") end end end |
That FileUtils.rm_r simply deletes all the files in the cache, which is really all the expire_cache line does anyways. You could also do a partial cache purge by only deleting a cache subdirectory. If I just wanted to remove all the caches under /public/blog I could do:
1 2 |
cache_dir = ActionController::Base.page_cache_directory FileUtils.rm_r(Dir.glob(cache_dir+"/blog/*")) rescue Errno::ENOENT |
If calling these File Utilities feels too hackerish for you, Charlie Bowman wrote up the broomstick plugin which allows you to "expire_each_page" of a controller or action, with one simple call.
Needing something more advanced?
Page caching can get very complex with large websites. Here are a few notable advanced solutions:
Rick Olson (aka Technoweenie) wrote up a Referenced Page Caching Plugin which uses a database table to keep track of cached pages. Check out the Readme for examples.
Max Dunn wrote a great article on Advanced Page Caching where he shows you how he dealt with wiki pages using cookies to dynamically change cached pages based on user roles.
Lastly, there doesn't seem to be any good way to page cache xml files, as far as I've seen. Mike Zornek wrote about his problems and figured out one way to do it. Manoel Lemos figured out a way to do it using action caching. We'll cover action caching in the next tutorial.
How do I test my page caching?
There is no built in way to do this in rails. Luckily Damien Merenne created a swank plugin for page cache testing. Check it out!
Conclusions
Page caching should be used if at all possible in your project, because of the awesome speeds it can provide. However, if you have a website with a member system where authentication is needed throughout, then you might not be able to do much with it outside of a login and new member form.
Ready learn about the other Rails Caching methods, continue to Part 2 of the tutorial.
This article is the second part of my series about Ruby on Rails Caching. If you have not yet read Part 1 you may be left clueless, so please do so.
In the last article we went over Page Caching, which involves taking the entire content of a page, and storing it in a static html file that gets read outside of the rails application itself. Page Caching is a great fit for the first page of your site or for your new member application form, but what if you need to ensure someone is authenticated or only cache part of a webpage?
This tutorial is going to address Action Caching, Fragment Caching, and even ActiveRecord Caching (only available in Edge Rails) which will complete our tutorial of caching in Rails.
Table of Contents
- Why teach caching in this order?
- Action Caching
- Cleaning the Action Cache
- Fragment Caching
- Cleaning the Fragment Cache
- Paginating with the Fragment Cache
- Advanced Naming with the Fragment Cache
- Fragment Cache Naming Examples
- Sweeping Multiple Fragments at the Same Time
- Where else can I store my Action/Fragment Cache?
- ActiveRecord Query Caching
- Advanced Caching Alternatives
Why teach caching in this order?
Well, when you go to implement caching you're going to want to use the fastest one possible, and cache the most information. So basically:
- Page Caching - Fastest
- Action Caching - Next Fastest
- Fragment Caching - Least Fast
So, the higher you can get your pages up on the list, the more that gets cached, the faster your page runs! Lets get on with it:
Action Caching
Action caching is VERY similar to page caching, the only difference is that the request for the page will always hit your rails server and your filters will always run. To setup action caching our controller might look like this:
1 2 3 4 |
class BlogController < ApplicationController layout 'base' before_filter :authenticate # <--- Check out my authentication caches_action :list, :show |
As you can see here, the user must be authenticated in order to view the list action. When we initially request the list action, we might see the following in our /log/development.log:
1 2 3 4 5 6 7 |
Processing BlogController#list (for 127.0.0.1 at 2007-03-04 12:51:24) [GET] Parameters: {"action"=>"list", "controller"=>"blog"} Checking Authentication Post Load (0.000000) SELECT * FROM posts ORDER BY created_on LIMIT 10 Rendering blog/list Cached fragment: localhost:3000/blog/list (0.00000) Completed in 0.07800 (12 reqs/sec) | Rendering: 0.01600 (20%) | DB: 0.00000 (0%) | 200 OK [http://localhost/blog/list] |
See the line "Cached fragment: localhost:3000/blog/list"? What that means is that a file was generated, and you can find it here:
/tmp/cache/localhost:3000/blog/list.cache
By default, action caching will cache your files at /tmp/cache/, and instead of writing out .html files as in page caching, it will write out .cache files. The path also includes the host & port name (localhost:3000) just in case you're hosting more then one subdomain on a single application. In this case each subdomain can have a different cache location.
If you opened this "list.cache" file, you'd find the complete static html of the page, just like you would if we were page caching. What's the difference then?
If we go and request the page again (after we got it cached above), here is what we'd see in /log/development.log:
1 2 3 4 5 |
Processing BlogController#list (for 127.0.0.1 at 2007-03-04 13:01:31) [GET] Parameters: {"action"=>"list", "controller"=>"blog"} Checking Authentication Fragment read: localhost:3000/blog/list (0.00000) Completed in 0.00010 (10000 reqs/sec) | DB: 0.00000 (0%) | 200 OK [http://localhost/blog/list] |
As you can see, our Authentication before_filter was run, then the cached action was read, and outputted to the screen. So in this case we can have a completely cached html page (FAST) and our application can still run our before_filters such as authentication.
It's important to note here that your before_filters must be listed before your caches_action code, at the top of your controller, otherwise you may end up caching the wrong html.
How do I clean up the Action Cache?
As we learned in the previous tutorial to regenerate the cache (if our data changes) we need to expire the cache. We can do the same thing with our action caching using sweepers. We'd just want to change "expire_page" to "expire_action" in our /app/sweepers/blog_sweeper.rb.
1 2 3 4 5 |
# Expire the list page now that we posted a new blog entry expire_action(:controller => 'blog', :action => 'list') # Also expire the show page, incase we just edited a blog entry expire_action(:controller => 'blog', :action => 'show', :id => record.id) |
Another notable way we can clear out not only our Action Cache and Fragment cache is to run the following rake task:
|
rake tmp:cache:clear |
This will clear out all of our .cache files
Fragment Caching
So far we've been dealing with caching entire pages of information. Obviously this can't always be done when you have dynamic webpages, this is where Fragment caching comes in. Fragment caching allows you to cache a portion of one of your views.
To fragment cache a list of blog entries we could edit the /app/views/blog/list.rhtml:
1 2 3 4 5 6 7 8 |
<strong>My Blog Posts</strong> <% cache do %> <ul> <% for post in @posts %> <li><%= link_to post.title, :controller => 'blog', :action => 'show', :id => post %></li> <% end %> </ul> <% end %> |
The "cache do" will create a fragment cache: /tmp/cache/localhost:3000/blog/list.cache, using the current controller and action to name it. Then the next time the "cache do" statement is hit, our previously cached data will be loaded. Lets take a look at what our /log/development.log looks like for the first and second requests of this page. Pay attention now:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Processing BlogController#list (for 127.0.0.1 at 2007-03-17 22:02:16) [GET] Authenticating User Post Load (0.000230) SELECT * FROM posts Rendering blog/list Cached fragment: localhost:3000/blog/list (0.00267) Completed in 0.02353 (42 reqs/sec) | Rendering: 0.01286 (54%) | DB: 0.00248 (10%) | 200 OK [http://localhost/blog/list] Processing BlogController#list (for 127.0.0.1 at 2007-03-17 22:02:17) [GET] Authenticating User Post Load (0.000219) SELECT * FROM posts Rendering blog/list Fragment read: localhost:3000/blog/list (0.00024) Completed in 0.01530 (65 reqs/sec) | Rendering: 0.00545 (35%) | DB: 0.00360 (23%) | 200 OK [http://localhost/blog/list] |
Can you see what might look a little fishy?
The fragment was cached properly on the first hit, and the second hit read the fragment as it should, however, you should notice that the SQL statement to get the posts was run both times! But if we're loading a cached page, we shouldn't need to run the SQL a second time, right?
Only the content between the "cache do" and the "end" are cached with fragment caching, while everything else gets run each time. So one of the many ways we could fix this, is to add a condition inside our /controllers/blog_controller.rb.
1 2 3 4 5 |
def list unless read_fragment({}) @post = Post.find(:all, :order => 'created_on desc', :limit => 10) %> end end |
Now our query should only be ran if our page hasn't been cached. We could have also moved our Post.find method inside of our "cache do" statement, but I kinda believe that your models (your business logic) should never be referenced inside your (views). It's an MVC thing.
Wanna take a guess at how we can expire fragment caching?
1 2 |
# Expire the blog list fragment expire_fragment(:controller => 'blog', :action => 'list') |
Paginating with the Fragment Cache
What if I wanted to make my list paginate, and cache each and every page?
As you probably realized by now, by default the name of the cache is determined by looking at the current controller name, and the action name. So if my controller is "blog" and my action is "list" it will write this cache out to "/localhost:3000/blog/list.cache".
In the case of pagination I want to add to the cache name, in which case my blog_controller.rb might look like this:
1 2 3 4 5 |
def list unless read_fragment({:page => params[:page] || 1}) # Add the page param to the cache naming @post_pages, @posts = paginate :posts, :per_page => 10 end end |
I could have wrote "read_fragment({:controller => 'blog', :action => 'list', :page => params[:page] || 1})" but by default rails assumes the first two parameters, so I don't need to write those.
My /views/blog/list.rhtml might look like:
1 2 3 |
<% cache ({:page => params[:page] || 1}) do %> ... All of the html to display the posts ... <% end %> |
This new caching code will create /localhost:3000/blog/list.page=1.cache when I initially go to /blog/list, then when I go to page two it will create /localhost:3000/blog/list.page=2.cache. Pretty cool!
Advanced Naming with the Fragment Cache
Most of the time you're going to want to name your caches, here's another good example:
Lets say my web design included a navigation menu (nav menu) on each page that was customized for each user on the site. Lets say it's a list of project tasks our user must complete:
1 2 3 4 5 6 7 8 |
<div id="nav-bar"> <strong>Your Tasks</strong> <ul> <% for task in Task.find_by_member_id(session[:user_id]) %> <li><%= task.name %></li> <% en |
相关推荐
《Rails 3 in Action》是2011年由Ryan Bigg撰写的一本关于Ruby on Rails框架的权威指南,专门针对当时最新的Rails 3.1版本进行了深入解析。这本书旨在帮助开发者充分利用Rails 3.1的强大功能,提升Web应用开发的效率...
### Rails 4 in Action, 第二版:关键知识点解析 #### 一、Rails 4简介与新特性 **Rails 4 in Action, 第二版** 是一本深入介绍Ruby on Rails框架的专业书籍。该书由Ryan Bigg、Yehuda Katz、Steve Klabnik和...
Rubyisms in Rails
### Ruby on Rails 2.1 新特性详解 #### 引言 自2004年7月David Heinemeier Hansson公开发布Ruby on Rails框架以来,这一轻量级且功能强大的Web开发框架迅速赢得了全球开发者们的青睐。经过三年多的发展与优化,在...
Rails Cache 是 Ruby on Rails 框架中的一个重要特性,它用于提高应用程序的性能,通过缓存数据来避免不必要的数据库查询和其他昂贵的操作。Rails 提供了多种级别的缓存,包括动作缓存、片段缓存、页面缓存以及低...
WeChat in Rails 的 API、命令和消息处理
唔,1分应该还是有人下的吧,共同学习进步,Ruby on Rails is an open source web framework.... "Rails 4 in Action" is a fully-revised second edition of "Rails 3 in Action." This hands-on, compreh...
包括has_cache在Rails项目Gemfile : gem 'has_cache' 在类中调用has_cache ,例如在模型中: class User < ActiveRecord :: Base has_many :posts , inverse_of : :user has_cache end class Post < ...
《Rails101_by_rails4.0》是一本专注于Rails 4.0.0版本和Ruby 2.0.0版本的自学教程书籍,它定位于中文读者,旨在成为学习Rails框架的参考教材。Rails(Ruby on Rails)是一个采用Ruby语言编写的开源Web应用框架,它...
We still start with a step-by-step walkthrough of building a real application, and in-depth chapters look at the built-in Rails features. This edition now gives new Ruby and Rails users more ...
Ruby on Rails,通常简称为Rails,是一个基于Ruby编程语言的开源Web应用框架,遵循MVC(Model-View-Controller)架构模式。这个“Rails项目源代码”是一个使用Rails构建的图片分享网站的完整源代码,它揭示了如何...
书中会介绍如何通过缓存(如Action Cache和Page Cache)、数据库查询优化、资产管道优化等手段提升应用性能。 2. **复杂的路由**:Rails的路由系统允许灵活地定义资源和URL结构。高级Rails会讲解如何创建更复杂的...
Rails 3.1 和 Cucumber-Rails 1.2.0 是两个在Web开发领域非常重要的工具,尤其对于Ruby on Rails框架的测试和自动化流程。本文将深入探讨这两个组件,以及它们如何协同工作来增强软件开发的效率和质量。 首先,...
标题 "Rails" 指的是 Ruby on Rails,一个开源的Web应用程序框架,它基于Ruby编程语言,遵循MVC(模型-视图-控制器)架构模式。Rails由David Heinemeier Hansson在2004年创建,其设计理念是强调代码的简洁性、DRY...
从给定的文件信息来看,我们正在探讨的是一本关于Ruby on Rails的书籍,书名为《Simply Rails2》,作者是Patrick Lenz。本书旨在为初学者提供深入理解Ruby on Rails框架的指南,从基础概念到高级主题均有涵盖,是...