sfThumbnailCachePlugin - Cache thumbnails?
Recently many people in the symfony community have been wondering if the sfThumbnailPlugin and my sfThumbnailCachePlugin should be combined and the merits of thumbnail caching have been questioned so I have posted my response.To be honest, I haven’t run tests to see what the performance gains are exactly for my plugin but I think we could both agree that the process of generating a thumbnails is CPU intense, even if only for a half second. To test, on my machine I generated a thumbnail using Imagemaick which my plugin also uses:
$ time convert -thumbnail 160x160 1429.jpg test.jpg real 0m0.430s user 0m0.037s sys 0m0.021s
So that’s roughly 430ms for a thumbnail. This is the classic CPU vs Space trade-off. If we store the photos in a cache we use more space but less CPU and can effectively handle more users on a server provided we have the disk space.For a real world example I developed a rather large real estate site that has 75GB of full size property photos. On a property list page we show up to 48 homes each with a thumbnail of the full size photo. Also we have photos of the company’s agents listed by alphabetical order there are up to 50 photos per page. In both of these cases I use my sfThumbnailCache to create or retrieve the thumbnail:
$thumb = sfThumbnailCache::getNewInstance($file_master,$w,$h,true,true,null,$file_thumb);
If we assume that it takes 430ms per thumbnail, and they don’t run in parallel because we’re working with php, we’re still looking at 20.6 seconds just to generate the thumbnails! What’s more, if apache expire times are setup properly, after all that work on the server you may find that the user already had the thumbnails cached and didn’t even need them!I definitely think caching is the way to go for thumbnails. Even if you’re tight on space you could simply run a cron script every night the prunes your cache based on atime to keep only the most popular images in the cache. (I used to do this before disk got real cheap)To take it a step further I actually use Amazon S3 to store my photos now, which means that the full version of the photos aren’t on my server at all and thus there is a larger wait time needed to generate a thumbnail. I use my sfAmazonFilePlugin with my sfThumbnailCachePlugin to get thumbnails and cache them locally on this web server. This makes the thumbnail system scaleable as you could simply run it on multiple servers with S3 acting like the photo repository for all servers.
Tags: sfThumbnailCachePlugin, Symfony
Jonathan, thanks for the detailed use-case. I believe that plugins should do one thing and do it well. In this case, I don’t think that thumbnails and caching should be so tightly coupled.
For thumbnails, we want to generate the thumbnail ONCE and store it — in other words DRY. That’s a given. It’s a waste of resources to regenerate the same thumbnail over and over. If a site is generating thumbnails on-the-fly, it probably shouldn’t be.
I propose the following for situations like this:
1) generate a thumbnail once (sfThumbnailPlugin) 2) store thumbnail in a way that scales vertically (sfAmazonS3Plugin/sfMogileFSPlugin) 3) add an additional caching layer to prevent repetitive disk cycles (sfMemcachedPlugin or any of the caching classes Fabien has been working on in the trunk recently)
There was also some recent talk about enabling HTTP support to the thumbnail class, but similarly to this case, we already have a good plugin that handles HTTP requests quite well (sfWebBrowser). So, one should use sfWebBrowser to fetch the image and then load it into sfThumbnail as in this snippet:
download images with sfWebBrowserPlugin http://www.symfony-project.com/snippets/snippet/225
s/vertically/horizontally
I agree completely that the two plugins should not be coupled. sfThumbnail should focus on being very good at creating thumbnails and sfThumbnailCache should be very good at managing a cache of thumnails and calling sfThumbnail when it needs a new one created. That’s what we have an I think it works fine for what it’s intended to do.
That said, and in reference to our saleability discussion, what I really think symfony needs at a higher level is a sfCDNPlugin. This would implement a Content Delivery Network using S3 or mogilefs as a backbone. More specifically, you could have a set of media and make parameterized requests to it with sizes for example. You could then do something like:
< ?php echo sfCDN::image_tag('image.jpg',array('w'=>200,’h'=>100)?>
which would create the 200×100 thumbnail of image.jpg and cache it behind the scenes.
This function would create code like:
which would allow you to set the expire time for jpeg’s in apache to be far in the future which would use people local cache for the file. Then sfCDN would detect changes to image.jpg and change it’s name accordingly: “image_0709024567_200_100.jpg”. with the proper rewrites in apache this will be extremely fast.
I have an ad hoc implementation of this now but I suppose I could spend some time to put it into a nice plugin. What do you think about this?