Useless requests
There is a fundamental problem with static assets on the web pages, like images, CSS files and most importantly JavaScript files – they are requested from the server over and over again even if they were not modified between page requests. This slows down page rendering and in case of JavaScripts quite dramatically as JavaScript downloads block downloads for the rest of the assets by default.
Infinite Expiration
Setting infinite Expires headers is a great solution to the problem – this way everything downloaded from the server will be just kept in a browser’s cache! It is very effective and ranked 3rd on best practices lists provided by Yahoo! Exceptional performance team and even first on Google PageSpeed’s list.
You can clearly see the difference on this graph below:
Scripts and CSS in the headers load from cache eliminating rendering delay and images load from cache as making document load event happening much sooner.
Caching problem
But there is a problem with this on the opposite side – when assets change during normal development, e.g. image gets updated or JavaScript or CSS modified to fix a bug or add a new feature, it must be pushed to user’s browser, but it is not requesting it anymore loading the ones it saved to cache. Developers are used to press Shift+Reload or Shift+F5 to force browser to refresh the page, but users don’t do that and that’s why many just don’t use infinite caching technique, they prefer peaceful development without “caching problem” paying the price with degraded user experience.
URL fingerprinting
To solve this problem, simple technique, sometimes called “fingerprinting” or “cache busting” (term usually indicates that techniques used for wrong reasons) can be used to replace Entity Tags and Conditional GET techniques both of which require server requests and will not work with infinite expiration.
The idea is simple — URL of the asset should be unique for each version of the asset, effectively changing every time you update you script or an image. This way browser will not find the asset in it’s cache and request new one from the server storing results in another cache entry and eventually pushing old one out.
Perceived as complex
The problem implementing this lies within current status quo with web servers that use file system having no idea about file versions. And to be fair, URL<->file one-to-one paradigm is just plain simple to understand helping web grow as fast as it did, but it affects knowledge for all developers and almost no systems are built with versioning in mind.
My experience with many teams shows that this technique is perceived as overly complex and people avoid it till the last moment while this quite obvious solution should be common functionality within the web servers.
Multi-layer problem
Another problem, I believe, is that this solution affects many “layers of the pie” – from asset publishing process and HTML modification by designers and front-end developers to software modifications done by backend developers to web server configuration done by system administrators. This means many groups are involved and more people involved, the harder it gets to push through and nobody wants to bring it up.
Solution: SVN Assets
Building HowDoable mostly alone, I don’t have “multi-layer” problem, more over, being a performance geek, I’m concerned with performance probably more then needed at this stage of development. So I spent some time early in the project to make sure my builds and upgrades don’t suffer from “caching problem” and still perform as good as they can with cached content.
What I did was simple – I used a basic and most obvious source of versioning info there can be – source control software, namely Subversion and made sure that my code does not insert a single asset URL into HTML without appending it’s Subversion version to it.
Since many project use Subversion and PHP for their content management, I thought I’d share it with the community and being a strong believer in open source I try to release as much infrastructure free as possible.
So, welcome SVN Assets, a set of tools to make your assets cache using Subversion and simple build process.
I’ve built it for PHP, but will be happy to see it used by developers on other platforms.
Usage is described in README file – just generate data file from Subversion and use assetURL('images/my.jpg')to insert your assets in the code. There is also a command line script that updates all CSS files to point to the proper versions of the files so you don’t have to maintain them by hand and still serve them directly from the file system.
Please, go check it out and let me know what you think about it!
You can also subscribe to a mailing list if you’d like to ask questions and discuss possible improvements:
http://groups.google.com/group/svn-assets