At LogNormal/SOASTA, we love running performance experiments, so when Steve approached us a while ago to test browser caching, we jumped at the chance. Since this was an experimental feature, we decided to build it as a standalone boomerang plugin that was only installed on a few sites rather than roll it out to all our customers.
The purpose of the experiment is to find out how long far-future expiring content stays in a browser’s cache. The item may be evicted either by the user clearing their cache, or by the regular cache eviction algorithm used by browsers.
That’s when things got interesting…
This wasn’t the intent of the experiment, but it’s interesting how when you start to study one characteristic, you often uncover other interesting characteristics. It’s also interesting that even today we still find ways in which the HTTP and HTML specs are open to interpretation, and different vendors interpret them in different ways.
The post doesn’t mention it, but the results hold true even if you call
in the page.
Reloading our cacheable content
At LogNormal, we serve boomerang with a far-future expires header. We also use a fixed URL, one without version numbers, and that doesn’t change when we push out new releases. This is necessary since we cannot expect our customers to update the URLs they reference every time we push out a new version.
Our script is also loaded dynamically through an iframe to avoid blocking the onload event (have I mentioned we really care about performance?).
All of this combined with Steve’s findings suggests that we might never be able to get all users to refresh their cached versions of boomerang.
Adoption rate of new versions
Whenever I push out a new version of boomerang, I monitor how long it takes for users to start using it, and at first, the numbers startled me. Despite the max-age of 7 days, within 10 minutes, 7% of all beacons use the new version of boomerang. In an hour that number has gone to 35% and 7 hours later (ok, I fell asleep waiting for data to come in) it’s over 80%. It typically takes a few more days to reach 95% and then tapers off. There are always a few users that never update their cache.
Forcing a cache reload
To get all users to update their cache, we use
a cache-reloading iframe and tell the browser to reload it from the origin. We do this using
location.reload(true)… except, remember that Steve’s results also apply to pages reloaded using
location.reload(true). We get around this by making sure the assets that need to be refreshed are referenced
statically from within the page. There are no dynamic script nodes, no iframe hacks, just a plain and simple Web 1.0
I like this method because it means you never have to change your asset URLs, and you can load them in a non-blocking fashion without worrying about having to refresh caches later. The actual code we use has evolved a bit since I wrote that post, but its essence is the same. I’ll write about the new code in a later post.
If you’re interested in more about how we use iframes to improve 3rd party script performance, I’ll be speaking about them at Velocity in Santa Clara this June.
Interpreting the specs
Lastly, I’d like to propose that anyone writing an implementation of the specs, whether they are browser authors, server authors or web developers, if you see something that’s inconsistent, or open to interpretation, please open a bug on the spec. Use this to mention what’s not clear, and what your interpretation and implementation does. Solicit feedback from other interested people and try to make the specs clearer.