While defining the set of page load time metrics that we think are most important for benchmarking, Mike Belshe, James Simonsen and I went through a seemingly simple exercise: enumerate the ways in which Chrome caches data. The resulting list was interesting enough to me that I thought it worthwhile to share.
When most people think of "the browser's cache" they envision a single map of HTTP requests to HTTP responses on disk (and perhaps partially in memory). This cache may arguably have the most impact on page load times, but to get to a truly stable benchmark, we identified 10 caches that need to be considered. An understanding of the various caches is also useful to web page optimization experts who seek to maximize cache hits.
- HTTP disk cache
Stores HTTP responses on disk as long as their headers permit caching. Lookups are usually significantly cheaper than fetching over the network, but they are not free as a single disk seek might take 10-15ms and that doesn't include the time to read the data from disk.
The maximum size of the cache is calculated as a percentage of available disk space. The contents can be viewed at chrome://net-internals/#httpCache. It can be cleared manually at chrome://settings/advanced or programmatically by calling
chrome.benchmarking.clearCache()when Chrome is run with the
--enable-benchmarkingflag set. Note that for incognito windows this cache actually resides in memory. source
- HTTP memory cache
Similar to the HTTP disk cache, but entirely unrelated in code. Lookups in this cache are fast enough that they may be thought of as "free."
This cache is limited to 32 megabytes, however, when the system is not under memory pressure the effective limit may be higher due to its use of purgeable memory. Conversely, when multiple tabs are open, the limit may be divided among the tabs. It is cleared in the same manner as the HTTP disk cache. source
- DNS host cache
Caches up to 100 DNS resolutions for up to 1 minute each. It is somewhat unfortunate that this cache needs to exist in Chrome, but OS level caching cannot be trusted across platforms.
It can be viewed and manually cleared at chrome://net-internals/#dns. source
- Preconnect domain cache
A unique and important optimization in Chrome is the ability to remember the set of domains used by all subresources referenced by a page. Upon the next visit to the page, Chrome can preemptively perform DNS resolution and even establish a TCP connection to these domains.
- V8 compilation cache
- SSL session cache
Caches SSL sessions to disk. This saves several round trips of negotiation when connecting to HTTPS pages by allowing the connection to skip directly to the encrypted stream. Implementation and limits vary by platform, as an example, when OpenSSL is used, the limit is 1,024 sessions. source
- TCP connections
Establishing a TCP connection takes about one round trip time. Newer connections also have a smaller window so they have a lower effective bandwidth. For this reason Chrome keeps connections open for a period in hopes that they can be reused. This can be thought of as an in-memory cache.
Connections may be viewed at chrome://net-internals/#sockets and cleared programmatically by calling
chrome.benchmarking.closeConnections()when Chrome is run with the
--enable-benchmarkingflag set. source
While not usually thought of as a cache, this is web page state which is persisted to disk. The presence of cookies can have a large impact on performance. They can bloat requests and change how the client and server behave in limitless ways.
They can be cleared manually at chrome://settings/advanced. We are planning to add a method to
chrome.benchmarkingfor the same. source
- HTML5 caches
HTML5 introduces 3 major new ways for web pages to persist state to disk: Application Cache, Indexed Database and Web Storage. For a particular page, these stores may be viewed under the "Resources" panel of the Inspector. The entire Application Cache may also be viewed and manually cleared at chrome://appcache-internals/
- SDCH dictionary cache
While currently only used by Google Search, the SDCH protocol requires a shared dictionary to be downloaded periodically. A performance hit is taken infrequently to download the dictionary which makes future requests much faster. source
I hope you found this as interested as I did. Please let me know if I left anything out.
Edit: Will Chan points out additional caches for proxies, authentication, glyphs and backing stores. Proxy and authentication caches were intentionally omitted because they aren't relevant to our benchmark, however glyphs and backing stores are two additional things we need to consider. Thanks!