Fastersite: 2012

Wednesday, June 13, 2012

A better timer for JavaScript

Browser performance guru, Nat Duca, has introduced high resolution timing to JavaScript. Ready to try in Chrome 20, the experimental window.performance.webkitNow() interface provides microsecond resolution and guarantees not to flow backward (monotonically nondecreasing).

Microseconds matter

Until now, creating a new Date object and querying its time was the only way to measure elapsed time on the web. Depending on the browser and OS, Date's resolution can be as low as 15 milliseconds. One millisecond at best. As John Resig brought to note, that isn't sufficient for benchmarking.

Mono-what-ic?

Perhaps less often considered is that Date, based on system time, isn't ideal for real user monitoring either. Most systems run a daemon which regularly synchronizes the time. It is common for the clock to be tweaked a few milliseconds every 15-20 minutes. At that rate about 1% of 10 second intervals measured would be inaccurate.

Try it out

Like Date, now returns a number in milliseconds. Unlike Date, now's value is:

a double with microseconds in the fractional
relative to the navigationStart of the page rather than to the UNIX epoch
not skewed when the system time changes

If you're using Date for any performance measurement, add this shim to upgrade to performance.now() with a fall back to Date for older browsers.

window.performance = window.performance || {};
performance.now = (function() {
  return performance.now       ||
         performance.mozNow    ||
         performance.msNow     ||
         performance.oNow      ||
         performance.webkitNow ||
         function() { return new Date().getTime(); };
})();

Monday, April 23, 2012

Optimizing with the timeline panel

"I know I should look at Chrome's Timeline panel, but what do I do with it?" was roughly the opening line in two recent web performance meetings. In both we found nice speedups using the same mantra: "Don't ask how to make an operation faster, ask why it is done at all."

To ward off a third meeting, I decided to write about it. This is best demonstrated with a real world example, and it happens that there's a perfect one on Wikipedia.

Record a trace

To follow along at home, head over to your favorite lengthy Wikipedia article—mine is List of common misconceptions. Open the Inspector's Timeline panel, click the black circle record button at the bottom, refresh the page and finally stop recording after it loads. You now have an initially daunting trace of all the major steps of the page load.

Orient yourself

The top timescale marks the document's DOMContentLoaded event with a blue line and the window's load event with red. In the screenshot above, I dragged the righthand slider in to the red line to zoom in on only the events which happened during the page load.

Finally notice the event coloring scheme: loading in blue, scripting in yellow and rendering in purple. In the remainder of the screenshots I've unchecked the blue box to hide loading events—perhaps a topic for another day.

Is JavaScript slow?

In the above screenshot there are three JavaScript executions which look long enough to investigate. Hover on one to see the details.

This one took 405ms ("Duration"). The three together add up to 1.25 seconds or about a quarter of this 4.8 second page load. Alright, JavaScript looks slow. Should we hunt for loops to unroll? Shame! What's happening here is much more typical. Only 29ms ("Self Time") are spent actually executing this JavaScript and the other 372ms (purple "Aggregated Time") are spent on rendering operations triggered by the JavaScript.

Examine rendering operations

Expand the drop down to see the rendering operations: predominantly 4 style recalculations, each about 90ms. The other two scripts have very similar breakdowns (not shown).

Now it may seem the way to speed this up is to optimize style recalculation time. Should we scour for descendent selectors? Not just yet. For both layouts and style recalculations, first ask whether the operation is necessary at all. It is usually possible to write a given script such that it triggers a maximum of one layout and one recalculation. So this is a red flag. To figure out what caused these recalculations, hover to see the responsible JavaScript call stack.

It turns out that each of the three major script executions are dominated by style recalculations which all have the addInlineCSS method on their callstack. Click the links to see the source. In the source view, the button with the two curly brackets formats the minified JavaScript for human readability.

The fix

Within the execute method the culprit steps into the light: a loop that appends style elements to the document one by one. The browser does a full recalculation of all styles on the page each time this happens.

Fixing this is straightforward. Before appending to the DOM, either concatenate the CSS strings into a single style element or else coalesce the style elements into a DocumentFragment. This trace suggests that could chop almost a second off the page load.

Conclusion

I'd love to see this example fixed (hopefully in WebKit), but that's not the point of this post. There are various well known anti-patterns which can trigger wasteful rendering operations. They are easy to write and often not slow enough to worry about. Hopefully you've taken away the courage to open the timeline and diagnose the ones that do need to be optimized. Happy tracing!

Thursday, February 16, 2012

Thou shalt not kill

The recent beta release of Chrome on Android marks a great time to look at what happens when web applications consume too much memory: something gets killed.

Previously I've blogged about the importance for web developers to mind their memory usage. On powerful desktop machines enjoyed by many developers, this can seem like an academic concern. But on mobile devices the limitation is all too real. Case in point: I compile with 24G of RAM and deploy to a Nexus S with 512M, and of that, only two-thirds (about 342M) is available for the system to use.

Here's what happens as a web app approaches that 342M limit.

1. Android apps are killed

The Android system has a wonderful design in which the user doesn't need to be concerned with which apps are open at any given time. The system has the prerogative to pause and resume background apps based on available resources. So, the first thing that happens as a web app's memory usage grows is that background applications are gracefully paused and killed. This is normal operation which happens constantly, however, it does mean that users wait longer when switching to paused apps as they have to be restarted and resumed.

2. Chrome tabs are killed

Once the system has given Chrome as much memory as it can afford, it sends notifications as usage approaches the limit. When Chrome notices it is using too much memory or receives one of these notifications, it has no choice but to kill background tabs. This is a little less transparent to users because when the user switches back to a killed tab, instead of the usual near instantaneous switch, the page must be reloaded from cache or network.

3. Caches are cleared

Up until this point, memory pressure has caused some performance problems for the user, but the foreground content hasn't suffered much. However, after all background tabs have been killed, Chrome's only remaining recourse is to clear memory caches and perform additional V8 garbage collections. This begins to cut into the application's interactive performance (scrolling, animations, JavaScript responsiveness, etc). It is often a short lived fix as these caches can quickly refill. A high performing web application should never allow memory usage get this high.

4. Suicide

Finally, when the system can allocate memory no more, it kills Chrome's renderer process in which the web page lives. Many mobile browsers would crash entirely, but due to Chrome's unique multiprocess architecture, the browser keeps running and displays the "Aw, Snap!" page. The user, left frustrated, is free to quickly move on to a better performing web page.

Living a long life

Possibly my favorite feature of Chrome on Android is that the entire developer tools suite works through remote debugging. This makes it easy to keep an eye on memory usage and diagnose memory leaks when they do occur.

Wednesday, February 8, 2012

Chrome Fast for Android

The Chrome Beta for Android is solid Chrome. That means it inherits my favorite speed features from its older desktop siblings. Some of them are novel to mobile browsers. For all of them, the absolute performance benefit is usually greater than on more powerful desktops.

In no particular order, here are ten I'm most excited to see together in the mobile world.

Remote debugging Probably the most important thing a browser can do to make the web fast is to provide developer tools which make it easy for web developers to build fast sites. While there have been some cutting edge projects to bring parts of DevTools to mobile, the full suite hasn't been available until now. Use the Timeline, Profiles and Network panels to find the trickiest bottlenecks.
SPDY Also supported in the stock Android Browser on Honeycomb systems and newer, SPDY significantly reduces the number of costly RTTs that are incurred during a web page load. Keep in mind it only works with servers that speak SPDY. Apache users may be interested in following the progress of mod_spdy.
Hardware accelerated graphics Getting this right was one of the largest efforts in bringing Chrome to Android. As a result, scrolling is buttery smooth on most reasonably efficient web sites. If your site isn't as smooth as you want it to be, take a profile in the Timeline panel to see where time is being spent. Often there are patterns for avoiding layouts or style recalculations to get back into the fast zone. A notable exception is CSS animations. While Chrome's framerate is similar to other mobile browsers, they aren't nearly as smooth as they need to be. Improving them is an area of focus at the moment.
V8 Crankshaft Incorporating the largest performance improvement in V8's history allows a mobile browser to hold its own against even JavaScript heavy web sites designed for the desktop.
Navigation Timing Also newly available in the stock Android Browser on Ice Cream Sandwich, support for the Navigation Timing API allows developers to understand real users' page load times. This further increases the coverage of the API which is already supported in the lastest versions of desktop Chrome, Firefox and IE.
Large persistent cache As brought to note by Guy Podjarny, most mobile browsers have very tiny disk caches. Chrome's logic is based on the amount of disk space available, which means devices with plenty of free space are able to have significantly larger caches.
requestAnimationFrame Support for this API is critical for building efficient JavaScript animations. UPDATE: Henri Sivonen points out that this was unclear. In Chrome, this API is vendor prefixed as webkitRequestAnimationFrame while it is being standardized.
Preloading Searches in the omnibox trigger the preloading of high confidence results. In many cases this results in a near instant page load. By default this is enabled only when on WiFi, and that is customizable.
SSL FalseStart This relatively recent improvement in the Chrome network stack avoids a costly RTT during the establishment of SSL connections. The commonly high RTT on mobile networks and increasing use of HTTPS on the web make this a nice win.
HTML5 APIs Support for newer HTML5 APIs like Web Sockets, Web Workers and Indexed Database provide the building blocks for building high performance web apps.

Getting Chrome running on Android is only the first step. Now the really fun stuff can begin.