Sunday, May 29, 2011

How a web page loads

The major web browsers load web pages in basically the same way. This process is known as parsing and is described by the HTML5 specification. A high-level understanding of this process is critical to writing web pages that load efficiently.

Parsing overview

As chunks of the HTML source become available from the network (or cache, filesystem, etc), they are streamed to the HTML parser. Next, in a process known as tokenization, the parser iterates through the source generating a token for (most notably) each start tag, end tag and character outside of a tag.

For example the input source <b>hello</b> yields 7 tokens:

start-tag { name: b }
character { data: h }
character { data: e }
character { data: l }
character { data: l }
character { data: o }
end-tag { name: b }

After each token is generated it is serially passed to the next major subsystem: the tree builder. The tree builder dynamically modifies the Document's DOM tree to reflect the new token.

The 7 input tokens above yield the following DOM tree:

<html>
  <head>
  <body>
    <b>
      "hello"

Fetching subresources

A frequent operation performed by the tree builder is creating a new HTML element and inserting it into the Document. It is at the point of insertion that HTML elements which load subresources begin fetching the subresource.

Running scripts

This parsing algorithm seems to translate HTML source into a DOM tree as efficiently as possible. That is, except for one wrinkle: scripts. When the tree builder encounters an end-tag token for a script, it must serially execute the script before parsing can continue (unless the associated script start-tag has the defer or async attribute).

There are two significant preconditions which must be fulfilled before a script can execute:

  1. If the script is external its source must be fully downloaded.
  2. For any script, all stylesheets in the document must be fully downloaded.

This means often the parser must idly wait while scripts and stylesheets are downloaded.

Why must parsing halt?
Well, a script may document.write something which affects further parsing or it may query something about the DOM which would yield incorrect results if parsing had continued (for instance the number of image elements in the DOM).

Why wait for stylesheets?

A script may expect to access the CSSOM directly or it may query an attribute of a DOM node which depends on the stylesheet (for example, how wide is a certain <table>).

Is it inefficient to block parsing?

Yes. Subresource download times often have a large constant factor limited by round trip time. This means it is faster to download two resources in parallel than to download the same two in serial. More obviously, the browser is also free to do CPU work while waiting on the network. For these reasons it is critical to efficient loading of a web page that subresource fetches are initiated as soon as possible. When parsing is blocked, the tree builder is not able to insert subsequent elements into the DOM, and thus subsequent subresource downloads are not initiated even if the HTML source which includes them is already available to the parser.

Mitigating blocking

As I've blogged previously, when the parser becomes blocked WebKit will run a lightweight parser known as the preload scanner. It mitigates the blocking problem by scanning ahead and fetching certain subresource that may be required. Other browsers employ similar techniques.

It is important to note that even with preload scanning, parsing is still blocked. Nodes cannot be added to the DOM tree. Although I haven't covered how a DOM tree becomes a render tree, layout or painting, it should be obvious that before a node is in the DOM tree it cannot be painted to the screen.

Finishing parsing

After the entire source has been parsed, first all deferred scripts will be executed (waiting for their source and all pending stylesheets to download). Their completion triggers the DOMContentLoaded event to be fired. Next, the parser will wait for any pending async scripts to finish loading and executing. Finally, once all subresources have finished downloading, the window's load event will be fired and parsing is complete.

Takeaway

With this understanding, it becomes clear how important it is to carefully consider where and how stylesheets and scripts are included in the document. Those decisions can have a significant impact on the efficiency of the page load.

22 comments:

Anonymous said...

What are the implications here for the common tactic of putting scripts just before the closing body tag?

Baskar said...

Awesome Tony!

Mupinc said...

As I known - window.onload event may fire before all images are loaded, and only in IE browsers onload event fires after all images are downloaded.

Kozie said...

Unless image tags are given their respective source's width and height in attributes, window.onload will be fired after that images are also loaded in other browsers.

Correct me if i'm wrong.

Aariel said...

This is very informative and helpful. For the past couple of months a colleague of mine and I have been slowly reverse engineering the parse / script execution process in various browsers in an attempt to build some automatic optimizations for scripts on arbitrary web pages. Our current solution implements this process:

1) Proxied HTML documents are scanned for script tags, which are then re-written so that they do not download or execute.
2) A script representing a client library is placed in the proxied document. This downloads early and asynchronously.
3) Client library scans for all external scripts, and sends a single request to a proxy server.
4) The proxy server sends a multipart response, and flushes scripts asynchronously through the response as they are made available.
5) Client library separates the received scripts, and executes them in their intended sequence with inline scripts. The source for individual scripts is cached with some meta data in the browsers local storage.

We have had to do some tricky things, such as overriding / emulating document.write functionality, but overall it has been a fun and effective experiment.

If you are interested to try it out, the feature is currently live and freely available on http://www.cloudflare.com (full disclosure: I work there).

Steve Souders said...

Great post, Tony. In addition to this article on parsing, you've had great browser implementation posts on tags that trigger downloads, preload scanning, how browsers have multiple caches, and triggering reflows. You're compiling a great set of articles on how browsers work that are worthwhile for all web developers to read.

The Nerdbirder said...

@id - Putting scripts at the bottom is a good idea for the reasons described in the article.

@Mupinc/Kozie - A browser which conforms to the HTML5 spec should not fire the onload event before all subresources (including images) have loaded. That is certainly the case in WebKit based browsers. If you have a test case to the contrary, it is a bug and should be fixed.

@Aariel - Very interesting. Sounds like the type of thing that would be really tricky to get right in all cases. Have you been able to measure the impact on popular sites?

@Steve Souders - Thanks, glad they are useful.

Guy Ellis said...

I'd be interested to see you write something on SPDY:

http://www.chromium.org/spdy

I believe that one of the objectives of SPDY is to address the problem with having to parse to find other resources to load by sending that info down in the header.

Manish Deo said...

SuperLiked Keep Writting!

Dheeraj said...

Awesome and really helpful post

Raven Ng Shi Jie said...

Terrific post! This has provided much insights for me to reconsider where I should place my relevant tags

cshandler said...

Hey Tony! Good post i'm gonna tweet this.. :)

Dotnet Associates said...

Nice

Dotnet Associates said...
This comment has been removed by the author.
Unknown said...

You might be interested in our Android App. It measures HTTP performance inside the Mobile Browser. You can also see all the real time web events happening. More at 3pmobile.com

Cheers,

Peter

Cl√£ Celestial Blog said...

It´s good.

James Bloom said...

Great article...

Unknown said...

App developers melbourne are also doing their best to be able to provide proper programs that would be suited to the newest and latest upgrade on the html settings the webpages nowadays got.

Ankur Kumar Singh said...

I got very good post for loading webpage here http://www.techflirt.com/javascript-web-page-preloader/

Clinton Nguyen said...

Developers and programmers should find a way to make the page reload easier and the browsing a lot more convenient to the user. They should provide a more effective model of html settings that is more user-friendly.

Charli Lockie said...

Your example is simple and clear. I now have a better understanding regarding parsing in HTML 5.

bespoke website development said...

Well, that's excellent article! I enjoy reading the articles that have good information and ideas to share with each reader. it was rale useful for me.navigator sharing specified ideas in the instant as symptomless.