May 9, 2014
DOM Object Reflection: How does it work?

I started writing a bug comment and it turned out to be generally useful, so I turned it into this blog post.

Let’s start by defining some vocabulary:

DOM object - any object (not just nodes!) exposed to JS code running in a web page.  This includes things that are actually part of the Document Object Model, such as the document, nodes, etc, and many other things such as XHR, IndexedDB, the CSSOM, etc.  When I use this term I mean all of the pieces required to make it work (the C++ implementation, the JS wrapper, etc).

wrapper - the JS representation of a DOM object

native object - the underlying C++ implementation of a DOM object

wrap - the process of taking a native object and retrieving or creating a wrapper for it to give to JS

IDL property - a property on a wrapper that is “built-in”. e.g. ‘nodeType’ on nodes, ‘responseXML’ on XHR, etc.  These properties are automatically defined on a wrapper by the browser.

expando property - a property on a wrapper that is not part of the set of “built-in” properties that are automatically reflected.  e.g. if I say “document.khueyIsAwesome = true” ‘khueyIsAwesome’ is now an expando property on ‘document’. (sadly khueyIsAwesome is not built into web browsers yet)

I’m going to ignore JS-implemented DOM objects here, but they work in much the same way: with an underlying C++ object that is automatically generated by the WebIDL code generator.

A DOM object consists of one or two pieces: the native object and potentially a wrapper that reflects it into JS.  Not all DOM objects have a wrapper.  Wrappers are created lazily in Gecko, so if a DOM object has not been accessed from JS it may not have a wrapper.  But the native object is always present: it is impossible to have a wrapper without a native object.

If the native object has a wrapper, the wrapper has a “strong” reference to the native.  That means that the wrapper exerts ownership over the native somehow.  If the native is reference counted then the wrapper holds a reference to it.  If the native is newed and deleted then the wrapper is responsible for deleting it.  This latter case corresponds to “nativeOwnership=’owned’” in Bindings.conf.  In both cases this means that as long as the wrapper is alive, the native will remain alive too.

For some DOM objects, the lifetimes of the wrapper and of the native are inextricably linked.  This is certainly true for all “nativeOwnership=’owned’” objects, where the destruction of the wrapper causes the deletion of the native.  It is also true for certain reference counted objects such as NodeIterator.  What these objects have in common is that they have to be created by JS (as opposed to, say, the HTML parser) and that there is no way to “get” an existing instance of the object from JS.  Things such as NodeIterator and TextDecoder fall into this category.

But many objects do not.  An HTMLImageElement can be created from JS, but can also be created by the HTML parser, and it can be retrieved at some point later via getElementById.  XMLHttpRequest is only created from JS, but you can get an existing XHR via event.target of events fired on it.

In these cases Gecko needs a way to create a wrapper for a native object.  We can’t even rely on knowing the concrete type.  Unlike constructors, where the concrete type is obviously known, we can’t require functions like getElementById or getters like event.target to know the concrete type of the thing they return.

Gecko also needs to be able to return wrappers that are indistinguishable from JS for the underlying native object.  Calling getElementById twice with the same id should return two things that === each other.

We solve these problems with nsWrapperCache.  This is an interface that we can get to via QueryInterface that exposes the ability to create and retrieve wrappers even if the caller doesn’t know the concrete type of the DOM object.  Overriding the WrapObject function allows the derived class to create wrappers of the correct type.  Most implementations of WrapObject just call into a generated binding function that does all the real work.  The bindings layer calls WrapObject and/or GetWrapper when it receives a native object and needs to hand a wrapper back to a JS caller.

This solves the two problems mentioned above: the need to create wrappers for objects that we don’t know the concrete type of and the need to make object identity work for DOM objects.  Gecko actually takes the latter a step further though.  By default, nsWrapperCache merely caches the wrapper stored in it.  It still allows that wrapper to be GCd.  GCing wrappers can save large amounts of memory, so we want to do it when we can avoid breaking object identity.  If JS does not have a reference to the wrapper then recreating it later after a GC does not break a === comparison because there is nothing to compare it to.  The internal state of the object all lives in the C++ implementation, not in JS, so don’t need to worry about the values of any IDL properties changing.

But we do need to be concerned about expando properties.  If a web page adds properties to a wrapper then if we later GC it we won’t be able to recreate the wrapper exactly as it was before and the difference will be visible for that page.  For that reason, setting expando properties on a wrapper triggers “wrapper preservation”.  This establishes a strong edge from the native object to the wrapper, ensuring that the wrapper cannot be GCd until the native object is garbage.  Because there is always an edge from the wrapper to the native object the two now participate in a cycle that will ultimately be broken by the cycle collector.  Wrapper preservation is also handled in nsWrapperCache.

tl;dr

DOM objects consist of two pieces, native objects and JS wrappers.  JS wrappers are lazily created and potentially garbage collected in certain situations.  nsWrapperCache provides an interface to handle the three aspects of working with wrappers:

  1. Creating (or recreating) wrappers for native objects whose concrete type may not be known.
  2. Retrieving an existing wrapper to preserve object identity.
  3. Preserving a wrapper to prevent it from being GCd when that would be observable by the web page.

And certain types of DOM objects, such as those with native objects that are not reference counted or those that can only be constructed, and never accessed through a getter or a function’s return value, do not need to be wrapper cached because the wrapper cannot outlive the native object.

August 4, 2013
Cycle Collector on Worker Threads

Yesterday I landed Bug 845545 on mozilla-inbound.  This is the completion of the second step of our ongoing effort to make it possible to write a single implementation of DOM APIs that can be shared between the main thread and worker threads.  The first step was developing the WebIDL code generator which eliminated the need to write manual JSAPI glue for every DOM object in workers.  The next step will be discarding the separate worker thread DOM event implementation in favor of the one used on the main thread.  Once complete these changes will allow us to make DOM APIs such as WebGL, WebSockets, and many others work in web workers without writing a separate C++ implementation.

Read More

June 12, 2013
Cycle Collection

I gave a talk about the cycle collector at the Rendering Meetup in Taipei last month.  Chris Pearce had the foresight to record it, and now it is available on people.m.o for anyone who wants to watch it.

September 22, 2012
Refcounting thread-safety assertions are now fatal on mozilla-central

Gecko has long had assertions to verify that XPCOM objects are AddRefed/Released on the right thread.  Today I landed Bug 753659 which makes those assertions fatal (using MOZ_ASSERT).  This makes these assertions noticeable on test suites that do not check assertion counts (namely mochitest).  It also ensures that developers will notice these assertions when testing locally.  Remember that any time you see one of these assertions you are seeing a potential sg:crit (via a use-after-free on an object that’s reference count is too low due to AddRef racing with another operation) and should file and fix it immediately.

July 19, 2012
Cycle Collection

We don’t really have a comprehensive and current overview of the cycle collector and how to use it anywhere, so I wrote this.  This is probably part 1 of a multipart series, as I’ve only convered the simple cases here.

What?

The cycle collector is sort of like a garbage collector for C++.  It solves the fundamental problem of reference counting: cycles.  In a naive reference counting system, if A owns B and B owns A, neither A nor B will ever be freed.  Some structures in Gecko are inherently cyclic (e.g. a node tree) or can very easily be made cyclic by code beyond our control (e.g. most DOM objects can form cycles with expando properties added by content script).

The cycle collector operates on C++ objects that “opt-in” to cycle collection and all JS objects.  It runs a heavily modified version of Bacon and Rajan’s synchronous cycle collection algorithm. C++ objects opt-in by notifying the cycle collector when they may be garbage.  When the cycle collector wakes up it inspects the C++ objects (with help from the objects themselves) and builds a graph of the heap that participates in cycle collection.  It then finds the garbage cycles in this graph and breaks them, allowing the memory to be reclaimed.

Why?

The cycle collector makes developing Gecko much simpler at the cost of some runtime overhead to collect cycles.  Without a cycle collector, we would have to either a) manually break cycles when appropriate or b) use weak pointers to avoid ownership cycles.  These add significant complexity to modifying code and make avoiding memory leaks and use-after-free errors much harder. 

When?

C++ objects need to participate in cycle collection whenever they can be part of a reference cycle that is not guaranteed to be broken through other means.  C++ objects also need to participate in cycle collection if they hold direct references to objects that are managed by the JavaScript garbage collector (a jsval, JS::Value, JSObject*, etc.).

In practice, this means most DOM objects need to be cycle collected.

  • Does the object inherit from nsWrapperCache (directly or indirectly)?  If so, it must be cycle collected.
  • Does the object have direct references to JavaScript values (jsval, JS::Value, JSObject*, etc)?  If so, it must be cycle collected.  Note that interface pointers to interfaces implemented by JavaScript (e.g. nsIDOMEventListener) do *not* count here.
  • Does the object hold no strong references (e.g. it has no member variables of type nsCOMPtr or nsRefPtr, it has no arrays of those (nsTArray<nsCOMPtr>, nsTArray<nsRefPtr>, or nsCOMArray), no hashtables of them (nsInterfaceHashtable, nsRefPtrHashtable), and does not directly own any object that has these (via new/delete or nsAutoPtr))?  If so, it does not need to be cycle collected.
  • Is the object threadsafe (e.g. an nsRunnable, or something that uses the threadsafe AddRef/Release macros)?  Threadsafe objects cannot participate in cycle collection and must break ownership cycles manually.
  • Is the object a service or other long lived object?  Long lived objects should break ownership cycles manually.  Adding cycle collection may prevent shutdown leaks, but it will just replace that with a leak until shutdown, which is just as bad but doesn’t show up on our tools.
  • Does the object hold strong references to other things that are cycle collected?  If so, and the object does not have a well-defined lifetime (e.g. it can be accessed from Javascript) it must be cycle collected.
  • Does the object have strong references only to other things that are not cycle collected (e.g. interfaces from XPCOM, Necko, etc)?  If so, it probably does not need to be cycle collected.
  • Can the object be accessed from Javascript?  Then it probably needs to be cycle collected.

The last two are kind of vague on purpose.  Determining exactly when a class needs to participate in cycle collection is a bit tricky and involves some engineering judgement.  If you’re not sure, ask your reviewer or relevant peers/module owners.

How?

C++ objects participate in cycle collection by:

  1. Modifying their reference counting to use the cycle collector.
  2. Implementing a “cycle collection participant”, a set of functions that tell the cycle collector how to inspect the object.
  3. Modifying their QueryInterface implementation to return the participant when asked.

Like many things in Gecko, this involves lots of macros.

The reference counting is modified by replacing existing macros:

  • NS_DECL_ISUPPORTS becomes NS_DECL_CYCLE_COLLECTING_ISUPPORTS.
  • NS_IMPL_ADDREF becomes NS_IMPL_CYCLE_COLLECTING_ADDREF.
  • NS_IMPL_RELEASE becomes NS_IMPL_CYCLE_COLLECTING_RELEASE.

The cycle collection participant is a helper class that provides up to three functions:

  • A ‘Trace’ function is provided by participants that represent objects that use direct JavaScript object references.  It reports those JavaScript references to the cycle collector.
  • A ‘Traverse’ function is provided by all participants.  It reports strong C++ references to the cycle collector,
  • An ‘Unlink’ function is provided by (virtually) all participants.  It clears out both JavaScript and C++ references, breaking the cycle.

The cycle collection participant is implemented by placing one of the following macros in the header:

  • NS_DECL_CYCLE_COLLECTION_CLASS is the normal choice.  It is used for classes that only have C++ references to report.  This participant has Traverse and Unlink functions.
  • NS_DECL_CYCLE_COLLECTION_CLASS_AMBIGUOUS is a version of the previous macro for classes that multiply inherit from nsISupports.
  • NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS is used for classes that have JS references or a mix of JS and C++ references to report.  This participant has Trace, Traverse, and Unlink methods.
  • NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS_AMBIGUOUS is the ambiguous version of the previous macro.

And by doing one of the following in the cpp file:

  • For very simple classes, that don’t have JS references and only have nsCOMPtrs, you can use the NS_IMPL_CYCLE_COLLECTION_N macros, where N is the number of nsCOMPtrs the class has.
  • For classes that almost meet the above requirements, but inherit from nsWrapperCache, you can use the NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE_N macros, where N is the number of nsCOMPtrs the class has.
  • Otherwise, use the NS_IMPL_CYCLE_COLLECTION_CLASS macro and separate macros to implement the Traverse, Unlink, and Trace (if appropriate) methods.  To implement those, use the NS_IMPL_CYCLE_COLLECTION_[TRAVERSE|UNLINK|TRACE]_* macros to construct Traverse, Unlink, and Trace methods.

April 26, 2012
Fixing the Memory Leak

The MemShrink effort that has been underway at Mozilla for the last several months has substantially decreased the memory usage of Firefox for most users.  There are still some remaining issues that lead to pathological memory use.  One of those issues is leaky addons, which Nick has identified as the single most important MemShrink issue.

In Firefox, the JavaScript heap is split into compartments.  Firefox’s UI code, which is written in JS, lives in the privileged “chrome” compartment.  Addon code also usually lives in the chrome compartment.  Websites live in different, unprivileged compartments.  Exactly how compartments are allocated to websites is beyond the scope of this article, but at the time of writing there is roughly one compartment per domain.  Code running in the chrome compartment can hold references to objects in the content compartments (much like how a page can hold references to objects in an iframe).

For example of how this might look in practice, lets imagine we have Firefox open to three tabs: GMail, Twitter, and Facebook, and we have some sort of social media addon installed.  Our compartments might look something like this:

Where the blue lines are the references the Firefox UI is holding and the red lines are the references the addon is holding.

The problems start to arise if these references aren’t cleaned up properly when the tab is navigated or closed.  If the Facebook tab is closed, but not all of those references are cleaned up, some or all of the memory the Facebook tab was using is not released.  The result is popularly known as a zombie compartment, and is a big source of leaks in Firefox.

Chrome (privileged UI or other JS) code that leaks is particularly problematic because the leak usually persists for the lifetime of the browser.  When chrome code leaks, say, facebook.com, it leads to dozens of megabytes of memory being lost.  It turns out that writing chrome code that doesn’t leak can actually be quite difficult.  Even the Firefox front end code, which is worked on by a number of full time engineers and has extensive code review, has a number of leaks.  We can find and fix those, but addons are a much harder problem, and we can’t expect addon authors to be as diligent as we try to be in finding and fixing leaks.  The only defense we have had is the AMO review team and our list of best practices.

That changed last night when I landed Bug 695480.  Firefox now attempts to clean up after leaky chrome code.  My approach takes advantage of the fact that chrome code lives in a separate compartment from web page code.  This means that every reference from chrome code to content code goes through a cross-compartment wrapper, which we maintain in a list.  When the page is navigated, or a tab is closed, we reach into chrome compartment and grab this list.  We go through this list and “cut” all of the wrappers that point to objects in the page we’re getting rid of.  The garbage collector can then reclaim the memory used by the page that is now gone.

The result looks something like:

Code that accidentally (or intentionally!) holds references to objects in pages that are gone will no longer leak.  If the code tries to touch the object after the wrapper has been “cut”, it will get an exception.  This may break certain code patterns.  A few examples:

  • Creating a DOM node from a content document and storing it in a global variable for indefinite use.  Once the page you created the node from is closed your node will vanish.  Here’s an example of code in Firefox that used to do that.
  • Creating a closure over DOM objects can break if those objects can go away before the closure is invoked.  Here’s some code in Firefox that did that.  In one of our tests in our test suite the tab closed itself before the timeout ran, resulting in an exception being thrown.

Addon authors probably don’t need to bother changing anything unless they see breakage.  Breakage should be pretty rare, and the huge upside of avoided leaks will be worth it.  It’s a little early to be sure what effects this will have, but the amount of leaks we see on our test suite dropped by 80%.  I expect that this change will also fix a majority of the addon leaks we see, without any effort on the part of the addon authors.

February 23, 2012
Address Space Layout Randomization now mandatory for binary components

This evening I landed Bug 728429 on mozilla-central.  Firefox will now refuse to load XPCOM component DLLs that do not implement ASLR.  ASLR is an important defense-in-depth mechanism that makes it more difficult to successfully exploit a security vulnerability.  Firefox has used ASLR on its core components for some time now, but many extensions that ship with binary components do not.

ASLR is on by default on modern versions of Visual Studio, so extension authors will only need to ensure that they haven’t flipped the switch to turn it off.  MSDN documentation on ASLR options is available here.  Further reading about the benefits of ASLR is available here.

If no unexpected problems arise, this change will ship in Firefox 13.

December 19, 2011
Pushing Compilers to the Limit (and Beyond)

At the end of the first week of December Firefox exceeded the memory limits of the Microsoft linker we use to produce our highly optimized Windows builds.  After the problem was identified we took some emergency steps to ensure that people could continue to land changes to parts of Firefox not affected by this issue by disabling some new and experimental features.  Once that was complete we were able to make some other changes that reduced the memory used by the linker back below the limits.  We were then unable to undo those emergency steps and turn those features back on.

This will have no lasting impact on what is or is not shipped in Firefox 11.  The issues described here only affected Firefox developers, and have nothing to do with the memory usage or other performance characteristics of the Firefox binaries shipped to users.

Technical Details

Recently we began seeing sporadic compilation failures in our optimized builds on Windows.  After some debugging we determined that the problem was that the linker was running out of virtual address space.  In essence, the linker couldn’t fit everything it needed into memory and crashed.

The build configuration that was failing is not our normal build configuration.  It uses Profiled Guided Optimization, fancy words meaning that it runs some benchmarks that we give it and then uses that information to determine what to optimize for speed and what optimizations to use.  It also uses Link-Time Code Generation, which means that instead of the traditional compilation model where the compiler generates code and the linker glues it all together the linker does all of the code generation.  These two optimization techniques are quite powerful (they generally win 10-20% on various benchmarks that we have) but they require loading source code and profiling data for most of Firefox into RAM at the same time.

Once we identified the problem we took emergency steps by disabling SPDY support and the Graphite font subsystem, both new features that had been landed recently and were turned off by default (in other words, users had to use an about:config preference to turn them on).  This allowed us to reopen the tree for checkins that did not touch code that ends up in xul.dll (this allowed work to proceed on the Firefox UI, the Javascript engine, and a few other things).

We then disabled Skia (which is being used as an experimental <canvas> backend) and separated video codecs and parts of WebGL support into a separate shared library.  This work decreased the linker’s memory usage enough to resume normal development and turn SPDY back on.  The medium term solution is to start doing our 32 bit builds on 64 bit operating systems so that the linker can use 4 GB of memory instead of 3 GB of memory, and to separate pieces of code that aren’t on the critical startup path into other shared libraries.

Frequently Asked Questions:

  • Why don’t you just get machines with more RAM? - The problem is not that the linker was running out of physical memory, but that it was running out of virtual memory.  A 32 bit program can only address 2^32 bytes (4GB) of memory, regardless of how much memory is in the machine.  Additionally, on 32 bit Windows, the last 1 GB is reserved for the kernel, so a program is really limited to 3 GB of memory.
  • Ok, so why don’t you just use a 64 bit linker? - Unfortunately there is no 64->32 bit cross compiler provided with the Microsoft toolchain so you can’t generate binaries that run on 32 bit systems with a 64 bit compiler.
  • Sure you can, just use -MACHINE:X86 on the linker! - You can have the 64 bit linker link 32 bit binaries, but this is incompatible with Link-Time Code Generation.
  • Is Firefox bloated? - Firefox’s size and linker memory usage compares favorably with other browsers. These problems are not a reflection on which browsers are or are not bloated, but rather on how resource intensive it is to do whole program optimization across a large C++ codebase.

September 29, 2011
Using XHR.onload/etc in addons

I just landed https://bugzilla.mozilla.org/show_bug.cgi?id=687332 on mozilla-central which makes some changes to how .onfoo event listeners are handled on some DOM objects (including XHR).  These changes mean it is no longer possible to use .onfoo event listeners from JS scopes where the global object is not a Window, or from C++.  The correct way to listen for events from these scopes is to use .addEventListener.

This will likely affect a number of addons (particularly for XHR).  Addons that use XHR in XPCOM components should check to see if they are affected.  We may consider implementing some sort of a compatibility hack for XHR if that number is large.

August 10, 2011
xpidlc is dead. Long live pyxpidl.

Today I landed Bug 458936 which moves from using xpidlc to generate xpcom typelibs to new python code.  With that, and other work by people including Ted Mielczarek, Mike Hommey, and Benjamin Smedberg, Firefox is now built without ever invoking the binary xpidl.

The remaining pieces of work here are:

  • Migrate comm-central to the new python tools (interfaces in comm-central are still compiled with xpidlc)
  • Package the python xpidl into the Gecko SDK.
  • Stop building the binary xpidl entirely and remove it from the tree.
  • Remove our build time dependencies on libIDL, etc.

Liked posts on Tumblr: More liked posts »