August 4, 2013
Cycle Collector on Worker Threads

Yesterday I landed Bug 845545 on mozilla-inbound.  This is the completion of the second step of our ongoing effort to make it possible to write a single implementation of DOM APIs that can be shared between the main thread and worker threads.  The first step was developing the WebIDL code generator which eliminated the need to write manual JSAPI glue for every DOM object in workers.  The next step will be discarding the separate worker thread DOM event implementation in favor of the one used on the main thread.  Once complete these changes will allow us to make DOM APIs such as WebGL, WebSockets, and many others work in web workers without writing a separate C++ implementation.

Prior to the great worker rewrite of 2011 exposing DOM objects in workers was pretty easy.  The same XPIDL/XPConnect layer that was used for most of the DOM worked off the main thread, and as long as the underlying implementation of your DOM object was threadsafe and you added a way to create the object in a worker or transfer it in (via postMessage) things just worked.  In fact, you could even send arbitrary threadsafe XPCOM objects to ChromeWorkers.  But it turns out that XPConnect and the JS engine were not really threadsafe, and that we couldn’t share XPConnect and the JS engine state between threads.  So bent rewrote workers to use a separate JSRuntime.

XPConnect didn’t come along for the ride though.  Instead we took the opportunity to prototype the next iteration of quickstubs.  XPConnect does a lot of work in each function call that can be done at compile time, at the cost of larger codesize.  As performance has become more and more important and codesize has mattered less, we have been moving towards using more and more generated code in the “binding” layer and less dynamic conversion and dispatch via XPConnect.  So workers got the first iteration of what would become the WebIDL bindings: a handwritten JSAPI binding layer.

This binding layer ended up diverging from what we did for the main thread in a number of ways.  Most notably memory management is handled quite differently.  One of the hardest problems in web browsers is how to handle cross-language cycles.  References from JS to the language the browser is implemented in (C++ in existing browsers) are ubiquitous.  But references from browser code to JS can exist too (e.g. event listeners).  Once references can exist in both directions, there’s the possibility for cycles.  There are 3 options for dealing with these cycles:

  1. Use a separate memory management system for C++, and have special purpose code that glues these two systems together.  This is what Gecko does on the main thread with the cycle collector.  C++ side objects are usually reference counted, and the cycle collector breaks both pure C++ and C++-JS cycles.  It also handles tracing the JS objects owned by C++.
  2. Use the JS garbage collector to manage the lifetimes of the C++ objects too.  This is what we chose to do on the worker threads.  In this setup there are never any C++->C++ edges, a C++ object owns the JS reflection of the C++ object it wants to keep alive.
  3. Implement something hacky for event listeners and hope that nobody ever wants to implement an API that requires other sorts of C++->JS edges.  This is the WebKit approach.

Option 3 is obviously a non-starter.  Option 2 is potentially what we would do on both threads if we were writing a browser from scratch, but it is a pretty severe impedance mismatch with the rest of Gecko.  Code that wanted to work on both threads would have to be written to own C++ and cycle collect on one thread and own JS and manually trace it on another.  This, combined with the lack of automated binding generation, has made adding new APIs to workers essentially impossible for people who aren’t already seasoned DOM hackers.

The WebIDL code generator solved the problem of writing the binding layer manually.  At the DOM meetup in London back in February we had a meeting where the rest of us overruled bent and decided to move workers to the cycle collector.  This involved some pretty serious refactoring work.  We had to separate a large amount of code from XPConnect so it could be used on both threads, while adding some customization hooks so that XPConnect could insert some of the strange behavior that only the main thread needs.  Then we were finally able to stand up a worker thread cycle collector implementation and port ImageData to use it.

Unfortunately this work was repeatedly delayed.  First by travel, then by the length of bent’s review queue, later by some Firefox OS work, and then by the effort needed to unbitrot the patches from some other cycle collector changes.  All of this led to a Q2 goal slipping a third of the way through Q3.  But it is finally done, and future work should allow more opportunities for parallelism.  Some of the existing worker thread objects can be converted to use the cycle collector without waiting for events.  Other APIs can be ported to worker threads now if they don’t use events.  And Olli can work on events now that I am no longer blocking him.

Best of all, we have some plans in the works to port some very interesting stuff to workers.  Keep your eyes peeled for more.

  1. khuey posted this
blog comments powered by Disqus