Fixing the Memory Leak
The MemShrink effort that has been underway at Mozilla for the last several months has substantially decreased the memory usage of Firefox for most users. There are still some remaining issues that lead to pathological memory use. One of those issues is leaky addons, which Nick has identified as the single most important MemShrink issue.
For example of how this might look in practice, lets imagine we have Firefox open to three tabs: GMail, Twitter, and Facebook, and we have some sort of social media addon installed. Our compartments might look something like this:
Where the blue lines are the references the Firefox UI is holding and the red lines are the references the addon is holding.
The problems start to arise if these references aren’t cleaned up properly when the tab is navigated or closed. If the Facebook tab is closed, but not all of those references are cleaned up, some or all of the memory the Facebook tab was using is not released. The result is popularly known as a zombie compartment, and is a big source of leaks in Firefox.
Chrome (privileged UI or other JS) code that leaks is particularly problematic because the leak usually persists for the lifetime of the browser. When chrome code leaks, say, facebook.com, it leads to dozens of megabytes of memory being lost. It turns out that writing chrome code that doesn’t leak can actually be quite difficult. Even the Firefox front end code, which is worked on by a number of full time engineers and has extensive code review, has a number of leaks. We can find and fix those, but addons are a much harder problem, and we can’t expect addon authors to be as diligent as we try to be in finding and fixing leaks. The only defense we have had is the AMO review team and our list of best practices.
That changed last night when I landed Bug 695480. Firefox now attempts to clean up after leaky chrome code. My approach takes advantage of the fact that chrome code lives in a separate compartment from web page code. This means that every reference from chrome code to content code goes through a cross-compartment wrapper, which we maintain in a list. When the page is navigated, or a tab is closed, we reach into chrome compartment and grab this list. We go through this list and “cut” all of the wrappers that point to objects in the page we’re getting rid of. The garbage collector can then reclaim the memory used by the page that is now gone.
The result looks something like:
Code that accidentally (or intentionally!) holds references to objects in pages that are gone will no longer leak. If the code tries to touch the object after the wrapper has been “cut”, it will get an exception. This may break certain code patterns. A few examples:
- Creating a DOM node from a content document and storing it in a global variable for indefinite use. Once the page you created the node from is closed your node will vanish. Here’s an example of code in Firefox that used to do that.
- Creating a closure over DOM objects can break if those objects can go away before the closure is invoked. Here’s some code in Firefox that did that. In one of our tests in our test suite the tab closed itself before the timeout ran, resulting in an exception being thrown.
Addon authors probably don’t need to bother changing anything unless they see breakage. Breakage should be pretty rare, and the huge upside of avoided leaks will be worth it. It’s a little early to be sure what effects this will have, but the amount of leaks we see on our test suite dropped by 80%. I expect that this change will also fix a majority of the addon leaks we see, without any effort on the part of the addon authors.