In Firefox 70 we changed how pixels get to the screen on macOS. This allows us to do less work per frame when only small parts of the screen change. As a result, Firefox 70 drastically reduces the power usage during browsing.
In short, Firefox 70 improves power usage by 3x or more for many use cases. The larger the Firefox window and the smaller the animation, the bigger the difference. Users have reported much longer battery life, cooler machines and less fan spinning.
I’m seeing a huge improvement over here too (2015 13″ MacBook Pro with scaled resolutions on internal display as well as external 4K display). Prior to this update I literally couldn’t use Firefox because it would spin my fans way up and slow down my whole computer. Thank you, I’m very happy to finally see Core Animation being implemented.
I usually try nightly builds every few weeks but end up going back to Edge Chromium or Chrome for speed and lack of heat. This makes my 2015 mbp without a dedicated dGPU become a power sipper compared to earlier builds.
The crucial limitation here is that flushBuffer gives you no way to indicate which parts of the OpenGL context have changed. This is a limitation which does not exist on Windows: On Windows, the corresponding API has full support for partial redraws.
Every Firefox window contains one OpenGL context, which covers the entire window. Firefox 69 was using the API described above. So we were always redrawing the whole window on every change, and the window manager was always copying our entire window to the screen on every change. This turned out to be a problem despite the fact that these draws were fully hardware accelerated.
Enter Core Animation
Core Animation is the name of an Apple framework which lets you create a tree of layers (CALayer). These layers usually contain textures with some pixel content. The layer tree defines the positions, sizes, and order of the layers within the window. Starting with macOS 10.14, all windows use Core Animation by default, as a way to share their rendering with the window manager.
So, does Core Animation have an API which lets us indicate which areas inside an OpenGL context have changed? No, unfortunately it does not. However, it provides a number of other useful capabilities, which are almost as good and in some cases even better.
First and foremost, Core Animation lets us share a GPU buffer with the window manager in a way that minimizes copies: We can create an IOSurface and render to it directly using OpenGL by treating it as an offscreen framebuffer, and we can assign that IOSurface to a CALayer. Then, when the window manager composites that CALayer onto the screen surface, it will read directly from our GPU buffer with no additional copies. (IOSurface is the macOS API which provides a handle to a GPU buffer that can be shared between processes. It’s worth noting that the ability to assign an IOSurface to the CALayer contents property is not properly documented. Nevertheless, all major browsers on macOS now make use of this API.)
Secondly, Core Animation lets us display OpenGL rendering in multiple places within the window at the same time and update it in a synchronized fashion. This was not possible with the old API we were using: Without Core Animation, we would have needed to create multiple NSViews, each with their own NSOpenGLContext, and then call flushBuffer on each context on every frame. There would have been no guarantee that the rendering from the different contexts would end up on the screen at the same time. But with Core Animation, we can just group updates from multiple layers into the same CATransaction, and the screen will be updated atomically.
Having multiple layers allows us to update just parts of the window: Whenever a layer is mutated in any way, the window manager will redraw an area that includes the bounds of that layer, rather than the bounds of the entire window. And we can mark individual layers as opaque or transparent. This cuts down the window manager’s work some more for areas of the window that only contain opaque layers. With the old API, if any part of our OpenGL context’s default framebuffer was transparent, we needed to make the entire OpenGL context transparent.
Lastly, Core Animation allows us to move rendered content around in the window cheaply. This is great for efficient scrolling. (Our current compositor does not yet make use of this capability, but future work in WebRender will take advantage of it.)
The Firefox Core Animation compositor
How do we make use of those capabilities in Firefox now?
The most important change is that Firefox is now in full control of its swap chain. In the past, we were asking for a double-buffered OpenGL context, and our rendering to the default framebuffer was relying on the built-in swap chain. So on every frame, we could guess that the existing framebuffer content was probably two frames old, but we could never know for sure. Because of this, we just ignored the framebuffer content and re-rendered the entire buffer. In the new world, Firefox renders to offscreen buffers of its own creation and it knows exactly which pixels of each buffer need to be updated and which pixels still contain valid content. This allows us to reduce the work in step 2 drastically: Our compositor can now finally do partial redraws. This change on its own is responsible for most of the power savings.
And finally, Firefox windows are additionally split into transparent and opaque parts: Transparent CALayers cover the “vibrant” portions of the window, and opaque layers cover the rest of the window. This saves some more work in step 3. It also means that the window manager does not need to redraw the vibrancy blur effect unless something in the vibrant part of the window changes.
The rendering pipeline in Firefox on macOS now looks as follows:
Step 1: Firefox draws pixels into “Gecko layers”.
Step 2: For each square CALayer tile in the window, the Firefox compositor combines the relevant Gecko layers to redraw the changed parts of that CALayer.
Step 3: The operating system’s window manager assembles all updated windows and CALayers on the screen to produce the screen content.
You can use the Quartz Debug app to visualize the improvements in step 3. Using the “Flash screen updates” setting, you can see that the window manager’s repaint area in Firefox 70 (on the right) is a lot smaller when a tab is loading in the background:
And in this screenshot with the “Show opaque regions” feature, you can see that Firefox now marks most of the window as opaque (green):
We are planning to build onto this work to improve other browsing use cases: Scrolling and full screen video can be made even more efficient by using Core Animation in smarter ways. We are targeting WebRender for these further optimizations. This will allow us to ship WebRender on macOS without a power regression.
We implemented these changes with over 100 patches distributed among 28 bugzilla bugs. Matt Woodrow reviewed the vast majority of these patches. I would like to thank everybody involved for their hard work. Thanks to Firefox contributor Mark, who identified the severity of this problem early on, provided sound evidence, and was very helpful with testing. And thanks to all the other testers that made sure this change didn’t introduce any bugs, and to everyone who followed along on Bugzilla.
During the research phase of this project, the Chrome source code and the public Chrome development notes turned out to be an invaluable resource. Chrome developers (mostly Chris Cameron) had already done the hard work of comparing the power usage of various rendering methods on macOS. Their findings accelerated our research and allowed us to implement the most efficient approach right from the start.
Questions and Answers
Are there similar problems on other platforms?
Firefox uses partial compositing on some platforms and GPU combinations, but not on all of them. Notably, partial compositing is enabled in Firefox on Windows for non-WebRender, non-Nvidia systems on reasonably recent versions of Windows, and on all systems where hardware acceleration is off. Firefox currently does not use partial compositing on Linux or Android.
OpenGL on macOS is deprecated. Would Metal have posed similar problems?
In some ways yes, in other ways no. Fundamentally, in order to get Metal content to the screen, you have to use Core Animation: you need a CAMetalLayer. However, there are no APIs for partial updates of CAMetalLayers either, so you’d need to implement a solution with smaller layers similarly to what was done here. As for Firefox, we are planning to add a Metal back-end to WebRender in the future, and stop using OpenGL on machines that support Metal.
Why was this only a problem now? Did power usage get worse in Firefox 57?
As far as we are aware, the power problem did not start with Firefox Quantum. The OpenGL compositor has always been drawing the entire window ever since Firefox 4, which was the first version of Firefox that came with hardware acceleration. We believe this problem became more serious over time simply because screen resolutions increased. Especially the switch to retina resolutions was a big jump in the number of pixels per window.
What do other browsers do?
Chrome’s compositor tries to use Core Animation as much as it can and has a fallback path for some rare unhandled cases. And Safari’s compositor is entirely Core Animation based; Safari basically skips step 2.
Why does hardware accelerated rendering have such a high power cost per pixel?
The huge degree to which these changes affected power usage surprised us. We have come up with some explanations, but this question probably deserves its own blog post. Here’s a summary: At a low level, the compositing work in step 2 and step 3 is just copying of memory using the GPU. Integrated GPUs share their L3 cache and main memory with the CPU. So they also share the memory bandwidth. Compositing is mostly memory bandwidth limited: The destination pixels have to be read, the source texture pixels have to be read, and then the destination pixel writes have to be pushed back into memory. A screen worth of pixels takes up around 28MB at the default scaled retina resolution (1680×1050@2x). This is usually too big for the L3 cache, for example the L3 cache in my machine is 8MB big. So each screenful of one layer of compositing takes up 3 * 28MB of memory bandwidth. My machine has a memory bandwidth of ~28GB/s, so each screenful of compositing takes about 3 milliseconds. We believe that the GPU runs at full frequency while it waits for memory. So you can estimate the power usage by checking how long the GPU runs each frame.
How does this affect WebRender’s architecture? Wasn’t the point of WebRender to redraw the entire window every frame?
These findings have informed substantial changes to WebRender’s architecture. WebRender is now adding support for native layers and caching, so that unnecessary redraws can be avoided. WebRender still aims to be able to redraw the entire window at full frame rate, but it now takes advantage of caching in order to reduce power usage. Being able to paint quickly allows more flexibility and fewer performance cliffs when making layerization decisions.
Details about the measurements in the chart:
These numbers were collected from test runs on a Macbook Pro (Retina, 15-inch, Early 2013) with an Intel HD Graphics 4000, on macOS 10.14.6, with the default Firefox window size of a new Firefox profile, at a resolution of 1680×1050@2x, at medium display brightness. The numbers come from the PKG measurement as displayed by the Intel Power Gadget app. The power usage in the idle state on this machine is 3.6W, so we subtracted the 3.6W baseline from the displayed values for the numbers in the chart in order to get a sense of Firefox’s contribution. Here are the numbers used in the chart (after the subtraction of the 3.6W idle baseline):
When I moved to Knoxville, Tennessee in 1997, it didn’t take long for me to be exposed to one of the best things about the South: Waffle House. It’s not that Waffle House food is particularly great, though there is a place in the world for eggs, hash browns, and toast, competently done, for a low price. It’s that it is the ultimately comfort institution, one that openly caters to working-class people of all races without any pretense while creating a space that people seem fairly comfortable being together with others who are different than them, though very much not without recent racial incidents taking place in them, a sign of the resurgent racism across the country. That they are all institutionally clean doesn’t hurt. Anyway, this is a good photo essay about the views from various Waffle House joints and what it means to the South.
Why Waffle House? Why not McDonald’s or Hardee’s? Three reasons: consistency, personal relationship, and the chain’s iconic status.
I felt that I needed a constant from which to study our built environment, and the relative sameness of Waffle House restaurants allowed me that ability. Whether you like it or not, Waffle House is your neighborhood diner, replicated thousands of times over. The restaurants are relatively the same, architecturally speaking, as are the menu, the prices, and the experience. This replication of experience was the conceptual underpinning of this project, and that repetition is illustrated in the images.
While the interiors would vary depending on the age of the restaurant, the signifiers that these photographs are made from within Waffle Houses are consistent: the iconic globe lights, the red vinyl booths, the semi-opaque blinds and their beaded chains, the identical tabletop arrangements. Those elements were constant wherever I went, as was the service. I visited approximately 60 restaurants in nine different states and never once had a bad experience. Sometimes my bacon was undercooked or I was served white toast instead of wheat, but I was never treated poorly, nor was anyone else.
Second, nearly every Southerner feels a personal connection to Waffle House. How can a cookie-cutter restaurant chain win over the hearts of millions of people? My answer is inclusivity. In my experience, when you walk into a Waffle House, whether it is off a random exit in northern Alabama or downtown Charlotte, North Carolina, you are welcomed. At its best, Waffle House creates a sense of belonging unlike most other places.
Waffle House does not care how much you are worth, what you look like, where you are from, what your political beliefs are, or where you’ve been so long as you respect the unwritten rules of Waffle House: Be kind, be respectful, and don’t overstay when others are waiting for a table. Besides, everyone who has ever stepped foot in a Waffle House has a story to tell: Perhaps it involves a late-night study session in college or a joyous pit stop on the way home from a concert or sporting event. Maybe it was a bad breakup over waffles or an early morning breakfast with your bridal party before your wedding. For me, it is the first time my son tried a chocolate chip waffle. The look on his face when he realized that chocolate and syrup taste great together was one of pure delight and discovery. Or the first time my father took me to a Waffle House around the age of 12. I sat at the counter mesmerized, watching the cooks sling hash browns and respond to shouted orders in what seemed like a secret language. These photographs require that personal relationship. I don’t set the stage for where these photographs are made so much as I witness the greater context of the interaction. Instead, the circumstances are gleaned from the viewer’s past experiences and personal relationship with Waffle House
I began this work because I have had a long-standing interest in witnessing, understanding, and pushing back against the economic and governmental structures that segregate us by race and class. That might seem extreme, but I have long argued we can see how we treat and value each other by merely looking out the window. The proof is in our tax codes, zoning laws, and businesses. That was the impetus for this project: What do I see and what does it mean? What is the architecture of poverty? I don’t mean the extreme poverty fetishized every few months in the mainstream media, but the more common poverty that hides in plain sight. The people who live paycheck to paycheck. The families who pay more for their childcare than their mortgage or rent. The commuters who ride public transit for two hours to work a job that is a mere 15 miles away. This country is full of people who work hard yet are forced to work multiple jobs because none pays a living wage, yet those in charge consistently suggest to us that we’re lucky to have such jobs. Those jobs are plentiful, but undesirable. My swing through Kentucky and western Tennessee in mid-December of 2018 was proof of this. It seemed every other fast food establishment in both states had a “Now Hiring” sign displayed. It isn’t a coincidence that most of the businesses captured in these photographs are fast food restaurants and low-end motels. This is not a critique of Waffle House as a corporation — good, inexpensive food that is prepared fast, 24 hours a day, is a desired commodity. But Waffle House’s business model overlaps with those of discount stores such as Dollar General, extended stay motels aimed at families without a steady place to live, and payday lenders.