Australis performance report

Mike Conley mconley at
Wed Sep 18 21:57:02 UTC 2013

Hey firefox-dev,

I just wanted to give an update on what's going on with the Australis 
project with respect to performance.

Here's where we are:

1) As of about a month ago, the tpaint regression that we were causing 
(bug 889758) has been neutralized.
2) The ts_paint regression (bug 880611) has dropped to about ~1-3%, and 
that remaining ~1-3% is possibly related to the tab animation regression.

Which brings us to our current focus:

# The Tab Animation Regression Test (TART)

Recently, through the hard work of avih and jmaher, the TART talos test 
landed. We're currently getting measurements for that test on each push, 
and that data is visible on Datazilla and Graph Server. This test has 
already been successful in detecting and preventing tab animation 
regressions on mozilla-central, and we're all pretty happy about that.

The TART test is actually composed of several smaller tests that 
exercise and measure various dimensions of tab performance. If anybody 
is interested in what exactly we're testing and measuring for the tab 
animations, just respond to this email indicating your interest, and 
I'll follow-up with a more thorough explanation.

But enough talk - here are some graphs showing how UX is doing on TART. 
These graphs compare the mozilla-central PGO builds against UX PGO 
builds on each of those TART subtests. Assuming everybody gets the same 
colours, the green is m-c, and the purple is UX.

Win 8:
Win 7:
Win XP:
OS X 10.8:
OS X 10.7.2:
OS X 10.6.8:
Ubuntu 12.04:

So the first thing to note is that we're doing really well on Win 8 and 
Win 7 across the board on those subtests. The platform with the most 
users that shows a regression is Windows XP, so that's where we've been 
putting our focus.

You'll notice that XP's TART data goes back a bit further than the other 
platforms. This is because while hunting for the cause of the 
regression, we decided to backfill our TART data to sometime in 
mid-April. This infusion of historical data, in theory, should help us 
locate the points where regressions were introduced.

There's already one of those points visible - the first graph in the XP 
breakdown is icon-close-DPI1.all.TART. This is a measure of the average 
frame interval for the tab close animation when there is a favicon, at 
DPI level 1. In mid-July, there's a spike, where suddenly the UX data 
set overtakes the m-c dataset. That regression is being tracked and 
investigated in bug 916946.

Things are never this easy, however. Through the heroics of our A-Team, 
we've successfully gotten that backfilled data into Datazilla. However, 
there's a problem with the calculation and plotting of the averages for 
some of those old values. Notice, for example, that changeset 
0b6c881ba74c in icon-close-DPI.all.TART (dated July 19th), reports an 
average of 1.5, but a minimum of 1.55. That's simply not possible. The 
average is actually closer to 2.16. The numbers later in the graph are 
more trustworthy - it's the stuff earlier on (the estimate is sometime 
before August 12th) where things get a little shakier. Apparently, our 
backfilling is conflicting with some pre-existing data, and repairing 
that would involve dumping and rebuilding the Datazilla tables. Joel 
Maher can supply more details on all of this if there are further 
questions on the Datazilla front.

There are a number of smaller regressions also visible in the Datazilla 
graphs, and we're still evaluating them. They're nowhere near as 
dramatic as the tab close regressions, but still detectable. We'll be 
evaluating their significance over the next day or so.

Anyhow, thought I'd put the cards on the table, since we've been pretty 
heads down on all of this.



[1]: "Ask me anything" - feel free to fire questions about this stuff my 
way. If I don't know the answers, I'll find someone who does and sic 
them on the thread.

More information about the firefox-dev mailing list