Tuesday, April 19, 2011

Load from March 24th to April 19th

Last week I did a post about how high our load was for that day and to let other people know that we are looking into mitigating the bad wait times that have been happening.

We know that we need more slaves but we also know that our masters are hitting edge cases and not being optimal. We now believe that bug 592244 is behind to some chunk of the wasted CPU by running some jobs twice. The problem comes that we have several masters that query a scheduling master and sometimes two jobs are run in two different masters. catlee has done a great job on chasing this and we hope that fixing this issue will improve significantly the wait times (it would have been hard for us without his help to narrow down this issue). If it does not help us enough to get by we will have to go back and chase other edge cases in our masters. Meanwhile IT and releng is still working on getting the next pool of test slaves.

And now back to the load (link to page with raw data):
  • on the 11th we handled 138 pushes across all branches (the day before the aurora merge)
  • try server had a 47.5%, mozilla-central 16.9% and cedar 11.2% (/me looks at ehsan) of the whole load
Conclusions:
  • even though we had the trip to Las Vegas, the all-hands and platform's work week we have had a very high load since we shipped Firefox 4
I wonder what the distribution from April 18th to the end of the month will look like as it would be more representative of what the normal development would be.

For the next post I should only grab weekdays and interpose them to see how things look from week to week.


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

No comments:

Post a Comment