The package importer is an important piece of the Ubuntu Distributed Development. It mirrors source packages and Bazaar branches and relies heavily on Launchpad to achieve that.

The past

During Launchpad downtimes, many (>1000) imports failed and they had to be re-queued semi-manually. The importer would have been better inspired by making tea instead of queuing imports that were bound to fail.

The circuit breaker

An automatically operated electrical switch designed to protect an electrical circuit <…> a circuit breaker can be reset (either manually or automatically) to resume normal operation.

This looks like a good candidate to avoid import failures while Launchpad is down.

In this automaton representing the behaviour of a circuit breaker, three events are used (remember that here closed == works ;)):

  • attempt: we try to use the circuit,
  • failure: an undesired event has occurred,
  • success: the circuit is working.

The main scenario here is:

closed — failure –> open — attempt –> half open — success –> closed

The reality test

A Launchpad rollout happened Friday 30 September 08:32. The importer log file said:

2011-09-30 08:32:02,308 – __main__ – INFO – Launchpad is down, re-trying jcifs

2011-09-30 08:34:09,337 – __main__ – INFO – Launchpad *is* back

The successful import took 27″, so the importer knew Launchpad was down for 1’40″ (back – down – duration(import)). I asked the Launchpad admins how long it took them and their log said:

2011-09-30 08:33:41 INFO    Outage complete. 0:01:40.919527

Make tea… or not

Another interesting number here is that we retried 498 times during this downtime. This is probably excessive and can be fixed by reducing the importer concurrency while Launchpad is down. These 498 attempts were previously seen as failures for 498 different packages.

In the end, not only did we avoid these 498 spurious failures but the imports were only suspended for as long as Launchpad was down, up to the second !

But that’s a bit short to make tea…