Archive for the ‘Uncategorized’ Category

Brian de Alwis

Extracting and processing SCM data with bzr-xmloutput

I was recently asked to estimate how long I’d been working on a particular project. Unfortunately I hadn’t been keeping track of my time in any organized way.

Fortunately I realized that, since I like to commit frequently (though nothing like Stephen Turnbull’s commit-on-save!), I could come up with an estimate based on my commit dates.

But I quickly realized that bzr log --line puts the author name before the commit date:

  $ bzr log --line -r -3..
  150: Max Bowsher 2011-02-12 [merge] Fix invalid version_info.
  149: Jelmer Vernooij 2010-12-20 [merge] Fixes most of the remaining test fai...
  148: Gary van der Merwe 2010-10-20 [merge] Ignore build folder created by se...
  147: Martin 2010-09-09 [merge] Import xml escaping function through local mo...

The spaces could make extracting the date a bit fragile.

Fortunately I remembered the bzr-xmloutput plugin, which makes processing this kind of information really easy. bzr-xmloutput adds an “–xml” option to many of the standard bzr commands that encodes the output as an XML document. Combined with XMLStarlet, a command-line XML tool that provides XSLT/XPath processing (amongst other things), I was able to cook up a recipe in a matter of minutes:

  $ bzr log --xml \
  | xml sel -t -m '/logs' -m '//log' \
    -v 'substring-before(substring-after(timestamp," ")," ")' -n \
  | sort -u \
  | wc -l

The substring() is required to pull out the date; as bzr-xmloutput prints dates as ‘Day YYYY-MM-DD HH:MM:SS offset’. awk would have worked just as well too.

Too easy, thanks to Guillermo and the other bzr-xmloutput contributors! Now I’m thinking of other questions that can be answered…

Jonathan Riddell

What I did on my Rotation

Canonical has a company scheme where after working there for a few years you can rotate to work at another part of the company for 6 months. Having worked on the desktop team for over five years I decided to do a rotation to Bazaar. My hopes for this were to build up my own programming skills by learning more Python and by experiencing different programming practices from the ones I’m used to in KDE.

I started off with some fixes to the developer documentation. This got me used to the process that you can not commit directly to bzr’s trunk, instead all committers are required to make merge proposals on Launchpad, have those approved by a fellow developer, then send it to a programme called Patch Queue Manager which will integrate the patch and run the test suite to check everything still works.

Next I started fixing a few easy command line UI bugs, improving error messages or stopping exception output and so forth. This got me into the world of writing test cases. Everything in bzr needs a test case, merge proposals will not be accepted without them. Like much of bzr I find that the test cases lack API documentation and comments but it turns out they are easy enough to read and similarly easy to write. There are both internal test cases, which run a small part of the code within bzr, and blackbox test cases which run a bzr command.

Bazaar is the version control system used by top open source project hosting site Launchpad so I was surprised to come across a bug which prevented bzr from talking to Launchpad properly on errors. “This is really important to fix. We need error reporting.” said Jonathan Lange over 2 years before. Pleasingly I could fix it, very satisfying. I had to learn about the hooks mechanism in bzr which shows up some of the downside of Python, you have to guess the arguments to send the hook. But who needs API documentation when you can just read the code? 🙂

Bazaar’s main GUI is qbzr (which provides GUIs for individual commands) and Bazaar Explorer (which provides a complete GUI). I worked with Martin (gz) to make these two talk to the normal Ubuntu crash system, Apport, rather than showing a nasty crash backtrace to the user.

Then I noticed that Bazaar Explorer has a lot of “Refresh” toolbar buttons about the place, any time you make a change to the file you have to click one before the UI will update. Not very user friendly. So I added file watchers about the place to make it magically update. Nifty, except that after release it turns out this breaks horribly when doing some commands outside of Bazaar Explorer, oops. Quick fix and message to packagers, hang head in shame.

The first large feature I worked on was GPG signing of commits. The documentation for Bazaar promised that this was implemented and all you need do was set the various options in the config file. Alas it lied. I fixed up the documentation and started looking into the GPG python bindings, which turn out to be completely undocumented on the Python side and surprisingly badly documented on the C side. Security critical code which is badly documented seems scary to me, mistakes could easily be made which go unnoticed until it appears on full-disclosure. But I manage to implement signing and adding a GUI to Bazaar Explorer being cautious as I go.

Bazaar has a scheme called patch pilot where we review patches submitted by the community and help them on their way to being integrated. I started out with this by following John Meinel who can write code faster than I can write English prose. We made small changes to some patches and integrated them, we gave feedback to newer patches that needed some work and we chased up contributors who had not responded. The barrier to entry in Bazaar is pleasingly small, if you don’t have the skills to write a perfect patch it’s encouraged to say so and someone else will finish it off.

Why, I wondered, is bzr (the command line UI to Bazaar) not translated? There were parts of gettext scattered around the code, and some code to extract strings but it didn’t get used. Turns out this code was a half completed feature that had never been taken to completion. I finished off translations by adding gettext()s throughout the code, ensuring tests still pass, fix the installation of .mos and enable the generation of .pot. This missed the 2.4 release so I’m still waiting to see how it works for 2.5, I suspect some strings will be missing context needed to do a good translation and of course the occasionally technical output of bzr might need some thought on how to translate but it should make bzr easier to use for non-English speakers.

Ubuntu Distributed Development is the project to put all of Ubuntu’s packages and history into Bazaar branches and change our packages processes to use Bazaar. This makes a lot of sense, the Ubuntu archive is already a primitive revision control system (you upload for each new version, often its useful to look at older versions). This project has been a long time coming and is one of the original reasons why Canonical started Bazaar back in the day. It suffers from a number of problems, notably the failure of quite a lot of packages to import into Bazaar including currently the whole of KDE due to a patch into openSUSE’s bz2 package. Also the quilt patch system we use tends to clash with being held within a revision control system so you end up with diffs of diffs. I tend to think that would have been an easier win to import only the debian/ packaging into Bazaar branches.

I tidied up the new Ubuntu Packaging Guide which is a guide to packaging with UDD branches (named in the hope that UDD will soon become the definitive way to do packaging). I also added a new command bzr get-orig-source to make it easier to do packaging in the current directory rather than a separate directory as used by bzr builddeb. I also added a hook to set the bzr changelog from the debian/changelog entry which is the current behaviour with debcommit. I got mixed feedback on this so I added a config option to disable it too. I also tidied up some of the bzr-builddeb code by removing weird terms like “larstiq” and removing acronyms by default.

My Python programming has improved a lot and I’m a convert to the cause of unit tests. Python is a fun and productive language but the lack of culture for documenting APIs is disappointing and being dynamic it’s that much easier to make mistakes without realising it. My productivity is nothing like as high as others on the Bazaar team but it seems I’m better at improving (graphical and command) user interfaces than my colleagues who can memorise internal data structures trivially. My six months is now up, I’ve enjoyed them and now I’m looking forward to getting back into Kubuntu and KDE.

Vincent Ladeuil

The imports must go on !

The package importer is an important piece of the Ubuntu Distributed Development. It mirrors source packages and Bazaar branches and relies heavily on Launchpad to achieve that.

The past

During Launchpad downtimes, many (>1000) imports failed and they had to be re-queued semi-manually. The importer would have been better inspired by making tea instead of queuing imports that were bound to fail.

The circuit breaker

An automatically operated electrical switch designed to protect an electrical circuit <…> a circuit breaker can be reset (either manually or automatically) to resume normal operation.

This looks like a good candidate to avoid import failures while Launchpad is down.

In this automaton representing the behaviour of a circuit breaker, three events are used (remember that here closed == works ;)):

  • attempt: we try to use the circuit,
  • failure: an undesired event has occurred,
  • success: the circuit is working.

The main scenario here is:

closed — failure –> open — attempt –> half open — success –> closed

The reality test

A Launchpad rollout happened Friday 30 September 08:32. The importer log file said:

2011-09-30 08:32:02,308 – __main__ – INFO – Launchpad is down, re-trying jcifs

2011-09-30 08:34:09,337 – __main__ – INFO – Launchpad *is* back

The successful import took 27″, so the importer knew Launchpad was down for 1’40” (back – down – duration(import)). I asked the Launchpad admins how long it took them and their log said:

2011-09-30 08:33:41 INFO    Outage complete. 0:01:40.919527

Make tea… or not

Another interesting number here is that we retried 498 times during this downtime. This is probably excessive and can be fixed by reducing the importer concurrency while Launchpad is down. These 498 attempts were previously seen as failures for 498 different packages.

In the end, not only did we avoid these 498 spurious failures but the imports were only suspended for as long as Launchpad was down, up to the second !

But that’s a bit short to make tea…

Brian de Alwis

tiplog: record and reference the history of a branch’s tip

It’s sometimes useful to be able to revert a branch to a previous known state. For example, I recently updated a bzr plugin to its latest and greatest to discover a severe regression. If I had had some foresight, I might have recorded the revision (or the “tip”) before the update to allow me to rollback to the previous stable version. But as I rarely have such foresight, and have more important uses for my little grey cells, I set out to create ‘tiplog’, a new bzr plugin for recording and referencing the history of a branch’s tip.

‘tiplog’ is inspired by git’s ‘reflog’, and records commits, uncommits, pushes (to the branch), pulls (into the branch) — basically any change that causes the tip to change. ‘tiplog’ only records pushes to local branches. But the plugin can also be run within the smart server, although it cannot distinguish the causes of the tip change.

In my updating scenario described above, for example, say the plugin was at revision 20. Running ‘bzr pull‘ pulls in revision 40. ‘bzr tiplog‘ will inform me that my plugin was previously at r20:

$ bzr tiplog
2011-09-23 tip:0 40 [pull] update version numbers
2011-08-25 tip:1 20 [pull] fix bug #12009
2011-06-23 tip:2 1 [pull] initial commit

Better yet, I can easily return to that previous stable revision using the new ‘tip:‘ revspec, with ‘bzr pull -r tip:1‘. ‘tip:0 is the current tip.

I also find tiplog useful when developing with others, as I can quickly review the changes since I last pulled:

$ bzr log --line -r tip:1..
17705: 2011-09-27 fixes #1474
17704: 2011-09-27 fixing Job Cost
17703: 2011-09-25 fixes #1377

And in fact I have that bound to an alias.

To install tiplog, simply perform the following:

$ bzr branch lp:bzr-tiplog ~/.bazaar/plugins/tiplog

Please direct report any bugs or questions through the plugin’s Launchpad page.

Jonathan Riddell

qbzr with curves

Nice little visual change to qbzr, curves on the diff view..



Thanks to Iwata Hidetaka.

Being bored of the IRC poll on I made a new poll for revision control systems. I’m glad to see that after one vote Bazaar is at 100%.

Jonathan Riddell

bzr starts speaking your language

Bug 83941 “bzr doesn’t speak my tongue” has been closed: bzr core can now be translated. (The qbzr and bzr-explorer guis have been internationalized for a couple of years.) If you want to help bring bzr to those who prefer to work in non-English languages please help translate at Launchpad.

The translation will involve quite a bit of specialist language (what is French for “colocated branch”?) and I expect there are strings yet that need to be added to the translation file. I also need to look at translations for plugins.  Please send issues to either the Bazaar mailing list or as bugs on bzr on Launchpad.

Philippe Lhoste wrote a while ago about the issues of translating DVCS terminology.

Martin Pool

fault-line: bzr plugin to guess relevant test modules

Aaron has announced a bzr plugin fault-line:

It works by looking at previous revisions where the file was changed,
and seeing what test files were changed at the same time. You can
specify the files, or it will autodetect them by looking at your working

For example:

$ bzr faultline


The bzr-beta-ppa Ubuntu PPA for Beta Users has moved

As the final step of consolidating all of the official Bazaar PPAs on Launchpad under one Launchpad team, the Bazaar Beta PPA formerly found at has moved to live under the main ~bzr team at If you are a user and tester of Bazaar beta releases via this PPA, you will need to update your APT sources.list lines –  you can see the new sources.list lines under the “Technical details about this PPA” section at the above link.


bzr-2.4 faster large tree handling

The last of my patches is queued up to land, so I figured I’d post an update about the performance improvements I’ve been working on. I’m also just excited about how well it has all come together.

There were essentially 3 changes that mattered for performance on large trees.

  1. Fixing iter_entries_by_dir() to preload the data in Repository- optimal ordering rather than by-request ordering. In large trees this was causing us to thrash and become pathologically slow. In the 70,000-file test tree, thrashing took about 3 minutes, the preloading version takes about 15s. This affected a lot of our commands, though I guess the next two fixes would actually reduce the number of commands affected by this.
  2. Fixing several code paths to use optimized iter_changes() rather than the generic iter_changes(). The generic path walks both inventories iter_entries_by_dir() and compares them. Our 2a format Repository can do iter_changes without loading the whole tree. (It internally uses a hash_trie to store the inventory, and so nodes with matching sub-trees can be skipped for comparison.) This generally shows up as something that was taking 15s (to load the whole inventory) dropping to <2s for the improved comparison. (bzr revert and bzr pull were both directly impacted here)
  3. Changing WT.set_parent_trees([one_tree]) to update itself using current_basis.iter_changes(one_tree), rather than setting the state from scratch. This basically adds another case where we can avoid reading the whole inventory state again, which is another 15s to <2s sort of change. This only showed up after fixing (2), because once the tree is loaded, the other actions are generally pretty quick. (bzr up, bzr pull)

This is the chart I put together for “whats-new-in-2.4.txt”. bzr-2.3.2 will have fix (1), but not (2) or (3), to give a feel for how much of an impact different fixes have had.

    bzr-2.3.1 bzr-2.3.2 bzr-2.4  action
    3m39s         1m08s   1m03s  bzr co --lightweight
      38s            8s      2s  bzr revert (in a clean tree)
    4m47s         3m56s     15s  bzr merge
    4m45s           20s      3s  bzr pull
    4m58s         3m00s      2s  bzr up
    9m33s           21s     19s  bzr uncommit (including a merge)
    4m44s           17s      2s  bzr uncommit (simple commit)

So yes, some operations that were taking almost 5 minutes have now dropped down to taking <3s.

You won’t see that dramatic of an improvement for smaller trees, though most cases will have a pleasant improvement. Here is a short list for the ‘Launchpad‘ tree (with ~8k items).

    bzr-2.3.1   bzr-2.4     action
    5.3s        5.2s        bzr co --lightweight
    0.9s        0.3s        bzr revert
    1.4s        0.4s        bzr pull
    3.9s        3.7s        bzr uncommit (with merge)
    0.9s        0.3s        bzr uncommit (without merge)

Anyway, I’m quite happy about how much better bzr-2.4 will be in large trees.

update:Add graphs…

Martin Pool

new shelve/unshelve GUI

Iwata is working on a new GUI for interactively shelving and unshelving changes, which is a way in Bazaar for temporarily setting aside some changes from your working tree. (At the moment there is an interactive text interface for shelving.)