Using getdelta to reduce size of distfiles download

Yesterday, Stefan Schweizer (a developer who I work with) brought a very cool thing to my attention: getdelta.

getdelta is based on deltup which is a way of storing and applying differences between files. Kind of like diff and patch, but designed for binary files such as compressed tarballs.

deltup is very useful in the context of upgrading distfiles, because typically very little changes between foobar-0.1 and foobar-0.2, so a deltup diff file which could upgrade foobar-0.1.tar.bz2 to foobar-0.2.tar.bz2 would be a much smaller download than downloading the entire foobar-0.2.tar.bz2 file.

The magic of getdelta is that it integrates into portage for downloading your distfiles. As an example, I have subversion-1.1.1.tar.bz2 present in /usr/portage/distfiles, but I now want to upgrade to version 1.1.3.

dsd ~ # emerge -f subversion
Calculating dependencies …done!
>>> emerge (1 of 1) dev-util/subversion-1.1.3-r1 to /
>>> Downloading http://gentoo.blueyonder.co.uk/distfiles/subversion-1.1.3.tar.bz2
Searching for a previously downloaded file in /usr/portage/distfiles

We have following candidates to choose from
subversion-1.1.1.tar.bz2

The best of all is … subversion-1.1.1.tar.bz2

Checking if this file is OK.

Trying to download subversion-1.1.1.tar.bz2-subversion-1.1.3.tar.bz2.dtu

[...snip the wget download verbosity...]

17:41:57 (490.22 KB/s) – `subversion-1.1.1.tar.bz2-subversion-1.1.3.tar.bz2.dtu’ saved [828684]

GOT subversion-1.1.1.tar.bz2-subversion-1.1.3.tar.bz2.dtu

Successfully fetched the dtu-file – let’s build subversion-1.1.3.tar.bz2…

subversion-1.1.1.tar.bz2 -> subversion-1.1.3.tar.bz2: OK
cleaning up
This dtu-file saved 5 MB (87%) download size.

>>> subversion-1.1.3.tar.bz2 size ;-)
>>> subversion-1.1.3.tar.bz2 MD5 ;-)
>>> md5 src_uri ;-) subversion-1.1.3.tar.bz2

The above process basically found that I had got a previous subversion tarball already downloaded, so it just downloaded the upgrade deltup patch from the getdelta server, which saved 87% of my download for subversion-1.1.3. You’ll notice that portage does its usual md5 checking independant of the deltup process so I don’t see any possible problems relating to binaries being hijacked.

I did a lot of package upgrades yesterday and for every one where I had an older distfile present and I was watching, getdelta typically saved me 70-95% of the downloading, and only once refused to download the deltup patch because it was bigger than the tarball it was going to construct.

The only disadvantage of this is that constructing the new file can sometimes be time consuming, especially for the big files. Here’s an example where I construct linux-2.6.11.2.tar.bz2 (~35mb) from the 2.6.10 distribution:

linux-2.6.10.tar.bz2 -> linux-2.6.11.2.tar.bz2: OK
cleaning up

real 3m42.395s
user 2m2.787s
sys 0m4.462s

Is this any quicker than downloading the whole thing? Perhaps not while I’m staying at uni, where I can get 500kb/sec from UK mirrors, but at home where I’m on standard/unreliable broadband, it definately would be. If you are a dialup user, this is a godsend.

It seems that there are plans to at least trial this as an official Gentoo service, but this is a very nice preview of what might be coming up. To get started, just emerge getdelta and follow the basic instruction to set a new FETCHCOMMAND in make.conf

3 Responses to “Using getdelta to reduce size of distfiles download”

  1. Trevor Fancher Says:

    Wow!

    I would like to thank you for bringing this to my attention. I am on a 56k connection and this is indeed a godsend!

  2. Serge Says:

    Good job !!!

    But for me it’s a little too late. For 2 months I got my ADSL line.
    for 30 EUR/month, 4Mb/s is far more better than 56k as before.

    So for gentoo, my old computer need more time to compile that to dowload (most time at 500ko/s, very good mirrors !)

    Regards

    And thanks to all for this good Distro !

    Serge of France

  3. James Says:

    Sweet! Just had a chance to actually use it last night. I’m on 56k dialup and the emerge -u world that I did wanted to download what would have taken about 1.5+ hours and with getdelta it took 10-15 minutes. Awesome!!

Leave a Reply

You must be logged in to post a comment.