Monday, September 08, 2008

New location for Factor downloads

The Linode virtual server hosting http://factorcode.org only has 10Gb of disk space. Over the last 7 months, almost daily uploads of binary packages from 11 platforms has done a pretty good job of filling it up.

To rectify this, we have moved the download files over to downloads.factorcode.org, which is a hosted by DreamHost shared hosting. This account has more than 300Gb available and we're not likely to exhaust it any time soon. The binaries page has not moved but the links on it now point to the new location.

The Factor web site continues to run on a Linode; having root access is very useful, because for one, we run the Factor HTTP server on there. Other than a few glitches caused by the CGI support, the Factor HTTP server handles the load very well; it could handle serving all the downloads just fine too, if it weren't for the disk space issue.

The binaries page is an .fhtml script, meaning it is an HTML file with embedded Factor. Formerly, it would look at the download directories, find the latest package for each platform, and compute the table of links on every request. Now that downloads are not hosted on the same server as the web page, I had to replace this with a more complex setup. A Factor script runs as an hourly cron job under our DreamHost account; it generates this table and spits it out at "http://downloads.factorcode.org/downloads.html. The binaries page then uses the http.client vocabulary to download this file and include it. This means that for every HTTP request to the binaries page, our Linode server makes an HTTP request to the DreamHost box; if this becomes a problem, I'll investigate some kind of caching, but I don't think it will.

I put the code for this script up in our pastebin; it is not very pretty, but note the "shebang" line at the top of the file, making executable from the shell.

This is all part of a pretty complicated distributed system written in Factor. There are 11 machines (some virtual) uploading binary packages; a Factor script on one machine tabulates them, and a Factor HTTP server on another machine receives the results of this tabulation and serves up the dynamically-generated downloads page where users can download the binaries. The developers just push patches, without worrying about making releases, updating the web site, and now, disk space filling up.

There is one final improvement to the build infrastructure. When a binary package is uploaded, it is uploaded with a name ending with .incomplete; only once the upload succeeds is it renamed to its final name. This prevents the binaries page from being generated with links to incomplete packages. This was a problem before; downloading a binary package at the wrong time could give you a truncated tarball. This won't happen again.

No comments: