I am running buildbot 0.7.6 on FreeBSD 6.3 with Python 2.5.1.
buildslave often fails to detect that a command that it started has completed in time. In these cases, it takes about 20 minutes until buildslave continues:
2008/02/24 21:28 +0200 [Broker,client] SlaveBuilder.remote_print(bknr-fbsd-ccl-amd64): message from master: ping
2008/02/24 21:28 +0200 [Broker,client] SlaveBuilder.remote_ping(<SlaveBuilder 'bknr-fbsd-ccl-amd64' at 16127976>)
2008/02/24 21:28 +0200 [Broker,client] <SlaveBuilder 'bknr-fbsd-ccl-amd64' at 16127976>.startBuild
2008/02/24 21:28 +0200 [Broker,client] startCommand:svn [id 46]
2008/02/24 21:28 +0200 [Broker,client] ShellCommand._startCommand
2008/02/24 21:28 +0200 [Broker,client] /usr/local/bin/svn update --revision HEAD --non-interactive
2008/02/24 21:28 +0200 [Broker,client] in dir /home/buildslave/builds/bknr-fbsd-ccl-amd64/build (timeout 1200 secs)
2008/02/24 21:28 +0200 [Broker,client] watching logfiles {}
2008/02/24 21:28 +0200 [Broker,client] argv: ['/usr/local/bin/svn', 'update', '--revision', 'HEAD', '--non-interactive']
2008/02/24 21:28 +0200 [Broker,client] environment: {'USERNAME': 'buildslave', 'SUDO_COMMAND': '/usr/local/bin/buildbot start /home/buildslave/builds/', 'TERM': 'xterm', 'SHELL': '/bin/t
csh', 'MAIL': '/var/mail/hans', 'SUDO_UID': '1000', 'SUDO_GID': '1000', 'LOGNAME': 'buildslave', 'USER': 'buildslave', 'HOME': '/home/hans', 'PATH': '/home/hans/bin:/sbin:/bin:/usr/sbin:/
usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin:/home/hans/bin', 'SUDO_USER': 'hans', 'DISPLAY': 'localhost:10.0', 'TMPDIR': '/tmp'}
2008/02/24 21:48 +0200 [-] command finished with signal None, exit code 0
In this logfile example, the process started exits after about 30 seconds, yet the "command finished" log entry is shown 20 minutes later. The process spawned is in Zombie state until it is eventually collected.
The problem could be related to http://twistedmatrix.com/trac/ticket/791 - If there is a workaround for buildslave, I'd happily use that.