Ticket #342 (closed defect: fixed)

Opened 19 months ago

Last modified 8 months ago

BuildBot crashes if a non-ascii is sent with 'force build'

Reported by: marcusl Owned by: dustin
Priority: critical Milestone: 0.7.11
Version: 0.7.10 Keywords:
Cc: Pike, dustin

Description

Entering a name with non-ascii chars when doing a 'force build' makes the waterfall view crash horribly with 'exceptions.UnicodeDecodeError?: 'ascii' codec can't decode byte 0xXX in position YYYY: ordinal not in range(128)'.

A fellow named Jonas Byström did this at our place. I had to delete the offending build-pickle files to make waterfall work again.

Attachments

waterfall_crash.zip Download (28.2 KB) - added by etienne 11 months ago.
Crash of the waterfall display with version 0.7.10p1

Change History

  Changed 19 months ago by marcusl

  • priority changed from critical to major

  Changed 18 months ago by tjensen

We are also seeing this error with the waterfall whenever someone uses locale specific characters in a commit message. (ie. æøå)

  Changed 13 months ago by marcusl

  • priority changed from major to critical

That's not good. I'm upping this to critical again.

  Changed 13 months ago by dustin

  • milestone changed from undecided to 0.7.10

Looks like I need to do a thorough analysis of charset handling. This is always fun in the world of web services :(

  Changed 13 months ago by dustin

  • owner set to dustin
  • status changed from new to assigned

  Changed 13 months ago by Pike

  • cc Pike added

  Changed 13 months ago by dustin

  • status changed from assigned to closed
  • resolution set to worksforme
  • milestone changed from 0.7.10 to undecided

OK, I really can't replicate this. I'm going to close it for the moment, but please re-open with a full traceback if you see it again in 0.7.10.

  Changed 13 months ago by marcusl

I just tested, and no, it doesn't seem to reproduce on 0.7.9 either.

Good to know if I get more buildbot questions from my old company. :)

Changed 11 months ago by etienne

Crash of the waterfall display with version 0.7.10p1

  Changed 11 months ago by etienne

  • status changed from closed to reopened
  • resolution worksforme deleted

I had the same problem with version 0.7.10p1 and I can attach a full traceback, I hope it's not too big.

I have (taken from about) :

  • Buildbot 0.7.10p1
  • Twisted 8.2.0
  • Python 2.4.3
  • Buildmaster platform: linux2

  Changed 11 months ago by dustin

OK, I don't have any clear idea how to fix this, but the problem is that non-ASCII characters are getting into the HTML output (data). When those are combined with the Unicode output from b.td(), then Python tries to upconvert data into Unicode by *decoding* it from the "ascii" encoding. Which doesn't work.

So the problem is actually that invalid ASCII symbols are getting injected on *input* (via the "stop" button, in this case), and getting stored in the pickles. It's always tricky to figure out which encoding input from a browser is using, so that's the part I don't know how to solve. I can try to solve the crashes, at least.

See if this patch helps?  http://github.com/djmitche/buildbot/commit/852871732326cf4d62322dd70040dddabd6edf42

  Changed 11 months ago by dustin

  • cc dustin added
  • version changed from 0.7.7 to 0.7.10
  • milestone changed from undecided to 0.7.11

  Changed 11 months ago by Pike

At least you found what crashed, that stack left me helpless. Not that I totally understand what's going on so far anyway.

The encoding of the data we receive is the encoding of the webpage we serve, btw, so it's not really all that hard to figure out, misconfigured servers withstanding.

I'd probably enforce utf-8, I would hope that at least the plain buildbot is setup to use that as encoding.

  Changed 11 months ago by dustin

Yeah, the fix here is clearly to get the encodings right on input and output, and assuming utf-8 on input is probably workable. However the mash-strings-together school of HTML generation makes this kind of conversion *really* difficult.

If someone wants to rejigger the HTML generation to be all unicode, all the time, and fix the places where form input enters the database, I'd be happy to commit it.

  Changed 11 months ago by marcusl

Vaguely related and to dig up an old corpse, I've seen some people use Twisted and Django together. I hope to get time to do some experiements for one-oh in that area.

We could then at least use Django for HTML generation, if nothing else.

follow-up: ↓ 16   Changed 10 months ago by dustin

I committed the partial fix referenced above in [e3abd0739af2408d676817f6d5b091ee5a4d49dd]

in reply to: ↑ 15   Changed 9 months ago by catlee

Replying to dustin:

I committed the partial fix referenced above in [e3abd0739af2408d676817f6d5b091ee5a4d49dd]

Please change how this is done. Decoding a unicode string is probably not what you want to do, and decoding an ever growing string is definitely not good. It's O(n2) in the number of grid elements if I'm not mistaken.

Maybe something like: diff --git a/buildbot/status/web/waterfall.py b/buildbot/status/web/waterfall.py index c3bd5c6..0e70f28 100644 --- a/buildbot/status/web/waterfall.py +++ b/buildbot/status/web/waterfall.py @@ -949,12 +949,11 @@ class WaterfallStatusResource?(HtmlResource?):

# third pass: render the HTML table for i in range(gridlen):

data += " <tr>\n";

- # convert data to a unicode string, whacking any non-ASCII characters it might contain - data = data.decode("ascii", "replace")

for strip in grid:

b = strip[i] if b:

- data += b.td() + # convert data to a unicode string, whacking any non-ASCII characters it might contain + data += b.td().encode("utf-8", "replace")

else:

if noBubble:

data += td([])

  Changed 8 months ago by dustin

  • status changed from reopened to closed
  • resolution set to fixed

Chris took care of this in [6b184718ee9576c0577c4fd9fb8dda7a2edf8754]. Thanks!

Note: See TracTickets for help on using tickets.