Fixed a worthy bug

Some bugs turn out to be annoying eye-rollers, and some take an embarrassingly long time to find something embarrassingly stupid. Yet occasionally one seems respectably interesting.  Here’s a nerdy dive into one of the latter.

A status page displayed in a browser was used to monitor multi-hour simulations on a server. It had some Javascript code that added rows to a displayed <table> when a run completed. The code substituted fields in a template. To keep the page design together, the template was right there in the page HTML, but hidden way down at the bottom. The code looked something like this.

let text = 
    document.getElementById("completedRunTemplate1").outerHTML +
    document.getElementById("completedRunTemplate2").outerHTML;
text = text
    .replace(/%abc%/g, abc)
    .replace(/%def%/g, def)
    .etc;
table.innerHTML += text;

That bit of code was called for each run “promoted” from running to complete.

The bug report was that the display was wrong when more than one run got promoted. For N promoted runs it just showed N copies of the first one to be promoted!

OK, was the problem server side or client side? Refreshing the page showed the correct display. That meant that the status in the database was correct, and the template rendering on the server side was correct. The culprit lay somewhere in the browser, with the incremental updating. However, the page refresh also wiped out the incorrect browser state.

Fortunately I could reproduce the situation in the browser. Unfortunately, each such iteration needed a fresh simulation run, because I never wrote the surgery for the status database that would construct a phantom run and have it make phantom progress. At least the simulator had a special mode that takes two minutes instead of two hours, but a two second mode would be better.

The browser’s F12 key brought up the Web Developer console. Copied & pasted in

document.getElementById("completedRunTemplate1").outerHTML

and it showed that the fields were already substituted with the first promoted run’s data. That explained why promotions after the first are wrong — the template they used no longer had “%abc%” but actual values — but how did that happen?

It’s “impossible”. The code extracted strings from the template nodes (which “shouldn’t” change the node contents). It concatenated the strings and did regex substitutions, which don’t even change the strings, let alone the node contents.

It’s super unlikely that I had found a Firefox bug, but just in case, I wrote a tiny page with a tiny script that should have reproduced the bug if there was one. It worked fine, no such bug, no surprise.

Back in the reproduced problem page, View Page Source showed the correct, unsubstituted template. Using Inspector to navigate down to the nodes of the hidden template showed that it too remained correctly unchanged. So the impossible problem had no cause.

Aha. This is where I wish I could identify how the cognitive leap of insight happened. Sorry, I can’t say.

The HTML standard says that node IDs must be unique — only one node can have a given ID. But nothing enforces this. It simply gives license to document.getElementById to return whatever it feels like.

The template node — the one that I copied from — had id=”completedRunTemplate1″ so I could find it. Its .outerHTML still had that ID. After field substitution it still had that ID. When I inserted the substituted version it still had that ID. Thus I broke the ID-uniqueness rule. It happened that when document.getElementById searched the page for the ID, it found the substituted one that had been pasted in, and not the intended hidden template down at the bottom.

The fix was simple: remove the id= attribute from the text that was retrieved from the template. It was simple to add one more substitution to the .replace chain:

text = text.replace(/id=".*?")/, "")
    .replace(etc etc etc)

If I ever write this bug again, next time I track it down it will be one of the stupid groaners. This time, however, it seemed worthy.

PS. I’ve never been able to remember whether the thing is called “node” or “element”. Here I mean the kind that has a tag and attributes.

Leave a Reply

Your email address will not be published. Required fields are marked *