2017-09-07 11:00 fromthetrenches [permalink]
Here's another nice story 'from the trenches'. Packaging stations use a barcode scanner to scan the barcode on the items that need packaging. We were able to buy a batch of really good barcode scanners, second hand, but newer and better than those that we had. A notice came in from an operator: "with this new scanner, we get the wrong packaging material proposal." The software we wrote for the packaging stations, would check the database for the item which is the best suitable packaging material to package it in. It's a fairly complicated bit of logic that used the order and product details, linked to the warehouse stockkeeping and knows about the several exceptions required by postal services of the different destination countries.
So I checked the configuration of this station first. Always try to reproduce first: the problem could go away by itself, or exist 'between the user and the keyboard', or worse only happen intermittent depending on something yet unknown... Sure enough, product '30cm wide' would get a packaging proposal of the '40cm box' which is incorrect since it fits the '30cm box'. Strange. The station had the 'require operator packaging material choice confirmation' flag set to 1, so I checked with 0 and sure enough, it proposed the '30cm box' (with on-screen display, without operator confirm, just as the flag says)...
So into the code. Hauling the order-data and product-data from the live DB into the dev DB (I still thank the day I thought of this tool to reliably transport a single order between DB's). Opening the source-code for the packaging software, starting the debugger while processing the order, and... nothing. Nicely proposing the '30cm box' every time, with any permutation of the different flags (and there are a few, so a lot of combinations...)
Strange. Very strange. Going over things again and again, checking with other orders and other products, nothing. I declared the issue 'non-reproducable' an flagged it 'need more feedback', not really knowing of any would come from anywhere.
A short while later, a new notice from the same station: "since your last intervention, scanned codes concatenate". What, huh? I probably forgot to switch the 'require operator packaging material choice confirmation' back to 1, but how could that cause codes to concatenate? I went to have a look, and indeed, when a barcode is scanned (the device emulates keyboard signals for the digits and a press of 'Enter') the input-box would select-all, so the number is displayed, and would get overwritten by the next input. This station didn't. The caret was behind the numbers, and the next scan would indeed concatenate the next code into the input field.
Strange. Very strange. Into to the code first: there's a SelectAll call, but what could be wrong with that? And how to reproduce? What I did was write a small tool that displayed the exact incoming data from the keyboard, since it's apparently all about this scanner. Sure enough, the input was: a series of digits (those from the barcode), an 'Enter', and 'Arrow Down'. A-ha! These were second hand scanners, remember? God knows what these scanners were used for before, but if having the scanner send an extra 'arrow down' after each code, is the kludge it takes to solve some mystery problem in software out of your control, than that is what a fellow support engineer has to do... Got to have some sympathy for that. And the '40cm box' was indeed just below the '30cm box' in the list, so the arrow down would land in the packaging material selection dialog, causing the initial issue.
Download the manual for the scanners, scan the 'reset all suffixes to "CR"' configuration code, done.
(Update: got some nice comments on reddit)