EditBug:10079 ISBN display/validation issue

From ISFDB
Revision as of 22:17, 6 January 2007 by Marc Kupper (talk | contribs) (Single field was better for me)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
  • EditBug:10079 ISBN display/validation issue OPEN This is both a display and edit bug but I'll file just this report. There was a publication record with ISBN 0932322240 which is displayed as 0-932322-24-0. Someone edited it and changed other fields but not the ISBN. When they saved submitted the record the ISBN validator did not recognize it because the checksum is wrong (it should be 7) and so the record was saved as 0-932322-24-0. I saw the pub-update in the queue, spotted the hyphenated ISBN, and decided to follow up on this. The display logic displays “0-932322-24-0” as “ISBN-13: 978-2-322-240-” and there’s no ISBN: line. I hit edit-pub again and the logic gave me “978-2-322-240-” in the edit field!
I suspect the code to format ISBNs either should not hyphenate ISBNs with a bad checksum or the validator code should treat things that look exactly like an ISBN except for the bad checksum as an ISBN and remove the hyphens.
As for the record with 0932322240, the correct ISBN is 0921322240 and I’ve already fixed that in ISFDB though if you want to play with this I have a test pub you can use. Per Google ISFDB was the only place with 0932322240 (the bad ISBN) meaning the error was either during keying into ISFDB or a very old dissembler finding that was later corrected. (it’s also listed in ISFDB:Bad_ISBN_List) Marc Kupper 01:15, 6 Jan 2007 (CST)
Wait. Wait. Wait. The software knows if the ISBN field is a valid ISBN or not, but that's it. How can it know that the user INTENDED to type a valid ISBN, but typed it wrong? That is, how does it differentiate between a bad ISBN and a catalog number? This is one of the reasons I like preceeding non-ISBNs with the '#' sign - then the software can know. Here's what we currently do:
  • If the user types a valid ISBN without hyphens, it gets stored without hyphens, and gets diplayed with hyphens.
  • If the user types a valid ISBN with hyphens, it gets stored without hyphens, and gets displayed with hyphens.
  • If the user types an invalid ISBN or a catalog ID with hyphens, it gets stored exactly as entered (as the software can't tell the difference between an invalid ISBN and a catalog ID).
  • If the user types an invalid ISBN without hyphens, it gets stored exactly as entered.
Are we saying that if the user entered a 10 digit ISBN, where the first 9 digits have values 0-9, and the last digits has values 0-9 or X, but the checksum is incorrect, that we SHOULD strip out any hyphens? Alvonruff 08:25, 6 Jan 2007 (CST)
There are a couple of ways do deal with this but I believe it’s important that the code be consistent in both the input and display logic. Right it’s not consistent and that’s causing the problem. The area where the code is not consistent is that on input it’s only stripping hyphens if the ISBN checksum is valid but for display it’s inserting hyphens regardless on if the checksum is valid. I believe the better fix is to only insert hyphens for display (including when setting up the default value for the pub-edit field) if the checksum is valid as then if someone see an unhyphenated “ISBN” they will know its checksum is not valid.
That's a quick fix and in the long run I suspect the code should indicate more clearly that the value looks like an ISBN but its checksum is invalid. If it's not really an ISBN then editors can use the # prefix. Marc Kupper 16:54, 6 Jan 2007 (CST)
In the long run I'd prefer to split the ISBN field to a separate field; I think a serial number or catalog number should go in as it is on the book, without a leading "#". Other than I agree with your suggestion that the hyphens should only be displayed for valid ISBNs. Mike Christie (talk) 17:09, 6 Jan 2007 (CST)
In my own book database I tried a separate field for catalogue numbers vs. ISBNs and it turned out to be a pain. These days it’s a single field but I also put the publisher name in there so you will see codes like Ace 27400, DAW UQ1043, Bantam S4482, and SBN 553-08162-095 meaning they are clearly distinguished from ISBNs which can just be entered as 0-345-25457-0.
I had to run right after my previous post and something I realized on the way out the door is that if someone enters an invalid (bad checksum) ISBN with hyphens that it would be stored and then displayed as a hyphenated value meaning my idea of non-hyphenated codes would be an indicator of bad checksums would not be reliable. I suspect the code should validate the checksum on data entry and would advise editors to use the # prefix if publication states an invalid ISBN. Marc Kupper 20:17, 6 Jan 2007 (CST)