ISFDB:Bad ISBN List
To Fix or to note?
We have a minor edit war over one of the entries that was fixed. Some people seem to want to put the correct ISBN in the Catalog ID field and the stated but incorrect ISBN in notes. Others want the stated ISBN in the Catalog ID field and the correct one in notes. I guess either will do, but I favour the correct ISBN being in the Catalog ID field - mainly so the links on the left to all the websites like alibris, AbeBooks, the various Amazons, etc, work. But also because I don't want to have to code each such individual ISBN as an exclusion when we refresh this page.BLongley 14:09, 8 Jan 2008 (CST)
I'm open to opinions though, so I haven't rejected it, just held it: and will put it to a vote. (Here, as we rarely get a conclusive vote on Rules and Standards.) Just put your name (and reasons, if you like) against the option you prefer, or add your preferred option:
- Catalog ID should contain the working ISBN, where known: any others stated on the pub should go in notes
- This gets my vote, for the Linking reason. BLongley 14:09, 8 Jan 2008 (CST)
- Concur. rbh 19:09, 8 Jan 2008 (CST)
- I agree. --Chris J 03:34, 9 Jan 2008 (CST)
- Catalog ID should contain the ISBN as stated on the publication, the working ISBN should go in notes, if known
- Other: (e.g. "Where there are conflicting ISBNs all should go in notes and the Catalog ID field left blank")
- I would have gone for the second choice for the simple reason that we record "exactly" what the pub says. But because it creates an error message, I have to opt for this third choice. It's easy to create a "working" ISBN by just adding the correct checksum, but what if two of the numbers of the ISBN were transposed. The "working" ISBN could be for another pub by the same publisher. Example: correct the checksum for this pub and you get this pub. 'Nuf said. Mhhutchins 21:09, 8 Jan 2008 (CST)
- In this case, I used a tool that changes a single digit making a correct ISBN, then I researched each of the new ISBNs to see what they matched. In this case, on my 3rd or 4th try, I got a match on Amazon to the correct pub. I do not believe the checksum digit was the error on this particular pub and of the other pubs I checked, only once was it the checksum digit and in each case, going to Amazon with the "fixed" ISBN resulted in a pub that matched the one ISFDB entry. I think this is a pretty reliable way to fix ISBNs and ensure the corrected ISBN actually matches the desired pub. If you fix ISBN for the pub you list above to: 0877955778 from 0877955578, it is both a valid ISBN and matches the hc edition of that title. Thx, rbh 12:24, 9 Jan 2008 (CST)
- I do similar, using Marc Kupper's ISBN Linker (which will actually allow a wildcard for any digit, not just the last as its notes imply) and check against multiple sites (as Amazon data can be appalling at times). Even a quick Google for the changed ISBN can sometimes make it clear that only we (and sites using our data) have the wrong one. BLongley 12:44, 9 Jan 2008 (CST)
- If the ISFDB is one of the few sites that records the ISBN printed in the book, I take that as a compliment to our basic philosophy. Mhhutchins 13:54, 9 Jan 2008 (CST)
- I agree we want the actual ISBN printed in the book recorded, I just don't agree that it has to be in the Catalog ID field - although I'd be happy if it WAS there, but not triggering all the warnings. Put a "#" in front of invalid ISBNs recorded there could be another option? (Although I don't want too many options or we'll get one each and it's stalemate.) BLongley 15:14, 9 Jan 2008 (CST)
- If the ISFDB is one of the few sites that records the ISBN printed in the book, I take that as a compliment to our basic philosophy. Mhhutchins 13:54, 9 Jan 2008 (CST)
- And Mike - can you think of a better message than "(Bad Checksum)"? I agree that it seems to suggest that's the digit that should be changed, whereas any or all could be wrong. BLongley 12:44, 9 Jan 2008 (CST)
- Finding a working ISBN doesn't necessarily mean that we've found the correct ISBN, and just because some internet dealers have also found a working ISBN doesn't mean it's the publisher's intended ISBN. For the sample cited above 0-87795-577-8 appears on only 7 pages of the WWW. How can we sure that this is the ISBN that Arbor House assigned to Silverberg's collection? Mhhutchins 13:54, 9 Jan 2008 (CST)
- If "working" means it works (in the way WE want) on the Internet, I'm not too bothered so long as we have the Actual too. A lot of publishers got their (I)SBN ranges and continued existing catalogue numbers into that range - some books even got the check digit right before they claimed it was an (I)SBN - e.g. "#10901" was printed on a Sphere book, Sphere are known for having a 07721 prefix, and 0722110901 gets more useful results than #10901. Which is more useful - finding a map of "Suffern, NY 10901, USA" or "Hothouse" by Brian (W. possibly) Aldiss? BLongley 15:14, 9 Jan 2008 (CST)
- Finding a working ISBN doesn't necessarily mean that we've found the correct ISBN, and just because some internet dealers have also found a working ISBN doesn't mean it's the publisher's intended ISBN. For the sample cited above 0-87795-577-8 appears on only 7 pages of the WWW. How can we sure that this is the ISBN that Arbor House assigned to Silverberg's collection? Mhhutchins 13:54, 9 Jan 2008 (CST)
- I do similar, using Marc Kupper's ISBN Linker (which will actually allow a wildcard for any digit, not just the last as its notes imply) and check against multiple sites (as Amazon data can be appalling at times). Even a quick Google for the changed ISBN can sometimes make it clear that only we (and sites using our data) have the wrong one. BLongley 12:44, 9 Jan 2008 (CST)
- In this case, I used a tool that changes a single digit making a correct ISBN, then I researched each of the new ISBNs to see what they matched. In this case, on my 3rd or 4th try, I got a match on Amazon to the correct pub. I do not believe the checksum digit was the error on this particular pub and of the other pubs I checked, only once was it the checksum digit and in each case, going to Amazon with the "fixed" ISBN resulted in a pub that matched the one ISFDB entry. I think this is a pretty reliable way to fix ISBNs and ensure the corrected ISBN actually matches the desired pub. If you fix ISBN for the pub you list above to: 0877955778 from 0877955578, it is both a valid ISBN and matches the hc edition of that title. Thx, rbh 12:24, 9 Jan 2008 (CST)
- To answer Bill's question, a better error message would be "Invalid ISBN" which is exactly what it is, no more...no less. Mhhutchins 13:54, 9 Jan 2008 (CST)
- I agree - whatever the outcome of this discussion is, I think we should make a feature request to improve that message. BLongley 15:14, 9 Jan 2008 (CST)
- To answer Bill's question, a better error message would be "Invalid ISBN" which is exactly what it is, no more...no less. Mhhutchins 13:54, 9 Jan 2008 (CST)
(unindent) Unfortunately, I have been sick for the last couple of days and may be missing some subtleties of the discussion, but perhaps we could list all pros and cons of the two proposed approaches here first? Things like "ease of scripting", "consistency with the overall ISFDB approach", "ease of use by the end users", etc? It may make it easier to decide what the ultimate balance is. Ahasuerus 21:41, 10 Jan 2008 (CST)
- I think there's THREE proposed approaches now, but I'm not clear what Mike's "Other" actually entails, unless he's voting for the example I gave. I was hoping both original editors would have added their views by now too, but we've only got one of them so far. Still, to address your example criteria: I don't think any are in conflict with "consistency with the overall ISFDB approach" if you mean "we record exactly what is on the pub" and "any useful extra information should be recorded in notes". Some of the rest is guessing intentions from Al's coding though, and we could say that the code should change rather than our practices. e.g.
Option 1: Pro: Links to other bibliographic sites work Pro: No Bibliographic Warnings Pro: Do not have to include each invalid ISBN in data-cleanup scripts Con: It's not what's on the pub, and people will have to read Notes to find that out. (Is this always true? In some (many?) cases could it have just been a typo by the editor making the entry? How would we know? rbh 08:38, 13 Jan 2008 (CST)) As with all un-noted, unverified,(and some verified) pubs, we can never be sure. BLongley 13:22, 18 Jan 2008 (CST) Option 2: Pro: It's exactly what's stated on the pub Con: Have to include each invalid ISBN in data-cleanup scripts, or make people recheck after every refresh Con: Links to other bibliographic sites don't work Con: Bibliographic Warnings about Invalid check-digit. Option 3: Pro: Not misleading about real and printed ISBNs (Notes will explain) Con: Links to other bibliographic sites don't work Con: Bibliographic Warnings about missing ISBNs Con: Have to include each deliberate missing-ISBN pub in data-cleanup scripts, or make people recheck after every refresh Options 4-N: (Get Al to change a lot of things - separate stated and working and maybe "official" ISBNs, relax or change bibliographic warnings, create "intelligent" links to other sites based on currency or publisher as well as ISBN, allow Mods/Editors to put pubs on data-cleanup-exclusion lists, etc) Pro: We have a definitive set of rules Con: We'll never agree on them all and this will never happen. (Some individual changes could though, e.g. the warning message currently given seems unpopular.)
- Maybe we should stop deleting "Fixed" entries for a bit and go see how people HAVE actually fixed them, to see if we agree with current practices? I only noticed this problem as I saw the same pub being "fixed" twice in different ways.
BLongley 14:02, 11 Jan 2008 (CST)
- One of the problems with the bad ISBN's is that we don't know if the original editor entered an incorrect ISBN(what was on the pub) or the pub. has the correct ISBN and they miss-typed the entry. Unless we know what's on the original pub it might be best to just change them to an ID with a # sign in front and a note that it should be fixed if someone has the book. If the ISBN is incorrect on the book than we should have one rule for how its dealt with. We have two options 1.Display what the correct ISBN should be if it matches searches on the web(to that pub.), with a note about the bad ISBN or 2.Put the bad ISBN in with the # sign and a mention in notes about what the valid ISBN should have been, again only if it matches the pub in question.Kraang 21:49, 11 Jan 2008 (CST)
- I like the idea of putting a # in front of a badly formed ISBN when an ISFDB editor has checked that this is the only "ISBN" that the book states (with an added explanation in the Notes). (I don't think just "fixing" the checksum necessarily ends up with the intended ISBN; I own books where the publisher has made a typo in the middle of the ISBN so the checksum seems wrong (... deduced where the copyright page and the back cover have different numbers.)
- However, if the book has 2 versions of the ISBN & one is a correctly formed ISBN string, I'm in favour of putting the "correct" one as the catalog ID and putting the other in the Notes (even if this occasionally breaks links to other sites, 'cos they have used the invalid one).
- And there is the occasional case where the printed ISBN is correctly formed but the publisher has made a boo-boo and it is the duplicate of an earlier-published book by another author (I own one of those, though I can't remember off-hand which it is.)--j_clark 02:07, 18 Jan 2008 (CST)
- OK, is that a vote AGAINST using 'Working' ISBN if it's not a secondary one on the pub itself? As we seem perfectly capable of finding it in most cases, but as noted several times it's not as simple as changing the check-digit. I can usually spot a transposition of digits in the publisher's usual prefix which means I need only check a few possibilities, but when it's two or more incorrect digits it's probably best to give up rather than guess. Incidentally, a Bad ISBN doesn't seem to link anywhere useful from our pre-defined links so removing/#ing those shouldn't lose us any functionality. BLongley 13:22, 18 Jan 2008 (CST)
- Anyway, I think we're almost done with this lot - there's a few where I'm waiting for the verifier, but otherwise there's nothing more I'd do to the remaining titles except add notes, a '#' to stop the error messages, and swap Working and Bad ISBNs if that's the way we want to go. Which originally we had 3 quick supporting votes for, but now we have 3 other people not supporting it but not making it clear what they DO want. I think we need a decision before we can refresh this page and have a go at the NEXT lot: #ing bad numbers would make them undetectable, so this would be our one chance to do them as a project, rather than as and when we encounter them. Or maybe "Bad ISBN, UN-#ed", in the Catalog ID field is desired, in which case the scripter needs to know that we have decided that's fine and he can omit the checked ones from future project refreshes. BLongley 13:22, 18 Jan 2008 (CST)
- One of the problems with the bad ISBN's is that we don't know if the original editor entered an incorrect ISBN(what was on the pub) or the pub. has the correct ISBN and they miss-typed the entry. Unless we know what's on the original pub it might be best to just change them to an ID with a # sign in front and a note that it should be fixed if someone has the book. If the ISBN is incorrect on the book than we should have one rule for how its dealt with. We have two options 1.Display what the correct ISBN should be if it matches searches on the web(to that pub.), with a note about the bad ISBN or 2.Put the bad ISBN in with the # sign and a mention in notes about what the valid ISBN should have been, again only if it matches the pub in question.Kraang 21:49, 11 Jan 2008 (CST)
ISBNs that fail checksum validation
- Note 1: This record needs more investigation; it appears to be a "duplicate" of Dangerous Games (see here) & possibly might be a pre-publication record, considering all the other data in the 2 publication records. I have just verified the Dangerous Games version. --j_clark 23:40, 2 Jan 2008 (CST)
ISBN fields that don't start with '#', and aren't 10 digits long
Done, but should probably be done again to see whats been miss or is new.
- Are you sure? There's a LOT... :-/ Even taking out "None", "No ISBN", "N/A", 13-digit ISBNs, and ISSNs or things that Look like ISSNs, we have these: feel free to break them up into more manageable chunks. (And when fixing, please mark your edits as "Minor" so the rest of us can filter them out, please?) BLongley 13:26, 4 Jan 2008 (CST)
- Do you want me to mark the pub. that I've fixed with the "Minor" edit or this list? Confused :-/Kraang 19:18, 4 Jan 2008 (CST)
- Mark edits to this page as Minor so we can filter them out of "Recent Changes" - e.g. I've updated it 8 times so far today and marked them all minor - but you can stop those edits cluttering your "Recent Changes" page with one click. BLongley 09:53, 5 Jan 2008 (CST)
- I've wondered what the "m" in recents changes was for, now I know. Learning new things all the time. Thanks!Kraang 20:49, 5 Jan 2008 (CST)
- Mark edits to this page as Minor so we can filter them out of "Recent Changes" - e.g. I've updated it 8 times so far today and marked them all minor - but you can stop those edits cluttering your "Recent Changes" page with one click. BLongley 09:53, 5 Jan 2008 (CST)
- Do you want me to mark the pub. that I've fixed with the "Minor" edit or this list? Confused :-/Kraang 19:18, 4 Jan 2008 (CST)
- The original list found in this section was generated by Al a long time ago and accounted for only a small subset of our "problem records". I have a much more complete list, but it's huge and I wanted to split it into multiple pages, probably one letter per page, before posting it here. I'll need to rewrite the script to account for 13 digit ISBNs first, though. Ahasuerus 13:46, 4 Jan 2008 (CST)
- Broke this list up a bit so I don't have to scroll forever when editing. Dana Carson 22:32, 6 Jan 2008 (CST)
- Wise move - and if people DELETE stuff rather than "mark it fixed" it helps too. Nobody's counting who fixed what. And marking edits as "Minor" does help those few of us that still try to see EVERYTHING happening. BLongley 17:28, 7 Jan 2008 (CST)
- The Other Foot 0-532-00433
- In Deep 053200444
(Unindent) Well done all! I've left the last two to the verifier - who is also the person that will provide the next backup, after which we can check the NEXT batch! BLongley 17:14, 13 Jan 2008 (CST)