To Fix or to note?

We have a minor edit war over one of the entries that was fixed. Some people seem to want to put the correct ISBN in the Catalog ID field and the stated but incorrect ISBN in notes. Others want the stated ISBN in the Catalog ID field and the correct one in notes. I guess either will do, but I favour the correct ISBN being in the Catalog ID field - mainly so the links on the left to all the websites like alibris, AbeBooks, the various Amazons, etc, work. But also because I don't want to have to code each such individual ISBN as an exclusion when we refresh this page.BLongley 14:09, 8 Jan 2008 (CST)
I'm open to opinions though, so I haven't rejected it, just held it: and will put it to a vote. (Here, as we rarely get a conclusive vote on Rules and Standards.) Just put your name (and reasons, if you like) against the option you prefer, or add your preferred option:

  • Catalog ID should contain the working ISBN, where known: any others stated on the pub should go in notes
    1. This gets my vote, for the Linking reason. BLongley 14:09, 8 Jan 2008 (CST)
    2. Concur. rbh 19:09, 8 Jan 2008 (CST)
    3. I agree. --Chris J 03:34, 9 Jan 2008 (CST)
    4. I use this method though it bothers me a little. When I derive or correct an ISBN I always add publication notes explaining when I'm doing plus also try to include in the note both the unhypenated and hyphenated ISBN so that someone using Google can find the record. It does bother me that I'm not entering what's stated. For example, the book I just entered is coded 451-Y5754 and from that I derived 0-451-05754-6. When I derive an ISBN I always first check to see if has a record or if there's frequent references to it on the Internet and will only enter the derived ISBN if it is well known. If it's not well known I still leave the note in place but add to it that a search found that the ISBN is not in use. In my personal book database I get around this by allowing semicolon as a separator and for the book I just mentioned I entered "451-Y5754-125; Derived ISBN Amazon at *0451057546." This allows me to later search for Y5754, 415-Y5754, and/or the ISBN. The parsing logic knows about * meaning my comment can be free text, such as "A later edition on Amazon at *" I had a recent publication with the a typo in the ISBN. At first I entered it with a leading # plus there was a note about this but later changed my mind and corrected the ISBN. Marc Kupper (talk) 02:38, 21 Jan 2008 (CST)
    5. I agree, in order to make the links work, provided that the editor is expected to always make a note indicatign the actual printed ISBN or other number, and that the editor is expected to use an ISBN search tool (or other method) to verify that the corrected ISBN is actually for the correct title, in case of traspositions or other errors. -DES Talk 19:08, 28 Jan 2008 (CST)
  • Catalog ID should contain the ISBN as stated on the publication, the working ISBN should go in notes, if known
  • Other: (e.g. "Where there are conflicting ISBNs all should go in notes and the Catalog ID field left blank")
    1. I would have gone for the second choice for the simple reason that we record "exactly" what the pub says. But because it creates an error message, I have to opt for this third choice. It's easy to create a "working" ISBN by just adding the correct checksum, but what if two of the numbers of the ISBN were transposed. The "working" ISBN could be for another pub by the same publisher. Example: correct the checksum for this pub and you get this pub. 'Nuf said. Mhhutchins 21:09, 8 Jan 2008 (CST)
In this case, I used a tool that changes a single digit making a correct ISBN, then I researched each of the new ISBNs to see what they matched. In this case, on my 3rd or 4th try, I got a match on Amazon to the correct pub. I do not believe the checksum digit was the error on this particular pub and of the other pubs I checked, only once was it the checksum digit and in each case, going to Amazon with the "fixed" ISBN resulted in a pub that matched the one ISFDB entry. I think this is a pretty reliable way to fix ISBNs and ensure the corrected ISBN actually matches the desired pub. If you fix ISBN for the pub you list above to: 0877955778 from 0877955578, it is both a valid ISBN and matches the hc edition of that title. Thx, rbh 12:24, 9 Jan 2008 (CST)
I do similar, using Marc Kupper's ISBN Linker (which will actually allow a wildcard for any digit, not just the last as its notes imply) and check against multiple sites (as Amazon data can be appalling at times). Even a quick Google for the changed ISBN can sometimes make it clear that only we (and sites using our data) have the wrong one. BLongley 12:44, 9 Jan 2008 (CST)
If the ISFDB is one of the few sites that records the ISBN printed in the book, I take that as a compliment to our basic philosophy. Mhhutchins 13:54, 9 Jan 2008 (CST)
I agree we want the actual ISBN printed in the book recorded, I just don't agree that it has to be in the Catalog ID field - although I'd be happy if it WAS there, but not triggering all the warnings. Put a "#" in front of invalid ISBNs recorded there could be another option? (Although I don't want too many options or we'll get one each and it's stalemate.) BLongley 15:14, 9 Jan 2008 (CST)
And Mike - can you think of a better message than "(Bad Checksum)"? I agree that it seems to suggest that's the digit that should be changed, whereas any or all could be wrong. BLongley 12:44, 9 Jan 2008 (CST)
Finding a working ISBN doesn't necessarily mean that we've found the correct ISBN, and just because some internet dealers have also found a working ISBN doesn't mean it's the publisher's intended ISBN. For the sample cited above 0-87795-577-8 appears on only 7 pages of the WWW. How can we sure that this is the ISBN that Arbor House assigned to Silverberg's collection? Mhhutchins 13:54, 9 Jan 2008 (CST)
If "working" means it works (in the way WE want) on the Internet, I'm not too bothered so long as we have the Actual too. A lot of publishers got their (I)SBN ranges and continued existing catalogue numbers into that range - some books even got the check digit right before they claimed it was an (I)SBN - e.g. "#10901" was printed on a Sphere book, Sphere are known for having a 07721 prefix, and 0722110901 gets more useful results than #10901. Which is more useful - finding a map of "Suffern, NY 10901, USA" or "Hothouse" by Brian (W. possibly) Aldiss? BLongley 15:14, 9 Jan 2008 (CST)
To answer Bill's question, a better error message would be "Invalid ISBN" which is exactly what it is, no less. Mhhutchins 13:54, 9 Jan 2008 (CST)
I agree - whatever the outcome of this discussion is, I think we should make a feature request to improve that message. BLongley 15:14, 9 Jan 2008 (CST)
I'll agree with that and changing it to read "Bad ISBN" or "Invalid ISBN" seems sufficient. I noticed my ISBN page's error says "ISBN Checksum is 3 but expected X" for example which could mislead someone into changing the checksum and thinking they have solved the problem. Marc Kupper (talk) 02:47, 21 Jan 2008 (CST)

(unindent) Unfortunately, I have been sick for the last couple of days and may be missing some subtleties of the discussion, but perhaps we could list all pros and cons of the two proposed approaches here first? Things like "ease of scripting", "consistency with the overall ISFDB approach", "ease of use by the end users", etc? It may make it easier to decide what the ultimate balance is. Ahasuerus 21:41, 10 Jan 2008 (CST)

I think there's THREE proposed approaches now, but I'm not clear what Mike's "Other" actually entails, unless he's voting for the example I gave. I was hoping both original editors would have added their views by now too, but we've only got one of them so far. Still, to address your example criteria: I don't think any are in conflict with "consistency with the overall ISFDB approach" if you mean "we record exactly what is on the pub" and "any useful extra information should be recorded in notes". Some of the rest is guessing intentions from Al's coding though, and we could say that the code should change rather than our practices. e.g.
Option 1: Pro: Links to other bibliographic sites work
          Pro: No Bibliographic Warnings
          Pro: Do not have to include each invalid ISBN in data-cleanup scripts
          Con: It's not what's on the pub, and people will have to read Notes to find that out. (Is this always true?  In some 
               (many?) cases could it have just been a typo by the editor making the entry?  How would we know?
               rbh 08:38, 13 Jan 2008 (CST)) As with all un-noted, unverified,(and some verified) pubs, we can never be sure. BLongley 13:22, 18 Jan 2008 (CST) 
Option 2: Pro: It's exactly what's stated on the pub
          Con: Have to include each invalid ISBN in data-cleanup scripts, or make people recheck after every refresh
          Con: Links to other bibliographic sites don't work
          Con: Bibliographic Warnings about Invalid check-digit.
Option 3: Pro: Not misleading about real and printed ISBNs (Notes will explain) 
          Con: Links to other bibliographic sites don't work
          Con: Bibliographic Warnings about missing ISBNs
          Con: Have to include each deliberate missing-ISBN pub in data-cleanup scripts, or make people recheck after every refresh
Options 4-N: (Get Al to change a lot of things - separate stated and working and maybe "official" ISBNs, 
              relax or change bibliographic warnings, 
              create "intelligent" links to other sites based on currency or publisher as well as ISBN, 
              allow Mods/Editors to put pubs on data-cleanup-exclusion lists, etc)
          Pro: We have a definitive set of rules
          Con: We'll never agree on them all and this will never happen. 
               (Some individual changes could though, e.g. the warning message currently given seems unpopular.)
Maybe we should stop deleting "Fixed" entries for a bit and go see how people HAVE actually fixed them, to see if we agree with current practices? I only noticed this problem as I saw the same pub being "fixed" twice in different ways.

BLongley 14:02, 11 Jan 2008 (CST)

One of the problems with the bad ISBN's is that we don't know if the original editor entered an incorrect ISBN(what was on the pub) or the pub. has the correct ISBN and they miss-typed the entry. Unless we know what's on the original pub it might be best to just change them to an ID with a # sign in front and a note that it should be fixed if someone has the book. If the ISBN is incorrect on the book than we should have one rule for how its dealt with. We have two options 1.Display what the correct ISBN should be if it matches searches on the web(to that pub.), with a note about the bad ISBN or 2.Put the bad ISBN in with the # sign and a mention in notes about what the valid ISBN should have been, again only if it matches the pub in question.Kraang 21:49, 11 Jan 2008 (CST)
I like the idea of putting a # in front of a badly formed ISBN when an ISFDB editor has checked that this is the only "ISBN" that the book states (with an added explanation in the Notes). (I don't think just "fixing" the checksum necessarily ends up with the intended ISBN; I own books where the publisher has made a typo in the middle of the ISBN so the checksum seems wrong (... deduced where the copyright page and the back cover have different numbers.)
However, if the book has 2 versions of the ISBN & one is a correctly formed ISBN string, I'm in favour of putting the "correct" one as the catalog ID and putting the other in the Notes (even if this occasionally breaks links to other sites, 'cos they have used the invalid one).
And there is the occasional case where the printed ISBN is correctly formed but the publisher has made a boo-boo and it is the duplicate of an earlier-published book by another author (I own one of those, though I can't remember off-hand which it is.)--j_clark 02:07, 18 Jan 2008 (CST)
OK, is that a vote AGAINST using 'Working' ISBN if it's not a secondary one on the pub itself? As we seem perfectly capable of finding it in most cases, but as noted several times it's not as simple as changing the check-digit. I can usually spot a transposition of digits in the publisher's usual prefix which means I need only check a few possibilities, but when it's two or more incorrect digits it's probably best to give up rather than guess. Incidentally, a Bad ISBN doesn't seem to link anywhere useful from our pre-defined links so removing/#ing those shouldn't lose us any functionality. BLongley 13:22, 18 Jan 2008 (CST)
Anyway, I think we're almost done with this lot - there's a few where I'm waiting for the verifier, but otherwise there's nothing more I'd do to the remaining titles except add notes, a '#' to stop the error messages, and swap Working and Bad ISBNs if that's the way we want to go. Which originally we had 3 quick supporting votes for, but now we have 3 other people not supporting it but not making it clear what they DO want. I think we need a decision before we can refresh this page and have a go at the NEXT lot: #ing bad numbers would make them undetectable, so this would be our one chance to do them as a project, rather than as and when we encounter them. Or maybe "Bad ISBN, UN-#ed", in the Catalog ID field is desired, in which case the scripter needs to know that we have decided that's fine and he can omit the checked ones from future project refreshes. BLongley 13:22, 18 Jan 2008 (CST)


I lost track - was there a consensus on if the derived or corrected ISBN should go in the ISBN/Catalog # field or if we leave the original data there and include a publication note explaining either the correct or derived ISBN? Marc Kupper (talk) 18:18, 21 Jan 2008 (CST)

Me too! I can't detect anything approaching a concensus in the comments above and I suspect that one is not going to happen. Bill started this discussion, is he going to make the final call? Thx, Bob rbh 18:34, 28 Jan 2008 (CST)
I'm beginning to think there will NEVER be a final resolution on ANYTHING here. :-( The only votes I've seen that actually get anything resolved are for Moderator status, which seems to be final, and lead to fruitless Cat-herding attempts. I'll review this again tomorrow - well, "soon" at least - and update MY pubs at a minimum. My current feeling is that we have 4 votes FOR, none AGAINST, and although the other "Don't really like either option so I'm going to post comments and leave it up to someone else" might outnumber the decisive, they aren't really helping much. So if anyone DOES have an opinion, or a clearly expressed "Other", post it sharpish please! BLongley 18:48, 28 Jan 2008 (CST)
Since I am resolutely and adamantly undecided on this issue (and no, you can't get me to change my mind!), I figured I had little useful to contribute. I suppose the ultimate solution that would make everybody happy would be to split the ISBN field into two, one for "ISBN as stated in the publication" and one for "Corrected ISBN to be used for search purposes" (to be populated only when the stated ISBN is bad), but it sounds like overkill for a relatively minor issue. Coding-wise, I can adjust my script either way once we have made a decision. Ahasuerus 20:55, 28 Jan 2008 (CST)
Keep in mind that votes don't resolve moderator status. Thx, rbh (Bob) 21:40, 28 Jan 2008 (CST)
I keep waffling on the ISBN issue. For example, with many books I can derive the ISBN from the publishers code and find that Amazon has a record. A similar issue is a publication that has catalog #s like Y7239 on the cover/spine but an SBN or even an ISBN elsewhere such as the copyright page. On the other hand, I really would prefer to input the "as stated" value in the ISBN field and will use # prefixed ISBNs if it's invalid but add notes explaining what's stated and the correct ISBN.
Ahasuerus, I thought about two ISBN fields for my personal book DB, tried it briefly, and went with one field but also allow entry of multiple codes. I separate them with semicolons as I've never seen a publisher use that as part of their publication coding string. One of the things I support is "SBN #####" meaning that when a publication has an SBN I can enter it "as is" and the code knows about the various SBN formats (some end in the price, others have the ISBN check digit, etc.) and can construct an ISBN and resulting Amazon links from that. I have thought expanding this to allow for "Signet W####", "DAW Ux####" etc. and putting in parsers that generate the ISBN. I'm already entering catalog #s in that format in my own DB and so it's a matter of adding the parse code. FWIW - I do have a second ISBN field called "bar code" that only contains the bar code contents. I've been recording this so that at some point I'll be able to establish when various publishers introduced bar codes. Marc Kupper (talk) 01:53, 6 Feb 2008 (CST)

ISBNs that fail checksum validation

ISBN fields that don't start with '#', and aren't 10 digits long

Done, but should probably be done again to see whats been miss or is new.

Are you sure? There's a LOT... :-/ Even taking out "None", "No ISBN", "N/A", 13-digit ISBNs, and ISSNs or things that Look like ISSNs, we have these: feel free to break them up into more manageable chunks. (And when fixing, please mark your edits as "Minor" so the rest of us can filter them out, please?) BLongley 13:26, 4 Jan 2008 (CST)
Do you want me to mark the pub. that I've fixed with the "Minor" edit or this list? Confused :-/Kraang 19:18, 4 Jan 2008 (CST)
Mark edits to this page as Minor so we can filter them out of "Recent Changes" - e.g. I've updated it 8 times so far today and marked them all minor - but you can stop those edits cluttering your "Recent Changes" page with one click. BLongley 09:53, 5 Jan 2008 (CST)
I've wondered what the "m" in recents changes was for, now I know. Learning new things all the time. Thanks!Kraang 20:49, 5 Jan 2008 (CST)
The original list found in this section was generated by Al a long time ago and accounted for only a small subset of our "problem records". I have a much more complete list, but it's huge and I wanted to split it into multiple pages, probably one letter per page, before posting it here. I'll need to rewrite the script to account for 13 digit ISBNs first, though. Ahasuerus 13:46, 4 Jan 2008 (CST)
Broke this list up a bit so I don't have to scroll forever when editing. Dana Carson 22:32, 6 Jan 2008 (CST)
Wise move - and if people DELETE stuff rather than "mark it fixed" it helps too. Nobody's counting who fixed what. And marking edits as "Minor" does help those few of us that still try to see EVERYTHING happening. BLongley 17:28, 7 Jan 2008 (CST)

(Unindent) Well done all! I've left the last two to the verifier - who is also the person that will provide the next backup, after which we can check the NEXT batch! BLongley 17:14, 13 Jan 2008 (CST)