Stabilizing Bibliographic Data

From ISFDB
Jump to navigation Jump to search

This page is a proposed guideline for how the data in the ISFDB database progresses towards a stable state. For a discussion of these guidelines, see the talk pages; updates to this page should be restricted to modifications and improvements to the guidelines.

Definition of terms. The following terms are used in these guidelines.

  • Editor. Any user who submits a modification to the ISFDB data.
  • Moderator. Any user with privilege to choose to accept or reject data submissions.

Overview

The data in the ISFDB is of two types. First, there are individual publications. “Publication” is the ISFDB term for a physical, published entity. It can be a book, a magazine, or an eBook; a chapbook, or perhaps even a fanzine. However, it is an object that can be obtained and examined, and the information verified.

Second, there are relations between publications. There are authors, who may have written many books; sometimes authors use different names on different works, while there may also be two authors with the same name. Works have titles, which can vary; they can be revised, or completely rewritten. Stories can be parts of series. Stories, novels, magazines and authors can win awards. All of this data is verifiable, but in a weaker sense than for publications, which can after all be physically examined. There are degrees of certainty for this data. Is “Masters of the Vortex” really part of the Lensman series? Is “Sunken Universe” the same story as “Surface Tension”?

These two types of data are what the ISFDB wants to record. The ISFDB is now open to editing by users, which means that all of the data can be modified. All modified data must be validated first by moderators, of whom there are currently four. However, it is clear that the goal of the ISFDB is not endless modification of the records, which is likely to be the case for Wikipedia, for example; instead the goal is that when a record is “correct”, it should be stable and should never change again, unless perhaps new attributes are added to the database. “Correct” here is clearly defined for “publications”, and will generally be clear for authors, titles, pseudonyms and so on; it may be a subject for debate on occasion. It is still true, however, that once a consensus is reached that data is correct, it should not change again.

This guideline defines how this is achieved.

Enhancements required to support these guidelines

Two enhancements to the ISFDB are required to support these guidelines: a "correctness flag" on the publication record, and a link on the edit and moderation screens to a Wiki page for each publication.

Correctness flag

A verification flag has been proposed for the ISFDB and is in the roadmap for June. This guideline assumes the existence of a similar flag, called the correctness flag, on the publication record. Setting the flag indicates that there is reason to believe the data is correct. This guideline addresses when the flag should be set, and how edits to a record should be handled when the flag is set and when it is not set.

It may be desirable at some stage to use the correctness flag in displaying ISFDB data, to help readers to assess reliability. Such uses are outside the scope of this guideline.

Publication wiki page

Each publication will have a wiki page. The title will be taken from the publication's tag. For example, The 1952 Avon edition of James Blish's Jack of Eagles has a tag of JCKOEG1952. The Wiki page for discussion of this publication would then be Publication:JCKOEG1952. This is not easy to interpret, but unfortunately text based on the title or author would not be sufficiently specific. The contents of the page will be notes on what sources editors have used for the data entered for this publication, and also notes on what verifications have been done against other bibliographic sources. It is unnecessary to record that, for example, a given publication matches the entry in Tuck. The goal of the ISFDB is not to perform and track a validation of other bibliographies. However, any discrepancies with major bibliographic sources should certainly be recorded here; this will be useful for bibliographic researchers and, more importantly for the ISFDB, is information a moderator may need to know to resolve conflicting edits.

A link to the publication Wiki page should appear in the ISFDB in three places: when displaying the publication; in the edit screen for the publication; and in the moderate screen for the publication. While the structure of the Wiki page is not specified, the first line of text should be a reference to the publication, such as:

 Bibliographic comments for JCKOEG1952.

This will allow readers who enter the Wiki page from search engine results to find the associated database entry, and also serves as a permanent link if the publication tag is somehow erroneously altered.

Creating and modifying publication records

When you create a new publication record, where possible you should use an actual copy of the publication. If you do, also check "Correct", as this is the best possible verification. When you have submitted the record, please also follow the displayed link to the publications Wiki page and add a note there saying something like "Entered from actual copy", and sign your name using four tildes: ~~~~.

If you are creating a record by using a bibliographic resource, such as Currey, Tuck, or Reginald, you may also enter the correctness flag. If you are using a resource that does not supply some of the bibliographic data, please do not check "Correct". For example, the Nicholls/Clute Encyclopedia, while very reliable, does not generally give publisher or price for edition information, and data entered from this source should not be marked correct.

Note that in some cases, the copy you have in front of you may not supply all the information that the ISFDB records for a publication. Sometimes the actual date of publication isn't apparent; this is quite often the case with reprints. It's also often the case that there is a cover picture but no way to tell who the artist is. In these cases, leave the date blank, and either leave the artist blank or put "Unknown" in the artist field. The "correct" flag in this case means that you have verified the data from the publication; it does not necessarily mean that the data is complete.

There are also some fairly rare cases where the data on the publication is actually wrong. At least one of Fritz Leiber's novels has his name printed as "Fritz Lieber", for example. In these cases, enter the data as it is shown on the publication -- but make a note on the associated publication wiki page. The only exception is for the date, which is intended to record the actual date of publication. If there is clear evidence that the printed date on the publication is incorrect (this is rare, but not unheard of), then correct the date and cite your sources on the publication's wiki page.

Generally, the rule is "Enter what you see, not what you know." What you know gets documented on the wiki page, if it differs. See Help:Screen:EditPub for more detailed information about what to enter.

It is preferable not to create a publication record at all if you do not have some bibliographic source to work from. If you know (from memory), for example, that there was a variant edition of In Viriconium by M. John Harrison published in the US with the title The Floating Gods, please don't create a publication of that name unless you can find some documentary reference to it. In a situation like this it would be better to edit the Author:M. John Harrison page and add a note that this title is missing. Even better would be to try to locate a description of the book on the web. Resources such as Ebay and second hand book aggregator listings can be used to locate book descriptions for publication data entry, so long as these are not marked correct.

If you do decide for whatever reason to add a publication without a source, please do not set the correctness flag.

When moderating, if a record does not have the correctness flag set, then any reasonable-looking change should be permitted. A moderator is not obliged to verify every record; if they have a copy to hand or a bibliographic reference, they are free to check it, but because of the likely volume of records to be moderated it is not realistic to expect this to happen very often.

If a submission sets the correctness flag, the moderator should check the Wiki page for that publication. If there is no information there about why the flag is being set, then the moderator should consider rejecting the change. If they wish to check against bibliographic sources they have access to they can do so, and they should update the Wiki page themselves in that case.

Issues with editing existing records

If a submission edits a record that was already marked correct, then the moderator should again check the Wiki page to see if a justification is given. If the edit simply adds data that was omitted before -- such as the name of the artist -- then it is OK to accept the edit without any entry on the Wiki page. If the submission modifies data, and there is no justification given, it should be rejected unless the moderator has the ability to check the data. In that case a Wiki page entry should be made. If the submission is supported by information on the Wiki page indicating what the source was for the data the moderator must determine whether to allow the change. Generally the change should be permitted if the reasons is documented; subsequent Wiki discussion can address any issues with conflicting sources or variant editions.

If a submission turns off the correctness flag, the moderator should permit this if the record is obviously deficient in some way -- missing important data, such as publisher or price, for example. If there is no obvious problem, the change should be rejected unless there is supporting information on the Wiki page for the publication.

Other record types

The same principles apply to other record types. However, correctness flags cannot be set for these other record types. They may be introduced at some future stage, but the publication data, being both easy to verify and clearly primary data, is the only data for which this guideline is proposed.