Help:Screen:Moderator

From ISFDB
Revision as of 14:52, 11 February 2022 by Ahasuerus (talk | contribs) (→‎US publishers of Light Novels: fmt)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
This page is a help or manual page for the ISFDB database. It describes standards or methods for entering or maintaining data in the ISFDB database, or otherwise working with the database. Other help pages may be found via the category below. To discuss what should go on this page, use the talk page.

If, after exploring the Help system, you still have a question, please visit the Help desk and let us know. We probably know the answer, but we need your help to know what we left out of the help pages.

If you are new to editing the ISFDB, please see Help:Getting Started.

For more on this and other header templates, see Header templates.


Moderator help

ISFDB moderators have access to a "Moderator" link on the navbar that is not available to editors who are not moderators. Clicking on this link will display a list of all submissions that have not yet been processed to accepted or rejected state. The list shows which editor submitted the record, and when, and gives some basic information about the submitted edit. Here is a snapshot of the current moderator queue to give you an idea of what it looks like:

New Submissions

Help on moderating: Help:Screen:Moderator

Submission State Type Date/Time Submitter Subject
248711 ON HOLD (Marc Kupper) PubDelete 2007-02-03 09:27:14 Rudam Songs the Dead Men Sing
248731 ON HOLD (Marc Kupper) PubDelete 2007-02-03 09:29:12 Rudam Nightflyers
249491 ON HOLD (Mike Christie) NewPub 2007-02-03 13:58:50 BLongley Times Without Number
249501 ON HOLD (Mike Christie) NewPub 2007-02-03 14:06:58 BLongley The Evil That Men Do / The Purloined Pla
249511 N NewPub 2007-02-03 14:23:52 BLongley More Things in Heaven
249521 N TitleUpdate 2007-02-03 14:25:41 BLongley More Things in Heaven (rev 1973)
249531 N NewPub 2007-02-03 14:29:04 BLongley Into the Slave Nebula
249541 N NewPub 2007-02-03 14:32:24 Chris J The English Way of Death
249551 N TitleUpdate 2007-02-03 14:35:29 BLongley Into the Slave Nebula (rev 1968)
249561 N PubUpdate 2007-02-03 14:36:30 Hall3730 Troll Fell
  • The Submission column shows the submission ID and links to pages that show the details for each type of submission. Some submissions are color-coded as follows:
    • Your own submissions are blue
    • Submissions created by other moderators are yellow
    • Submissions created by editors with fewer than 20 Wiki edits are green
  • The State column is either “ON HOLD” for those submissions where a moderator is researching something or is “N” for new submissions.
  • The remaining columns provide an overview of the type of submission, when it was submitted, by who, and the name or title for whatever is being submitted.

Moderators usually start with the first New entry and click on the submission ID to get to a more detailed display, showing exactly what information will be changed by the edit. On this screen the moderator can choose to "Approve," put on "Hold," or "Reject" the submission.

  • Approved submissions are applied to or "integrated" into the ISFDB database meaning the editor's changes are now visible to the public.
  • Rejected submissions are flagged as “rejected” and can be viewed in the editor's rejected list. Moderators also have access to a list of all recently rejected submissions. As part of the reject, a moderator can enter a message explaining why the item was rejected, which is visible in both the editor's and the general moderator reject lists.
  • Submission on Hold are visible in the moderator queue though with a state of "HOLD." Usually the holding moderator then initiates a conversation with the editor who submitted the item to better understand what the editor was trying to do. Once things are sorted out, the submission is then approved or rejected. Once a submission has been put on hold, only the holding moderator can approve or reject it. If the holding moderator decides not to work on a submission that he has on hold, he or she can "unhold" it and its status will change back to "New". If the holding moderator becomes unavailable and can't work on a held submission, an ISFDB bureaucrat can remove the hold.

General guidelines

Generally, with all these submissions, a moderator's level of experience with a particular editor may change the level of verification they do. If you know that a given editor makes few mistakes, you may (for example) choose to let through a DeletePub of a publication listed under the wrong title without waiting for the corresponding publication to be added under the correct title. This is a judgement call for each moderator based on their experience.

If you see typographical errors in a submission, then you have several options.

  • If the errors are bad enough that they would seriously damage the effect of the data entry, you may choose to reject the submission. This is rarely the case, but is included here for completeness.
  • You can approve the submission and leave the editor a talk page note asking for corrections
  • You can approve the submission without leaving the editor a note. This might be appropriate if the typos are in new data (such as a print history note in the notes field) and are comprehensible. As moderator, you don't have to leave everything completely clean and final; but you mustn't let through edits that make the data worse. Errors that anyone can correct when they see them can be left for other editors if you wish.
  • You can approve them and correct them yourself. If the correction is a multi-stage process, it is politer to make the correction yourself as it is tedious for an editor to have to wait for approvals.

If an editor makes an edit that is not ideal, but does not make the data any worse than it was, it is best to err on the side of approval. For example, suppose an editor adds a note to a publication saying that the title is "#3 in the Lords of Time series". Series information is handled by the series fields on the title records, so this edit is not the best way to record this information. However, no harm is done, and the information is being added for other editors to see. Another editor could use this note to help construct the series information the correct way. So an edit like this should be approved. If you have time, you should consider leaving a note for the editor letting them know about series, so they can correct it if they wish. You may also choose to fix it yourself.

Evaluating submissions

Here are some things to consider for the various types of submissions. The notes are organized by "submission type".

AuthorMerge

Note: At this time Author Merge submissions can only be created by ISFDB moderators. The discussion below is retained in case non-moderators are allowed to create Author Merge submissions again.

When two authors are merged, the "target" author is the one that is preserved, and the "source" is the one that will be merged into the target. An author merge will change every reference to the source author, anywhere in the database, to point to the target author instead. The source author will no longer exist in the ISFDB.

An AuthorMerge has the potential to do a great deal of irreversible damage to the ISFDB. If someone merged "Isaac Asimov" with "Robert A. Heinlein", hundreds of hours of work would be destroyed; it would be necessary to recover those records from a backup of the database.

To verify an AuthorMerge, check the source author record. The author_id number will appear on the merge review screen. You can use the Author Search Form on the advanced search screen to search for the source author name; the record id is displayed on the resulting screen so you can check you have the right one. Clicking on the author name will display the author's bibliography.

If you have reason to believe that every publication and title listed really should be listed under the target author, then the merge can go ahead. Reasons to reject an AuthorMerge include:

  • The editor has mistakenly merged authors when in fact the author published under different forms of their name. Algis Budrys published as both "Algis Budrys" and "A. J. Budrys"; these should not be merged. Similarly, "Robert Heinlein" is a different author record from "Robert A. Heinlein".
  • There is a mis-spelling in the source author's name, but that mis-spelling was in fact used on a book. For example, Fritz Leiber's "Night's Black Agents" was published as by "Fritz Lieber".
  • There are any publications verified under the source. Verification can be mistaken, but always requires some research before overwriting.

Other things to consider:

  • It's useful to look at the titles and publications listed under the source. Are there versions of those titles already under the target name? If a publication is duplicated under both names, one of them has to be in error.
  • Can you find records on the web of the alternate spelling? Amazon and used.addall.com are useful, but can't be relied upon. However, if multiple sources agree on an alternate spelling, it may be real.
  • How many publications are there under the source name? If there are two or more, it is wise to be cautious, as multiple entries are more likely to imply that the data is real. Sometimes there are none: this can happen, for example, if Awards data is recorded under the incorrect name.

If there are doubts, place the submission on hold and ask the submitter why they are doing the merge, and if they have seen the publications in question.

AuthorUpdate

The AuthorUpdate submission is currently the only one for which full history is preserved. That means that overwritten data can be quite easily recovered, which in turn makes it a little less risky to approve updates to the data.

If an AuthorUpdate simply adds data to the record, without overwriting anything, it should be approved unless there are obvious problems with the data. Updates should be looked at for common-sense errors, such as typos or format errors in web addresses, or incorrect forms of the Legal Name field. Where possible, the updated data should be quickly checked -- for example, try bringing up the web address given, or the wikipedia page.

DeletePub

The DeletePub submission includes a reason field filled in by the submitter. Typical reasons include:

  • Duplicate publication
  • Under the wrong title; deleting in order to re-add under another title
  • Bogus publication -- never existed
  • Added by mistake (this will usually be one of the three reasons listed above).

Each of these is a valid reason, but can be invoked in error by editors not fully familiar with ISFDB rules. For example, editors may think two publications are duplicate when in fact they are different printings of the same edition of a publication.

For this reason, DeletePub should usually be researched to be sure the editor has not made a mistake. When a record is being moved out from under an incorrect title, via an Add of the new publication and a DeletePub of the old one, it's preferable for the editor to create the new version first, so that the moderator can view the new version. If they don't, you may choose to hold the DeletePub until the new one is entered, or approve it but keep an eye on the relevant titles and make sure the new publication is eventually added.

DeleteTitle

DeleteTitle is not a particularly dangerous kind of deletion, since it can only be done if there is little evidence in the ISFDB that it's a mistake to delete that title. This is because the ISFDB will not delete a title if any publications refer to it.

However, it can still be submitted in error, so research is in order. A reason should be given by the submitter. Typical reasons include:

  • It's a duplicate title. This is more often handled by a merge, but if it's truly a duplicate with no publications there is no harm in a delete.
  • It's a bogus title; a title that never existed and has been entered by mistake, perhaps as a result of a mistaken publication entry.

If the title to be deleted has any variant title references (either as a source or target, you should verify that these are either OK to destroy, or are preserved in the remaining titles. Similarly, if the title is referenced by Awards or Reviews, verify that these links will be preserved.

Make sure that there is a specific reason for deletion, not just "I've never heard of this story of Asimov's so I think it's bogus and am deleting it". Unless there is a clear reason for deletion that does not need research, such as "I just created this in error as a duplicate" then research should be done to see if references to that title exist. Contento, Locus1, and various print bibliographies are good resources for this kind of research.

If the title is a result of mistaken data entry, but a correct version of the title still needs to be entered, it may be possible to solve the problem by TitleUpdate instead. However, this is not a justification for a reject of the record; it's up to the editors how they organize their work. If you suspect the editor would like to know about a shortcut, by all means leave them a message on their talk page, however.

MakeVariant

A MakeVariant submission only updates a single field: the variant title pointer on the child record. These can generally be approved without research if they look correct. However, there are some authors for which the network of variant titles is extremely complex, and if the author involved is one of these, you may wish to do some extra checking.

The variant title tree is intended to have one parent, with all variants the child of that root title. If you suspect that the parent title in a MakeVariant is not the correct root title, check the data on the ISFDB to see what the canonical name should be.

NewPub

New publications create new data of three kinds: a new publication record, a new title record that pairs with the publication, and additional content title records. Cloning a publication will generate a NewPub submission as well; the difference is that a clone will note "Automerge" against some or all of the content records. The "Add Publication to this Title" tool will also create a NewPub record; in this case the title record for the publication as a whole will be automerged but the content records will not.

For each of these possiblities the editor can change the title and author on the publication record. An editor unfamiliar with the rules for variant titles might therefore submit an AddPub or ClonePub for a variant title or variant form of the author name. For these, you have two options:

  • Reject the record and request a corrected resubmission
  • Accept the record and use Unmerge and Make Variant to correct the situation yourself.

You should generally not both accept the edit and then ask the editor to fix it, unless you are essentially "mentoring" the editor through this process, as it is a multi-step process to correct the data.

Other than this kind of error, the main concern is accuracy of data entry. Some invalid data can be spotted quite quickly -- is the ISBN field prefixed with a "#" for something that's obviously not an ISBN? Does the format field hold one of the standard codes? Other problems may not be so easy to spot. An editor may mistakenly enter an late reprinting using the copyright date instead of the printing date. However, if they enter a price too, it is often quite easy to spot this situation. A 1965 paperback priced at $4.50 is definitely an error of some kind. In these cases, you should hold the record, and communicate with the editor on their talk page, to try to resolve the questions.

Incomplete data, however, is not an issue. If an editor fails to enter everything that we would like to see captured, this is not a mistake on their part. Of course we would like to encourage editors to enter everything, but as they are volunteer labour, it is much preferable only to intervene for mistakes of commission, not of omission.

Once a new publication has been entered, if it was not a ClonePub or AddPub, title merging is likely to be necessary to reconnect titles to the pre-existing versions of those titles. There is no need to police this process. If one editor doesn't follow up and merge the titles, another one eventually will do so. Ultimately this problem is likely to be solved by improved editing interfaces that allow all merges to be identified at data entry time, but for now we accept the piecemeal nature of this data entry.

PubUpdate

The data validation comments about NewPub apply here too. However, PubUpdate has an pitfall that should be watched for: updates to title records done through a pub update. This is almost always a mistake. The titles in a publication may have multiple other publications connected to them. Changing a title via PubUpdate (or via TitleUpdate) will change that title in every publication it appears in. If the title appears in only this publication, then there is no harm in the update. Hence if you see a title update, either to the primary title for the publication, or to any of the content records, you must check to see what other publications exist for that title.

If you find an error of this kind, the record should usually be rejected. However, if the data entry involved was significant, it may be preferable to accept the change and perform an unmerge on the title records involved, editing the prior title back to the original state, and creating variant title linkages as needed. In any case, always leave an explanation on the editor's talk page, since this error is quite damaging if it is not caught, and education is part of the solution.

TitleRemove

The basic need for TitleRemove is when a title has been added to a publication in error. It is wise to research the publication's parent title's other versions to determine whether that content item exists in other publications. Sometimes it will be obvious that the TitleRemove is correct -- if the record is obviously inappropriate for the publication.

It may be necessary to query the editor about the submission, but in many cases (deletion of introductions, prefaces and so on), if the moderator doesn't have a copy of the book in front of them, this is a situation where the editor's word that the title is not there should suffice. Unless the removed title looks like something integral to the book, it's usual to accept these submissions. If you suspect a mistaken interpretation of the rules by the editor, a message on the talk page is the next step -- for example, interior art can legitimately be entered either as a single record or as multiple records for the same artist, each one identifying a talk page. Deleting the multiples is not technically wrong, since the single-record approach is correct too, but should not be done without good reason.

TitleRemove is often used after cloning a publication that is close to what is needed, but not identical. For example, a cloned publication done to create a later reprint may have a different introduction, written by another writer. In this case the old introduction must be removed and the new one added. If it's apparent that the TitleRemove requests are part of a clone/update process, it's generally OK to trust the editor unless obvious problems are seen; after all, the TitleRemove is being applied to a publication that that editor created.

TitleMerge

A TitleMerge that shows no differences between the two titles can generally be merged with no research by the moderator. The only situation in which this would be a mistake would be if there were two identical titles for the same author that were actually different in some way. This is extremely rare, and any such situation should be recorded in the title notes for one or both titles.

Any record for which the title, author, and title type all match is likely to be a valid merge. In these cases the moderator should check that the merge has been done to retain the correct version of any additional data, such as date, series information, notes, or variant title pointers. Some research may be necessary to check this.

Merges which differ in title, author or type require research before validation. If the merge is of multiple titles, only one of which is incorrect, it may be preferable to approve the merge, and then unmerge the incorrect merge, rather than reject the merge. In either case, leave the editor a message to explain the situation, either via the reject reason or on their talk page.

TitleUpdate

Updates to the title, author, and title type are subject to the same concerns listed above, in PubUpdate: the title record may apply to multiple publications, so a lot of care must be taken to ensure this edit is correct.

Updates to other fields are usually unproblematic; though obvious questions should be asked -- if data is being deleted, for example.


Rejecting Bad Submissions

If a submission is so badly malformed that it doesn't display correctly and can't be rejected using the regular Approve/Hold/Reject options, it can be rejected using an undocumented script. The script, "hardreject.cgi", is not available in the navbar, but if you give it a submission number, e.g. http://www.isfdb.org/cgi-bin/mod/hardreject.cgi?123456, it will force submission number 123456 into the rejected state.
Another undocumented script "dumpxml.cgi" allows closer inspection of the XML blob for the submission: this can occasionally show why a submission is so badly malformed, e.g. if somebody has put an HTML tag in the page number field. Again, it takes the submission number, e.g. http://www.isfdb.org/cgi-bin/mod/dumpxml.cgi?123456 .

Moderators and the ISFDB Wiki

When you become a moderator, you also automatically become a Wiki "sysop", which gives you access to certain features of the Wiki that are not available to regular editors. These features differ depending on the version of the Wiki software, but the core one is the ability to block malicious users. This ability is very useful for large collaborative projects with thousands of contributors, e.g. Wikipedia, but it is rarely used within ISFDB. The only time that you are likely to use this feature is when a malicious Web bot tries to post spam links to online casinos, virus sites and the like. When this happens, you can use the "Block" function to block the account, but it is best to bring the issue up on the Community Portal since there are some nuances that have to do with blocking by name vs. blocking by IP address. (Note that the current version of the Wiki software no longer auto-blocks innocent users when you block a spambot account.)

There is a special Blocking Policy (see ISFDB:Policy), but it's rarely invoked except when permanently banning bot accounts as described above. Other Wiki features that moderators have access to include an expanded set of tools accessible from the "Special Pages" page (see the "toolbox" area on the left), including the following abilities:

  • see the contents of deleted pages
  • delete pages and uploaded images
  • protect and unprotect pages
  • edit protected pages

Suggested update (tentative) to this help section:
! One change that happens on Wiki pages listed on your "Watch List" and on the "Recent Changes" (on the wiki pages navigation bar) is the appearance of a red ! exclamation mark in front of wiki postings made by non-moderators. There are two goals for this tagging: to increase the likelihood that a bot posting Bad Things (tm) is noticed by a moderator, and to help moderators notice wiki changes by junior editors that may need additional guidance on such postings. (To search for these tags on "Recent Changes", search for "!<blank space>", as that will skip most other exclamation marks.)

Moderating Automated Submissions

General Instructions

ISFDB uses a number of robots which create submissions automatically or with limited human input. The major robots are Fixer (run by Ahasuerus), Dissembler (run by Alvonruff - currently inactive), and Data Thief (originally created by BLongley -- currently inactive.) Submissions created by Fixer are typically automatically put on hold on behalf of the moderator who requested them. If they are not on hold, they can be approved by any moderator. Submissions created by Dissembler are reserved by the robot maintainer.

Note that Fixer has an internal queue system, which it uses to prioritize ISBNs. The current status of Fixer's queues is available here.

Since automatically generated submissions can take longer to process and may require additional research, it is advisable to put them on hold before beginning your research. This will help avoid collisions with other moderators.

See Help:How to work with Records Built by Robots for an overview of the challenges presented by robotic submissions.

Note that submissions created by Fixer include a big warning in Moderator Notes when Fixer suspects that it's an import.

Publication Type Issues

The submitted publication type (NOVEL, COLLECTION, etc) may be incorrect. In most cases, robots have no way of telling whether a book is a collection, novel, anthology or even non-fiction, so they usually default all publication types to "NOVEL". Note that Amazon's subject headings may include the word "anthology", but it's not reliable because Amazon calls many single author collections "anthologies".

At this time Fixer assigns publication types as follows:

  • AddPubs: uses the type of the reference title
  • NewPubs:
    • Page count 150+: NOVEL
    • Page count <150: CHAPBOOK

Cover Image Issues

Sometimes the submitted cover image may contain a "Cover Not Final" blurb. Other times it will contain a computer-generated image with the words "Note: This is not the actual book cover" (or similar) displayed at the bottom. When you come across these types of images, make sure to remove the image URL after approving the submission.

Notes Issues

Fixer-generated submissions always indicate where the data originally came from. They also provide additional information about the book as listed by the original source. Fixer performs a certain amount of normalization, e.g. if Amazon's "Edition statement" field reads "2", then Fixer will add "Second edition" to the Notes field. Similarly, "Mti" becomes "Movie tie-in edition", "Leather" become "Leather binding", "Lrg" becomes "Large print edition" and so on.

It is estimated that Fixer normalizes the Notes field for about 60-70% of all submissions, but there are many cases when its filters can't cope with the provided data, in which case the raw data is added to the Notes field. For example, if the Amazon-provided value of the Edition field is "Del Rey spec. ed.", Fixer has no way of telling that it stands for "Del Rey special edition", so it uses the raw value. When this happens, the Notes field has to be adjusted manually after approving the submission.

Sometimes Amazon uses the code "Rei", which Fixer changes to "Reissue". Most of the time, this information can be safely deleted from the Notes field since ISFDB should already have the original edition on file. However, when Fixer creates a "Reissue" submission and there is nothing in the database that matches the new publication, it's a good indication that ISFDB may be missing the original publication of the book.

If you see a pattern that Fixer could profitably add to its list of patterns, please contact the robot maintainer (Ahasuerus).

Also, on rare occasion a statement in the Publication Notes field applies to the title rather to the publication, in which case it should be moved to the Notes field in the associated title record.

AddPub Issues

Some robots, including Fixer, compare the titles of newly discovered pubs with the title records that ISFDB already has on file before creating a submission. If there is a match, then the submission type is changed from NewPub to AddPub. Here is how the matching logic works:

  • Check the last ISFDB backup and find all title records with the same author(s) and title. Note that this check disregards anything to the right of the first colon or the first left parenthesis in the title of the new pub.
  • If there are no matches, create a NewPub submission
  • If there is more than one matching record on file (e.g. if there is a NOVEL and a COLLECTION record with the same title), create a NewPub submission
  • Otherwise there is only one matching title record in ISFDB, so create an AddPub submission

There are a few possible problems with this logic:

  • Some sources do not always use the co-authors that ISFDB has on file. For example, when a major author like James Patterson collaborates with a young author, Amazon may only list Patterson as the book's author while ISFDB will list both authors. When this happens, Fixer creates a NewPub submission and a manual Title Merge needs to be performed after adjusting the author data.
  • Since Fixer typically uses the latest backup rather than the current ISFDB data to find matches, it's possible that a matching title record has already been added to the database but Fixer is not aware of it. When this happens, you will need to perform a manual Title Merge operation after approving the submission.
  • The reason why Fixer disregards anything to the right of the first colon or the first left parenthesis in the new pub's title is that it tries to ignore subtitles and series information, which often appear after the main title, e.g. A Feast in Exile: A Novel of Saint-Germain or Climb the Wind: A Novel of Another America. On occasion, this assumption may prove incorrect, e.g. Fixer may try to add Brian Lumley's The House of Doors: Second Visit, volume 2 in the House of Doors series, to The House of Doors title, Volume 1 in the series, since everything to the left of the colon matches. Because of this potential flaw in the logic, you may want to double check that the new pub matches the title record whenever the title of the new pub contains a colon or a left parenthesis. If an AddPub submission added a new pub to the wrong title record, you will need to perform an Unmerge operation and possibly a subsequent Title Merge operation.
  • The "disregard everything after the first colon/left parenthesis" logic doesn't always work if the title contains both a colon and a left parenthesis. When this happens, a NewPub submission is created and you will need to perform a Title Merge operation on the newly approved pub.

Additional Data That Needs to be Entered

When an automated submission is approved and results in the creation of a new title record, it may be necessary to enter additional information about the new title. This primarily affects the Series and Series Number fields, but if you know what genre the title belongs to, you are encouraged to add an appropriate tag(s). The tag(s) doesn't have to be very detailed, even "science fiction" or "fantasy" are better than nothing.

When an automated submission results in the creation of a new author record, you are encouraged to research the author and enter any author-specific data you may find. Similarly, you are encouraged to research any new series, publication series and publishers and update their records.

Light Novels

"Light novels" were originally Japanese novels written for older children and teenagers, although their target audience later expanded to include some segments of the adult market. Light novels tend to have more illustrations than most novels, but they are not manga or graphic novels. Note that many light novels start out as Web serials and are then picked up by traditional publishers for paper/electronic publication, which often leads to a partial rewrite of the Web serial text. Some light novels also have manga (i.e. graphic novel) versions of the same story, which can be confusing, especially if they published by the same publisher and use the same title. Sometimes a light novel will include a short (4-16 pages) manga sections at the beginning of the book.

Light novels became increasingly popular during the 2000s and 2010s and many of them have appeared in translation. Some authors from other countries have begun self-publishing similar "light novels" of their own, sometimes using Japanese-sounding pseudonyms.

US publishers of Light Novels

Seven Seas used to publish both manga and light novels, but it changed in mid-2021; now Seven Seas is a manga-only imprint while Airship is their light novel imprint. Similarly, Yen Press used to publish both manga and light novels, but they launched a separate light novel imprint, Yen On, some years ago.

J-Novel Club never uses ISBNs for e-books. Yen On always prints the ISBN of the paperback edition *and* the ISBN of the e-book edition on the copyright page. Airship puts the ISBN of the paperback edition on the copyright page, but it's not clear whether it's supposed to be shared with the e-book edition.

Cross Infinite World presents a unique challenge because it tends to buy English translation rights to previously unpublished Japanese light novels. In most cases the original Japanese text appeared online as a Web serial, often on the Web site "Shōsetsuka ni Narō" (syosetu.com)

Cleanup Scripts

These won't actually clean up anything - they'll help you find things that need cleaning up though. Some of them are hopefully temporary while we clean up the remaining data problems after a fix, others will need to stay around. For an explanation of the "ignore functionality", see Help:Screen:IgnoreCleanupRecords.

Here's the script-author's views after the first and second wave:

  1. "Editor Records not in a Series" If Magazine and Fanzine Editor records are placed in a series, then we get the benefits of Series Grids and such-like. If you find any significant series, then please do check if the Magazine has an entry on the Magazines or Fanzines wiki-pages.
  2. "Variant Editor Records in a Series" - only the Canonical author should be in a series. At time of writing, this script returns zero results. Future "Variants in Series" will likely be forthcoming.
  3. "Missing Editors" - Magazines and Fanzines should have an Editor Record in them. At time of writing, this script returns zero results.
  4. "Interviews of Pseudonyms" - these should be on the Canonical Author's page, not the pseudonym's. The way to do this is to make the Interviewee the Canonical Author, and if the interview is done with the pseudonym indicate that in the title, e.g. like 319801 this interview.
  5. "Authors with invalid Last Names" - these are authors that don't fit into the author Directory as their official LAST NAME doesn't start with an "A-Z" character. There is a known bug with authors whose second letter in their surname is an apostrophe. We'll work on that.
  6. "Authors with invalid spaces" - this is intended to find authors where there are double spaces, or punctuation not spaced out before a letter, etc. There are many exceptions coded in - e.g. Doctors that end ", M.D." or", Ph.D.". There could be a few more exceptions, but the author of the scripts really doesn't like companies or websites included in this field.
  7. "Authors that exist only due to reviews" - either the author isn't really an author in ISFDB terms (e.g. a Graphic Novel artist or suchlike) or we're missing a title/publication. Relevant Non-Fiction books can be added to support the link, or the Review should be converted to an Essay. There is currently a little question over whether Filk music reviews are "in".
  8. "Duplicate Publication Tags" - there was a Bug 2795822 that made certain titles with very similar titles and dates use the same Publication tag. There is some work going on to make linking by Pub Tags redundant within ISFDB, replaced with Pub IDs, but many Images are still loaded under the Tag, and external links to the ISFDB may still rely on them. This script currently shows no problems, but ought to be rerun occasionally as it's not yet totally impossible to create a duplicate.
  9. "Empty Series" - Series are not auto-deleted when the last constituent title goes. They probably should be, but there is the issue of sub-series leading to the super-series being redundant. No current problems as at time of writing, but until we fix the bug this should still be run occasionally.
  10. "Potential HTML problems in Publication Notes" - this is very basic and just matches the number of "<" and ">" characters in notes. It could be expanded to other types of notes, but also could be improved to make sure that it's valid HTML without Javascript injections, etc. BLongley 00:38, 20 April 2011 (UTC)