Difference between revisions of "ISFDB:Data Consistency"
(→Authorless titles: 2008-02-10 data added) |
(→Pseudonym consistency: Disbled in the software; cleanup report exists; deleting) |
||
(72 intermediate revisions by 8 users not shown) | |||
Line 1: | Line 1: | ||
+ | The '''Data Consistency''' project is a place to coordinate efforts to identify and repair data consistencies, including Stray Publications, malformed ISBNs, etc. | ||
+ | ==Publication Records== | ||
+ | ===Invalid characters in Publication titles=== | ||
+ | |||
+ | * [[ISFDB:Invalid characters in Publication titles]] 14 as of 2007-11-27. [[User:Ahasuerus|Ahasuerus]] 19:36, 9 Dec 2007 (CST) | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===Missing data=== | ||
+ | *Need a script that finds missing data, e.g. missing page counts, missing pb/tp/hc data, 0000-00-00 dates, etc. | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==Titles== | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===Safe to auto-merge identical titles?=== | ||
+ | Related to the Gardner Dozois collection was that for nearly all of the titles I found had two title records that were identical other than one was a parent of the other. If it seems safe it seems it would safe some work to to do a sweep for title records that are identical and to auto-merge them. [[User:Marc Kupper|Marc Kupper]] 22:25, 22 Dec 2006 (CST) | ||
+ | |||
+ | |||
+ | |||
+ | === Tags === | ||
+ | |||
+ | It would be nice to standardize user-generated tags. | ||
+ | |||
+ | ====Synonyms==== | ||
+ | |||
+ | ----------------------------- | ||
+ | * hard science fiction | ||
+ | * hard sf | ||
+ | ----------------------------- | ||
+ | * history of sf | ||
+ | * history-of-sf | ||
+ | ----------------------------- | ||
+ | * juvenile sf | ||
+ | * juvenile-sf | ||
+ | ----------------------------- | ||
+ | * recursive sf | ||
+ | * meta sf | ||
+ | ----------------------------- | ||
+ | * 'young-adult humorous sf' -> 'humorous sf' + 'young-adult sf' | ||
+ | |||
+ | etc etc | ||
+ | |||
+ | some one should probably come up with a standard then make some filters for input and run the filters on the existing tags. | ||
+ | |||
+ | |||
+ | |||
+ | == Authors == | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ==Title vs. Publication Type Consistency== | ||
+ | |||
+ | This section documents mismatches between Title types and associated Publication types: | ||
+ | |||
+ | *Collections: | ||
+ | **[[ISFDB:Data Consistency/Collection-Novel Mismatches|Collection-Novel mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Collection-Anthology Mismatches|Collection-Anthology mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Collection-Omnibus Mismatches|Collection-Omnibus mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Collection-Other Mismatches|Collection-Other mismatches]] | ||
+ | |||
+ | *Novels: | ||
+ | **[[ISFDB:Data Consistency/Novel-Magazine Mismatches|Novel-Magazine mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Novel-Anthology Mismatches|Novel-Anthology mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Novel-Collection Mismatches|Novel-Collection mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Novel-Other Mismatches|Novel-Other mismatches]] | ||
+ | |||
+ | *Short fiction: | ||
+ | **[[ISFDB:Data Consistency/Short Fiction-Non-fiction Mismatches|Short fiction-Non-fiction mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Short Fiction-Novel Mismatches|Short fiction-Novel mismatches]] | ||
+ | |||
+ | *Non-fiction: | ||
+ | **[[ISFDB:Data Consistency/Non-fiction-Novel Mismatches|Non-fiction-Novel mismatches]] | ||
+ | **[[ISFDB:Data Consistency/Non-fiction-Other Mismatches|Non-fiction-Other mismatches]] | ||
+ | |||
+ | *Other: | ||
+ | *[[ISFDB:Data Consistency/Non-genre Mismatches|Non-genre mismatches]] | ||
+ | *[[ISFDB:Data Consistency/Serial Mismatches|Serial mismatches]] | ||
+ | *[[ISFDB:Data Consistency/Omnibus Mismatches|Omnibus mismatches]] | ||
+ | *[[ISFDB:Data Consistency/Anthology Mismatches|Anthology mismatches]] | ||
+ | *[[ISFDB:Data Consistency/Editor Mismatches|Editor mismatches]] | ||
+ | |||
+ | ==Pseudonyms in Collections== | ||
+ | |||
+ | The following page lists all known Collection Publications that include pseudonymous Titles as of the August 11, 2007 backup. Although this is not always indicative of an error, we estimate that a significant percentage of these occurrences need to be fixed. | ||
+ | |||
+ | *[[ISFDB:Data Consistency/Pseudonyms in Collections|Pseudonyms in Collections]] NEW as of 2008-05-06 | ||
+ | |||
+ | ==Serial Dates== | ||
+ | |||
+ | The following page lists all known Serial records whose Title dates do not match the dates of the Publications that they appeared in as of the 2008-02-10 backup. | ||
+ | |||
+ | *[[ISFDB:Data Consistency/Serial Dates|Serial Dates]] | ||
+ | |||
+ | [[Category:Bibliographic Projects|Data Consistency]] |
Latest revision as of 19:28, 19 February 2015
The Data Consistency project is a place to coordinate efforts to identify and repair data consistencies, including Stray Publications, malformed ISBNs, etc.
Publication Records
Invalid characters in Publication titles
- ISFDB:Invalid characters in Publication titles 14 as of 2007-11-27. Ahasuerus 19:36, 9 Dec 2007 (CST)
Missing data
- Need a script that finds missing data, e.g. missing page counts, missing pb/tp/hc data, 0000-00-00 dates, etc.
Titles
Safe to auto-merge identical titles?
Related to the Gardner Dozois collection was that for nearly all of the titles I found had two title records that were identical other than one was a parent of the other. If it seems safe it seems it would safe some work to to do a sweep for title records that are identical and to auto-merge them. Marc Kupper 22:25, 22 Dec 2006 (CST)
Tags
It would be nice to standardize user-generated tags.
Synonyms
- hard science fiction
- hard sf
- history of sf
- history-of-sf
- juvenile sf
- juvenile-sf
- recursive sf
- meta sf
- 'young-adult humorous sf' -> 'humorous sf' + 'young-adult sf'
etc etc
some one should probably come up with a standard then make some filters for input and run the filters on the existing tags.
Authors
Title vs. Publication Type Consistency
This section documents mismatches between Title types and associated Publication types:
- Collections:
- Novels:
- Non-fiction:
- Other:
- Non-genre mismatches
- Serial mismatches
- Omnibus mismatches
- Anthology mismatches
- Editor mismatches
Pseudonyms in Collections
The following page lists all known Collection Publications that include pseudonymous Titles as of the August 11, 2007 backup. Although this is not always indicative of an error, we estimate that a significant percentage of these occurrences need to be fixed.
- Pseudonyms in Collections NEW as of 2008-05-06
Serial Dates
The following page lists all known Serial records whose Title dates do not match the dates of the Publications that they appeared in as of the 2008-02-10 backup.