ISFDB talk:Dup Shortfiction

From ISFDB
Jump to navigation Jump to search

Another mini-project based on Ahasuerus' "In each magazine issue, find all fiction titles whose title_title is identical - will help identify suspected Interior art titles entered as fiction". BLongley 01:31, 30 April 2011 (UTC)

SQL for generating this page is below, but currently works too slowly to make this a Moderator Clean-Up Script. Improvement suggestions welcome. BLongley 19:04, 1 May 2011 (UTC)

select "[http://www.isfdb.org/cgi-bin/pl.cgi?", p.pub_id, " ", p.pub_title, "] " , t.title_title , " <br> " from pubs p, pub_content pc, titles t where pc.pub_id = p.pub_id and p.pub_ctype = 'MAGAZINE' and pc.title_id = t.title_id and t.title_ttype = 'SHORTFICTION' GROUP BY p.pub_id, p.pub_title, t.title_title HAVING count(*) > 1 LIMIT 100

Omni, November 1989

I don't understand the three listings for this issue. They appear to be correctly typed as interiorart. Could they have been corrected sometime between the last backup and the creation of this list? Mhhutchins 14:30, 1 May 2011 (UTC)

They must have been: here's how bad the publication looked a week ago. I think some of the dups are still lying around. OmniNov1989.jpg BLongley 15:05, 1 May 2011 (UTC)
You were correct. Most of the orphaned records were still in the database, but no longer. Which makes me think, can a script find orphaned title records, i.e. titles without pubs? Mhhutchins 19:34, 1 May 2011 (UTC)
Yes - that is how the "Stray Interviews and Reviews" Moderator Clean-Up Script works for instance. However, the more of that title type there are, the slower the script - I'll experiment and see if we can do each other type on its own. BLongley 20:25, 1 May 2011 (UTC)

Project closed, for now

Thanks to everyone that helped clear these. Unless someone can improve the performance of the SQL I posted above, I don't think we can put this into the Moderator Clean-Up Scripts yet. But if you think it may be a recurring problem, just ask me to rerun the script. BLongley 01:02, 2 May 2011 (UTC)