Difference between revisions of "ISFDB talk:Data Consistency/Disallowed URLs"
Jump to navigation
Jump to search
DESiegel60 (talk | contribs) |
Mhhutchins (talk | contribs) |
||
(12 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. [[User:BLongley|BLongley]] 17:36, 3 September 2010 (UTC) | It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. [[User:BLongley|BLongley]] 17:36, 3 September 2010 (UTC) | ||
− | This table is far out of date. The scan that created it should be re-run | + | : I've replaced the bestsf.net images with Galactic Central ones, and most of the SJGames ones with Amazon links. [[User:BLongley|BLongley]] 22:05, 13 October 2010 (UTC) |
+ | |||
+ | This table is far out of date. The scan that created it should be re-run when possible and then every few months if possible. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 16:44, 13 October 2010 (UTC) | ||
+ | |||
+ | : I'm not sure what the original script was. This is my closest, for the counts: | ||
+ | |||
+ | select | ||
+ | SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8) | ||
+ | , COUNT(*) | ||
+ | from pubs p | ||
+ | where p.pub_frontimage is not null | ||
+ | and LENGTH(p.pub_frontimage) > 8 | ||
+ | and p.pub_frontimage LIKE 'http%' | ||
+ | group by SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8) | ||
+ | |||
+ | : then plug the dodgy prefixes found into a query like | ||
+ | |||
+ | select pub_id from pubs where pub_frontimage LIKE '%prefix%' | ||
+ | |||
+ | : to find the actual pubs. There seem to be a lot more now. :-/ [[User:BLongley|BLongley]] 20:19, 13 October 2010 (UTC) | ||
+ | |||
+ | == Rerun of the script == | ||
+ | |||
+ | Ahasuerus ran the original script, and I've just left him a note to see if he could do it again when he gets the chance. [[User:Mhhutchins|Mhhutchins]] 20:33, 13 October 2010 (UTC) | ||
+ | |||
+ | == Looking for offending records == | ||
+ | |||
+ | How would someone look for records that link to sjgames, cox.net and photobucket? [[User:Mhhutchins|Mhhutchins]] 22:19, 13 October 2010 (UTC) | ||
+ | |||
+ | : I think I've fixed the sjgames and photobucket ones, but here are the cox.net Pub IDs: 3251, 34297, 35399, 40674, 45990, 169051, 196833. [[User:BLongley|BLongley]] 22:34, 13 October 2010 (UTC) | ||
+ | |||
+ | ::Thanks. I've changed all links to cover images on approved servers. [[User:Mhhutchins|Mhhutchins]] 00:35, 14 October 2010 (UTC) | ||
+ | |||
+ | :::I've removed (most) and replaced (about 75%) of the links from the offending records based on the latest posted list, but there are a couple of servers for which the links weren't provided. Can someone who has a local copy of the database check to see what records are linked to "albin-michel.fr" and "eclipse.co.uk", and the three that have "nohttp"? Thanks. [[User:Mhhutchins|Mhhutchins]] 18:53, 20 October 2010 (UTC) | ||
+ | :::I was able to find those for albin-michel by going through the publisher records. That leaves eclipse.co.uk. [[User:Mhhutchins|Mhhutchins]] 19:06, 20 October 2010 (UTC) | ||
+ | |||
+ | :::: If they are for [http://www.eclipse.co.uk/sweetdespise/ Sweet Despise], then we've got implicit permission from Unapersson. [[User:BLongley|BLongley]] 23:38, 20 October 2010 (UTC) | ||
+ | |||
+ | ::::: Is there a way to make sure that they're all on that URL? [[User:Mhhutchins|Mhhutchins]] 00:41, 21 October 2010 (UTC) | ||
+ | |||
+ | :::::: Yes, as of the last backup I loaded all 76 are for Sweet Despise. [[User:BLongley|BLongley]] 19:34, 21 October 2010 (UTC) | ||
+ | |||
+ | :::::::Great. Thanks for checking. [[User:Mhhutchins|Mhhutchins]] 19:44, 21 October 2010 (UTC) |
Latest revision as of 15:44, 21 October 2010
It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. BLongley 17:36, 3 September 2010 (UTC)
- I've replaced the bestsf.net images with Galactic Central ones, and most of the SJGames ones with Amazon links. BLongley 22:05, 13 October 2010 (UTC)
This table is far out of date. The scan that created it should be re-run when possible and then every few months if possible. -DES Talk 16:44, 13 October 2010 (UTC)
- I'm not sure what the original script was. This is my closest, for the counts:
select SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8) , COUNT(*) from pubs p where p.pub_frontimage is not null and LENGTH(p.pub_frontimage) > 8 and p.pub_frontimage LIKE 'http%' group by SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8)
- then plug the dodgy prefixes found into a query like
select pub_id from pubs where pub_frontimage LIKE '%prefix%'
- to find the actual pubs. There seem to be a lot more now. :-/ BLongley 20:19, 13 October 2010 (UTC)
Rerun of the script
Ahasuerus ran the original script, and I've just left him a note to see if he could do it again when he gets the chance. Mhhutchins 20:33, 13 October 2010 (UTC)
Looking for offending records
How would someone look for records that link to sjgames, cox.net and photobucket? Mhhutchins 22:19, 13 October 2010 (UTC)
- I think I've fixed the sjgames and photobucket ones, but here are the cox.net Pub IDs: 3251, 34297, 35399, 40674, 45990, 169051, 196833. BLongley 22:34, 13 October 2010 (UTC)
- Thanks. I've changed all links to cover images on approved servers. Mhhutchins 00:35, 14 October 2010 (UTC)
- I've removed (most) and replaced (about 75%) of the links from the offending records based on the latest posted list, but there are a couple of servers for which the links weren't provided. Can someone who has a local copy of the database check to see what records are linked to "albin-michel.fr" and "eclipse.co.uk", and the three that have "nohttp"? Thanks. Mhhutchins 18:53, 20 October 2010 (UTC)
- I was able to find those for albin-michel by going through the publisher records. That leaves eclipse.co.uk. Mhhutchins 19:06, 20 October 2010 (UTC)
- If they are for Sweet Despise, then we've got implicit permission from Unapersson. BLongley 23:38, 20 October 2010 (UTC)
- Is there a way to make sure that they're all on that URL? Mhhutchins 00:41, 21 October 2010 (UTC)
- Yes, as of the last backup I loaded all 76 are for Sweet Despise. BLongley 19:34, 21 October 2010 (UTC)
- Great. Thanks for checking. Mhhutchins 19:44, 21 October 2010 (UTC)