Difference between revisions of "ISFDB talk:Data Consistency/Disallowed URLs"

From ISFDB
Jump to navigation Jump to search
 
(12 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. [[User:BLongley|BLongley]] 17:36, 3 September 2010 (UTC)
 
It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. [[User:BLongley|BLongley]] 17:36, 3 September 2010 (UTC)
  
This table is far out of date. The scan that created it should be re-run whn possible and then every few months if possible. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 16:44, 13 October 2010 (UTC)
+
: I've replaced the bestsf.net images with Galactic Central ones, and most of the SJGames ones with Amazon links. [[User:BLongley|BLongley]] 22:05, 13 October 2010 (UTC)
 +
 
 +
This table is far out of date. The scan that created it should be re-run when possible and then every few months if possible. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 16:44, 13 October 2010 (UTC)
 +
 
 +
: I'm not sure what the original script was. This is my closest, for the counts:
 +
 
 +
select
 +
SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8)
 +
, COUNT(*)
 +
from pubs p
 +
where p.pub_frontimage is not null
 +
and LENGTH(p.pub_frontimage) > 8
 +
and p.pub_frontimage LIKE 'http%'
 +
group by SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8)
 +
 
 +
: then plug the dodgy prefixes found into a query like
 +
 
 +
select pub_id from pubs where pub_frontimage LIKE '%prefix%'
 +
 
 +
: to find the actual pubs. There seem to be a lot more now. :-/ [[User:BLongley|BLongley]] 20:19, 13 October 2010 (UTC)
 +
 
 +
== Rerun of the script ==
 +
 
 +
Ahasuerus ran the original script, and I've just left him a note to see if he could do it again when he gets the chance. [[User:Mhhutchins|Mhhutchins]] 20:33, 13 October 2010 (UTC)
 +
 
 +
== Looking for offending records ==
 +
 
 +
How would someone look for records that link to sjgames, cox.net and photobucket? [[User:Mhhutchins|Mhhutchins]] 22:19, 13 October 2010 (UTC)
 +
 
 +
: I think I've fixed the sjgames and photobucket ones, but here are the cox.net Pub IDs: 3251, 34297, 35399, 40674, 45990, 169051, 196833. [[User:BLongley|BLongley]] 22:34, 13 October 2010 (UTC)
 +
 
 +
::Thanks. I've changed all links to cover images on approved servers. [[User:Mhhutchins|Mhhutchins]] 00:35, 14 October 2010 (UTC)
 +
 
 +
:::I've removed (most) and replaced (about 75%) of the links from the offending records based on the latest posted list, but there are a couple of servers for which the links weren't provided.  Can someone who has a local copy of the database check to see what records are linked to "albin-michel.fr" and "eclipse.co.uk", and the three that have "nohttp"?  Thanks. [[User:Mhhutchins|Mhhutchins]] 18:53, 20 October 2010 (UTC)
 +
:::I was able to find those for albin-michel by going through the publisher records.  That leaves eclipse.co.uk. [[User:Mhhutchins|Mhhutchins]] 19:06, 20 October 2010 (UTC)
 +
 
 +
:::: If they are for [http://www.eclipse.co.uk/sweetdespise/ Sweet Despise], then we've got implicit permission from Unapersson. [[User:BLongley|BLongley]] 23:38, 20 October 2010 (UTC)
 +
 
 +
::::: Is there a way to make sure that they're all on that URL? [[User:Mhhutchins|Mhhutchins]] 00:41, 21 October 2010 (UTC)
 +
 
 +
:::::: Yes, as of the last backup I loaded all 76 are for Sweet Despise. [[User:BLongley|BLongley]] 19:34, 21 October 2010 (UTC)
 +
 
 +
:::::::Great.  Thanks for checking. [[User:Mhhutchins|Mhhutchins]] 19:44, 21 October 2010 (UTC)

Latest revision as of 15:44, 21 October 2010

It looks like bestsf.net has moved its images? I've found several broken ones for Interzone covers for instance. BLongley 17:36, 3 September 2010 (UTC)

I've replaced the bestsf.net images with Galactic Central ones, and most of the SJGames ones with Amazon links. BLongley 22:05, 13 October 2010 (UTC)

This table is far out of date. The scan that created it should be re-run when possible and then every few months if possible. -DES Talk 16:44, 13 October 2010 (UTC)

I'm not sure what the original script was. This is my closest, for the counts:
select 
SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8) 
, COUNT(*)
from pubs p
where p.pub_frontimage is not null
and LENGTH(p.pub_frontimage) > 8
and p.pub_frontimage LIKE 'http%'
group by SUBSTR(p.pub_frontimage,8,locate('/',p.pub_frontimage,8)-8) 
then plug the dodgy prefixes found into a query like
select pub_id from pubs where pub_frontimage LIKE '%prefix%'
to find the actual pubs. There seem to be a lot more now. :-/ BLongley 20:19, 13 October 2010 (UTC)

Rerun of the script

Ahasuerus ran the original script, and I've just left him a note to see if he could do it again when he gets the chance. Mhhutchins 20:33, 13 October 2010 (UTC)

Looking for offending records

How would someone look for records that link to sjgames, cox.net and photobucket? Mhhutchins 22:19, 13 October 2010 (UTC)

I think I've fixed the sjgames and photobucket ones, but here are the cox.net Pub IDs: 3251, 34297, 35399, 40674, 45990, 169051, 196833. BLongley 22:34, 13 October 2010 (UTC)
Thanks. I've changed all links to cover images on approved servers. Mhhutchins 00:35, 14 October 2010 (UTC)
I've removed (most) and replaced (about 75%) of the links from the offending records based on the latest posted list, but there are a couple of servers for which the links weren't provided. Can someone who has a local copy of the database check to see what records are linked to "albin-michel.fr" and "eclipse.co.uk", and the three that have "nohttp"? Thanks. Mhhutchins 18:53, 20 October 2010 (UTC)
I was able to find those for albin-michel by going through the publisher records. That leaves eclipse.co.uk. Mhhutchins 19:06, 20 October 2010 (UTC)
If they are for Sweet Despise, then we've got implicit permission from Unapersson. BLongley 23:38, 20 October 2010 (UTC)
Is there a way to make sure that they're all on that URL? Mhhutchins 00:41, 21 October 2010 (UTC)
Yes, as of the last backup I loaded all 76 are for Sweet Despise. BLongley 19:34, 21 October 2010 (UTC)
Great. Thanks for checking. Mhhutchins 19:44, 21 October 2010 (UTC)