ISFDB:Data Consistency/Disallowed URLs

From ISFDB
< ISFDB:Data Consistency
Revision as of 17:02, 17 October 2010 by DESiegel60 (talk | contribs) (fixed ma.us)
Jump to navigation Jump to search

Here is how many URLs we point to on a domain by domain basis. We will want to check with the owners of the domains that we haven't secured permission to point to yet and, if they fail to grant permission, zap the URLs. (Whether the URLs are still valid is a different question, one that I will explore in a later script.) See ISFDB:Image linking permissions#List of sites granting permission for a list of sites that have granted permission.

URLs by domain determined as of 2010-10-16:


Domain Number of URLs Allowed?
albin-michel.fr 11 No
amazon.ca 17 ?
amazon.com 37198 Yes
bookscans.com 455 Yes
collectorshowcase.fr 200 Yes
eclipse.co.uk 76 ?
fantascienza.com 758 Yes
fantasticfiction.co.uk 1197 Yes
fatcow.com 43 Yes (Bookscans)
googlepages.com 3:

25002 47306 129951

Yes? (Marc Kupper)
images-amazon.com 37157 Yes
isfdb.org 26022 D'oh!
meow.fr 1:

262371

No
mondourania.com 381 Yes
mottleshire.org 1:

268421

No
mpressbooks.co.uk 4:

307693 307694 307695 307696

No
mushroom-ebooks.com 9:

291255 291256 291321 291331 291334 291335 291336 291337 291380

No
ndhansen-hill.com 3:

273541 273631 273632

No
netonecom.net 1:

301704

No
nohttp 3: No
noosfere.org 2:

271991 271992

? (Collectors Showcase)
obversebooks.co.uk 3:

317336 317445 317446

No
openlibrary.org 1:

231837

Yes
orcabook.com 1:

256422

No
over-blog.com 3:

261304 261305 261383

No
penguingroup.com 1:

255307

No
philsp.com 2466 Yes (Galactic Central)
pjfarmer.com 1:

302971

No
polluto.com 1:

256138

No
pspublishing.co.uk 1:

272034

No
quarante-deux.org 1:

259017

No
randomhouse.com 1:

291014

No
redrosepublishing.com 1:

273544

No
regalcrest.biz 1:

257732

No
rstuttle.com 24 No
sfcovers.net 2263 Yes (Visco)
sfsite.com 4:

286446 286528 302185 304433

No
shaunaroberts.com 2:

321777 321778

No
sjgames.com 1:

123101

No
skyrock.com 1:

262237

No
smashwords.com 1:

316708

No
smithwriter.com 1:

275613

No
thetrashcollector.com 3 Yes
tout-resumer.fr 1:

262372

No
uncw.edu 108 Yes (Ace Image Library)
unl.edu 2:

279422 280314

No
vanguardproductions.net 1:

304553

No
webscription.net 1:

295917

No
wildsidebooks.com 1:

331033

No
wordpress.com 2:

257733 331418

No
zarthani.net 1:

274972

No