Publisher Catalogs and Print Series

From ISFDB
Revision as of 04:16, 13 February 2008 by Marc Kupper (talk | contribs) (Wikify printseries.html, add Doubleday, add DAW)
Jump to navigation Jump to search

ISFDB - Publisher Catalogs / Print Series

Wiki Conversion Notes

This is a Wiki conversion of printseries.html

Most links point to non existent wiki pages, but you can get the content that should be there from the above link.

As i started converting some of these pages, i encountered two big issues...

  1. should these pages really be in the wiki, or should they be powered by the DB? (at the moment search by publisher doesn't seem to work)
  2. some of these publisher lists are too big for the wiki... when i tried converting "Orbit" the wiki warned me that many browsers would have dificulty editing and i should break it up, when i tried converting DAW, after my browser had submitted all the data, the wiki churned for about 10 (more) minutes before my browser finally timed out.


That said...

Assuming you use this URL to convert pages to wiki syntax, the following perl script is handy for cleaning up the publisher listing pages. Gnome Press and Fantasy Press are good examples


 #!/bin/perl
 #
 # use http://diberri.dyndns.org/html2wiki.html to convert pub pages,
 # then use this to clean them up
 # :TODO: should have one script that uses HTML::WikiConverter and does it all
 #
 use warnings;
 use strict;
 # slurp it in
 undef $/;
 my $w = <>;
 
 # convert bold years to sub-headings
 $w =~ s/ '''(\d+)'''/\n\n== $1 ==\n/g;
 # get rid of all the horiz rules
 $w =~ s/^\s*?----//mg;
 # any pub link is a bullet
 $w =~ s{(\[http://www.isfdb.org/cgi-bin/pl.cgi)}{\n* $1}mg;
 # some pubs don't have links, just a dash
 $w =~ s{ ?\- (\S)}{\n* $1}mg;
 # trim excess newlines
 $w =~ s/\n{2,}/\n\n/g;
 # kill any remaining single newline (followed by optional whitespace)
 $w =~ s{([^\n])\n ?([^\n])}{$1$2}g;
 # get rid of all the excess whitespace
 $w =~ s/ +/ /sg;
 # any line that still starts with whitespace is bad
 $w =~ s/^ +//mg;
 
 # spit it out
 print $w;