Difference between revisions of "Publisher:Project Gutenberg"

From ISFDB
Jump to navigation Jump to search
(discussion to talk page, reword procedures without discussion, suggestions for documenting formats and link to PG ed)
Line 1: Line 1:
 
This page is for noting Bibliographic and other issues with works published by [http://www.gutenberg.org/catalog/ Project Gutenberg].
 
This page is for noting Bibliographic and other issues with works published by [http://www.gutenberg.org/catalog/ Project Gutenberg].
 +
 +
Please use [[Publisher talk:Project Gutenberg|the talk page]] to discuss procedures, while this page documents currently accepted or recommended procedures.
  
 
Project Gutenberg's [http://en.wikipedia.org/wiki/Project_gutenberg Wikipedia article].
 
Project Gutenberg's [http://en.wikipedia.org/wiki/Project_gutenberg Wikipedia article].
Line 7: Line 9:
  
 
==Etext number==
 
==Etext number==
All Project Gutenberg works are identified by an "etext number" which is a persistent identifier. Moreover, given the etext number, a canonical URL can be automatically generated (for etext nnnnn it is "http://www.gutenberg.org/etext/nnnn"). I have been entering this in the "Catalog ID" field. I have entered a work with etext number 12345 as "#12345" (just an example) without including the label "etext". -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 10:42, 6 Feb 2008 (CST)
+
All Project Gutenberg works are identified by an "etext number" which is a persistent identifier. Moreover, given the etext number, a canonical URL can be automatically generated (for etext nnnnn it is "http://www.gutenberg.org/etext/nnnn"). Please enter this in the "Catalog ID" field. Enter a work with etext number 12345 as "#12345" (just an example) without including the label "etext". -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 10:42, 6 Feb 2008 (CST)
  
 
==Separate publication==
 
==Separate publication==
In many cases, particularly for SF, Project Gutenberg publishes an individual work of short fiction as a separate etext, often scanned from the original magazine version. I have been entering these publications as chapterbooks. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 10:42, 6 Feb 2008 (CST)
+
In many cases, particularly for SF, Project Gutenberg publishes an individual work of short fiction as a separate etext, often scanned from the original magazine version. Please enter these publications as chapterbooks. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 10:42, 6 Feb 2008 (CST)
  
 
==Price field==
 
==Price field==
<small>Discussion copied and refactored from [[ISFDB:Community Portal#Project Gutenberg]])</small>
+
Please enter 0 for the price. There is discussion on going as to whether to use a currency symbol (as "$0.00" or "L0.00") or not, see [[Publisher talk:Project Gutenberg#Price field|the talk page]] for the discussion.
I don't think we have documented a standard for the price field when entering free books yet. The last time a related issue came up, the consensus seemed to be that we should be leaving the field blank as opposed to entering "npp" for books with no printed price, but free books are different. Should we use "free" or "$0.00"? "$0.00" seems to be too US-centric since Gutenberg is accessible worldwide. [[User:Ahasuerus|Ahasuerus]] 12:08, 6 Feb 2008 (CST)
 
:I'll follow whatever the consensus is, of course. I think that leaving the field blank is a mistake, because that is what we do for items with '''unknown''' prices, and here the prices is known -- also if anyone runs a query on price it would be better to be able distinguish free books. I would prefer some form of 0 to free so that if we ever develop stats on  average prices or the like these will work properly. I was using ($0.00) because we use a currency with all numeric prices, and while the Project Gutenberg is accessible worldwide, they make a significant point of being a US-based project -- specifically they look '''only''' to US copyright law in determining what is in the public domain, and have posted works where someone has claimed that a non-US copyright is still in force. Note that there is a separate Project Gutenberg Australia (which carries a number of works that are not PD in the US, but are in Oz), and a separate project Gutenberg EU, and I think that a separate Project Gutenberg Canada is in existence or being formed. I would mark works published by each of those as zero in their respective native currencies (euros for the EU PG). -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 13:46, 6 Feb 2008 (CST)
 
 
 
::Those last five words decided it for me - I prefer zero without a currency symbol. No way would I want hard-working British contributors to have their works revalued in foreign money. ;-) And zero converts to zero worldwide - well, for currency anyway, it's not like Centigrade to Fahrenheit to Kelvin. [[User:BLongley|BLongley]] 13:57, 6 Feb 2008 (CST)
 
:::Actualy the PG-Europe site has been concentrating on works not in english, particularly works where full unicode representation is desired to handel accented characters. British works are mostly being done by either PG-Aus or PG-US, because of the ways in which the copyright laws interact. But that is jsut a tendancy, not an invariable rule (to misquote Prof Parkinson).-[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 14:15, 6 Feb 2008 (CST)
 
  
 
==Tags==
 
==Tags==
Oh, and on a related note swfritter has been using [http://www.isfdb.org/wiki/index.php/User:Swfritter#Project_Gutenberg_Science_Fiction Tags for Gutenberg titles], it might be good to merge the effort - I'd like to have ONE solution. Do Gutenberg have only one edition of a book, or do they have multiple versions of some titles? If the latter, multiple publications would be better than single tags. [[User:BLongley|BLongley]] 13:41, 6 Feb 2008 (CST)
+
As discussed in [http://www.isfdb.org/wiki/index.php/User:Swfritter#Project_Gutenberg_Science_Fiction Tags for Gutenberg titles], [[User:Swfritter]], [[User:DESiegel60]] and some others have been adding user tags to titles which exist in Project Gutenberg editions.
:On looking at [[User:Swfritter#Project Gutenberg Science Fiction ]] It appears that he is using only 26 distinct tags: all PG pubs by authors whose last name begins with A get the tag "pga", all PG pubs by authors whose last name begins with B get the tag "pgb", and so on. Thsi is in effect a hack to provide a search by publisher/author for PG titles only. I have no objection to addign these tags, and they in no way conflict with what I have been doing with PG pubs. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 14:16, 6 Feb 2008 (CST)
+
This effort is using only 26 distinct tags: all PG pubs by authors whose last name begins with A get the tag "pga", all PG pubs by authors whose last name begins with B get the tag "pgb", and so on. This is in effect a hack to provide a search by publisher/author for PG titles only.
  
 
==Pages fields==
 
==Pages fields==
I have been leaving all page number fields and all fields for page count of the work blank. it has been suggested that for an ebook collection or anthology "placeholder" page numbers of 1, 2, 3... be entered to preserve the order of the contents, but there is not yet any consensus on this, as far as I know. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 14:36, 6 Feb 2008 (CST)
+
In most cases, the page number fields and all fields for page count of the work are left blank for Project Gutenberg publications.
:I have started entering such "placeholders", when, and only when, it seems to me that the order of the items in a work is significant to the overall effect of the work. Discussion on whether, and if so how, to make this a common practice is in progress. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 13:29, 13 Feb 2008 (CST)
 
  
==Binding field==
+
===Placeholders===
I have been entering the binding as "ebook" -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 14:52, 6 Feb 2008 (CST)
+
It has been suggested that for an ebook collection or anthology "placeholder" page numbers of 1, 2, 3... be entered to preserve the order of the contents, but there is not yet any consensus on this. Other users have suggested using "10, 20, 30..." instead, to allow for possible later insertions. Some users are now entering such "placeholders", when, and only when, it seems to them that the order of the items in a work is significant to the overall effect of the work.
  
== Separate editions? ==
+
Discussion on whether, and if so how, to make this a common practice is in progress.
I've got a question here.  I've been entering quite a few PG ebooks now, most scanned from magazines, some from books.  In general what I'm actually looking at is a HTML version; usually it includes the illustrations from the original magazine or the book's covers, sometimes a magazine cover when the story was the subject of the cover art. I've been entering the artwork in ways that have seemed most appropriate on a case-by-case basis.
+
 +
===Actual page numbers===
 +
In some cases, the HTML version of a project Gutenberg text includes indications of page numbers, normally matching those in the source text fairly closely. (The practice seems to be becoming more common in recent PG editions.) In such cases, please treat these just as if they were physical page numbers in a printed volume.  
  
I've just realized that these ebooks are in fact available in different forms.  In particular, if there's an HTML version, there's usually (maybe always) also a plain text form as well.  (There is also sometimes something called Plucker that I don't have any way to read, but which is supposedly generated from the HTML version if it exists; I'll ignore it for now.)  But, of course, the text version lacks the illustrations I've been entering.
+
==Binding field==
 
+
Please enter the binding as "ebook" -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 14:52, 6 Feb 2008 (CST)
I'm not sure whether we should be treating these as separate editions or printings, or what.  I guess for now I'll continue as I've been doing, but I'd appreciate others' thoughts on this.  (Maybe this should have been entered in the rules & standards discussion instead of here.  Dave, if you think so, feel free to move it.) -- [[User:Davecat|Dave (davecat)]] 12:28, 11 Apr 2008 (CDT)
 
  
:Ezines and ebooks from Fictionwise come in about fifteen different formats. I use the most universal format available - in the case of the Fictionwise releases that is PDF (they don't have HTML versions). I further clarify the binding as 'ebook: PDF'. Some of the other formats exclude artwork. In the case of ''Jim Baen's Universe'' I use HTML because I consider it more universal than PDF. Plucker is for use on the Palm PDA platform.--[[User:Swfritter|swfritter]] 12:43, 11 Apr 2008 (CDT)
 
  
:I would suggest making a notation in the notes stating the source as 'HTML' and perhaps even stating that it is also available in text and Plucker versions. You certainly have no obligation to enter all three formats.--[[User:Swfritter|swfritter]] 20:38, 16 April 2008 (UTC)
+
==Formats==
::I have started putting an entry in notes such as "This ebook is avaliable in ASCII an HTML formats" together with a link to the page where all the formats are listed. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 23:24, 26 April 2008 (UTC)
+
Project Gutenberg etexts are always made available in a pure ASCII format. Frequently other formats, such as HTML, Plucker, nd the like are also available for a given text. Please include an entry in the notes field documenting the formats available for a given etext. For example: "This ebook is available in ASCII an HTML formats". -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 15:18, 28 April 2008 (UTC)
  
== Page counts/numbers? ==
+
==Link==
What about cases where marginal page numbers are given in the text, corresponding to page numbers in the source publication?  So far I've put in a note but ignored these otherwise; however, particularly when there are multiple contents, it seems to me it might make sense to use them.  And if they are used to generate a page count, it would turn off that irritating warning in the biblio. Anyone? -- [[User:Davecat|Dave (davecat)]] 20:45, 26 April 2008 (UTC)
+
Please include a link to the actual Project Gutenberg edition in the notes field. For example:  
  
:Where these exist, I have been treating them exactly as if they were page numbers in a printed volume. Note that PG is doing this more often in recent works than it used to -- it is becoming routine. Note also that when this is done, the page numbers always match those in the volume/edition from which the transcription is made, pretty much exactly (sometimes plus or minus one line). -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 23:21, 26 April 2008 (UTC)
+
<code><nowiki>This ebook edition is available in HTML, ASCII, and iso-8859-1 formats as <a HREF="http://www.gutenberg.org/etext/20836">Ebook #20836</a>.</nowiki></code>
  
:: Good; that suits my instincts on this.  I also view the cross-reference to the original source pub as a big advantage; one possible exception is when there are illustrations.  Particularly an illustration that spans two pages, which just winds up split into two separate illustrations, since the pages are no longer side by side.<br> (And not so good that I'll eventually have to revisit some I've already done, I guess.) -- [[User:Davecat|Dave (davecat)]] 17:38, 27 April 2008 (UTC)
+
The link should as in the example, go to the root page for the etext, rather than to any of the actual texts. the root page is always at an address like <nowiki>"http://www.gutenberg.org/etext/nnnn"</nowiki>, where "nnnn" is the etext number. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 15:18, 28 April 2008 (UTC)
:::I think you will find that larg illustrations are not split in such cases -- the PG books are not precise facsimiles or sets of page images, aftr all. -[[User:DESiegel60|DES]] <sup>[[User talk:DESiegel60|Talk]]</sup> 13:15, 28 April 2008 (UTC)
 

Revision as of 11:18, 28 April 2008

This page is for noting Bibliographic and other issues with works published by Project Gutenberg.

Please use the talk page to discuss procedures, while this page documents currently accepted or recommended procedures.

Project Gutenberg's Wikipedia article.

Listing as publisher

When we enter as a publication of a work an etext from Project Gutenberg, we list the publisher as "Project Gutenberg", considering that such an etext forms a new and separate edition of the work. -DES Talk 10:42, 6 Feb 2008 (CST)

Etext number

All Project Gutenberg works are identified by an "etext number" which is a persistent identifier. Moreover, given the etext number, a canonical URL can be automatically generated (for etext nnnnn it is "http://www.gutenberg.org/etext/nnnn"). Please enter this in the "Catalog ID" field. Enter a work with etext number 12345 as "#12345" (just an example) without including the label "etext". -DES Talk 10:42, 6 Feb 2008 (CST)

Separate publication

In many cases, particularly for SF, Project Gutenberg publishes an individual work of short fiction as a separate etext, often scanned from the original magazine version. Please enter these publications as chapterbooks. -DES Talk 10:42, 6 Feb 2008 (CST)

Price field

Please enter 0 for the price. There is discussion on going as to whether to use a currency symbol (as "$0.00" or "L0.00") or not, see the talk page for the discussion.

Tags

As discussed in Tags for Gutenberg titles, User:Swfritter, User:DESiegel60 and some others have been adding user tags to titles which exist in Project Gutenberg editions. This effort is using only 26 distinct tags: all PG pubs by authors whose last name begins with A get the tag "pga", all PG pubs by authors whose last name begins with B get the tag "pgb", and so on. This is in effect a hack to provide a search by publisher/author for PG titles only.

Pages fields

In most cases, the page number fields and all fields for page count of the work are left blank for Project Gutenberg publications.

Placeholders

It has been suggested that for an ebook collection or anthology "placeholder" page numbers of 1, 2, 3... be entered to preserve the order of the contents, but there is not yet any consensus on this. Other users have suggested using "10, 20, 30..." instead, to allow for possible later insertions. Some users are now entering such "placeholders", when, and only when, it seems to them that the order of the items in a work is significant to the overall effect of the work.

Discussion on whether, and if so how, to make this a common practice is in progress.

Actual page numbers

In some cases, the HTML version of a project Gutenberg text includes indications of page numbers, normally matching those in the source text fairly closely. (The practice seems to be becoming more common in recent PG editions.) In such cases, please treat these just as if they were physical page numbers in a printed volume.

Binding field

Please enter the binding as "ebook" -DES Talk 14:52, 6 Feb 2008 (CST)


Formats

Project Gutenberg etexts are always made available in a pure ASCII format. Frequently other formats, such as HTML, Plucker, nd the like are also available for a given text. Please include an entry in the notes field documenting the formats available for a given etext. For example: "This ebook is available in ASCII an HTML formats". -DES Talk 15:18, 28 April 2008 (UTC)

Link

Please include a link to the actual Project Gutenberg edition in the notes field. For example:

This ebook edition is available in HTML, ASCII, and iso-8859-1 formats as <a HREF="http://www.gutenberg.org/etext/20836">Ebook #20836</a>.

The link should as in the example, go to the root page for the etext, rather than to any of the actual texts. the root page is always at an address like "http://www.gutenberg.org/etext/nnnn", where "nnnn" is the etext number. -DES Talk 15:18, 28 April 2008 (UTC)