LISTSERV - TWAIN-L Archives - LISTSERV.YORKU.CA

TWAIN-L Archives

Mark Twain Forum

TWAIN-L@YORKU.CA

	LISTSERV Archives
	TWAIN-L Home

	Log In
	Register

	Subscribe or Unsubscribe

	Search Archives

Options:	Use Forum View Use Monospaced Font Show Text Part by Default Show All Mail Headers
Message:	[<< First] [< Prev] [Next >] [Last >>]
Topic:	[<< First] [< Prev] [Next >] [Last >>]
Author:	[<< First] [< Prev] [Next >] [Last >>]

Subject:	Re: computer-based text analysis
From:	Sharon Goetz <[log in to unmask]>
Reply To:	Mark Twain Forum <[log in to unmask]>
Date:	Sun, 10 Sep 2006 19:28:05 -0700
Content-Type:	TEXT/PLAIN
Parts/Attachments:	TEXT/PLAIN (32 lines)

It all depends on what sort of analysis you want to conduct, and why. If
you go with Project Gutenberg, you may want to check which editions
underlie individual releases of their texts. The accuracy applied there
(now via http://www.pgdp.net/) is meant only to ensure that a text matches
the edition that was scanned. PG rarely has the luxury of caring which
out-of-copyright edition it picks up.

Naturally, if a PG text's header doesn't declare which print edition it
used, things will be more complicated, since it'll be harder to find the
introduction (if any) once attached to that text and learn thereby how the
text was established. If you're analyzing dialectal usage in dialogue,
whether an edition normalizes spelling will matter, for example. Depending
on when PG released the text you're interested in, you might also try to
find the PGDP forum that discussed its proofreading or to contact the
person who managed scanning/correcting of the PG version. Some PGDP
managers add project-specific instructions to the standard formatting and
proofreading guidelines, both of which are linked from
  http://www.pgdp.net/c/faq/faq_central.php
--and documentation of those instructions isn't visible in a text's final
release. I helped some time ago to proofread part of Holinshed's
_Chronicles_, and I see no enumeration in the final product of the minor
tweaks that Ingram, the manager, had us enact.

Since PG can't afford time-wise to attest the full provenance of its
texts, I'd be reluctant to call their output "academically acceptable,"
useful though it be for casual reading. At this time I'm not sure of
reliable electronic textual resources to suggest, though.

Cheers,

Sharon Goetz

ATOM RSS1 RSS2

LISTSERV.YORKU.CA