I'd be very curious to know how the "control group" texts are selected for such tests. Seems like a massive potential for bias there.
Carl
-----Original Message-----
From: Mark Twain Forum <[log in to unmask]> On Behalf Of Matt Seybold
Sent: Wednesday, February 20, 2019 8:16 PM
To: [log in to unmask]
Subject: Re: Rediscovered Twain Sketch?
Thanks, Leslie, and all. I came across the phrase “of first water” in another context today. I had not recognized this was part of Twain’s repertoire, but it is, including in other contexts from the same season. Of course, other journalists used it to. “Steer” was published during the African diamond fever and “of first waster” is a colloquialism from the gemstone industry. Twain and J. H. Riley were in the process of hatching their plan for a book about the diamond fever (never realized). Again, all circumstantial. - MS
> On Feb 18, 2019, at 3:30 PM, Mac Donnell Rare Books <[log in to unmask]> wrote:
>
> Another problem in stylometric studies is finding an accurate text of the piece being studied as well as accurate texts for the control group. Texts taken from 19th century books often reflect house styles imposed by editors. This doesn't make all texts look alike, of course, but it makes them look a wee bit less different.
>
> I always laugh when I think of the scholarly excitement over Herman Melville's "soiled fish" --until it was discovered that Melville's fish were merely coiled, like everybody else's fish.
>
> Kevin
> @
> Mac Donnell Rare Books
> 9307 Glenlake Drive
> Austin TX 78730
> 512-345-4139
> Member: ABAA, ILAB, BSA
>
> You can browse our books at:
> www.macdonnellrarebooks.com
>
>
> ------ Original Message ------
> From: "Leslie MYRICK" <[log in to unmask]>
> To: [log in to unmask]
> Sent: 2/18/2019 1:46:43 PM
> Subject: Re: Rediscovered Twain Sketch?
>
>> Update: now that I've removed my gearhead cap and looked again at the
>> sources, the Wilkie piece adhering too closely to Dickens was the
>> first chapter of A House to Let, on which he and Dickens
>> collaborated. So, hurray, R Studio application.
>> But misattributions are entirely possible when you use algorithms to
>> read texts.
>>
>>
>>
>>> On Fri, Feb 15, 2019 at 1:12 PM Hal Bush <[log in to unmask]> wrote:
>>>
>>> BTW: even today there are pockets in America, often in the south
>>> and among more ardent groups of evangelical and/or fundamentalist
>>> Christians, who continue to pronounce the word "humble" as "umble."
>>> As in, "he's an umble man!"
>>>
>>>
>>> I even know a few.
>>>
>>>
>>> That fact sort of reminds me of all the ballyhoo & brouhaha when our
>>> great leader the President mentioned "2 Corinthians." Actually, it
>>> turns out that many pockets of church folks still call it 2
>>> Corinthians, or 2 Timothy, or whatever. Of course it is beyond the
>>> scope of this post to argue that our leader said it that way due to
>>> his sympathies for the blue collar believers of the flyover district.
>>>
>>>
>>> ballyhoo & brouhaha are 2 great words making a comeback in our umble
>>> Age of Trump...
>>>
>>>
>>>
>>> Dr. Hal Bush
>>>
>>> Dept. of English
>>>
>>> Saint Louis University
>>>
>>> [log in to unmask]
>>>
>>> 314-977-3616
>>>
>>> http://halbush.com
>>>
>>> author website: halbush.com
>>>
>>> ________________________________
>>> From: Mark Twain Forum <[log in to unmask]> on behalf of Leslie MYRICK
>>> < [log in to unmask]>
>>> Sent: Thursday, February 14, 2019 12:40:44 PM
>>> To: [log in to unmask]
>>> Subject: Re: Rediscovered Twain Sketch?
>>>
>>> It appears that "humble" was occasionally pronounced with a dropped
>>> H even in the US at the time, especially if the speaker was from a
>>> family that immigrated from the UK, or, like the Express's political
>>> editor, from Canada. (Bob Hirst would know whether Larned's
>>> editorialzing was ever this
>>> *sustainedly* humorous).
>>>
>>> Or, as I think someone else has noted, "an humble" could have been
>>> in this case, if it *was* written by MT, a typesetter's mistake.
>>>
>>> I took a look at "an humble" in the NYS newspapers archive, and
>>> found an interesting case of "a humble" vs "an humble" in the
>>> transcription of a speech by an Illinois congressman on the effects
>>> of Republican tariffs on farmers. If you compare these two versions,
>>> whose links will hopefully preserve the highlighting, you'll see at
>>> least one case of humble treated with a silent H and a voiced H in
>>> two reprints, suggesting an intervention based on differences in dialect.
>>> Geneva Gazette, 10 Jun 70
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__nyshistoricnewsp
>>> apers.org_lccn_sn83031108_1870-2D06-2D10_ed-2D1_seq-2D4_-23date1-3D0
>>> 1-252F01-252F1869-26index-3D6-26date2-3D01-252F31-252F1874-26searchT
>>> ype-3Dadvanced-26SearchType-3Dphrase-26sequence-3D0-26words-3Dhumble
>>> -26proxdistance-3D-26to-5Fyear-3D1874-26rows-3D20-26ortext-3D-26from
>>> -5Fyear-3D1869-26proxtext-3D-26phrasetext-3Dan-2Bhumble-26andtext-3D
>>> -26dateFilterType-3Drange-26page-3D1&d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIW
>>> oQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZHCJHyBlf3h10-HisntoNDpMHXAqbPsxU
>>> oX3pE&s=SIIkmP3l4lRXiZF_U176dVFoRkNSv3hV1OoCv-Ai9qQ&e=
>>> Herkimer Democrat, 3 Aug 70
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__nyshistoricnewsp
>>> apers.org_lccn_sn83031101_1870-2D08-2D03_ed-2D1_seq-2D2_-23date1-3D0
>>> 1-252F01-252F1869-26index-3D2-26date2-3D01-252F31-252F1874-26searchT
>>> ype-3Dadvanced-26SearchType-3Dphrase-26sequence-3D0-26words-3Dflanne
>>> l-2Bhumble-2Bshirt-26proxdistance-3D-26to-5Fyear-3D1874-26rows-3D20-
>>> 26ortext-3D-26from-5Fyear-3D1869-26proxtext-3D-26phrasetext-3Dhumble
>>> -2Bflannel-2Bshirt-26andtext-3D-26dateFilterType-3Drange-26page-3D1&
>>> d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIWoQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZH
>>> CJHyBlf3h10-HisntoNDpMHXAqbPsxUoX3pE&s=LNyV6qFT08f4UGZP0RnTigIayuZ4E
>>> UcUQsQ5b4vV5Nc&e=
>>>
>>> A survey of the same speech in newspapers.com shows 44 cases of "an
>>> humble," which is apparently how it was enunciated by Rep. Marshall,
>>> and faithfully transmitted, vs 9 cases of "a humble." The
>>> typesetter's or editor's intervention was apparently the dropping of the "n" in this case.
>>> But this sort of intervention could go both ways, depending on a
>>> person's dialect affinities
>>>
>>> All to say, I suggest that "an humble" could just be a typo, and not
>>> necessarily a viable data point -- or what I used to call, before I
>>> retired from MTP, "a glitch."
>>>
>>> I say data point, because in at least one branch of stylometry,
>>> articles, conjunctions, and other words more unconsciously generated
>>> by a writer's brain appear to make the best case for identification.
>>>
>>> For Too Much Information on how stylometry works (yet you can cherry
>>> pick really useful information from it) see
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__programminghist
>>> orian.org_en_lessons_introduction-2Dto-2Dstylometry-2Dwith-2Dpython&
>>> d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIWoQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZH
>>> CJHyBlf3h10-HisntoNDpMHXAqbPsxUoX3pE&s=vfp8dReeCNT4rgCh1AlUx3IEy30LA
>>> pZzm5Ke3z2D9_w&e=
>>> Leslie
>>>
>>> On Wed, Feb 13, 2019 at 10:13 PM Clay Shannon <[log in to unmask]>
>>> wrote:
>>>
>>> > Thanks, Barb! I've added it to my amazon shopping list - will
>>> > purchase it later.
>>> > - B. Clay Shannon
>>> >
>>> > On Wednesday, February 13, 2019, 5:57:19 PM PST, Barbara
>>> > Schmidt < [log in to unmask]> wrote:
>>> >
>>> > Clay asked -- Has anybody compiled a list of Twain's
>>> > "vocabulary" -- Yes.
>>> >
>>> > A MARK TWAIN LEXICON by Robert Ramsay and Frances Emberson.
>>> > Published in 1963.
>>> >
>>> > Barb
>>> >
>>> >
>>>
>>
|