Having done a bit of research and playing around with a few stylometric
tutorials this weekend, I have to report that the Texan Steer piece is
probably way too short for any meaningful analysis using those tools.
For a Delta analysis the minimum acceptable size for a sample is around
2000 words, while the comfortable range is closer to 5000 words. Bad OCR of
this article shows that it contains around 500 words.
In one of the above mentioned tutorials, five unidentified texts were
compared to two known corpora, marked C and D, and wouldn't you know it,
the tools found that one of the Wilkie Collins pieces was much closer to
Dickens than to Collins. So ... results may be surprising.
Leslie
On Fri, Feb 15, 2019 at 1:12 PM Hal Bush <[log in to unmask]> wrote:
> BTW: even today there are pockets in America, often in the south and
> among more ardent groups of evangelical and/or fundamentalist Christians,
> who continue to pronounce the word "humble" as "umble." As in, "he's an
> umble man!"
>
>
> I even know a few.
>
>
> That fact sort of reminds me of all the ballyhoo & brouhaha when our great
> leader the President mentioned "2 Corinthians." Actually, it turns out
> that many pockets of church folks still call it 2 Corinthians, or 2
> Timothy, or whatever. Of course it is beyond the scope of this post to
> argue that our leader said it that way due to his sympathies for the blue
> collar believers of the flyover district.
>
>
> ballyhoo & brouhaha are 2 great words making a comeback in our umble Age
> of Trump...
>
>
>
> Dr. Hal Bush
>
> Dept. of English
>
> Saint Louis University
>
> [log in to unmask]
>
> 314-977-3616
>
> http://halbush.com
>
> author website: halbush.com
>
> ________________________________
> From: Mark Twain Forum <[log in to unmask]> on behalf of Leslie MYRICK <
> [log in to unmask]>
> Sent: Thursday, February 14, 2019 12:40:44 PM
> To: [log in to unmask]
> Subject: Re: Rediscovered Twain Sketch?
>
> It appears that "humble" was occasionally pronounced with a dropped H even
> in the US at the time, especially if the speaker was from a family that
> immigrated from the UK, or, like the Express's political editor, from
> Canada. (Bob Hirst would know whether Larned's editorialzing was ever this
> *sustainedly* humorous).
>
> Or, as I think someone else has noted, "an humble" could have been in this
> case, if it *was* written by MT, a typesetter's mistake.
>
> I took a look at "an humble" in the NYS newspapers archive, and found an
> interesting case of "a humble" vs "an humble" in the transcription of a
> speech by an Illinois congressman on the effects of Republican tariffs on
> farmers. If you compare these two versions, whose links will hopefully
> preserve the highlighting, you'll see at least one case of humble treated
> with a silent H and a voiced H in two reprints, suggesting an intervention
> based on differences in dialect.
> Geneva Gazette, 10 Jun 70
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__nyshistoricnewspapers.org_lccn_sn83031108_1870-2D06-2D10_ed-2D1_seq-2D4_-23date1-3D01-252F01-252F1869-26index-3D6-26date2-3D01-252F31-252F1874-26searchType-3Dadvanced-26SearchType-3Dphrase-26sequence-3D0-26words-3Dhumble-26proxdistance-3D-26to-5Fyear-3D1874-26rows-3D20-26ortext-3D-26from-5Fyear-3D1869-26proxtext-3D-26phrasetext-3Dan-2Bhumble-26andtext-3D-26dateFilterType-3Drange-26page-3D1&d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIWoQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZHCJHyBlf3h10-HisntoNDpMHXAqbPsxUoX3pE&s=SIIkmP3l4lRXiZF_U176dVFoRkNSv3hV1OoCv-Ai9qQ&e=
> Herkimer Democrat, 3 Aug 70
>
> https://urldefense.proofpoint.com/v2/url?u=http-3A__nyshistoricnewspapers.org_lccn_sn83031101_1870-2D08-2D03_ed-2D1_seq-2D2_-23date1-3D01-252F01-252F1869-26index-3D2-26date2-3D01-252F31-252F1874-26searchType-3Dadvanced-26SearchType-3Dphrase-26sequence-3D0-26words-3Dflannel-2Bhumble-2Bshirt-26proxdistance-3D-26to-5Fyear-3D1874-26rows-3D20-26ortext-3D-26from-5Fyear-3D1869-26proxtext-3D-26phrasetext-3Dhumble-2Bflannel-2Bshirt-26andtext-3D-26dateFilterType-3Drange-26page-3D1&d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIWoQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZHCJHyBlf3h10-HisntoNDpMHXAqbPsxUoX3pE&s=LNyV6qFT08f4UGZP0RnTigIayuZ4EUcUQsQ5b4vV5Nc&e=
>
> A survey of the same speech in newspapers.com shows 44 cases of "an
> humble," which is apparently how it was enunciated by Rep. Marshall, and
> faithfully transmitted, vs 9 cases of "a humble." The typesetter's or
> editor's intervention was apparently the dropping of the "n" in this case.
> But this sort of intervention could go both ways, depending on a person's
> dialect affinities
>
> All to say, I suggest that "an humble" could just be a typo, and not
> necessarily a viable data point -- or what I used to call, before I retired
> from MTP, "a glitch."
>
> I say data point, because in at least one branch of stylometry, articles,
> conjunctions, and other words more unconsciously generated by a writer's
> brain appear to make the best case for identification.
>
> For Too Much Information on how stylometry works (yet you can cherry pick
> really useful information from it) see
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__programminghistorian.org_en_lessons_introduction-2Dto-2Dstylometry-2Dwith-2Dpython&d=DwIFaQ&c=Pk_HpaIpE_jAoEC9PLIWoQ&r=f7i-Uq4rMQU8-TBe45qVLg&m=BuDtlZHCJHyBlf3h10-HisntoNDpMHXAqbPsxUoX3pE&s=vfp8dReeCNT4rgCh1AlUx3IEy30LApZzm5Ke3z2D9_w&e=
> Leslie
>
> On Wed, Feb 13, 2019 at 10:13 PM Clay Shannon <[log in to unmask]>
> wrote:
>
> > Thanks, Barb! I've added it to my amazon shopping list - will purchase it
> > later.
> > - B. Clay Shannon
> >
> > On Wednesday, February 13, 2019, 5:57:19 PM PST, Barbara Schmidt <
> > [log in to unmask]> wrote:
> >
> > Clay asked -- Has anybody compiled a list of Twain's "vocabulary" --
> > Yes.
> >
> > A MARK TWAIN LEXICON by Robert Ramsay and Frances Emberson. Published in
> > 1963.
> >
> > Barb
> >
> >
>
|