Deutsch   English   Français   Italiano  
<va3ekh$3it79$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: porkchop@invalid.foo (Mike Sanders)
Newsgroups: comp.lang.awk
Subject: Re: (Long post) Metaphone Algorithm In AWK
Date: Wed, 21 Aug 2024 01:07:29 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 28
Sender: Mike Sanders <busybox@sdf.org>
Message-ID: <va3ekh$3it79$1@dont-email.me>
References: <v9qbgh$1u7qe$1@dont-email.me> <878qwts8bd.fsf@bsb.me.uk> <va1aht$3906i$1@dont-email.me> <87wmkapx0x.fsf@bsb.me.uk>
Injection-Date: Wed, 21 Aug 2024 03:07:30 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="c07617ccf20e561b171d3b81aed9e688";
	logging-data="3765481"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19W2HT0C08XHtrok+UbgTHW"
User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64))
Cancel-Lock: sha1:9cWoYl0OmJHZl0SDDqAHcEO9NfA=
Bytes: 2001

Ben Bacarisse <ben@bsb.me.uk> wrote:

> I don't know what your are asking for as this (your latest AWK) is not
> just an implementation of the metaphone algorithm.  With the extra
> Levenshtein test it "texas" matches only a few words.

Not seeking/asking for anything Ben, just enjoy the ride =)

As for my Metaphone take... In fact it is. Several Metaphone variants
use Levenshtein & can be any mixture of three types of Metaphone
versions further still, or even a mix. Seems that's the way it is
in the wild...
 
> However, if I remove the extra condition (that levenshtein($x, find) <=
> 2) your AWK code matches a different set of words to the C
> implementation.  Looking a bit deeper, your AWK code give the code TKSS
> to the word "texas" but the C code assigns is "TKS".

Just differing metaphone variants, witness...

Texas = Tex[ess] (if phonetically pronounced - almost slurred sounding)

But hey: Many thanks for your input kind sir, I appreciate it.

-- 
:wq
Mike Sanders