Deutsch English Français Italiano |
<va3ekh$3it79$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: porkchop@invalid.foo (Mike Sanders) Newsgroups: comp.lang.awk Subject: Re: (Long post) Metaphone Algorithm In AWK Date: Wed, 21 Aug 2024 01:07:29 -0000 (UTC) Organization: A noiseless patient Spider Lines: 28 Sender: Mike Sanders <busybox@sdf.org> Message-ID: <va3ekh$3it79$1@dont-email.me> References: <v9qbgh$1u7qe$1@dont-email.me> <878qwts8bd.fsf@bsb.me.uk> <va1aht$3906i$1@dont-email.me> <87wmkapx0x.fsf@bsb.me.uk> Injection-Date: Wed, 21 Aug 2024 03:07:30 +0200 (CEST) Injection-Info: dont-email.me; posting-host="c07617ccf20e561b171d3b81aed9e688"; logging-data="3765481"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19W2HT0C08XHtrok+UbgTHW" User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64)) Cancel-Lock: sha1:9cWoYl0OmJHZl0SDDqAHcEO9NfA= Bytes: 2001 Ben Bacarisse <ben@bsb.me.uk> wrote: > I don't know what your are asking for as this (your latest AWK) is not > just an implementation of the metaphone algorithm. With the extra > Levenshtein test it "texas" matches only a few words. Not seeking/asking for anything Ben, just enjoy the ride =) As for my Metaphone take... In fact it is. Several Metaphone variants use Levenshtein & can be any mixture of three types of Metaphone versions further still, or even a mix. Seems that's the way it is in the wild... > However, if I remove the extra condition (that levenshtein($x, find) <= > 2) your AWK code matches a different set of words to the C > implementation. Looking a bit deeper, your AWK code give the code TKSS > to the word "texas" but the C code assigns is "TKS". Just differing metaphone variants, witness... Texas = Tex[ess] (if phonetically pronounced - almost slurred sounding) But hey: Many thanks for your input kind sir, I appreciate it. -- :wq Mike Sanders