Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: porkchop@invalid.foo (Mike Sanders) Newsgroups: comp.lang.awk Subject: Re: (Long post) Metaphone Algorithm In AWK Date: Wed, 21 Aug 2024 02:50:07 -0000 (UTC) Organization: A noiseless patient Spider Lines: 21 Sender: Mike Sanders Message-ID: References: <878qwts8bd.fsf@bsb.me.uk> <87wmkapx0x.fsf@bsb.me.uk> Injection-Date: Wed, 21 Aug 2024 04:50:07 +0200 (CEST) Injection-Info: dont-email.me; posting-host="c07617ccf20e561b171d3b81aed9e688"; logging-data="3915077"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/cYio6qYBXXZMGmwer0Kdc" User-Agent: tin/2.6.2-20221225 ("Pittyvaich") (NetBSD/9.3 (amd64)) Cancel-Lock: sha1:ibB0gnlpdIwIjCi4qi1cCy4FCnE= Bytes: 1652 Mike Sanders wrote: > Ben Bacarisse wrote: > >> However, if I remove the extra condition (that levenshtein($x, find) <= >> 2) your AWK code matches a different set of words to the C >> implementation. Looking a bit deeper, your AWK code give the code TKSS >> to the word "texas" but the C code assigns is "TKS". > > Just differing metaphone variants, witness... > > Texas = Tex[ess] (if phonetically pronounced - almost slurred sounding) See also: https://tilores.io/metaphone-phonetic-algorithm-online-tool?t1=texas&t2=taxi -- :wq Mike Sanders