Path: ...!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Stefan Reuther Newsgroups: comp.arch.embedded Subject: Re: Static regex for embedded systems Date: Wed, 22 Jan 2025 17:53:15 +0100 Lines: 42 Message-ID: References: <9me0pjpctevm2k0vjf07iei0a1isf58tqa@4ax.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Trace: individual.net OSXBdmfOGKOy4ppsJBnLtwQkYUAcdFJmGVOaRkHDygHyqF/ZmF Cancel-Lock: sha1:uMoUciZg61XLiVJPmwpR67HUhng= sha256:HDuI6cHEG2okQsW766r6zzwuV0duw7FaX0267yJzPws= User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 Hamster/2.1.0.1538 In-Reply-To: <9me0pjpctevm2k0vjf07iei0a1isf58tqa@4ax.com> Bytes: 2729 Am 22.01.2025 um 01:38 schrieb George Neuner: > On Tue, 21 Jan 2025 18:03:48 +0100, pozz wrote: >>> (Personally, I have no problem with handcrafted parsers.) > > So long as they are correct 8-) Correctness has an inverse correlation with complexity, so optimize for non-complexity. I would implement a two-stage parser: first break the lines into a buffer, then throw a bunch of statements like if (Parser p(str); p.matchString("+") && p.matchTextUntil(":", &prefix) && p.matchWhitespace() ...) at this, with Parser being a small C++ class wrapping the individual matching operations (strncmp, strspn, etc.) Surely this is more complex as a regex/template, but still easy enough to be "obviously correct". > Lex and Flex create table driven lexers (and driver code for them). > Under certain circumstances Flex can create far smaller tables than > Lex, but likely either would be massive overkill for the scenario you > described. Maybe, maybe not. I find it hard to extrapolate to the complete task from the two examples given. If there's hundreds of these templates, that need to be matched bit-by-bit, I have the impression that lex would be a quick and easy way to pull them out of a byte stream. But splitting it into lines first, and then tackling each line on its own (...using lex, maybe? Or any other tool. Or a parser class.) might be a good option as well. For example, this can answer the question whether linefeeds are required to be \r\n, or whether a single \n also suffices, in a central place. And if you decide that you want to do a hard connection close if you see a \r or \n outside a \r\n sequence (to prevent an attack such as SMTP smuggling), that would be easy. Stefan