| Deutsch English Français Italiano |
|
<v7002j$2710$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Richard Owlett <rowlett@access.net> Newsgroups: comp.editors Subject: Re: Automating an atypical search & replace Date: Sun, 14 Jul 2024 02:51:45 -0500 Organization: A noiseless patient Spider Lines: 27 Message-ID: <v7002j$2710$1@dont-email.me> References: <v6u8qi$3lh0j$1@dont-email.me> <v6v372$3povk$8@dont-email.me> <MPG.40fd0c80e0a741bc99030e@news.individual.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sun, 14 Jul 2024 09:51:48 +0200 (CEST) Injection-Info: dont-email.me; posting-host="46abaa0d5260ada45c5313798b70835f"; logging-data="72736"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+khWt59YWgD5GAByTjlm1S5qfOXP/XIA4=" User-Agent: Mozilla/5.0 (X11; Linux i686; rv:52.0) Gecko/20100101 Firefox/52.0 SeaMonkey/2.49.4 Cancel-Lock: sha1:+DpdJH9DjQnspKDiqFWJhwTHACE= In-Reply-To: <MPG.40fd0c80e0a741bc99030e@news.individual.net> Bytes: 2165 On 07/14/2024 01:13 AM, Stan Brown wrote: > AOn Sat, 13 Jul 2024 23:39:14 -0000 (UTC), Lawrence D'Oliveiro wrote: >> On Sat, 13 Jul 2024 11:08:48 -0500, Richard Owlett wrote: >> >>> These occurrences are consistently of the form >>> <span class='add'>arbitrary_text</span> >>> >>> I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>". >> >> This is beyond the abilities of regular expressions. This is the point >> where you need to use an actual HTML/XML-parsing library. >> > > In general I'd agree with you. But the OP made a big deal -- in a > different thread, for some reason -- about wanting to use minimal > HTML, so I doubt very much there will be nested <span> ... </span> > sequences. I'd compare using a minimal HTML to learning to crawl before pursuing running a marathon ;} > > Also, the OP quite rightly wanted to confirm each change before it is > made, so presumably if there are any nested sequences he will say no > to that particular edit and fix it manually. >