Article <v6uem9$3mj8r$1@dont-email.me>

Deutsch English Français Italiano
<v6uem9$3mj8r$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!3.eu.feeder.erje.net!feeder.erje.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Janis Papanagnou <janis_papanagnou+ng@hotmail.com>
Newsgroups: comp.editors
Subject: Re: Automating an atypical search & replace
Date: Sat, 13 Jul 2024 19:48:57 +0200
Organization: A noiseless patient Spider
Lines: 66
Message-ID: <v6uem9$3mj8r$1@dont-email.me>
References: <v6u8qi$3lh0j$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 13 Jul 2024 19:48:58 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="1b7f34cd7e8ffb468887c458977b23b9";
	logging-data="3886363"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18yogptqtOYV20EKZ7e4/Qf"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101
 Thunderbird/45.8.0
Cancel-Lock: sha1:yy1xBw346hNs55kBkAp1ekLlA6U=
In-Reply-To: <v6u8qi$3lh0j$1@dont-email.me>
X-Enigmail-Draft-Status: N1110
Bytes: 3208

On 13.07.2024 18:08, Richard Owlett wrote:
> I'm reformatting some HTML files containing chapters of the KJV Bible.
> My source follows the practice of italicizing some words.
> I find italics distracting.
> 
> These occurrences are consistently of the form
>    <span class='add'>arbitrary_text</span>
> 
> I wish to delete "<span class='add'>" and *ASSOCIATED* "</span>".
> Obviously it would not be wise to fully automate the action.
> I wish to find all occurrences of <span
> class='add'>arbitrary_text</span> an manually confirm the edit.
> 
> In general, is it feasible?

Yes, sure.

Some remarks...
I would use Regular Expressions (RE) for that task.
If <span> sections can be nested in your HTML source then you
cannot do that with plain RE processors.
Since you want to inspect each <span> pattern individually it's
not clear what you mean by "automate" (which I'd interpret as
running a batch job to do the process).
Actually you seem to want a sequential find + replace-or-skip.

In Vim I'd search for the "<span ..." pattern and then delete
to the next "</span>" pattern. (Assuming no nested <span>.)
Rinse repeat.
That could be (for example) the commands [case 1]

  /<span class='add'>
  d/<\/span>df>

If there's no other <...> inside the span-sections you could
simplify that to [case 2]

  /<span class='add'>
  d2f>

with the opportunity to repeat those search+delete commands
by simply typing  n.  for every match, like  n.n.n.n.  or if
you want to skip some like, e.g.,   n.nnnn.n.nnn.n

With  n  you get to the next span pattern and  .  repeats the
last command.

In [case 1] the repeat isn't possible since we have two delete
operations  d/<\/span>  and  df>  , but here you can define
macros to trigger the command by a keystroke or just use the
recording function to repeat the once recorded commands.

Sounds complicated? - Maybe. - But if we know your exact data
format we can provide the best command sequence for Vim for
most easy use.


> Can KDE's Kate do it?

Don't know.

Janis

> 
> TIA