Deutsch   English   Français   Italiano  
<mailman.23.1728742245.4695.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: <avi.e.gross@gmail.com>
Newsgroups: comp.lang.python
Subject: RE: Correct syntax for pathological re.search()
Date: Sat, 12 Oct 2024 10:10:41 -0400
Lines: 60
Message-ID: <mailman.23.1728742245.4695.python-list@python.org>
References: <ve0o34$1nep4$1@dont-email.me>
 <MQaOO.3313338$EVn.2054758@fx04.ams4>
 <011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
 <20241012105958.cbctekv7vustleha@hjp.at>
 <003201db1cb0$85ac8760$91059620$@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de 67/U/e/vy95UE4rdWC+lmgxBr4dwvUWCMpsNwNcizpQg==
Cancel-Lock: sha1:nqlBhI4JTBs0+Aqiuza4QnrnAGU= sha256:ekYYCEaoWvXre2bFkUadse6Ov2XZX+UBj0catK6r4sw=
Return-Path: <avi.e.gross@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
 reason="2048-bit key; unprotected key"
 header.d=gmail.com header.i=@gmail.com header.b=F5hBzjg3;
 dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.001
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'parallel': 0.05;
 'debugging': 0.07; 'intermediate': 0.07; 'expression': 0.09;
 'fyi,': 0.09; 'prints': 0.09; 'string,': 0.09; 'user.': 0.09;
 'utility': 0.09; 'syntax': 0.15; 'that.': 0.15; '"creative': 0.16;
 '2024': 0.16; '7:00': 0.16; '__/': 0.16; 'another.': 0.16; 'avi':
 0.16; 'challenge!"': 0.16; 'compiled': 0.16; 'expressions.': 0.16;
 'gross': 0.16; 'hjp@hjp.at': 0.16; 'holzer': 0.16; 'luke': 0.16;
 'machine,': 0.16; 'reality.': 0.16; 'sees': 0.16; 'stross,': 0.16;
 'subject:syntax': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-
 ip:212.17/16': 0.16; 'url:hjp': 0.16; 'visualize': 0.16; '|_|_)':
 0.16; 'wrote:': 0.16; 'october': 0.17; 'probably': 0.17; 'message-
 id:@gmail.com': 0.18; 'to:addr:python-list': 0.20; 'input': 0.21;
 'goal': 0.23; 'anything': 0.25; 'skip:- 10': 0.25; 'discussion':
 0.25; "isn't": 0.27; 'function': 0.27; 'sense': 0.28; 'asked':
 0.29; 'seem': 0.31; 'looked': 0.31; 'module': 0.31; "doesn't":
 0.32; 'question': 0.32; 'assume': 0.32; 'python-list': 0.32;
 'but': 0.32; 'subject:for': 0.33; 'there': 0.33; 'same': 0.34;
 'mean': 0.34; 'header:In-Reply-To:1': 0.34; 'received:google.com':
 0.34; 'understood': 0.35; 'from:addr:gmail.com': 0.35; 'change':
 0.36; 'those': 0.36; "it's": 0.37; 'way': 0.38; 'could': 0.38;
 'two': 0.39; 'use': 0.39; 'received:100': 0.39; 'table': 0.39;
 'something': 0.40; 'from:': 0.62; 'to:': 0.62; 'point.': 0.62;
 'feel': 0.63; 'skip:r 20': 0.64; 'clear': 0.64; 're:': 0.64;
 'his': 0.65; 'produce': 0.65; 'look': 0.65; 'and,': 0.69; 'url-
 ip:212/8': 0.69; 'site': 0.70; 'sent:': 0.78; 'happens': 0.84;
 'received:mail-qk1-x735.google.com': 0.84; 'saturday,': 0.84;
 'transitions': 0.84; 'want.': 0.84; 'websites': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1728742242; x=1729347042; darn=python.org;
 h=thread-index:content-language:content-transfer-encoding
 :mime-version:message-id:date:subject:in-reply-to:references:to:from
 :from:to:cc:subject:date:message-id:reply-to;
 bh=ls2SccAqmZhKNak/jZG1WrzRxe9htyAVkrfENb3OsZM=;
 b=F5hBzjg3tQgXJrvIesf/lLgBNvDy/nQQ0tHIKKzcZxzWXL9yDHTdTTDxpMdgE4Ujw9
 Tu4QPfGb5Jf41vyO3Kpf//6Jvo5NYpWnrUAw5xYshohLIIK1kqrZBEpT3hRxX6DVB1tK
 gCqzpF7MRvHmTjSti1ytww9minvty443KE0K4L4XtV34SaEwfMDuwzSn5WbGCbBY1PCd
 8E9ZD3F3lyKfTo07BC8j3jpHtq9JZ6neWbLOiSQMBkbtTkBmofmfJUD29BVPZgb/DaeN
 8o7wn4A2AEZnalRtr9pmmCAODLt0Ts4iOG4SE1k6jmzxCTQmqOXZRTYbgS9LaWwhuCdO
 4e/w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1728742242; x=1729347042;
 h=thread-index:content-language:content-transfer-encoding
 :mime-version:message-id:date:subject:in-reply-to:references:to:from
 :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
 bh=ls2SccAqmZhKNak/jZG1WrzRxe9htyAVkrfENb3OsZM=;
 b=OyCtRO/4Kaw3+boVqD788mFTyCDTGpboJJ9L1yx7kT11TRAvb7IqCqHnOxU+cZvKOz
 b/P1fjdDjcePum3hLBDn61eiQK680R4v39+PSl7YIO2VPakfaLWbOXFHhoh8tNdvBy8Z
 fgt9T/Sf5LA+j77nNICx6BL8vdOzIE5dsc9atrGH94MT+0WIE8RV4zW9GeOL4ZdDbxh3
 bemT0gDQEKOv6kzODazKpL1kgG4hU78iO/exYRk1Iv42rbcjq3pxOwhAIo+3c1qrbYnj
 o5JHTzd0UFSNn0q2xgXddRBrv7EFkhYb6kRO7SKm7f87hI8G2kd8oB9GS3uPR3nesF7s
 JAhg==
X-Forwarded-Encrypted: i=1;
 AJvYcCWiQj/811dyIj2KEuOd1AoohLjK0/mtNyijrjFf1Ngtfsd2R2hk7uwZVhKUarRwBsJPhii/HXUDl+0wjA==@python.org
X-Gm-Message-State: AOJu0YxOlxNtMOxusvZrWLRoOxfDBU3NUhmy4J1z16tMM6lheEYJ/C22
 2Tv3O9NS6LBFKVYcpIhjgjciH9ttGYlrSuba3BNy4pvSAq+sF+mtxLU8pw==
X-Google-Smtp-Source: AGHT+IFw/SDiS3lH7mAFSl7yGXh/0c223uF4aC+MESib/Z50r7kHKlOgd2oZaRFjdTsVi6gdBNTLbw==
X-Received: by 2002:a05:620a:4155:b0:7a9:a389:c13e with SMTP id
 af79cd13be357-7b120fb9c74mr523249785a.18.1728742242091; 
 Sat, 12 Oct 2024 07:10:42 -0700 (PDT)
In-Reply-To: <20241012105958.cbctekv7vustleha@hjp.at>
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AQFBUE6EHplIDkSetD50ItlJZJ+fJAGoI+1+AVnoAKgBv6MWzbOQsKgw
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
 <python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
 <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
 <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <003201db1cb0$85ac8760$91059620$@gmail.com>
X-Mailman-Original-References: <ve0o34$1nep4$1@dont-email.me>
 <MQaOO.3313338$EVn.2054758@fx04.ams4>
 <011301db1c22$5e7519c0$1b5f4d40$@gmail.com>
 <20241012105958.cbctekv7vustleha@hjp.at>
Bytes: 8215

Peter,

Matthew understood what I was hinting at in one way and you in another.

The question asked how to add some power of two backslashes or make other
changes, so the RE functionality sees what you want. The goal is to see what
happens when one or more intermediate evaluations may change the string.

So, a simple print may suffice as a parallel way to force the same
evaluations. 

Thomas made his point. And, I am starting to feel like I need to change my
name to something like Luke since this discussion must be gospel.

FYI, I was not planning on posting at all. Time to detach.


-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Peter J. Holzer via Python-list
Sent: Saturday, October 12, 2024 7:00 AM
To: python-list@python.org
Subject: Re: Correct syntax for pathological re.search()

On 2024-10-11 17:13:07 -0400, AVI GROSS via Python-list wrote:
> Is there some utility function out there that can be called to show what
the
> regular expression you typed in will look like by the time it is ready to
be
> used?

I assume that by "ready to be used" you mean the compiled form?

No, there doesn't seem to be a way to dump that. You can

    p = re.compile("\\\\sout{")
    print(p.pattern)

but that just prints the input string, which you could do without
compiling it first.

But - without having looked at the implementation - it's far from clear
that the compiled form would be useful to the user. It's probably some
kind of state machine, and a large table of state transitions isn't very
readable.

There are a number of websites which visualize regular expressions.
Those are probably better for debugging a regular expression than
anything the re module could reasonably produce (although with the
caveat that such a web site would use a different implementation and
therefore might produce different results).

        hp

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"