Path: ...!newsreader4.netcologne.de!news.netcologne.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.python
Subject: Re: Code improvement question
Date: 19 Nov 2023 10:04:06 GMT
Organization: Stefan Ram
Lines: 32
Expires: 1 Dec 2024 11:59:58 GMT
Message-ID: <regexp-20231119110325@ram.dialup.fu-berlin.de>
References: <b6e81def-3db8-4f05-8459-9a967c774020@dewhirst.com.au> <088586a6-79c2-4114-8d62-5e1a1061b841@mrabarnett.plus.com> <32bbd365-a2fb-471f-b19e-3a3ec4457124@dewhirst.com.au> <mailman.249.1700019686.3828.python-list@python.org> <uj3h1b$1u2b5$1@dont-email.me> <20231117111744.oocpwdjryvcty5ol@hjp.at> <7072d3e8-317c-4953-9b7e-5a1750d957aa@tompassin.net> <20231117144606.ssezd234lj753bp2@hjp.at> <303c6738-4c51-4dbd-9c3c-1fe659b2ff6e@tompassin.net> <mailman.282.1700234667.3828.python-list@python.org> <Or-20231117191916@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de BP40UAC9OlVuImCXDoc2gQe2QLkpakF01CGWsZ+NQ/6k+/
Cancel-Lock: sha1:sRof9xN9p1cM3izGhFlc5/M/LOg= sha256:8o/PuF3a3NLOEna7zmq/vVRZGgKT+NSynFae91UAHSc=
X-Copyright: (C) Copyright 2023 Stefan Ram. All rights reserved.
	Distribution through any means other than regular usenet
	channels is forbidden. It is forbidden to publish this
	article in the Web, to change URIs of this article into links,
        and to transfer the body without this notice, but quotations
        of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
	services to mirror the article in the web. But the article may
	be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Accept-Language: de-DE-1901, en-US, it, fr-FR
Bytes: 2949

ram@zedat.fu-berlin.de (Stefan Ram) writes:
>Thomas Passin <list1@tompassin.net> writes:
>>>>>>>> re.findall(r'\b[0-9]{2,7}-[0-9]{2}-[0-9]{2}\b', txt)
>Or,

def repeat_preceding( min=None, max=None, count=None ):
    ''' require that the preceding regexp is repeated
    a certain number of times, use either min and max
    or count '''
    return '{' + str( count )+ '}' if count else \
    '{' + str( min )+ ',' + str( max )+ '}'

digit = '[0-9]' # match a decimal digit
word_boundary = r'\b' # match a word boundary
a_hyphen = '-' # match a literal hyphen character

def digits( **kwargs ):
    ''' A certain number of digits. See 'repeat_preceding' for
    the possible kwargs. '''
    return digit + repeat_preceding( **kwargs )

def word( regexp: str ):
    ''' something that starts and ends with a word boundary '''
    return word_boundary + regexp + word_boundary

my_regexp = \
word \
( digits( min=2, max=7 ) + a_hyphen + 
  digits( count=2 ) + a_hyphen +
  digits( count=2 ))