Article <benchmark-20240405190253@ram.dialup.fu-berlin.de>

Deutsch English Français Italiano
<benchmark-20240405190253@ram.dialup.fu-berlin.de>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.python
Subject: Re: A technique from a chatbot
Date: 5 Apr 2024 18:29:22 GMT
Organization: Stefan Ram
Lines: 65
Expires: 1 Feb 2025 11:59:58 GMT
Message-ID: <benchmark-20240405190253@ram.dialup.fu-berlin.de>
References: <chatbot-20240402181409@ram.dialup.fu-berlin.de> <fvstdk-607.ln1@lazy.lzy> <7d38d9e2-78fb-43af-971f-e0d4afb8b039@tompassin.net> <mailman.58.1712110350.3468.python-list@python.org> <uumtii$qum4$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de 7doB6KUYNRHEWkoJEP9nnQ4FRkoahC2AZPx0UZiug/xmuM
Cancel-Lock: sha1:gLtysVdziOHXllnhweodJqn/D8Y= sha256:rA+G2fhGLLvIcGULoIoIbquqZPnA9Y1fPz9g0Bj/dSw=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
	Distribution through any means other than regular usenet
	channels is forbidden. It is forbidden to publish this
	article in the Web, to change URIs of this article into links,
        and to transfer the body without this notice, but quotations
        of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
	services to mirror the article in the web. But the article may
	be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Bytes: 4002

Mark Bourne <nntp.mbourne@spamgourmet.com> wrote or quoted:
>I don't think there's a tuple being created.  If you mean:
>     ( word for word in list_ if word[ 0 ]== 'e' )
>...that's not creating a tuple.  It's a generator expression, which 
>generates the next value each time it's called for.  If you only ever 
>ask for the first item, it only generates that one.

  Yes, that's also how I understand it!

  In the meantime, I wrote code for a microbenchmark, shown below.

  This code, when executed on my computer, shows that the
  next+generator approach is a bit faster when compared with
  the procedural break approach. But when the order of the two
  approaches is being swapped in the loop, then it is shown to
  be a bit slower. So let's say, it takes about the same time.

  However, I also tested code with an early return (not shown below),
  and this was shown to be faster than both code using break and
  code using next+generator by a factor of about 1.6, even though
  the code with return has the "function call overhead"!

  But please be aware that such results depend on the implementation
  and version of the Python implementation being used for the benchmark
  and also of the details of how exactly the benchmark is written.

import random
import string
import timeit

print( 'The following loop may need a few seconds or minutes, '
'so please bear with me.' )

time_using_break = 0
time_using_next = 0

for repetition in range( 100 ):
    for i in range( 100 ): # Yes, this nesting is redundant!

        list_ = \
        [ ''.join \
          ( random.choices \
            ( string.ascii_lowercase, k=random.randint( 1, 30 )))
          for i in range( random.randint( 0, 50 ))]

        start_time = timeit.default_timer()
        for word in list_:
            if word[ 0 ]== 'e':
                word_using_break = word
                break
        else:
            word_using_break = ''
        time_using_break += timeit.default_timer() - start_time

        start_time = timeit.default_timer()
        word_using_next = \
        next( ( word for word in list_ if word[ 0 ]== 'e' ), '' )
        time_using_next += timeit.default_timer() - start_time

        if word_using_next != word_using_break:
            raise Exception( 'word_using_next != word_using_break' )

print( f'{time_using_break = }' )
print( f'{time_using_next = }' )
print( f'{time_using_next / time_using_break = }' )