Deutsch English Français Italiano |
<wrap-20240828192402@ram.dialup.fu-berlin.de> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail From: ram@zedat.fu-berlin.de (Stefan Ram) Newsgroups: comp.text.tex Subject: Re: TeX's line breaking in the grub sesh Date: 28 Aug 2024 18:34:59 GMT Organization: Stefan Ram Lines: 127 Expires: 1 Jul 2025 11:59:58 GMT Message-ID: <wrap-20240828192402@ram.dialup.fu-berlin.de> References: <Lines-20240805135439@ram.dialup.fu-berlin.de> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: news.uni-berlin.de E02UTrkLVdeHx7OeR8NgcgPTVZ9bYbiEi1O+g2iw2KNtl7 Cancel-Lock: sha1:i9XhWF9F7MSHctvZvWZxQDRtL9g= sha256:+PEPxs/+SF0sAuRVlFXVsQvGfNhc1ckt/GQQ2qhos9w= X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved. Distribution through any means other than regular usenet channels is forbidden. It is forbidden to publish this article in the Web, to change URIs of this article into links, and to transfer the body without this notice, but quotations of parts in other Usenet posts are allowed. X-No-Archive: Yes Archive: no X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some services to mirror the article in the web. But the article may be kept on a Usenet archive server with only NNTP access. X-No-Html: yes Content-Language: en-US Bytes: 7026 ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted: >For sure there's some janky bugs in there, but peep this: Turns out there was still a glitch in the code! The latest build now shows a paragraph break with (fingers crossed) global optimization, taking parshape into account. Now that I've finally squashed the bug, discretionary items haven't been baked in yet. That's next on my to-do list though. Ironically, lines of the following Python 3.12 source code have NOT been wrapped to the 72 characters recommended for Usenet posts! main.py from dataclasses import dataclass from typing import Optional, List, Iterator import bisect source_text = list( ' Lorem ipsum dolor sit amet, consectetur adipiscing elit, ' 'sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad ' 'minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea ' 'commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit ' 'esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat ' 'non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. ' ) parshape =[ 80, 80, 80, 40 ] @dataclass class ActiveEntry: previous: Optional[ 'ActiveEntry' ]= None position: int = 0 # position in the text branch: int = 0 # for future version sum_quality: int = 0 # the sum of all "merits" up to this point line_number: int = 0 # line started with this (first line = 0) parshape_length = len( parshape ) def print_up_to( this ): '''prints the wrapped paragraph up to the point "this". It starts at the back point "this" and then goes forward in the text via the linked chain of points. Finally, for printing the text in the normal order, it then goes forward again.''' buff = [] qual = [] sum_quality = this.sum_quality while this.previous: previous = this.previous line = source_text[ previous.position: this.position ] # print( f'{line = }' ) buff.append( line ) qual.append( this.sum_quality ) this = previous start_position = 0 # we went backwards, but actually want to print in the normal direction first = 1 for i,( line, qual ) in enumerate( zip( reversed( buff ), reversed( qual ))): text = ''.join( line[ start_position: ]) target_length = parshape[ i ]if i < parshape_length else parshape[ -1 ]# dupe! output = text print( output[ first: ]) first = 0 start_position = 1 # skip an initial space or something print() print( 'Total merits:', sum_quality ) print() active0 = ActiveEntry() active_list =[ active0 ] current_position = 1 source_length = len( source_text ) while current_position < len( source_text ): ch = source_text[ current_position ] if ch == ' ': # possible breakpoint new_active_list = [] # next active list best_sum_quality = None # best quality summed across this and previous lines, not yet determined best_act = active_list[ 0 ] # preliminary choice for active in active_list: active_position = active.position line_number = active.line_number target_length = parshape[ line_number ]if line_number < parshape_length else parshape[ -1 ]# dupe! distance = current_position - active_position adjustment = target_length - distance if adjustment < 0: # "When an active breakpoint a is encountered for which # the line from a to b has an adjustment ratio less # than -1 (that is, when the line can't be shrunk to # fit the desired length), breakpoint a is removed # from the active list." pass # do not transfer into the new active list else: new_active_list.append( active ) this_line_quality = -adjustment**2 have_reached_final_space = current_position == source_length - 1 # final ' ' on end of last line if have_reached_final_space: this_line_quality = 0 # arbitrary whitespace at end is accepted this_sum_quality = active.sum_quality + this_line_quality if \ best_sum_quality is None or \ this_sum_quality > best_sum_quality: best_sum_quality = this_sum_quality best_predecessor = active if best_sum_quality is not None: # make a new active point from current position, linking it to the best active point found new_active_list.append( ActiveEntry( previous=best_predecessor, position=current_position, branch=0, sum_quality=best_sum_quality, line_number=best_predecessor.line_number+1 )) active_list = new_active_list current_position += 1 active = active_list[ -1 ] # the final space print_up_to( active ) output Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. Total merits: -80