Path: ...!news.mixmin.net!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: "Loris Bennett" <loris.bennett@fu-berlin.de>
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 11:57:37 +0100
Organization: FUB-IT, Freie =?utf-8?Q?Universit=C3=A4t?= Berlin
Lines: 110
Message-ID: <875xp3mfku.fsf@zedat.fu-berlin.de>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
	<ZyVMe3Jspc0fJrel@cskk.homeip.net>
	<mailman.69.1730497664.4695.python-list@python.org>
	<87ed3rmg7g.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de K3NFHovAT4eiybs6ZWzXlwQ2SAvDFsVuO+/zgfdq87F6Pw
Cancel-Lock: sha1:dCJHonEM2ycvtkvQ3gYP8nEzZa0= sha1:aWb9lRBT+flGz+gjOvbOP1bYa9M= sha256:tuwMZ8fJUZg5Fd3ngrcj0mDobU4VFHGrzypxDdxhooY=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Bytes: 4125

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> Cameron Simpson <cs@cskk.id.au> writes:
>
>> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>as expected.  The non-UTF-8 text occurs when I do
>>>
>>>  mail = EmailMessage()
>>>  mail.set_content(body, cte="quoted-printable")
>>>  ...
>>>
>>>  if args.verbose:
>>>      print(mail)
>>>
>>>which is presumably also correct.
>>>
>>>The question is: What conversion is necessary in order to print the
>>>EmailMessage object to the terminal, such that the quoted-printable
>>>parts are turned (back) into UTF-8?
>>
>> Do you still have access to `body` ? That would be the original
>> message text? Otherwise maybe:
>>
>>     print(mail.get_content())
>>
>> The objective is to obtain the message body Unicode text (i.e. a
>> regular Python string with the original text, unencoded). And to print
>> that.
>
> With the following:
>
> ######################################################################
>
> import email.message
>
> m = email.message.EmailMessage()
>
> m['Subject'] = 'Übung'
>
> m.set_content('Dies ist eine Übung')
> print('== cte: default == \n')
> print(m)
>
> print('-- full mail ---')
> print(m)
> print('-- just content--')
> print(m.get_content())
>
> m.set_content('Dies ist eine Übung', cte='quoted-printable')
> print('== cte: quoted-printable ==\n')
> print('-- full mail --')
> print(m)
> print('-- just content --')
> print(m.get_content())
>
> ######################################################################
>
> I get the following output:
>
> ######################################################################
>
> == cte: default == 
>
> Subject: Übung
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: base64
> MIME-Version: 1.0
>
> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>
> -- full mail ---
> Subject: Übung
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: base64
> MIME-Version: 1.0
>
> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>
> -- just content--
> Dies ist eine Übung
>
> == cte: quoted-printable ==
>
> -- full mail --
> Subject: Übung
> MIME-Version: 1.0
> Content-Type: text/plain; charset="utf-8"
> Content-Transfer-Encoding: quoted-printable
>
> Dies ist eine =C3=9Cbung
>
> -- just content --
> Dies ist eine Übung
>
> ######################################################################
>
> So in both cases the subject is fine, but it is unclear to me how to
> print the body.  Or rather, I know how to print the body OK, but I don't
> know how to print the headers separately - there seems to be nothing
> like 'get_headers()'.  I can use 'get('Subject) etc. and reconstruct the
> headers, but that seems a little clunky.  

Sorry, I am confusing the terminology here.  The 'body' seems to be the
headers plus the 'content'.  So I can print the *content* without the
headers OK, but I can't easily print all the headers separately.  If
just print the body, i.e. headers plus content, the umlauts in the
content are not resolved.

-- 
This signature is currently under constuction.