Article <871pzrmcky.fsf@zedat.fu-berlin.de>

Deutsch English Français Italiano
<871pzrmcky.fsf@zedat.fu-berlin.de>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!weretis.net!feeder8.news.weretis.net!fu-berlin.de!uni-berlin.de!not-for-mail
From: "Loris Bennett" <loris.bennett@fu-berlin.de>
Newsgroups: comp.lang.python
Subject: Re: Printing UTF-8 mail to terminal
Date: Mon, 04 Nov 2024 13:02:21 +0100
Organization: FUB-IT, Freie =?utf-8?Q?Universit=C3=A4t?= Berlin
Lines: 131
Message-ID: <871pzrmcky.fsf@zedat.fu-berlin.de>
References: <875xp7nwus.fsf@zedat.fu-berlin.de>
	<ZyVMe3Jspc0fJrel@cskk.homeip.net>
	<mailman.69.1730497664.4695.python-list@python.org>
	<87ed3rmg7g.fsf@zedat.fu-berlin.de>
	<875xp3mfku.fsf@zedat.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de qcYYdOG7c7e7KSYkx8/6twZ3kLenB6GSyOLOMeUpcfwmxa
Cancel-Lock: sha1:f5YK6mGo+QOXycGyerakkhD5MEM= sha1:4Y0kJ4CwD39m5hxgmNvhqaVY63s= sha256:eLDRXfJWKFlBr2B7A/UCvcnDDZNaUnkdNwbZGepKcyE=
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Bytes: 4819

"Loris Bennett" <loris.bennett@fu-berlin.de> writes:

> "Loris Bennett" <loris.bennett@fu-berlin.de> writes:
>
>> Cameron Simpson <cs@cskk.id.au> writes:
>>
>>> On 01Nov2024 10:10, Loris Bennett <loris.bennett@fu-berlin.de> wrote:
>>>>as expected.  The non-UTF-8 text occurs when I do
>>>>
>>>>  mail = EmailMessage()
>>>>  mail.set_content(body, cte="quoted-printable")
>>>>  ...
>>>>
>>>>  if args.verbose:
>>>>      print(mail)
>>>>
>>>>which is presumably also correct.
>>>>
>>>>The question is: What conversion is necessary in order to print the
>>>>EmailMessage object to the terminal, such that the quoted-printable
>>>>parts are turned (back) into UTF-8?
>>>
>>> Do you still have access to `body` ? That would be the original
>>> message text? Otherwise maybe:
>>>
>>>     print(mail.get_content())
>>>
>>> The objective is to obtain the message body Unicode text (i.e. a
>>> regular Python string with the original text, unencoded). And to print
>>> that.
>>
>> With the following:
>>
>> ######################################################################
>>
>> import email.message
>>
>> m = email.message.EmailMessage()
>>
>> m['Subject'] = 'Übung'
>>
>> m.set_content('Dies ist eine Übung')
>> print('== cte: default == \n')
>> print(m)
>>
>> print('-- full mail ---')
>> print(m)
>> print('-- just content--')
>> print(m.get_content())
>>
>> m.set_content('Dies ist eine Übung', cte='quoted-printable')
>> print('== cte: quoted-printable ==\n')
>> print('-- full mail --')
>> print(m)
>> print('-- just content --')
>> print(m.get_content())
>>
>> ######################################################################
>>
>> I get the following output:
>>
>> ######################################################################
>>
>> == cte: default == 
>>
>> Subject: Übung
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: base64
>> MIME-Version: 1.0
>>
>> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>>
>> -- full mail ---
>> Subject: Übung
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: base64
>> MIME-Version: 1.0
>>
>> RGllcyBpc3QgZWluZSDDnGJ1bmcK
>>
>> -- just content--
>> Dies ist eine Übung
>>
>> == cte: quoted-printable ==
>>
>> -- full mail --
>> Subject: Übung
>> MIME-Version: 1.0
>> Content-Type: text/plain; charset="utf-8"
>> Content-Transfer-Encoding: quoted-printable
>>
>> Dies ist eine =C3=9Cbung
>>
>> -- just content --
>> Dies ist eine Übung
>>
>> ######################################################################
>>
>> So in both cases the subject is fine, but it is unclear to me how to
>> print the body.  Or rather, I know how to print the body OK, but I don't
>> know how to print the headers separately - there seems to be nothing
>> like 'get_headers()'.  I can use 'get('Subject) etc. and reconstruct the
>> headers, but that seems a little clunky.  
>
> Sorry, I am confusing the terminology here.  The 'body' seems to be the
> headers plus the 'content'.  So I can print the *content* without the
> headers OK, but I can't easily print all the headers separately.  If
> just print the body, i.e. headers plus content, the umlauts in the
> content are not resolved.

OK, so I can do:

######################################################################
if args.verbose:
    for k in mail.keys():
        print(f"{k}: {mail.get(k)}")
    print('')
    print(mail.get_content())
######################################################################

prints what I want and is not wildly clunky, but I am a little surprised
that I can't get a string representation of the whole email in one go.

Cheers,

Loris


-- 
Dr. Loris Bennett (Herr/Mr)
FUB-IT, Freie Universität Berlin