| Deutsch English Français Italiano |
|
<voobnc$3l2dl$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Cost of handling misaligned access
Date: Fri, 14 Feb 2025 15:14:11 -0600
Organization: A noiseless patient Spider
Lines: 89
Message-ID: <voobnc$3l2dl$1@dont-email.me>
References: <5lNnP.1313925$2xE6.991023@fx18.iad> <vnosj6$t5o0$1@dont-email.me>
<2025Feb3.075550@mips.complang.tuwien.ac.at> <volg1m$31ca1$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 14 Feb 2025 22:14:21 +0100 (CET)
Injection-Info: dont-email.me; posting-host="9f075752c769f3b49544a44a921ac4b4";
logging-data="3836341"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/dQ2BNAr3Zxr3odv70Em25ildbY5v2rrE="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:KMcleDqsR9VFZNZuQUcU5bbnSds=
In-Reply-To: <volg1m$31ca1$1@dont-email.me>
Content-Language: en-US
Bytes: 4666
On 2/13/2025 1:09 PM, Marcus wrote:
> On 2025-02-03, Anton Ertl wrote:
>> BGB <cr88192@gmail.com> writes:
>>> On 2/2/2025 10:45 AM, EricP wrote:
>>>> Digging deeper with performance counters reveals executing each
>>>> unaligned
>>>> load instruction results in ~505 executed instructions. P550 almost
>>>> certainly doesn’t have hardware support for unaligned accesses.
>>>> Rather, it’s likely raising a fault and letting an operating system
>>>> handler emulate it in software."
>>>>
>>>
>>> An emulation fault, or something similarly nasty...
>>>
>>>
>>> At that point, even turning any potentially unaligned load or store into
>>> a runtime call is likely to be a lot cheaper.
>>
>> There are lots of potentially unaligned loads and stores. There are
>> very few actually unaligned loads and stores: On Linux-Alpha every
>> unaligned access is logged by default, and the number of
>> unaligned-access entries in the logs of our machines was relatively
>> small (on average a few per day). So trapping actual unaligned
>> accesses was faster than replacing potential unaligned accesses with
>> code sequences that synthesize the unaligned access from aligned
>> accesses.
>
> If you compile regular C/C++ code that does not intentionally do any
> nasty stuff, you will typically have zero unaligned loads stores.
>
> My machine still does not support unaligned accesses in hardware (it's
> on the todo list), and it can run an awful lot of software without
> problems.
>
> The problem arises when the programmer *deliberately* does unaligned
> loads and stores in order to improve performance. Or rather, if the
> programmer knows that the hardware supports unaligned loads and stores,
> he/she can use that to write faster code in some special cases.
>
Pretty much.
This is partly why I am in favor of potentially adding explicit keywords
for some of these cases, or to reiterate:
__aligned:
Inform compiler that a pointer is aligned.
May use a faster version if appropriate.
If a faster aligned-only variant exists of an instruction.
On an otherwise unaligned-safe target.
__unaligned: Inform compiler that an access is unaligned.
May use a runtime call or similar if necessary,
on an aligned-only target.
May do nothing on an unaligned-safe target.
None: Do whatever is the default.
Presumably, assume aligned by default,
unless target is known unaligned-safe.
And/or, an attribute, which seems to be the new style.
__attribute__((unaligned)) //GCC-ism
[[unaligned]] //probably if the C standard people did it...
Most of the pointers will remain unqualified, but most will not do
anything unaligned, so this is fine.
For cases where it is needed, a keyword could make sense (probably
alongside volatile and the usual mess of per-target ifdefs that usually
also needs to exist with this sort of code).
Meanwhile, function wrappers with manual byte-shifts or memcpy is a
particularly poor solution (depends too much on compiler magic).
Would be nice if there was a "commonly accepted" or "standard" option,
so that one can just use this and not have a mess of ifdefs (or to "just
do it with raw bytes" and accept a potentially significant performance
penalty).
>
>> Of course, if the cost of unaligned accesses is that high, you will
>> avoid them in cases like block copies where cheap unaligned accesses
>> would otherwise be beneficial.
>>
>> - anton
>