Deutsch   English   Français   Italiano  
<voobnc$3l2dl$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail
From: BGB <cr88192@gmail.com>
Newsgroups: comp.arch
Subject: Re: Cost of handling misaligned access
Date: Fri, 14 Feb 2025 15:14:11 -0600
Organization: A noiseless patient Spider
Lines: 89
Message-ID: <voobnc$3l2dl$1@dont-email.me>
References: <5lNnP.1313925$2xE6.991023@fx18.iad> <vnosj6$t5o0$1@dont-email.me>
 <2025Feb3.075550@mips.complang.tuwien.ac.at> <volg1m$31ca1$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 14 Feb 2025 22:14:21 +0100 (CET)
Injection-Info: dont-email.me; posting-host="9f075752c769f3b49544a44a921ac4b4";
	logging-data="3836341"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1/dQ2BNAr3Zxr3odv70Em25ildbY5v2rrE="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:KMcleDqsR9VFZNZuQUcU5bbnSds=
In-Reply-To: <volg1m$31ca1$1@dont-email.me>
Content-Language: en-US
Bytes: 4666

On 2/13/2025 1:09 PM, Marcus wrote:
> On 2025-02-03, Anton Ertl wrote:
>> BGB <cr88192@gmail.com> writes:
>>> On 2/2/2025 10:45 AM, EricP wrote:
>>>> Digging deeper with performance counters reveals executing each 
>>>> unaligned
>>>> load instruction results in ~505 executed instructions. P550 almost
>>>> certainly doesn’t have hardware support for unaligned accesses.
>>>> Rather, it’s likely raising a fault and letting an operating system
>>>> handler emulate it in software."
>>>>
>>>
>>> An emulation fault, or something similarly nasty...
>>>
>>>
>>> At that point, even turning any potentially unaligned load or store into
>>> a runtime call is likely to be a lot cheaper.
>>
>> There are lots of potentially unaligned loads and stores.  There are
>> very few actually unaligned loads and stores: On Linux-Alpha every
>> unaligned access is logged by default, and the number of
>> unaligned-access entries in the logs of our machines was relatively
>> small (on average a few per day).  So trapping actual unaligned
>> accesses was faster than replacing potential unaligned accesses with
>> code sequences that synthesize the unaligned access from aligned
>> accesses.
> 
> If you compile regular C/C++ code that does not intentionally do any
> nasty stuff, you will typically have zero unaligned loads stores.
> 
> My machine still does not support unaligned accesses in hardware (it's
> on the todo list), and it can run an awful lot of software without
> problems.
> 
> The problem arises when the programmer *deliberately* does unaligned
> loads and stores in order to improve performance. Or rather, if the
> programmer knows that the hardware supports unaligned loads and stores,
> he/she can use that to write faster code in some special cases.
> 

Pretty much.


This is partly why I am in favor of potentially adding explicit keywords 
for some of these cases, or to reiterate:
   __aligned:
     Inform compiler that a pointer is aligned.
     May use a faster version if appropriate.
       If a faster aligned-only variant exists of an instruction.
       On an otherwise unaligned-safe target.
   __unaligned: Inform compiler that an access is unaligned.
     May use a runtime call or similar if necessary,
       on an aligned-only target.
     May do nothing on an unaligned-safe target.
   None: Do whatever is the default.
     Presumably, assume aligned by default,
       unless target is known unaligned-safe.

And/or, an attribute, which seems to be the new style.
   __attribute__((unaligned))  //GCC-ism
   [[unaligned]]  //probably if the C standard people did it...


Most of the pointers will remain unqualified, but most will not do 
anything unaligned, so this is fine.

For cases where it is needed, a keyword could make sense (probably 
alongside volatile and the usual mess of per-target ifdefs that usually 
also needs to exist with this sort of code).

Meanwhile, function wrappers with manual byte-shifts or memcpy is a 
particularly poor solution (depends too much on compiler magic).


Would be nice if there was a "commonly accepted" or "standard" option, 
so that one can just use this and not have a mess of ifdefs (or to "just 
do it with raw bytes" and accept a potentially significant performance 
penalty).



> 
>> Of course, if the cost of unaligned accesses is that high, you will
>> avoid them in cases like block copies where cheap unaligned accesses
>> would otherwise be beneficial.
>>
>> - anton
>