Article <6729ED5E.30500@grunge.pl>

Deutsch English Français Italiano
<6729ED5E.30500@grunge.pl>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!news.nk.ca!rocksolid2!i2pn2.org!.POSTED!not-for-mail
From: fir <fir@grunge.pl>
Newsgroups: comp.lang.c
Subject: Re: is double slower?
Date: Tue, 05 Nov 2024 11:03:10 +0100
Organization: i2pn2 (i2pn.org)
Message-ID: <6729ED5E.30500@grunge.pl>
References: <4d5973952030c993c48f93329fc25be7f236e2c5@i2pn2.org> <vgck4n$1e7dd$1@dont-email.me> <6729EA1E.60703@grunge.pl>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: i2pn2.org;
	logging-data="1040335"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
To: David Brown <david.brown@hesbynett.no>
X-Spam-Checker-Version: SpamAssassin 4.0.0
In-Reply-To: <6729EA1E.60703@grunge.pl>
Bytes: 6662
Lines: 169

fir wrote:
> David Brown wrote:
>> On 04/11/2024 08:53, fir wrote:
>>> float takes less space and when you keep arrays of floats for sure float
>>> is better (less spase and uses less memory bandwidth so i guess floats
>>> can be as twice faster in some aspects)
>>>
>>
>> Certainly if you have a lot of them, then the memory bandwidth and cache
>> it rate can make floats faster than doubles.
>>
>>> but when you do calculations on local variables not floats do the
>>> double is slower?
>>
>> I assume that for the calculations in question, the accuracy and range
>> of float is enough - otherwise the answer is obviously use doubles.
>>
>>
>> This is going to depend on the cpu, the type of instructions, the source
>> code in question, the compiler and the options.  So there is no single
>> easy answer.
>>
>> You can, as Bonita suggested, look up instruction timing information at
>> agner.org for the cpu you are using (assuming it's an x86 device) to get
>> some idea of any fundamental differences in timings.  Usually for modern
>> "big" processors, basic operations such as addition and multiplication
>> are single cycle or faster (i.e., multiple instructions can be done in
>> parallel) for float and double.  But division, square root, and other
>> more complex operations can take a lot longer with doubles.
>>
>> Next, consider if you can be using vector or SIMD operations.  On some
>> devices, you can do that with floats but not doubles - and even if you
>> can use doubles, you can usually run floats at twice the rate.
>>
>>
>> In the source code, remember it is very easy to accidentally promote to
>> double when writing in C.  If you want to stick to floats, make sure you
>> don't use double-precision constants - a missing "f" suffix can change a
>> whole expression into double calculations.  Remember that it takes time
>> to convert between float and double.
>>
>>
>> Then look at your compiler flags - these can make a big difference to
>> the speed of floating point code.  I'm giving gcc flags, because those
>> are the ones I know - if you are using another compiler, look at the
>> details of its flags.
>>
>> Obviously you want optimisation enabled if speed is relevant - -O2 is a
>> good start.  Make sure you are optimising for the cpu(s) you are using -
>> "-march=native" is good for local programs, but you will want something
>> more specific if the binary needs to run on a variety of machines.  The
>> closer you are to the exact cpu model, the better the code scheduling
>> and instruction choice can be.
>>
>> Look closely at "-ffast-math" in the gcc manual.  If that is suitable
>> for your code (and it often is), it can make a huge difference to
>> floating point intensive code.  If it is unsuitable because you have
>> infinities, or need deterministic control of things like associativity,
>> it will make your results wrong.
>>
>> "-Wdouble-promotion" can be helpful to spot accidental use of doubles in
>> what you think is a float expression.  "-Wfloat-equal" is a good idea,
>> especially if you are mixing floats and doubles.  "-Wfloat-conversion"
>> will warn about implicit conversions from doubles to floats (or to
>> integers).
>>
>>
>>
> the code that seem to speeded up a bit when turning float to double is
>
>   union Color
>   {
>     unsigned u;
>     struct { unsigned char b,g,r,a;};
>    };
>
>
> inline float distance2d_(float x1, float y1, float x2, float y2)
>   {
>    return sqrt((x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
>   }
>
> inline unsigned GetPixelUnsafe_(int x, int y)
>   {
>     return frame_bitmap[y*frame_size_x+x];
>   }
> inline void SetPixelUnsafe_(int x, int y, unsigned color)
> {
>    frame_bitmap[y*frame_size_x+x]=color;
> }
>
> void DrawPoint(int i)
> {
>     // if(!point[i].enabled) return;
>
>      int xq = point[i].x;
>      int yq = point[i].y;
>
>      Color c;
>      Color bc;
>
>      if(d_toggler)
>      {
> //     DrawCircle(xq,yq,point[i].radius,0xffffff);
>       FillCircle(xq,yq,point[i].radius,point[i].c.u);
>
>       return;
>      }
>
>     float R = point[i].radius*5;
>
>     int y_start = max(0, yq-R);
>     int y_end = min(frame_size_y, yq+R);
>     int x_start = max(0, xq-R);
>     int x_end = min(frame_size_x, xq+R);
>
>    for(int y = y_start; y<y_end; y++)
>    {
>    for(int x = x_start; x<x_end; x++)
>     {
>       //fere below was float ->
>      double p = (R - distance2d_(x,y,point[i].x,point[i].y));
>
>
>      if(!i_toggler)
>      {
>       if(p<0.4*R) continue;
>      }
>      else
>        if(p<0) continue;
>
>        p/=R;
>
>        bc.u = GetPixelUnsafe_(x,y);
>        int r = bc.r + (point[i].c.r)* p*p*p;
>        int g = bc.g + (point[i].c.g)* p*p*p;
>        int b = bc.b + (point[i].c.b)* p*p*p;
>
>        if(!r_toggler)
>        {
>        if(r>255) r = 255;
>        if(g>255) g = 255;
>        if(b>255) b = 255;
>        }
>
>        c.r = r;
>        c.g = g;
>        c.b = b;
>
>        SetPixelUnsafe_(x,y,c.u);
>
>     }
>     }
>
> }
>
> this just draws something like little light that darkens as 1/(r*r*r)
> and is able to add n-lights in place to mix colors end eventually
> "overlight" (so this is kinda blending)
>
> its very time consuming liek draving 100 of them (rhen r is 9) was
> taking 35 ms on old machine afair)

some can test it BTW

 
https://drive.google.com/file/d/1-Obb6F19h5yfCbCETP4-VFoV3XYGpRsN/view?usp=sharing

its for windows but worx under wine afair /and on linux wirtual machine 
on windows also (afair, i dont know as i got only windows)