Deutsch   English   Français   Italiano  
<v4d4h5$1rc9e$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.lang.c
Subject: Re: "undefined behavior"?
Date: Wed, 12 Jun 2024 23:38:45 +0200
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <v4d4h5$1rc9e$1@dont-email.me>
References: <666a095a$0$952$882e4bbb@reader.netnews.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 12 Jun 2024 23:38:45 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="4318c523bc196dccfc54ab79a6d888db";
	logging-data="1945902"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+VG7yBBouMkro4q3/pGtXetIObhEiL4MY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:T8Hgiw+QNESPiPO9ZOKCNOPvgYA=
Content-Language: en-GB
In-Reply-To: <666a095a$0$952$882e4bbb@reader.netnews.com>
Bytes: 4379

On 12/06/2024 22:47, DFS wrote:
> Wrote a C program to mimic the stats shown on:
> 
> https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
> 
> My code compiles and works fine - every stat matches - except for one 
> anomaly: when using a dataset of consecutive numbers 1 to N, all values 
>  > 40 are flagged as outliers.  Up to 40, no problem.  Random numbers 
> dataset of any size: no problem.
> 
> And values 41+ definitely don't meet the conditions for outliers (using 
> the IQR * 1.5 rule).
> 
> Very strange.
> 
> Edit: I just noticed I didn't initialize a char:
> before: char outliers[100];
> after : char outliers[100] = "";
> 
> And the problem went away.  Reset it to before and problem came back.
> 
> Makes no sense.  What could cause the program to go FUBAR at data point 
> 41+ only when the dataset is consecutive numbers?
> 
> Also, why doesn't gcc just do you a solid and initialize to "" for you?
> 

It is /really/ difficult to know exactly what your problem is without 
seeing your C code!  There may be other problems that you haven't seen yet.

Non-static local variables without initialisers have "indeterminate" 
value if there is no initialiser.  Trying to use these "indeterminate" 
values is undefined behaviour - you have absolutely no control over what 
might happen.  Any particular behaviour you see is done to luck from the 
rest of the code and what happened to be in memory at the time.

There is no automatic initialisation of non-static local variables, 
because that would often be inefficient.  The best way to avoid errors 
like yours, IMHO, is not to declare such variables until you have data 
to put in them - thus you always have a sensible initialiser of real 
data.  Occasionally that is not practical, but it works in most cases.

For a data array, zero initialisation is common.  Typically you do this 
with :

	int xs[100] = { 0 };

That puts the explicit 0 in the first element of xs, and then the rest 
of the array is cleared with zeros.

I recommend never using "char" as a type unless you really mean a 
character, limited to 7-bit ASCII.  So if your "outliers" array really 
is an array of such characters, "char" is fine.  If it is intended to be 
numbers and for some reason you specifically want 8-bit values, use 
"uint8_t" or "int8_t", and initialise with { 0 }.

A major lesson here is to learn how to use your tools.  C is not a 
forgiving language.  Make use of all the help your tools can give you - 
enable warnings here.  "gcc -Wall" enables a range of common warnings 
with few false positives in normal well-written code, including ones 
that check for attempts to read uninitialised data.  "-Wextra" enables a 
slew of extra warnings.  Some of these will annoy people and trigger on 
code they find reasonable, while most are good choices for a lot of code 
- but personal preference varies significantly.  And remember to enable 
optimisation, since it makes the static checking more powerful.

If you /really/ want gcc to zero out such local data automatically, use 
"-ftrivial-auto-var-init=zero".  But it is much better to use warnings 
and write correct code - options like that one are an addition to 
well-checked code for paranoid software in security-critical contexts.