| Deutsch English Français Italiano |
|
<666a18de$0$958$882e4bbb@reader.netnews.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news-out.netnews.com!postmaster.netnews.com!us14.netnews.com!not-for-mail
X-Trace: DXC=<BJFoDGFSbVSN]UZ>R37=RHWonT5<]0T]Q;nb^V>PUfV=AnO\FUBY[PnF54O@^\1?T6dIM8`1NEMQ>Sa_GLglHY[nk9`jfQW@>]dDioXmM8L1QOXeKkS2`?jY
X-Complaints-To: support@blocknews.net
Date: Wed, 12 Jun 2024 17:53:35 -0400
MIME-Version: 1.0
User-Agent: Betterbird (Windows)
Subject: Re: "undefined behavior"?
Newsgroups: comp.lang.c
References: <666a095a$0$952$882e4bbb@reader.netnews.com>
<8t3k6j5ikf5mvimvksv2t91gbt11ljdfgb@4ax.com>
Content-Language: en-US
From: DFS <nospam@dfs.com>
In-Reply-To: <8t3k6j5ikf5mvimvksv2t91gbt11ljdfgb@4ax.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 347
Message-ID: <666a18de$0$958$882e4bbb@reader.netnews.com>
NNTP-Posting-Host: 127.0.0.1
X-Trace: 1718229214 reader.netnews.com 958 127.0.0.1:50201
Bytes: 11770
On 6/12/2024 5:30 PM, Barry Schwarz wrote:
> On Wed, 12 Jun 2024 16:47:23 -0400, DFS <nospam@dfs.com> wrote:
>
>> Wrote a C program to mimic the stats shown on:
>>
>> https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
>>
>> My code compiles and works fine - every stat matches - except for one
>> anomaly: when using a dataset of consecutive numbers 1 to N, all values
>>> 40 are flagged as outliers. Up to 40, no problem. Random numbers
>> dataset of any size: no problem.
>>
>> And values 41+ definitely don't meet the conditions for outliers (using
>> the IQR * 1.5 rule).
>>
>> Very strange.
>>
>> Edit: I just noticed I didn't initialize a char:
>> before: char outliers[100];
>> after : char outliers[100] = "";
>>
>> And the problem went away. Reset it to before and problem came back.
>>
>> Makes no sense. What could cause the program to go FUBAR at data point
>> 41+ only when the dataset is consecutive numbers?
>>
>> Also, why doesn't gcc just do you a solid and initialize to "" for you?
>
> Makes perfect sense. The first rule of undefined behavior is
> "Whatever happens is exactly correct." You are not entitled to any
> expectations and none of the behavior (or perhaps all of the behavior)
> can be called unexpected.
I HATE bogus answers like this.
Aren't you embarrassed to say things like that?
> Since we cannot see your code, I will guess that you use a non-zero
> value in outliers[i] to indicate that the corresponding value has been
> identified as an outlier.
No.
I compare the data point to the lower and upper bounds of a stat rule
commonly called the "IQR Rule":
lo = Q1 - (1.5 * IQR)
hi = Q3 + (1.5 * IQR)
If it falls outside the range of lo-hi I strcat the value to a char.
The outlier routine starts line 170.
If you change
char outliers[200]="", temp[10]="";
to
char outliers[200], temp[10];
you might see what happens when you run the program for consecutive values:
$ ./prog 100 -c
=========================================================================
//this code is hereby released to the public domain
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include <time.h>
/*
this program computes the descriptive statistics of a randomly
generated set of N integers
1.0 release Dec 2020
2.0 release Jun 2024
used the population skewness and Kurtosis formulas from:
https://www.calculatorsoup.com/calculators/statistics/descriptivestatistics.php
also test the results of this code against that site
compile: gcc -Wall prog.c -o prog -lm
usage : ./prog N -option (where N is 2 or higher, and option is -r or
-c or -o)
-r generates N random numbers
-c generates consecutive numbers 1 to N
-o generates random numbers with outliers
*/
//random ints
int randNbr(int low, int high) {
return (low + rand() / (RAND_MAX / (high - low + 1) + 1));
}
//comparator function used with qsort
int compareint (const void * a, const void * b)
{
if (*(int*)a > *(int*)b) return 1;
else if (*(int*)a < *(int*)b) return -1;
else return 0;
}
int main(int argc, char *argv[])
{
if(argc < 3) {
printf("Missing argument:\n");
printf(" * enter a number greater than 2\n");
printf(" * enter an option -r -c or -o\n");
exit(0);
}
//vars
int i=0, lastmode=0;
int N = atoi(argv[1]);
int nums[N];
//int *nums = malloc(N * sizeof(int));
double sumN=0.0, median=0.0, Q1=0.0, Q2=0.0, Q3=0.0, IQR=0.0;
double stddev = 0.0, kurtosis = 0.0;
double sqrdiffmean = 0.0, cubediffmean = 0.0, quaddiffmean = 0.0;
double meanabsdev = 0.0, rootmeansqr = 0.0;
char mode[100], tmp[12];
//generate random dataset
if(strcmp(argv[2],"-r") == 0) {
srand(time(NULL));
for(i=0;i<N;i++) { nums[i] = randNbr(1,N*3); }
printf("%d Randoms:\n", N);
printf("No commas : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas: "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
qsort(nums,N,sizeof(int),compareint);
printf("\nSorted : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nSorted : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}
//generate random dataset with outliers
if(strcmp(argv[2],"-o") == 0) {
srand(time(NULL));
nums[0] = 1; nums[1] = 3;
for(i=2;i<N-2;i++) { nums[i] = randNbr(100,N*30); }
nums[N-2] = 1000; nums[N-1] = 2000;
printf("%d Randoms with outliers:\n", N);
printf("No commas : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas: "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
qsort(nums,N,sizeof(int),compareint);
printf("\nSorted : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nSorted : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}
//generate consecutive numbers 1 to N
if(strcmp(argv[2],"-c") == 0) {
for(i=0;i<N;i++) { nums[i] = i + 1; }
printf("%d Consecutive:\n", N);
printf("No commas : "); for(i=0;i<N;i++) { printf("%d ", nums[i]); }
printf("\nWith commas : "); for(i=0;i<N;i++) { printf("%d,", nums[i]); }
}
//various
for(i=0;i<N;i++) {sumN += nums[i];}
double min = nums[0], max = nums[N-1];
//calc descriptive stats
double mean = sumN / (double)N;
========== REMAINDER OF ARTICLE TRUNCATED ==========