Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Don Y <blockedofcourse@foo.invalid>
Newsgroups: comp.arch.embedded
Subject: Re: Unit Testing: from theory to real case
Date: Fri, 30 Aug 2024 12:33:05 -0700
Organization: A noiseless patient Spider
Lines: 149
Message-ID: <vat6pr$jvn1$1@dont-email.me>
References: <vakb55$2sp38$2@dont-email.me> <vaqnag$45ii$1@dont-email.me>
 <varv8e$ct05$1@dont-email.me> <vasdg1$fl41$1@dont-email.me>
 <vasqb6$ct06$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 30 Aug 2024 21:33:16 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="abd3e25f7ccb4ae2d2d1b094a7f46280";
	logging-data="655073"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX18gJiqV3uQ5dFpQEoy/kcS4"
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101
 Thunderbird/102.2.2
Cancel-Lock: sha1:7kOxf6eT+u30i49T3dUU5nYXzeo=
In-Reply-To: <vasqb6$ct06$3@dont-email.me>
Content-Language: en-US
Bytes: 7733

On 8/30/2024 9:00 AM, pozz wrote:
> Il 30/08/2024 14:21, Don Y ha scritto:
>> On 8/30/2024 1:18 AM, pozz wrote:
>>> When you write, test for this, test for that, what happens if the client 
>>> uses the module in a wrong way, what happens when the system clock changes a 
>>> little or a big, and when the task missed the exact timestamp of an event?
>>>
>>> I was trying to write tests for *all* of those situations, but it seemed to 
>>> me a very, VERY, *VERY* big job. The implementation of the calendar module 
>>> took me a couple of days, tests seem an infinite job.
>>
>> Because there are lots of ways your code can fail.  You have to prove
>> that it doesn't fail in ANY of those ways.
> 
> So you're confirming it's a very tedious and long job.

It is "tedious" because you consider it as "overhead" instead of an
integral part of the job.

For a painter, cleaning his brushes is "tedious".  But, failing to clean
them means they are ruined from their use!  Putting fuel into a vehicle
is "tedious" -- but, failing to do so means the vehicle stops moving.
Cashing a paycheck is tedious; "why can't they pay me in cash???"

As to length/duration, it "feels" long because it is "boring" and not
perceived as a productive aspect of the job.

I spend 40% of my time creating specifications, 20% writing the code
to meet those specifications and the remaining 40% of my time TESTING
my code to prove that it meets those specifications.

If I don't know what I am supposed to write (lack of a specification),
then how do I *know* what to write?  Do I just fumble around and hope
something gels from my efforts?  And, having written "it", how do I
know that it complies with ALL of the specifications?

I suspect you're just starting to write code on day one without
any plan/map as to where you'll be going...

>> Chances are, there is one place in your code that is aware of the fact that
>> the event is scheduled for a PAST time.  So, you only need to create one test.
>> (actually, two -- one that proves one behavior for time *almost* NOT past and
>> another for time JUST past)
> 
> I read that tests shouldn't be written for the specific implementation, but 
> should be generic enough to work well even if the implementation changes.

The tests that you write before you write any code should cover the
operation of the module without regard to its actual implementation
(BECAUSE YOU HAVEN'T WRITTEN ANY CODE YET!)

*As* you settle on a particular implementation, your "under the hood"
examination of the code will reveal issues that could present bugs.
So, you should be ADDING test cases to deliberately tickle those
issues.  These are just "select, special cases".  To an outside observer,
they don't appear "special" as they ALSO meet the specification of the
module, in general.

But, to the implementor with knowledge of the internals, they highlight
special conditions where the code (if incorrect) could fail.  They provide
reassurance that those "special conditions" *in* the implementation have
been handled correctly.

Symbolic execution algorithmically identifies these "edges" in the code
and tests them (but, you likely can't afford that luxury in your build
environment).

>> Your goal (having already implemented the modules) is to exercise each
>> path through the code.
>>
>> whatever() {
>>     ...
>>     if (x > y) {
>>        // do something
>>     } else {
>>        // do something else
>>     }
>>     ...
>> }
>>
>> Here, there are only two different paths through the code:
>> - one for x > y
>> - one for !(x > y)
>> So, you need to create test cases that will exercise each path.
> 
> Now I really know there are only two paths in the current implementation, but 
> I'm not sure this will stay the same in the future.

Then you *add* MORE test cases to tickle the special cases in THAT
implementation.  Of course, the old test cases should *still* pass
so there is no reason to remove those!  (this is the essence of regression
testing -- to ensure you haven't "slid backwards" with some new change)

>> Note that test cases that are applied to version 1 of the code should
>> yield the same results in version 305, even if the implementation
>> changes dramatically.  Because the FUNCTIONALITY shouldn't be
>> changing.
> 
> Ok, but if you create tests knowing how you will implement functionalities 
> (execution paths), it's possible they will not be sufficient when the 
> implementation change at version 305.

Yes.   You've written NEW code so have to add tests to cover any potential
vulnerabilities in your new implementation!

If generating test cases was "inexpensive", you would test every possible
combination of inputs to each function and verify the outputs.  But, that
would be a ridiculously large number of tests!

So, you pick *smarter* tests that tickle the aspects of the implementation
that are likely to be incorrectly designed/implemented.  This reduces the
number of tests and still leaves you confident in the implementation
because you expect the code to be "well behaved" *between* the special
cases that you've chosen.

> Before implementing the function I can imagine the following test cases:
> 
>    assert(square(0) == 0)
>    assert(square(1) == 1)
>    assert(square(2) == 4)
>    assert(square(15) == 225)
> 
> Now the developer writes the function this way:
> 
>    unsigned char square(unsigned char num) {
>      if (num == 0) return 0;
>      if (num == 1) return 1;
>      if (num == 2) return 4;
>      if (num == 3) return 9;
>      if (num == 4) return 16;
>      if (num == 5) return 35;
>      if (num == 6) return 36;
>      if (num == 7) return 49;
>      ...
>      if (num == 15) return 225;
>    }
> 
> My tests pass, but the implementation is wrong. To avoid this I, writing tests, 
> should add so many test cases that I get a headache.

If the SPECIFICATION for the module only defines how it is supposed to behave
over the domain {0, 1, 2, 15}, then your initial set of tests are appropriate.
Note that it can't handle a range outside of [0..255] due to the return type
that *you* have chosen.  So, applying a test of 16 or larger will give you
a FAILing result -- and cause you to wonder why the code doesn't work
(ans:  because you constrained the range in your choice of return type).

This is how testing can catch mistakes