Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Don Y Newsgroups: comp.arch.embedded Subject: Re: Unit Testing: from theory to real case Date: Fri, 30 Aug 2024 12:33:05 -0700 Organization: A noiseless patient Spider Lines: 149 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Fri, 30 Aug 2024 21:33:16 +0200 (CEST) Injection-Info: dont-email.me; posting-host="abd3e25f7ccb4ae2d2d1b094a7f46280"; logging-data="655073"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18gJiqV3uQ5dFpQEoy/kcS4" User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Cancel-Lock: sha1:7kOxf6eT+u30i49T3dUU5nYXzeo= In-Reply-To: Content-Language: en-US Bytes: 7733 On 8/30/2024 9:00 AM, pozz wrote: > Il 30/08/2024 14:21, Don Y ha scritto: >> On 8/30/2024 1:18 AM, pozz wrote: >>> When you write, test for this, test for that, what happens if the client >>> uses the module in a wrong way, what happens when the system clock changes a >>> little or a big, and when the task missed the exact timestamp of an event? >>> >>> I was trying to write tests for *all* of those situations, but it seemed to >>> me a very, VERY, *VERY* big job. The implementation of the calendar module >>> took me a couple of days, tests seem an infinite job. >> >> Because there are lots of ways your code can fail.  You have to prove >> that it doesn't fail in ANY of those ways. > > So you're confirming it's a very tedious and long job. It is "tedious" because you consider it as "overhead" instead of an integral part of the job. For a painter, cleaning his brushes is "tedious". But, failing to clean them means they are ruined from their use! Putting fuel into a vehicle is "tedious" -- but, failing to do so means the vehicle stops moving. Cashing a paycheck is tedious; "why can't they pay me in cash???" As to length/duration, it "feels" long because it is "boring" and not perceived as a productive aspect of the job. I spend 40% of my time creating specifications, 20% writing the code to meet those specifications and the remaining 40% of my time TESTING my code to prove that it meets those specifications. If I don't know what I am supposed to write (lack of a specification), then how do I *know* what to write? Do I just fumble around and hope something gels from my efforts? And, having written "it", how do I know that it complies with ALL of the specifications? I suspect you're just starting to write code on day one without any plan/map as to where you'll be going... >> Chances are, there is one place in your code that is aware of the fact that >> the event is scheduled for a PAST time.  So, you only need to create one test. >> (actually, two -- one that proves one behavior for time *almost* NOT past and >> another for time JUST past) > > I read that tests shouldn't be written for the specific implementation, but > should be generic enough to work well even if the implementation changes. The tests that you write before you write any code should cover the operation of the module without regard to its actual implementation (BECAUSE YOU HAVEN'T WRITTEN ANY CODE YET!) *As* you settle on a particular implementation, your "under the hood" examination of the code will reveal issues that could present bugs. So, you should be ADDING test cases to deliberately tickle those issues. These are just "select, special cases". To an outside observer, they don't appear "special" as they ALSO meet the specification of the module, in general. But, to the implementor with knowledge of the internals, they highlight special conditions where the code (if incorrect) could fail. They provide reassurance that those "special conditions" *in* the implementation have been handled correctly. Symbolic execution algorithmically identifies these "edges" in the code and tests them (but, you likely can't afford that luxury in your build environment). >> Your goal (having already implemented the modules) is to exercise each >> path through the code. >> >> whatever() { >>     ... >>     if (x > y) { >>        // do something >>     } else { >>        // do something else >>     } >>     ... >> } >> >> Here, there are only two different paths through the code: >> - one for x > y >> - one for !(x > y) >> So, you need to create test cases that will exercise each path. > > Now I really know there are only two paths in the current implementation, but > I'm not sure this will stay the same in the future. Then you *add* MORE test cases to tickle the special cases in THAT implementation. Of course, the old test cases should *still* pass so there is no reason to remove those! (this is the essence of regression testing -- to ensure you haven't "slid backwards" with some new change) >> Note that test cases that are applied to version 1 of the code should >> yield the same results in version 305, even if the implementation >> changes dramatically.  Because the FUNCTIONALITY shouldn't be >> changing. > > Ok, but if you create tests knowing how you will implement functionalities > (execution paths), it's possible they will not be sufficient when the > implementation change at version 305. Yes. You've written NEW code so have to add tests to cover any potential vulnerabilities in your new implementation! If generating test cases was "inexpensive", you would test every possible combination of inputs to each function and verify the outputs. But, that would be a ridiculously large number of tests! So, you pick *smarter* tests that tickle the aspects of the implementation that are likely to be incorrectly designed/implemented. This reduces the number of tests and still leaves you confident in the implementation because you expect the code to be "well behaved" *between* the special cases that you've chosen. > Before implementing the function I can imagine the following test cases: > >   assert(square(0) == 0) >   assert(square(1) == 1) >   assert(square(2) == 4) >   assert(square(15) == 225) > > Now the developer writes the function this way: > >   unsigned char square(unsigned char num) { >     if (num == 0) return 0; >     if (num == 1) return 1; >     if (num == 2) return 4; >     if (num == 3) return 9; >     if (num == 4) return 16; >     if (num == 5) return 35; >     if (num == 6) return 36; >     if (num == 7) return 49; >     ... >     if (num == 15) return 225; >   } > > My tests pass, but the implementation is wrong. To avoid this I, writing tests, > should add so many test cases that I get a headache. If the SPECIFICATION for the module only defines how it is supposed to behave over the domain {0, 1, 2, 15}, then your initial set of tests are appropriate. Note that it can't handle a range outside of [0..255] due to the return type that *you* have chosen. So, applying a test of 16 or larger will give you a FAILing result -- and cause you to wonder why the code doesn't work (ans: because you constrained the range in your choice of return type). This is how testing can catch mistakes