Article <uvu0qq$2vscg$1@dont-email.me>

Deutsch English Français Italiano
<uvu0qq$2vscg$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article
Path: ...!2.eu.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: pozz <pozzugno@gmail.com>
Newsgroups: comp.arch.embedded
Subject: Re: Library for save an events log in Flash
Date: Fri, 19 Apr 2024 16:58:35 +0200
Organization: A noiseless patient Spider
Lines: 243
Message-ID: <uvu0qq$2vscg$1@dont-email.me>
References: <uvrb97$24nd4$1@dont-email.me> <uvrscc$2e2vi$1@dont-email.me>
 <uvs08l$2f2pb$1@dont-email.me> <uvt9vu$2tb3t$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 19 Apr 2024 16:58:34 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7a9d6136739a41fe83f8065a55972666";
	logging-data="3142032"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+QzxXXH/0aoVm6v1tk+PH8e0kDzcaLGdg="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:LYMZj1oWwk46i4sYJo/h48Oigso=
In-Reply-To: <uvt9vu$2tb3t$1@dont-email.me>
Content-Language: it
Bytes: 12050

Il 19/04/2024 10:28, David Brown ha scritto:
> On 18/04/2024 22:36, pozz wrote:
>> Il 18/04/2024 21:30, David Brown ha scritto:
>>> On 18/04/2024 16:38, pozz wrote:
>>>> The request is very common: when some interested events occur in the 
>>>> system, they, with the related timestamp, must be saved in a log. 
>>>> The log must be saved in an external SPI Flash connected to a MCU. 
>>>> The log has a maximum number of events. After the log is completely 
>>>> filled, the new event overwrite the oldest event.
>>>>
>>>> I tried to implement such library, but I eventually found it's not a 
>>>> simple task, mostly if you want a reliable library that works even 
>>>> when errors occur (for example, when one writing fails).
>>>>
>>>> I started by assigning 5 sectors of SPI Flash to the log. There are 
>>>> 256 events in a sector (the event is a fixed struct).
>>>> In this scenario, the maximum log size is 1024 events, because the 
>>>> 5th sector can be erased when the pointer reaches the end of a sector.
>>>>
>>>> The first challenge is how the system can understand what is the 
>>>> first (newest) event in the log at startup. I solved saving a 
>>>> 16-bits counter ID to each event. The initialization routine starts 
>>>> reading all the IDs and taks the greatest as the last event.
>>>> However initially the log is empty, so all the IDs are 0xFFFF, the 
>>>> maximum. One solution is to stop reading events when 0xFFFF is read 
>>>> and wrap-around ID at 0xFFFE and not 0xFFFF.
>>>
>>> Start at 0xfffe, and count down. 
>>
>>
>> And what to do when the counter reaches zero? It can wrap-around up to 
>> 0xfffe (that is very similar to an increasing counter from 
>> 0x0000-0xFFFE).
>>
> 
> How big are your log entries?  How many entries are realistic in the 
> lifetime of the system?

16 bytes each entry. It's difficult to reach 0xFFFF events in the log, 
but why limit our fantasy? :-D

With 20 events per day, the log will be filled after 9 years. It's a 
very long life, but I think I have solved the problem to understand if 
the ID was wrapped-around.


>>  > Or xor with 0xffff for storage.  Or
>>> wrap at 0xfffe, as you suggest.  Or use 32-bit values.  Or have 
>>> another way to indicate that the log entry is valid. 
>>
>> I will add a CRC for each entry and that can be used to validate the 
>> event. An empty/erased slot filled with 0xFF will not pass CRC 
>> validation.
>>
> 
> That's usually fine.
> 
>>
>>> Or, since you have a timestamp, there's no need to track an ID - the 
>>> timestamp will be increasing monotonically.
>>
>> I don't want to use timestamps for two reasons:
>>
>> - the system wall clock can be changed (the system is isolated)
>> - the library I'm writing doesn't know the content of "events", for
>>    it the event is an opaque sequence of bytes.
>>
> 
> OK.
> 
>>
>>>> However there's another problem. What happens after writing 65535 
>>>> events in the log? The ID restarts from 0, so the latest event 
>>>> hasn't the greatest ID anymore.
>>>>
>>>> These are the saved IDs after 65536 events:
>>>>
>>>>      1^ SECT    2^ SECT    3^ SECT    4^ SECT    5^SECT---------->
>>>>      0xFB00 ... 0xFC00 ... 0xFD00 ... 0xFE00 ... 0xFF00 ... 0xFFFF
>>>>
>>>> The rule "newest event has greatest ID" is correct yet. Now a new 
>>>> event is written:
>>>>
>>>>      1^ SECT-------> 2^ SECT   3^ SECT   4^ SECT   5^SECT--------->
>>>>      0x0000 0xFB01.. 0xFC00 .. 0xFD00 .. 0xFE00 .. 0xFF00 .. 0xFFFF
>>>>
>>>> Now the rule doesn't work anymore. The solution I found is to detect 
>>>> discontinuity. The IDs are consecutive, so the initialization 
>>>> routine continues reading while the ID(n+1)=ID(n)+1. When there's a 
>>>> gap, the init function stops and found the ID and position of the 
>>>> newest event.
>>>
>>> Make your counts from 0 to 256*5 - 1, then wrap.  Log entry "n" will 
>>> be at address n * sizeof(log entry), with up to 256 log entries 
>>> blank. Then you don't need to store a number at all.
>>
>> What do you mean with log entry "0"? Is it the oldest or the newest? I 
>> think the oldest, because that formula is imho correct in this case.
>>
>> However it doesn't appear correct when the log has rotated, that 
>> happens after writing 5x256+1 events. In this case the newest entry 
>> ("n"=1024) is at address 0, not n*sizeof(entry).
>>
> 
> (I misread your "5 sectors of an SPI flash chip" as "5 SPI flash chips" 
> when first replying.  It makes no real difference to what I wrote, but I 
> might have used "chip" instead of "sector".)
> 
> You have 256 entries per flash sector, and 5 flash sectors.  For the log 
> entry number "n" - where "n" is an abstract count that never wraps, your 
> index "i" into the flash array is (n % 5*256).  The sector number is 
> then (i / 256), and the index into the sector is (i % 256).  The 
> position in the log is determined directly by the entry number, and you 
> don't actually need to store it anywhere.
> 
> Think of this a different way - dispense with the log entry numbers 
> entirely.  When you start up, scan the flash to find the next free slot. 
>   You do this by looking at slot 0 first.  If that is not empty, keep 
> scanning until you find a free slot - that's the next free slot.  If 
> slot 0 is empty, scan until you have non-empty slots, then keep going 
> until you get a free one again, and that's the next free slot.  If you 
> never find a used slot, or fail to find a free slot after the non-free 
> slots, then your first free slot is slot 0.
> 
> Any new logs are then put in this slot, moving forward.  If you need to 
> read out old logs, move backwards.  When storing new logs, as you are 
> nearing the end of a flash sector (how near depends on the sector erase 
> time and how often events can occur), start the erase of the next sector 
> in line.

Yes, it is what I already do. However I disagree on the formula.

The higher-layer application requests log entry 0. What is it? The 
newest event. My eventlog library should convert 0 to the slot index in 
the Flash, that is directly related to the Flash addres (I don't really 
need the number of sector here).

If the log is empty, event 0 for the application is slot 0 for the 
eventlog library. However, if there are three events in the log, event 0 
for the application is slot 2 for the eventlog library.

The application doesn't know anything about what I named the ID of the 
event. It's just a number used by the lower-layer eventlog module to 
find the first free slot at startup.


>>>> But he problems don't stop here. What happens if an event write 
>>>> failed? Should I verify the real written values, by reading back 
>>>> them and comparing with original data? In a reliable system, yes, I 
>>>> should.
>>>>
>>>> I was thinking to protect an event with a CRC, so to understand at 
>>>> any time if an event is corrupted (bad written). However the 
>>>> possibility to have isolated corrupted events increase the 
>>>> complexity of the task a lot.
>>>
>>> An 8-bit or 16-bit CRC is peanuts to calculate and check. 
>>
>> I know. Here the increased complexity wasn't related to the CRC 
>> calculation, but to the possibility to have isolated corrupted slots 
>> in the buffer. Taking into account these corrupted slots isn't so 
>> simple for me.
>>
> 
> Think how such corruption could happen, and its consequences.  For most 
> event logs, it is simply not going to occur in the lifetime of working 
> products - and if it does, as an isolated error in an event log, it 
> doesn't matter significantly.  Errors in a sensibly designed SPI NOR 
> flash system would be an indication of serious hardware problems such as 
> erratic power supplies, and then the log is the least of your concerns.
> 
> The only thing to consider is a reset or power failure in the middle of 
> writing a log event.

Yes, I agree with you.


>>> Write the whole log entry except for a byte or two (whatever is the 
========== REMAINDER OF ARTICLE TRUNCATED ==========