Path: ...!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.arch.embedded
Subject: Re: Library for save an events log in Flash
Date: Thu, 18 Apr 2024 21:30:19 +0200
Organization: A noiseless patient Spider
Lines: 96
Message-ID: <uvrscc$2e2vi$1@dont-email.me>
References: <uvrb97$24nd4$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 18 Apr 2024 21:30:20 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="ac77df562e41779b6c3d23eaf288d0f7";
	logging-data="2558962"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19sYoCaw23I7Tl75WpUziLzxm4bcTHok0s="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:NMB27oT+eMAKJD3kwQ/cK5McL0A=
Content-Language: en-GB
In-Reply-To: <uvrb97$24nd4$1@dont-email.me>
Bytes: 5789

On 18/04/2024 16:38, pozz wrote:
> The request is very common: when some interested events occur in the 
> system, they, with the related timestamp, must be saved in a log. The 
> log must be saved in an external SPI Flash connected to a MCU. The log 
> has a maximum number of events. After the log is completely filled, the 
> new event overwrite the oldest event.
> 
> I tried to implement such library, but I eventually found it's not a 
> simple task, mostly if you want a reliable library that works even when 
> errors occur (for example, when one writing fails).
> 
> I started by assigning 5 sectors of SPI Flash to the log. There are 256 
> events in a sector (the event is a fixed struct).
> In this scenario, the maximum log size is 1024 events, because the 5th 
> sector can be erased when the pointer reaches the end of a sector.
> 
> The first challenge is how the system can understand what is the first 
> (newest) event in the log at startup. I solved saving a 16-bits counter 
> ID to each event. The initialization routine starts reading all the IDs 
> and taks the greatest as the last event.
> However initially the log is empty, so all the IDs are 0xFFFF, the 
> maximum. One solution is to stop reading events when 0xFFFF is read and 
> wrap-around ID at 0xFFFE and not 0xFFFF.

Start at 0xfffe, and count down.  Or xor with 0xffff for storage.  Or 
wrap at 0xfffe, as you suggest.  Or use 32-bit values.  Or have another 
way to indicate that the log entry is valid.  Or, since you have a 
timestamp, there's no need to track an ID - the timestamp will be 
increasing monotonically.

> 
> However there's another problem. What happens after writing 65535 events 
> in the log? The ID restarts from 0, so the latest event hasn't the 
> greatest ID anymore.
> 
> These are the saved IDs after 65536 events:
> 
>      1^ SECT    2^ SECT    3^ SECT    4^ SECT    5^SECT---------->
>      0xFB00 ... 0xFC00 ... 0xFD00 ... 0xFE00 ... 0xFF00 ... 0xFFFF
> 
> The rule "newest event has greatest ID" is correct yet. Now a new event 
> is written:
> 
>      1^ SECT-------> 2^ SECT   3^ SECT   4^ SECT   5^SECT--------->
>      0x0000 0xFB01.. 0xFC00 .. 0xFD00 .. 0xFE00 .. 0xFF00 .. 0xFFFF
> 
> Now the rule doesn't work anymore. The solution I found is to detect 
> discontinuity. The IDs are consecutive, so the initialization routine 
> continues reading while the ID(n+1)=ID(n)+1. When there's a gap, the 
> init function stops and found the ID and position of the newest event.
> 

Make your counts from 0 to 256*5 - 1, then wrap.  Log entry "n" will be 
at address n * sizeof(log entry), with up to 256 log entries blank. 
Then you don't need to store a number at all.

> But he problems don't stop here. What happens if an event write failed? 
> Should I verify the real written values, by reading back them and 
> comparing with original data? In a reliable system, yes, I should.
> 
> I was thinking to protect an event with a CRC, so to understand at any 
> time if an event is corrupted (bad written). However the possibility to 
> have isolated corrupted events increase the complexity of the task a lot.
> 

An 8-bit or 16-bit CRC is peanuts to calculate and check.  Write the 
whole log entry except for a byte or two (whatever is the minimum write 
size for the flashes), which must be something other than 0xff's.  Check 
the entry after writing if you really want, then write the final byte or 
two.  Then if there's a power-fail in the middle, it's obvious that the 
log entry is bad.

For SPI NOR flash, the risk of a bad write - other than through 
unexpected resets or power fails - is negligible for most purposes. 
Don't over-complicate things worrying about something that will never 
happen.

> Suppose to write the 4th event with ID=3 at position 4 in the sector. 
> Now suppose the write failed. I can try to re-write the same event at 
> poisition 5. Should I use ID=4 or ID=5? At first, I think it's better to 
> use ID=5. The CRC should detect that at position 4 is stored a corrupted 
> event.
> After that two events are written as well, ID=6 and 7.
> 
> Now the higher application wants to read the 4th event of the log 
> (starting from the newest). We know that the newest is ID=7, so the 4th 
> event is ID=7-4+1=4. However the event with ID=4 is corrupted, so we 
> should check the previous ID=3... and so on.
> Now we can't have a mathematical function that returns the position of 
> an event starting from it's ordinal number from the newest event.
> 
> 
> Eventually I think it's better to use a public domain library, if it 
> exists.
>