Deutsch English Français Italiano |
<f62c31b943bb5f89ed14f4cc359762bb@www.novabbs.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder6.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: mitchalsup@aol.com (MitchAlsup1) Newsgroups: comp.arch Subject: Re: Efficiency of in-order vs. OoO Date: Wed, 13 Mar 2024 15:36:50 +0000 Organization: Rocksolid Light Message-ID: <f62c31b943bb5f89ed14f4cc359762bb@www.novabbs.org> References: <uigus7$1pteb$1@dont-email.me> <a292b7f4fd21a329c25a686bb16a2b4d@www.novabbs.org> <gkxtN.44654$SyNd.35025@fx33.iad> <urg470$215g3$3@dont-email.me> <zp2DN.467627$xHn7.106497@fx14.iad> <usgid7$20vho$1@dont-email.me> <81072affd0e9b23260975a11cc07e9a8@www.novabbs.org> <JS_GN.75294$LONb.61962@fx08.iad> <a7ff82db04a4bb9a5c06508967abd6f0@www.novabbs.org> <zIkHN.527686$7sbb.324715@fx16.iad> <V2mHN.527689$7sbb.171646@fx16.iad> <E2nHN.412941$Ama9.35835@fx12.iad> <Q4oHN.386275$vFZa.14544@fx13.iad> <uhqHN.366349$q3F7.102593@fx45.iad> <lXHHN.529056$7sbb.441166@fx16.iad> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: i2pn2.org; logging-data="1858606"; mail-complaints-to="usenet@i2pn2.org"; posting-account="PGd4t4cXnWwgUWG9VtTiCsm47oOWbHLcTr4rYoM0Edo"; User-Agent: Rocksolid Light X-Spam-Checker-Version: SpamAssassin 4.0.0 X-Rslight-Site: $2y$10$o8C6G0zJJIvTAsJ6RBGnl.ZA.l0SjJ2QDILVwoCBXhyBsWMBl8WiO X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8 Bytes: 4278 Lines: 57 EricP wrote: > Scott Lurndal wrote: >> EricP <ThatWouldBeTelling@thevillage.com> writes: >>> Scott Lurndal wrote: >>>> EricP <ThatWouldBeTelling@thevillage.com> writes: >>>>> EricP wrote: >>>>>>>> As most stores are posted, the data stored needs to be 'poisoned' >>>>>>>> so that any subsequent use of the data (e.g. a load) will report >>>>>>>> a fault. >>>>>>> Storing the bad <arriving> ECC should take care of that. >>>>>> I don't think that will always work. Assuming we are using a >>>>>> 72-bit SECDED ECC and a cache line is read with a double error, >>>>>> then if the ST overwrites an 8 byte aligned value it will generate >>>>>> a new valid ECC and correct the error. >>>>>> >>>>>> However if the ST is less than 8 bytes or misaligned, it won't know which >>>>>> of the 8 bytes was invalid so can't tell if the bad data was overwritten. >>>>>> If it keeps the old ECC as an error indicator, that code might actually be >>>>>> correct for the new data. If it generates a new valid ECC then it loses >>>>>> track of the fact that the data MAY be invalid. >>>>>> >>>>>> In this second case of partial overwrite I think it has to generate a >>>>>> new invalid ECC for the new 8 byte data indicating a double error. >>>>>> >>>>>> When the modified line is written back to DRAM it retains the >>>>>> double error ECC. >>>>> And if the page is out swapped and recycled we lose track of >>>>> the error indicator on that 8-byte value. >>>> If it was properly poisoned, the access by the DMA engine will >>>> cause a RAS error to be signalled and the DMA aborted. >>> And the OS does what with the page and its data? >>> This could happen long after the owner process terminated, >>> maybe part of a lazy file cache write back. >>> >>> The only option for the OS might be to log the error and just reset >>> the ECC to valid for the current data so the IO can complete. >> >> No, the I/O must be aborted. RAS 101 - do not propogate >> poisoned data. Consider a page being written out and the last cache line in the page has a bad ECC. What command does one send the disk to indicate "forget all that data I just sent you" ?? > Perhaps but tossing a whole block from an IO expands the size of > the problem by a factor of 1000's. > If that was one byte wrong in a text file then I think most people > would want it written, as opposed to tossing out their work. > If that was one byte wrong in a file system meta data block then > there is no good answer. Many of the meta data blocks are in linked lists > or B+ trees so not writing the block could corrupt a whole file system, > and writing the block could also cause corruption but hopefully less likely. > So you are damned if you do fix the ECC and write the block, > and damned if you don't. But do seems less damning.