Deutsch English Français Italiano |
<87le6az0s8.fsf@nosuchdomain.example.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Keith Thompson <Keith.S.Thompson+u@gmail.com> Newsgroups: comp.lang.c Subject: Re: A Famous Security Bug Date: Fri, 22 Mar 2024 09:31:03 -0700 Organization: None to speak of Lines: 168 Message-ID: <87le6az0s8.fsf@nosuchdomain.example.com> References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240320114218.151@kylheku.com> <uthirj$29aoc$1@dont-email.me> <20240321092738.111@kylheku.com> <87a5mr1ffp.fsf@nosuchdomain.example.com> <20240322083648.539@kylheku.com> MIME-Version: 1.0 Content-Type: text/plain Injection-Info: dont-email.me; posting-host="d7de0737ea53c76359917fb7cbce40ac"; logging-data="3189370"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+XxOadGbx+xfy/6XiEedhL" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) Cancel-Lock: sha1:DVjokyNvL66piN0QHeWauNqzmqQ= sha1:HU95ZRBUWxmGryoU5mu8wM9X0EM= Bytes: 8236 Kaz Kylheku <433-929-6894@kylheku.com> writes: > On 2024-03-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote: >> Kaz Kylheku <433-929-6894@kylheku.com> writes: >>> On 2024-03-21, David Brown <david.brown@hesbynett.no> wrote: >>>> On 20/03/2024 19:54, Kaz Kylheku wrote: >>>>> On 2024-03-20, Stefan Ram <ram@zedat.fu-berlin.de> wrote: >>>>>> A "famous security bug": >>>>>> >>>>>> void f( void ) >>>>>> { char buffer[ MAX ]; >>>>>> /* . . . */ >>>>>> memset( buffer, 0, sizeof( buffer )); } >>>>>> >>>>>> . Can you see what the bug is? >>>>> >>>>> I don't know about "the bug", but conditions can be identified under >>>>> which that would have a problem executing, like MAX being in excess >>>>> of available automatic storage. >>>>> >>>>> If the /*...*/ comment represents the elision of some security sensitive >>>>> code, where the memset is intended to obliterate secret information, >>>>> of course, that obliteration is not required to work. >>>>> >>>>> After the memset, the buffer has no next use, so the all the assignments >>>>> performed by memset to the bytes of buffer are dead assignments that can >>>>> be elided. >>>>> >>>>> To securely clear memory, you have to use a function for that purpose >>>>> that is not susceptible to optimization. >>>>> >>>>> If you're not doing anything stupid, like link time optimization, an >>>>> external function in another translation unit (a function that the >>>>> compiler doesn't recognize as being an alias or wrapper for memset) >>>>> ought to suffice. >>>> >>>> Using LTO is not "stupid". Relying on people /not/ using LTO, or not >>>> using other valid optimisations, is "stupid". >>> >>> LTO is a nonconforming optimization. It destroys the concept that >>> when a translation unit is translated, the semantic analysis is >>> complete, such that the only remaining activity is resolution of >>> external references (linkage), and that the semantic analysis of one >>> translation unit deos not use information about another translation >>> unit. >>> >>> This has not yet changed in last April's N3096 draft, where >>> translation phases 7 and 8 are: >>> >>> 7. White-space characters separating tokens are no longer significant. >>> Each preprocessing token is converted into a token. The resulting >>> tokens are syntactically and semantically analyzed and translated >>> as a translation unit. >>> >>> 8. All external object and function references are resolved. Library >>> components are linked to satisfy external references to functions >>> and objects not defined in the current translation. All such >>> translator output is collected into a program image which contains >>> information needed for execution in its execution environment. >>> >>> and before that, the Program Structure section says: >>> >>> The separate translation units of a program communicate by (for >>> example) calls to functions whose identifiers have external linkage, >>> manipulation of objects whose identifiers have external linkage, or >>> manipulation of data files. Translation units may be separately >>> translated and then later linked to produce an executable program. >>> >>> LTO deviates from the the model that translation units are separate, >>> and the conceptual steps of phases 7 and 8. >> [...] >> >> Link time optimization is as valid as cross-function optimization *as >> long as* it doesn't change the defined behavior of the program. > > It always does; the interaction of a translation unit with another > is an externally visible aspect of the C program. (That can be inferred > from the rules which forbid semantic analysis across translation > units, only linkage.) > > That's why we can have a real world security issue caused by zeroing > being optimized away. > > The rules spelled out in ISO C allow us to unit test a translation > unit by linking it to some harness, and be sure it has exactly the > same behaviors when linked to the production program. > > If I have some translation unit in which there is a function foo, such > that when I call foo, it then calls an external function bar, that's > observable. I can link that unit to a program which supplies bar, > containing a printf call, then call foo and verify that the printf call > is executed. > > Since ISO C says that the semantic analysis has been done (that > unit having gone through phase 7), we can take it for granted as a > done-and-dusted property of that translation unit that it calls bar > whenever its foo is invoked. We can take it for granted that the output performed by the printf call will be performed, because output is observable behavior. If the external function bar is modified, the LTO step has to be redone. >> Say I have a call to foo in main, and the definition of foo is in >> another translation unit. In the absence of LTO, the compiler will have >> to generate a call to foo. If LTO is able to determine that foo doesn't >> do anything, it can remove the code for the function call, and the >> resulting behavior of the linked program is unchanged. > > There always situations in which optimizations that have been forbidden > don't cause a problem, and are even desirable. > > If you have LTO turned on, you might be programming in GNU C or Clang C > or whatever, not standard C. > > Sometimes programs have the same interpretation in GNU C and standard > C, or the same interpretation to someone who doesn't care about certain > differences. Are you claiming that a function call is observable behavior? Consider: main.c: #include "foo.h" int main(void) { foo(); } foo.h: #ifndef FOO_H #define FOO_H void foo(void); #endif foo.c: void foo(void) { // do nothing } Are you saying that the "call" instruction generated for the function call is *observable behavior*? If an implementation doesn't generate that "call" instruction because it's able to determine at link time that the call does nothing, that optimization is forbidden? I presume you'd agree that omitting the "call" instruction is allowed if the call and the function definition are in the same translation unit. What wording in the standard requires a "call" instruction to be generated if they're in different translation units? That's a trivial example, but other link time optimizations that don't change a program's observable behavior (insert weasel words about unspecified behavior) are also allowed. In phase 8: All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. I don't see anything about required CPU instructions. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com Working, but not speaking, for Medtronic void Void(void) { Void(); } /* The recursive call of the void */