Deutsch English Français Italiano |
<20240323173522.946@kylheku.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Kaz Kylheku <433-929-6894@kylheku.com> Newsgroups: comp.lang.c Subject: Re: A Famous Security Bug Date: Sun, 24 Mar 2024 01:23:00 -0000 (UTC) Organization: A noiseless patient Spider Lines: 88 Message-ID: <20240323173522.946@kylheku.com> References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240320114218.151@kylheku.com> <uthirj$29aoc$1@dont-email.me> <20240321092738.111@kylheku.com> <87a5mr1ffp.fsf@nosuchdomain.example.com> <20240322083648.539@kylheku.com> <87le6az0s8.fsf@nosuchdomain.example.com> <20240322094449.555@kylheku.com> <87cyrmyvnv.fsf@nosuchdomain.example.com> <20240322123323.805@kylheku.com> <utmst2$3n7mv$2@dont-email.me> <20240323090700.848@kylheku.com> <utn57t$3pbh7$1@dont-email.me> Injection-Date: Sun, 24 Mar 2024 01:23:00 -0000 (UTC) Injection-Info: dont-email.me; posting-host="866d3663afea439826bdeb05c35522a4"; logging-data="4176815"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jQ7JGdupZntLas6HOvwdX2SsaREd9XMQ=" User-Agent: slrn/pre1.0.4-9 (Linux) Cancel-Lock: sha1:tu+IITRRNrnVqlWc2KykEP+vpD8= Bytes: 4908 On 2024-03-23, David Brown <david.brown@hesbynett.no> wrote: > I believe we are at an impasse here, unless someone can think of a new > point to make. How about a completely different one about a related but separate matter (small one). It has occurred to me that the definition of "translation unit" is lacking a little bit in regard to existing practice. Or that at least it could use a footnote: "A source file together with all the headers and source files included via the preprocessing directive #include is known as a preprocessing translation unit." But in fact, in actual compilers we can do something like this: gcc -DMAIN='int main(void) { puts("hello"); }' and then in the source file we can have #include <stdio.h> MAIN the point is that a translation unit tokens can come from sources other than a source file and its included header files. Say we have: printf '#include <stdio.h>\nMAIN\n' | \ gcc -DMAIN='int main(void) { puts("hello"); }' -x c - How we can subject this to a standard-based interpretation is to identify the output of printf piped into gcc, as well as the -DMAIN option, as being the "source file". "The text of the program is kept in units called source files, (or preprocessing files) in this document." Thus the unit in which we are keeping the source in the above shell script is identifiable as the content of the pipe, and the symbol MAIN. It is understood that the MAIN symbol precedes the content of the pipe. Those things together are the "source file". This is all fine, but could benefit from a foot note like "A source file need not be a single data unit accessible by name in a file system. Implementations may allow situations such as source code dynamically generated, transmitted to the translator via an interprocess communication mechanism or network. Furthermore, implementations may allow some tokens of a translation unit to be injected via a configuraton mechanism, such as command line arguments." > > One thing I would ask before leaving this - could you take a look at the > latest draft for the next C standard after C23? > ><https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf> Thanks, I'm now using that, discontinuing most use of n3096. > Look at the definitions of the "reproducible" and "unsequenced" function > type attributes in 6.7.13.8. In particular, look at the leeway > explicitly given to the compiler for re-arranging code in 6.7.13.8.3p6 > and similar examples. Consider how that fits (or fails to fit) with > your interpretation of the tranSlation phases in section 5. These are intersting and useful attributes. They are ortoghonal to the translation unit issue though. If we declare that a function in another translation unit is reproducible, and we call it twice with the same arguments, then two calls need not take place. That is not anything like LTO: the function attributes which drives those semantic possibilities comes from the same translation unit. If a function is attributed as "reproducible" or "unsequenced" in another translation unit, such that this is not visible to our current translation unit (the header file declaration for the function omits the attributes), then it looks like an ordinary function. If we call it twice, it gets called twice. There is no conflict between the semantics of these advanced attributes, and the claim that LTO is nonconforming. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal Mastodon: @Kazinator@mstdn.ca