Deutsch English Français Italiano |
<vhso61$1o2of$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder2.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Bart <bc@freeuk.com> Newsgroups: comp.lang.c Subject: Re: else ladders practice Date: Sat, 23 Nov 2024 14:17:36 +0000 Organization: A noiseless patient Spider Lines: 77 Message-ID: <vhso61$1o2of$1@dont-email.me> References: <3deb64c5b0ee344acd9fbaea1002baf7302c1e8f@i2pn2.org> <vgdt36$2r682$2@paganini.bofh.team> <vge8un$1o57r$3@dont-email.me> <vgpi5h$6s5t$1@paganini.bofh.team> <vgtsli$1690f$1@dont-email.me> <vhgr1v$2ovnd$1@paganini.bofh.team> <vhic66$1thk0$1@dont-email.me> <vhins8$1vuvp$1@dont-email.me> <vhj7nc$2svjh$1@paganini.bofh.team> <vhje8l$2412p$1@dont-email.me> <WGl%O.42744$LlWc.33050@fx42.iad> <vhkr9e$4bje$1@dont-email.me> <vhptmn$3mlgf$1@paganini.bofh.team> <vhq6b4$17hkq$1@dont-email.me> <vhqm3l$3ntp7$1@paganini.bofh.team> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sat, 23 Nov 2024 15:17:38 +0100 (CET) Injection-Info: dont-email.me; posting-host="a58c04c916580acacf6def0185ed8157"; logging-data="1837839"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/JJh32PojrULEbI3vGW56n" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:bn6SbHioDm1pE2onSNUF+3ZWgEs= In-Reply-To: <vhqm3l$3ntp7$1@paganini.bofh.team> Content-Language: en-GB Bytes: 4487 On 22/11/2024 19:29, Waldek Hebisch wrote: > Bart <bc@freeuk.com> wrote: > clang -O3 -march=native 126112us > clang -O3 222136us > clang -O 225855us > gcc -O3 -march=native 82809us > gcc -O3 114365us > gcc -O 287786us > tcc 757347us You've omitted -O0 for gcc and clang. That timing probably won't be too far from tcc, but compilation time for larger programs will be significantly longer (eg. 10 times or more). The trade-off then is not worth it unless you are running gcc for other reasons (eg. for deeper analysis, or to compile less portable code that has only been tested on or written for gcc/clang; or just an irrational hatred of simple tools). > > There is some irregularity in timings, but this shows that > factor of order 9 is possible. That's an extreme case, for one small program with one obvious bottleneck where it spends 99% of its time, and with little use of memory either. For simply written programs, the difference is more like 2:1. For more complicated C code that makes much use of macros that can expand to lots of nested function calls, it might be 4:1, since it might rely on optimisation to inline some of those calls. Again, that would be code written to take advantage of specific compilers. But that is still computationally intensive code working on small amounts of memory. I have a text editor written in my scripting language. I can translate its interpreter to C and compile with both gcc-O3 and tcc. Then, yes, you will notice twice as much latency with the tcc interpreter compared with gcc-O3, when doing things like deleting/inserting lines at the beginning of a 1000000-line text file. But typically, the text files will be 1000 times smaller; you will notice no difference at all. I'm not saying no optimisation is needed, ever, I'm saying that the NEED for optimisation is far smaller than most people seem to think. Here are some timings for that interpreter, when used to run a script to compute fib(38) the long way: Interp Built with Timing qc tcc 9.0 secs (qc is C transpiled version) qq mm 5.0 (-fn; qq is original M version) qc gcc-O3 4.0 qq mm 1.2 (-asm) (My interpreter doesn't bother with faster switch-based or computed-goto based dispatchers. The choice is between a slower function-table-based one, and an accelerated threaded-code version using inline ASM. These are selected with -fn/-asm options. The -asm version is not JIT; it is still interpreting a bytecode at a time). So the fastest version here doesn't use compiler optimisation, and it's 3 times the speed of gcc-O3. My unoptimised HLL code is also only 25% slower than gcc-O3. That is for this test, but that's also one that is popular for language benchmarks.