Deutsch English Français Italiano |
<vbq1kl$33csc$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Brett <ggtgp@yahoo.com> Newsgroups: comp.arch Subject: Re: Computer architects leaving Intel... Date: Tue, 10 Sep 2024 18:03:01 -0000 (UTC) Organization: A noiseless patient Spider Lines: 55 Message-ID: <vbq1kl$33csc$1@dont-email.me> References: <2024Aug30.195831@mips.complang.tuwien.ac.at> <vat5ap$jthk$2@dont-email.me> <vaunhb$vckc$1@dont-email.me> <vautmu$vr5r$1@dont-email.me> <2024Aug31.170347@mips.complang.tuwien.ac.at> <vavpnh$13tj0$2@dont-email.me> <vb00c2$150ia$1@dont-email.me> <505954890d8461c1f4082b1beecd453c@www.novabbs.org> <vb0kh2$12ukk$1@dont-email.me> <vb3smg$1ta6s$1@dont-email.me> <vb4q5o$12ukk$3@dont-email.me> <vb6a16$38aj5$1@dont-email.me> <vb7evj$12ukk$4@dont-email.me> <vb8587$3gq7e$1@dont-email.me> <vb91e7$3o797$1@dont-email.me> <vb9eeh$3q993$1@dont-email.me> <vb9l7k$3r2c6$2@dont-email.me> <vba26l$3te44$1@dont-email.me> <vbag2s$3vhih$1@dont-email.me> <vbbnf9$8j04$1@dont-email.me> <vbbsl4$9hdg$1@dont-email.me> <vbcbob$bd22$3@dont-email.me> <vbcob9$dvp4$1@dont-email.me> <vbd174$eulp$1@dont-email.me> <vbm67e$2apse$1@dont-email.me> <vbmkln$2cmfo$1@dont-email.me> <vbni3u$2h7pp$1@dont-email.me> <vbp3jl$2subi$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Tue, 10 Sep 2024 20:03:02 +0200 (CEST) Injection-Info: dont-email.me; posting-host="b7071f8d372b41faa89a77c9a1cb1bd0"; logging-data="3257228"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19om5P5y6tk3XsOaTfZ+MrZ" User-Agent: NewsTap/5.5 (iPad) Cancel-Lock: sha1:m8e+WfCQC++wTlwNLeaREnUV/gM= sha1:jeHt/EYBpK2zsPlIr/bZf/GUtlw= Bytes: 4212 Terje Mathisen <terje.mathisen@tmsw.no> wrote: > Brett wrote: >> David Brown <david.brown@hesbynett.no> wrote: >>> Often you get the most efficient results by writing code clearly and >>> simply so that the compiler can understand it better and good object >>> code. This is particularly true if you want the same source to be used >>> on different targets or different variants of a target - few people can >>> track the instruction scheduling and timings on multiple processors >>> better than a good compiler. (And the few people who /can/ do that >>> spend their time chatting in comp.arch instead of writing code...) When >>> you do hand-made micro-optimisations, these can work against the >>> compiler and give poorer results overall. >> >> I know of no example where hand optimized code does worse on a newer CPU. >> A newer CPU with bigger OoOe will effectively unroll your code and schedule >> it even better. > > Not true: > > My favorite benchmark program for 20+ years was Word Count, I > re-optimized that for every new x86 generation, and on the Pentium I got > it to run at 1.5 clock cycles per character (40 MB/s on a 60 MHz Pentium). > > When the PentiumPro came out, it did a 10-20 cycle stall for every pair > of characters, so about an order of magnitude slower in cycle count. > (But only about 3X clock time due to being 200 instead of 60 MHz.) But how big a slowdown did the unoptimized code get? Are you describing a glass jaw handling unpredictable branches on a CPU with a much longer pipeline? A shorter pipeline with better worst case handling is going to do better, even if older. Intel was going for high clock benchmark speed, not performance. >> It’s older lesser CPU’s where your hand optimized code might fail hard, and >> I know of few examples of that. None actually. >> >>> This is especially the case >>> when code is moved around with inlining, constant propagation, >>> unrolling, link-time optimisation, etc. >>> >>> Long ago, it was a different matter - then compilers needed more help to >>> get good results. And compilers are far from perfect - there are still >>> times when "smart" code or assembly-like C is needed (such as when >>> taking advantage of some vector and SIMD facilities). >> > Right. > > Terje >