Deutsch English Français Italiano |
<v51m3c$2lh9s$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Thomas Koenig <tkoenig@netcologne.de> Newsgroups: comp.arch Subject: Re: Stealing a Great Idea from the 6600 Date: Thu, 20 Jun 2024 16:41:16 -0000 (UTC) Organization: A noiseless patient Spider Lines: 23 Message-ID: <v51m3c$2lh9s$1@dont-email.me> References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com> <v02eij$6d5b$1@dont-email.me> <152f8504112a37d8434c663e99cb36c5@www.novabbs.org> <v04tpb$pqus$1@dont-email.me> <v4f5de$2bfca$1@dont-email.me> <jwvzfrobxll.fsf-monnier+comp.arch@gnu.org> <v4f97o$2bu2l$1@dont-email.me> <613b9cb1a19b6439266f520e94e2046b@www.novabbs.org> <v4hsjk$2vk6n$1@dont-email.me> <6b5691e5e41d28d6cb48ff6257555cd4@www.novabbs.org> <v4tfu3$1ostn$1@dont-email.me> <96280554541a8a9b1a29a5cbd5b7c07b@www.novabbs.org> <v4v3ot$22rd9$1@dont-email.me> <99fe225bd3d326964ec86862fe38a437@www.novabbs.org> Injection-Date: Thu, 20 Jun 2024 18:41:17 +0200 (CEST) Injection-Info: dont-email.me; posting-host="651413151ef30955dbe82fcdbc24a100"; logging-data="2803004"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mPs+SDtC23/h51/C3o9ANl9JzHc74nxU=" User-Agent: slrn/1.0.3 (Linux) Cancel-Lock: sha1:9sIy3q8yYpgvKFQehny8XSQ89Gw= Bytes: 2444 MitchAlsup1 <mitchalsup@aol.com> schrieb: > Now, if &b[1] happens to = &a[0], then your construction fails while VVM > succeeds--it just runs slower because there IS a dependency checked by > HW and enforced. In those situations where the dependency is > nonexistent, > then the loop vectorizes--and the programmer remains blissfuly unaware. The performance loss can be significant, unfortunately, depending on the ratio of the width of the data in quesiton to the width of the SIMD which actually performs the operation. In the case of 8-bit data and 256-bit wide SIMD, this would be a factor of 32, which could lead to a slowdown of a factor of... 25, maybe? This would be enough to trigger bug reports, I can tell you from experience :-) One technique that could get around that would be loop reversal, with a branch to the correct loop at runtime (or a predicate chosing the right values for the loop constants). An option to raise an exception when there is a slowdown due to loops running the wrong direction could be helpful in this context.