Deutsch English Français Italiano |
<v99t1b$271h3$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.nobody.at!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Thomas Koenig <tkoenig@netcologne.de> Newsgroups: comp.arch Subject: Re: My 66000 and High word facility Date: Sun, 11 Aug 2024 08:33:47 -0000 (UTC) Organization: A noiseless patient Spider Lines: 25 Message-ID: <v99t1b$271h3$1@dont-email.me> References: <v98asi$rulo$1@dont-email.me> <38055f09c5d32ab77b9e3f1c7b979fb4@www.novabbs.org> <v991kh$vu8g$1@dont-email.me> Injection-Date: Sun, 11 Aug 2024 10:33:47 +0200 (CEST) Injection-Info: dont-email.me; posting-host="be1d890c76466cef743ef65763f1d3fc"; logging-data="2328099"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18WacWnmPAfYz+YxJ0ARlHC5lytnN6ilq4=" User-Agent: slrn/1.0.3 (Linux) Cancel-Lock: sha1:lOkHyFzxxcUIb9pCsO1b+siWOAo= Bytes: 2055 Brett <ggtgp@yahoo.com> schrieb: > Compilers love unrolling loops because it saves an instruction, which for a > short loop could mean 10% faster. Point out your code has more unrolls and > performance. If you want to look at what the compiler for My 66000 does, it can be found at https://github.com/bagel99/llvm-my66000 . Installation is a bit cumbersome, but manageable. Speaking as somebody who neither designed the ISA nor written the compiler port: The Virtual Vector methods makes unrolling vectorized loops unprofitable; all you "gain" from unrolling those is increased register pressure and code size. Having constants in the instruction stream also reduces register pressure. In the beginning, I had my doubts that 32 general registers which are also used for floating point are enough, but looking at generated code convinced me. Unrolling in the presence of VVM is not that easy Non-vectorizable loops can still be profitable to unroll, as can be outer loops. But when working with an existing compiler which has assumptions about currently available architectures baked in, this is quite difficult.