| Deutsch English Français Italiano |
|
<2024Aug1.174030@mips.complang.tuwien.ac.at> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.arch Subject: Re: Array vs. Linked List Traversal Date: Thu, 01 Aug 2024 15:40:30 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 34 Message-ID: <2024Aug1.174030@mips.complang.tuwien.ac.at> References: <v8g7lr$27dq5$1@dont-email.me> Injection-Date: Thu, 01 Aug 2024 17:53:07 +0200 (CEST) Injection-Info: dont-email.me; posting-host="9c6cb8df3cbe72c15b04922cd88fdb90"; logging-data="2345580"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18biqGqodDKrQquLunoe1Ah" Cancel-Lock: sha1:r1igCagJ1gZecINMrGP6yYKwJOA= X-newsreader: xrn 10.11 Bytes: 2334 jseigh <jseigh_es00@xemaps.com> writes: >Array elements being in cache, assuming the element size is small >enough to fit multiple elements in a cache line? Yes. Even without, hardware branch predictors will see the stride and prefetch the cache lines appropriately. >No dependent load delay from having to load the linked list next >pointer? If that was the case, then moving the load of the next >pointer to the beginning of the loop should ameliorate that somewhat. Depends on how much work you have to do per element. If it's just a little work, like adding up one field of each element, or searching for a particular value of a field, the linked-list will mean that each iteration takes 4-5 cycles on a modern CPU even on a D-cache hit, while such loops may take 1 cycle/iteration or less if an array is used. If you have more than 5 cycles of work per element, then that's not an issue (as long as the elements are in the D-cache or the linked-list layout works for the hardware prefetchers), but modern CPUs can do a lot in 5 cycles. Rearranging the load of the next pointer will typically not make a big difference on OoO CPUs, unless you have a lot of branch mispredictions. >Something else? Please limit your lines to around 70 characters. I reformatted the quoted material. - anton -- 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.' Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>