| Deutsch English Français Italiano |
|
<vmbtcq$3lp99$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: David Brown <david.brown@hesbynett.no> Newsgroups: comp.arch Subject: Re: Segments Date: Thu, 16 Jan 2025 22:23:38 +0100 Organization: A noiseless patient Spider Lines: 84 Message-ID: <vmbtcq$3lp99$1@dont-email.me> References: <vdlgl9$3kq50$2@dont-email.me> <vdtmv9$16lu8$1@dont-email.me> <2024Oct6.150415@mips.complang.tuwien.ac.at> <vl7m2b$6iat$1@paganini.bofh.team> <2025Jan3.093849@mips.complang.tuwien.ac.at> <vlcddh$j2gr$1@paganini.bofh.team> <2025Jan5.121028@mips.complang.tuwien.ac.at> <vleuou$rv85$1@paganini.bofh.team> <ndamnjpnt8pkllatkdgq9qn2turaao1f0a@4ax.com> <2025Jan6.092443@mips.complang.tuwien.ac.at> <vlgreu$1lsr9$1@dont-email.me> <vlhjtm$1qrs5$1@dont-email.me> <bdZeP.23664$Hfb1.16566@fx46.iad> <vlj1pg$25p0e$1@dont-email.me> <87cygo97dl.fsf@nosuchdomain.example.com> <vm7mvi$2rr87$1@dont-email.me> <vmaig9$3ehn7$1@dont-email.me> <vmat2e$3geg9$1@dont-email.me> <874j1y8nxy.fsf@nosuchdomain.example.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 16 Jan 2025 22:23:39 +0100 (CET) Injection-Info: dont-email.me; posting-host="274f6314c3d31355a7f854f44f61af25"; logging-data="3859753"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/1HnvIeY2U6HP+rJA8LCKBmH8td0JXG7A=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:WEOGJB2y+w4GkA1n99utZC5Nyhs= Content-Language: en-GB In-Reply-To: <874j1y8nxy.fsf@nosuchdomain.example.com> Bytes: 6023 On 16/01/2025 22:10, Keith Thompson wrote: > David Brown <david.brown@hesbynett.no> writes: >> On 16/01/2025 10:11, Terje Mathisen wrote: >>> Thomas Koenig wrote: >>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> schrieb: >>>>> Thomas Koenig <tkoenig@netcologne.de> writes: >>>>> [...] >>>>>> CHERY targets C, which on the one hand, I understand (there's a >>>>>> ton of C code out there), but trying to retrofit a safe memory >>>>>> model onto C seems a bit awkward - it might have been better to >>>>>> target a language which has arrays in the first place, unlike C. >>>>> [...] >>>>> >>>>> C does have arrays. >>>> >>>> Sort of - they decay into pointers at first sight. >>>> >>>> But what I should have written was "multi-dimensional arrays", >>>> with a reasonable way of handling them. >>>> >>> Rust provides an interesting data point here: >>> It has Vec<> which is always implemented as a dope vector, i.e. a >>> header which contains the starting point and current length, along >>> with allocated size. For multidimendional work, the natural mapping >>> is Vec<Vec<>>, i.e. similar to classic C arrays of arrays, but with >>> boundary checking. >>> However, in my own testing I have found that it is often faster to >>> flatten those multi-dim vectors, and instead use explicit >>> multiplication to get the actual position: >>> array[y][x] -> array[y*width + x] > > Note that this will inhibit bounds checking on the inner dimension. > That might be part of the reason for the improvement in speed. > > For example, given int array[10][10], array[0][11] is out of bounds, > even if it logically refers to the same location as array[1][0]. This > results in undefined behavior in C, and perhaps some kind of exception > in a language that requires bounds checking. If you do this manually by > defining a 1d array, any checking applies only to the entire array. > >> That does not surprise me. Vec<> in Rust is very similar to >> std::vector<> in C++, as far as I know (correct me if that's wrong). >> So a vector of vectors of int is not contiguous or consistent - each >> subvector can have a different current size and capacity. Doing a >> bounds check for accessing xs[i][j] (or in C++ syntax, xs.at(i).at(j) >> when you want bounds checking) means first reading the current size >> member of the outer vector, and checking "i" against that. Then xs[i] >> is found (by adding "i * sizeof(vector)" to the data pointer stored in >> the outer vector). That is looked up to find the current size of this >> inner vector for bounds checking, then the actual data can be found. > > I'm not familiar with Rust's Vec<>, but C++'s std::vector<> guarantees > that the elements are stored contiguously. But the std::vector<> object > itself doesn't contain those elements; it's a fixed-size chunk of data > (basically a struct in C terms) whose size doesn't change regardless of > the number of elements (and typically regardless of the element type). > So a std::vector<std::vector<int>> will result in the data for each row > being stored contiguously, but the rows themselves will be allocated > dynamically. > Yes, exactly. Of course you could do as Terje did in Rust - make a std::vector<> of size N x M and do the "i * N + j" calculation manually. Now that C++23 has a multi-parameter subscript operator, you can do that quite neatly in a little wrapper class around a std::vector<> with a nice access operator. But it's still more efficient to use a std::array<> if you know the sizes at compile time. >> This is /completely/ different from classic C multi-dimensional >> arrays. It is more akin to a one-dimensional C array of pointers to >> individually allocated one-dimensional C arrays - but even less >> efficient due to an extra layer of indirection. >> >> If you know the size of the data at compile time, then in C++ you have >> std::array<> where the information about size is carried in the type, >> with no run-time cost. A nested std::array<> is a perfectly good and >> efficient multi-dimensional array with runtime bounds checking if you >> want to use it, as well as having value semantics (no decay to pointer >> types in expressions). I would guess there is something equivalent in >> Rust ? >