Deutsch English Français Italiano |
<vjq0o9$19dn3$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB <cr88192@gmail.com> Newsgroups: comp.arch Subject: Re: Got Quake 2 running on my MRISC32 FPGA computer Date: Mon, 16 Dec 2024 13:57:58 -0600 Organization: A noiseless patient Spider Lines: 242 Message-ID: <vjq0o9$19dn3$1@dont-email.me> References: <vjnouf$po8u$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Mon, 16 Dec 2024 20:58:01 +0100 (CET) Injection-Info: dont-email.me; posting-host="ddb9061b7ff374fdc7dfdbef0796e5c0"; logging-data="1357539"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18gdbKpCrhG2lASN6uWsTJbI0JgZYPXjvE=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:ddnNLjRDxIpK+gqxGHlWdWnNj0w= Content-Language: en-US In-Reply-To: <vjnouf$po8u$1@dont-email.me> Bytes: 10427 On 12/15/2024 5:32 PM, Marcus wrote: > Some progress... > > Earlier this year I spent some time porting Quake 2 to my MRISC32 based > computer. It required some refactoring since Quake 2 used a modular > rendering and game logic system based on dynamically loaded libraries > (DLLs). My computer isn't that fancy, so I had to get everything > statically linked into a single executable ELF32 binary (and the > Quake 2 source code didn't support that at all). > > My patched source code: https://gitlab.com/mbitsnbites/mc1-quake2 > > When I finally got a working build, it only worked in my simulator but > not on my FPGA board, so I dropped the effort. > > Yesterday, however, I went and bumped my GNU toolchain to GCC 15.x and > fixed a few bugs in my MRISC32 back end, and lo and behold, the binary > actually started working on the FPGA (not sure if it was a compiler bug > or if it's a CPU implementation bug that got hidden by the compiler > update). > > Video: https://vimeo.com/1039476687 > > It's not much (about 10 FPS at 320x180 resolution), but at least it's > progress. > Yeah, I am still mostly limited to single-digit framerates in Quake 1, and pretty much entirely unplayable framerates in Quake 3. Not tried porting Quake 2 yet. Though, if I did so, may make sense to leave it as 256 color and then convert to hi-color later. In my Quake 1 port, I had modified the software renderer to internally operate directly on hi-color, which potentially increases the amount of cache pressure and similar. Note that Quake 3 had been delayed for a while in my case due to needing virtual memory and DLL loading, but I have these now. But, now there is the chaos of me trying to turn things into proper user-mode (so, ATM, Quake3 is broken again due to how I needed to link things to make it work). And, the partial irony that GLQuake is slightly faster than SW Quake, though my GLQuake port had been modified to use static vertex lighting. This information isn't stored in the BSP, so needs to be regenerated on map load, but isn't super accurate (and not as good looking as the vertex lighting mode in Quake 3). There is a hardware rasterizer module, which helps "slightly", but my OpenGL implementation, the bigger limiting factor is the front-end geometry transform. My HW rasterizer basically only does edge-walking, so a lot of the heavy lifting (transform/projection and geometric subdivision) needs to be done CPU side. Generally, it seems the rasterizer module rasterizes things faster than the CPU side code can feed requests into it. .... Oh well, I am distracted some. I am working on trying to build a specialized printer which might (eventually) be used for printed electronics. At present, I am building it by CNC converting and X/Y table, and using syringe pumps for ink (got some 100ml syringes to use for this; will be driven using NEMA-17 steppers, as these seem to be the cheapest option). Current idea for print-head is to use 4 22ga blunt-tip needles, spaced roughly 0.25" apart (will be offset in software to re-align the layers). For the synthesis, had the idea that rather than trying to use large tiles representing more complex logic faking an FPGA (such as LUT4's or LUTRAMs), it may make more sense to first decompose these into logic gates, and then do all the final "place and route" stuff mostly at the level of logic gates. Would likely have: AND, OR, NAND, NOR, BUF, NOT. May skip the XOR gate, as it is larger than the others. To include an XOR gate as a fundamental gate and try to layout everything on a grid, would require making everything else bigger. In terms of truth tables (outputs only): 0 0 0 0: Zero, resistor to GND. 0 0 0 1: AND 0 0 1 0: AND (~B) 0 0 1 1: Input B (BUF) 0 1 0 0: AND (~A) 0 1 0 1: Input A (BUF) 0 1 1 0: XOR 0 1 1 1: OR 1 0 0 0: NOR 1 0 0 1: XNOR 1 0 1 0: NOT Input A (NOT) 1 0 1 1: OR (~A) 1 1 0 0: NOT Input B (NOT) 1 1 0 1: OR (~B) 1 1 1 0: NAND 1 1 1 1: HI (resistor to Vcc) Pretty much all of the more complex logic elements can be synthesized from logic gates. Also recently got around to implementing an experimental filesystem. General: Superblock: Follows a similar pattern to NTFS; inode table: Represents itself as an inode (0); inodes currently 256 or 512 bytes. Block indexing: Normal files (32-bit block numbers): 16 direct blocks 8 indirect 1-level blocks 4 indirect 2-level blocks 2 indirect 3-level blocks 1 indirect 4-level block 1 indirect 5-level block Compressed / large volume, 256 byte inode: 8 direct blocks 4 indirect 1-level blocks 1 indirect 2-level block 1 indirect 3-level block 1 indirect 4-level block 1 indirect 5-level block Compressed / large volume, 512 byte inode: 16 direct blocks 8 indirect 1-level blocks 4 indirect 2-level blocks 1 indirect 3-level block 1 indirect 4-level block 1 indirect 5-level block 1 indirect 6-level block Blocks are allocated via a bitmap, and assigned into table indices within the inodes. The 32-bit index has 5 levels, though the 5th level is kind of overkill as on current valid block sizes it will not be used (4 levels could address the entire range of 2^32 blocks, or 4TB with 1K blocks). So, mostly just serves to pad the structure to 128 bytes. For compressed files, 64 bit block index numbers would be used. Similar for large volumes with more than 2^32 blocks. At present, largest valid volume size (at 1K blocks), would be 256PB. This would be larger than the current EXTn filesystems, which seem to still be limited to 32-bit block numbers (~ 16TB with 4K blocks). Though, I have yet to confirm whether EXTn actually has such a limit. For compressed files, the high order bits of the block numbers would be used for some additional metadata (such as the span of disk blocks for larger compressed blocks; or the location of the packed-data within packed blocks, *). *: Block compression may use one of several strategies: Store: Stores compressed block as raw / uncompressed as N disk blocks. Compressed: Compressed data stored as a smaller number of disk blocks. Packed: Compressed data is held within data in another inode. May be used for small or highly Skip: Block was all zeroes, so index entry is left as 0. No disk storage is used for compressed all-zero blocks. I am debating whether to use/allow block-skipping for non-compressed ========== REMAINDER OF ARTICLE TRUNCATED ==========