| Deutsch English Français Italiano |
|
<vmalpk$1pah$6@solani.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mild Shock <janburse@fastmail.fm> Newsgroups: sci.math Subject: Re: Memory Powering the AI Revolution Date: Thu, 16 Jan 2025 11:07:51 +0100 Message-ID: <vmalpk$1pah$6@solani.org> References: <vmalmh$1pah$2@solani.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 16 Jan 2025 10:07:48 -0000 (UTC) Injection-Info: solani.org; logging-data="58705"; mail-complaints-to="abuse@news.solani.org" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:128.0) Gecko/20100101 Firefox/128.0 SeaMonkey/2.53.20 Cancel-Lock: sha1:y05dlhWnt3nW0DKra5n+DpZeBi8= X-User-ID: eJwFwYEBwCAIA7CXhhQK5yja/09YEp6WQ2QkQqGNuovejUt2DEN2W6aF98WpmoCXYT3imFNqTL89dlRM/jmNFQI= In-Reply-To: <vmalmh$1pah$2@solani.org> See also: The Special Memory Powering the AI Revolution https://www.youtube.com/watch?v=yAw63F1W_Us Mild Shock schrieb: > I currently believe that some of the fallacies > around LLMs is that one assumes that the learning > generates some small light NNs (Neural Networks), > > which are then subject to blurred categories and > approximative judgments. But I guess its quite > different the learning generates very large massive NNs, > > which can afford representing ontologies quite precise > and with breadth. But how is it done? One puzzle piece > could be new types of memory, so called High-Bandwidth > > Memory (HBM), an architecture where DRAM dies are > vertically stacked and connected using Through-Silicon > Vias (TSVs). For example found in NVIDIA GPUs like the > > A100, H100. Compare to DDR3 that might be found in > your Laptop or PC. Could give you a license to trash > L1/L2 Caches with your algorithms? > > HBM3 DDR3 > Bandwidth 1.2 TB/s (per stack) 12.8 GB/s to 25.6 GB/s > Latency Low, optimized for Higher latency > real-time tasks > Power Efficiency > More efficient Higher power consumption > despite high speeds than HBM3