Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!nntp.TheWorld.com!usenet.csail.mit.edu!.POSTED.hergotha.csail.mit.edu!not-for-mail From: wollman@hergotha.csail.mit.edu (Garrett Wollman) Newsgroups: rec.arts.sf.written Subject: Re: ongoing infrastructure changes with AI in the USA Date: Fri, 15 Nov 2024 20:23:10 -0000 (UTC) Organization: MIT Computer Science & Artificial Intelligence Lab Message-ID: References: Injection-Date: Fri, 15 Nov 2024 20:23:10 -0000 (UTC) Injection-Info: usenet.csail.mit.edu; posting-host="hergotha.csail.mit.edu:207.180.169.34"; logging-data="14485"; mail-complaints-to="security@csail.mit.edu" X-Newsreader: trn 4.0-test77 (Sep 1, 2010) Originator: wollman@hergotha.csail.mit.edu (Garrett Wollman) Bytes: 3018 Lines: 46 In article , D wrote: >On Thu, 14 Nov 2024, Scott Lurndal wrote: >> Lynn McGuire writes: >>> I am on the periphery of the ongoing blanketing of the USA with AI >>> servers. I have a few facts that might just blow you away. >>> >>> The expected number of AI servers in the USA alone is presently a >>> million (SWAG). The current cost for a single AI server is $500,000 US. >>> 1,000,000 x $500,000 = $500 billion US of capital. >> >> First, What is your source for this data? Be specific. >> >> Second, define precisely what an "AI server" is. > >My guess would be a server stuffed with GPU:s. As someone who is currently involved with such things, while I'm not at liberty to comment on pricing, a typical "state of the art" compute node for a machine learning cluster might include: - multiple CPUs with many cores each - a lot of RAM - a lot of very fast solid-state disk - 8 H200 GPUs (also many cores each) - 8 ports of 400G Infiniband - 2 ports of 100G or 200G Ethernet - about 30 kW in power supplies - enough cooling (fans and/or liquid cooling systems) to dissipate 30 kW of waste heat Because most ML work ("AI" or "training") happens on the GPUs, there is typically only enough CPU to handle the I/O load. The Infiniband is used exclusively for low-latency GPU-to-GPU communication across the cluster; the regular ingress and egress happen over Ethernet. While I can't comment on specific costs, I will say that the retail cost is far higher than the BOM cost, and most of that profit stays in the pockets of Nvidia. -GAWollman -- Garrett A. Wollman | "Act to avoid constraining the future; if you can, wollman@bimajority.org| act to remove constraint from the future. This is Opinions not shared by| a thing you can do, are able to do, to do together." my employers. | - Graydon Saunders, _A Succession of Bad Days_ (2015)