Path: ...!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail From: Mild Shock Newsgroups: comp.lang.prolog Subject: Microsoft is plagiarizing my Invention [LLMs under the hood] Date: Tue, 8 Oct 2024 16:00:56 +0200 Message-ID: References: <1b7ce2bd-722b-4c2e-b853-12fc2232752bn@googlegroups.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 8 Oct 2024 14:00:54 -0000 (UTC) Injection-Info: solani.org; logging-data="183591"; mail-complaints-to="abuse@news.solani.org" User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 SeaMonkey/2.53.19 Cancel-Lock: sha1:JUg5xLbV9H6DP3BQo+QmAI49FtI= X-User-ID: eJwNyskBwDAIA7CVwmXIOIWa/Udo9VYYBJOOgMfGrnrTMojKaqEg04NXW1E2z/xL5yBKlVxhX6NH9pEXgH9CjxTm In-Reply-To: Bytes: 2013 Lines: 24 will probably never get a Turing Award or something for what I did 23 years ago. Why is its reading count on research gate suddently going up? Knowledge, Planning and Language, November 2001 I guess because of this, the same topic takled by Microsofts recent model GRIN. Shit. I really should find some investor and pump up a start up! "Mixture-of-Experts (MoE) models scale more effectively than dense models due to sparse computation through expert routing, selectively activating only a small subset of expert modules." https://arxiv.org/pdf/2409.12136 But somehow I am happy with my dolce vita as it is now... Or maybe I am decepting myself? P.S.: From the GRIN paper, here you see how expert domains modules relate with each other: Figure 6 (b): MoE Routing distribution similarity across MMLU 57 tasks for the control recipe.