Deutsch   English   Français   Italiano  
<v8ut7t$26n8q$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Fereydoun Memarzanjany <thraetaona@ieee.org>
Newsgroups: comp.lang.vhdl,comp.lang.fpga
Subject: Innervator: Hardware Acceleration for Neural Networks
Date: Tue, 6 Aug 2024 22:29:50 -0600
Organization: A noiseless patient Spider
Lines: 35
Message-ID: <v8ut7t$26n8q$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 07 Aug 2024 06:29:50 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="311e58f894bccfb40be344c4f43061b5";
	logging-data="2317594"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+hOwF4QHHsHLZ2c1lHdfXXgPsBQLTeWic="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:o65WjjaD4TyHhER8aQTsemYXXS0=
Content-Language: en-US
Bytes: 3208

Pasted below is an overview/abstract, and you will find more information 
(including a paper, demo video, statistics, slides, and source code) at 
the following GitHub repository:

https://github.com/Thraetaona/Innervator

------------------------------------------------------------------------
Artificial intelligence ("AI") is deployed in various applications, from 
noise cancellation to image recognition, but AI-based products often 
come with high hardware and electricity costs; this makes them 
inaccessible for consumer devices and small-scale edge electronics. 
Inspired by biological brains, deep neural networks ("DNNs") are modeled 
using mathematical formulae, yet general-purpose processors treat 
otherwise-parallelizable AI algorithms as step-by-step sequential logic. 
  In contrast, programmable logic devices ("PLDs") can be customized to 
the specific parameters of a trained DNN, thereby ensuring data-tailored 
computation and algorithmic parallelism at the register-transfer level. 
Furthermore, a subgroup of PLDs, field-programmable gate arrays 
("FPGAs"), are dynamically reconfigurable.  So, to improve AI runtime 
performance, I designed and open-sourced my hardware compiler: 
Innervator.  Written entirely in VHDL-2008, Innervator takes any DNN's 
metadata and parameters (e.g., number of layers, neurons per layer, and 
their weights/biases), generating its synthesizable FPGA hardware 
description with the appropriate pipelining and batch processing. 
Innervator is entirely portable and vendor-independent.  As a proof of 
concept, I used Innervator to implement a sample 8x8-pixel handwritten 
digit-recognizing neural network in a low-cost AMD Xilinx Artix-7(TM) 
FPGA @ 100 MHz.  With 3 pipeline stages and 2 batches at about 67% LUT 
utilization, the Network achieved ~7.12 GOP/s, predicting the output in 
630 ns and under 0.25 W of power.  In comparison, an Intel(R) Core(TM) 
i7-12700H CPU @ 4.70 GHz would take 40,000-60,000 ns at 45 to 115 W. 
Ultimately, Innervator's hardware-accelerated approach bridges the 
inherent mismatch between current AI algorithms and the general-purpose 
digital hardware they run on.
------------------------------------------------------------------------