Deutsch English Français Italiano |
<v8fi05$2381g$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: BGB <cr88192@gmail.com> Newsgroups: comp.arch Subject: Re: Misc: Applications of small floating point formats. Date: Thu, 1 Aug 2024 03:45:52 -0500 Organization: A noiseless patient Spider Lines: 649 Message-ID: <v8fi05$2381g$1@dont-email.me> References: <v8ehgr$1q8sr$1@dont-email.me> <61e1f6f5f04ad043966b326d99e38928@www.novabbs.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 01 Aug 2024 10:45:58 +0200 (CEST) Injection-Info: dont-email.me; posting-host="333d6ae404bf8b1a0d3118c53a323626"; logging-data="2203696"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Oj40oCObAn+dyW4sEDOCzc7btrf5bXZo=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:NXtRmDcU1JxObcLcJvTsurq9Pkc= In-Reply-To: <61e1f6f5f04ad043966b326d99e38928@www.novabbs.org> Content-Language: en-US Bytes: 25428 On 7/31/2024 7:31 PM, MitchAlsup1 wrote: > On Wed, 31 Jul 2024 23:31:35 +0000, BGB wrote: > >> So, say, we have common formats: >> Binary64, S.E11.F52, Common Use >> Binary32, S.E8.F23, Common Use >> Binary16, S.E5.F10, Less Common Use >> >> But, things get funky below this: >> A-Law: S.E3.F4 (Bias=8) >> FP8: S.E4.F3 (Bias=7) (E4M3 in NVIDIA terms) >> FP8U: E4.F4 (Bias=7) >> FP8S: E4.F3.S (Bias=7) >> >> >> Semi-absent in my case: >> BFloat16: S.E8.F7 >> Can be faked in software in my case using Shuffle ops. >> NVIDIA E5M2 (S.E5.F2) >> Could be faked using RGBA32 pack/unpack ops. > > So, you have identified the problem:: 8-bits contains insufficient > exponent and fraction widths to be considered standard format. > Thus, in order to utilize 8-bit FP one needs several incarnations. > This just points back at the problem:: FP needs at least 10 bits. > Though, 10 bits only gives 3 components per 32-bit word, or would need 5 bytes for 4 components, neither is ideal... >> >> No immediate plans to add these later cases as (usually) I have a need >> for more precision than more exponent range. The main seeming merit of >> these formats being that they are truncated forms of the wider formats. >> >> >> No need to elaborate on the use-cases for Binary32 and Binary64, wide >> and varied. > > There is a growing clamor for 128-bit FP, too. Supported in my case, but currently software only (as "long double"). There were past plans for truncated Binary128 in hardware, but going much bigger than Binary64, the cost quickly gets out-of-hand. >> >> >> Binary16 is useful for graphics > probably, >> and audio processing. > > Insufficient data width as high quality Audio has gone to 24-bits > {120 DBa S/N). > > You can call MP3 and other "phone" formats Audio, but please restrict > yourself from using the term High Quality when doing so. > Usually "gold standard" audio format IME are 44100Hz and 48000Hz 16-bit stereo. Personally, I don't notice much difference between 44kHz and 48kHz. Have noted that 8-bit PCM sounds poor at nearly every "reasonable" sample rate. Seemingly, somehow, 8-bit PCM adds a very obvious "hiss" to the audio which is distracting. I personally consider 16kHz to be near the lower end of acceptable (at 8kHz or 11kHz there is a notable distortion, things like speech become highly muffled and nearly unintelligible). This combination of hiss and muffling seems to be the normal situation on phones, making it very difficult to understand what people are saying. Or, basically: 8kHz: very bad 11kHz: poor 16kHz: OK 22kHz: OK 32kHz: Good 44khz: Ideal 48kHz: Ideal Past this: Overkill. And, bit-depth: 8-bit PCM: Poor 8-bit A-Law: OK 8-bit u-Law: OK 16-bit PCM: Ideal 16-bit FP: Ideal 32-bit FP: Probably overkill For my projects, I am mostly using 16kHz A-Law, because it sounds "pretty OK". Major factor is how much memory one needs for the "loop buffer". Typically, the ideal size for the loop buffer is around 250ms. For 16kHz 8-bit stereo, this is 8K; and 8K is reasonable. For 44kHz 16-bit stereo, one would likely need a 64K buffer (rounding up to the next power-of-2). A 64K buffer for the PCM audio loop is a bit steep for an FPGA... MP3 sounds good at around 128kbps. Have noted at 40kbps or 64kbps it sounds rather poor. It tends to develop artifacts that sound like a bunch of high-frequency distortions (whistling and other effects) and rattling broken glass in a can, which is very displeasing. Even arguably worse strategies, like say driving the audio at 32kHz 1 bit/sample with a delta-sigma modulator, can sound better IMO. Not great, but doesn't sound quite so much like one is rapidly shaking a steel can full of broken glass either. But, I can use a sharp 2kHz to 8kHz bandpass, and to me it sounds mostly the same, but my cats will respond as if some great evil has come forth from the speakers. For me though, seems like this is a fairly important range: With just this range, the audio is intact; Without this range, it is basically just a muffle. Then again, I have noted that my sense of hearing may be anomalous, so my experiences may not exactly match other people. Then again, I have noted that a lot of people also sit around playing MIDI music on floppy drives and stepper motors as a novelty, so maybe not that far off either. But, results here are rather variable. >> Seemingly IEEE >> specifies it mostly for storage and not for computation, but for these >> cases it is good enough for computation as well. >> >> Binary16 is mostly sufficient for 3D model geometry, and for small 3D >> scenes, but not really for 3D computations or larger scenes (using it >> for transform or projection matrices or matrix multiply does not give >> acceptable results). >> >> Does work well for fast sin/cos lookup tables (if supported natively), >> say, because the error of storing an angle as 1/256 of a circle is >> larger than the error introduced by the 10 bit mantissa. >> >> I had also used it as the computational format in a lot of my neural-net >> experiments. >> > I have seen NN used compressed FP formats where 0 uses 1-bit and > 1.0 uses but 2-bits. ... I had my experimental BITNN thing, where: Inputs are 1 or 2 bits (1b = +/-1; 2b = 0/1/0/-1); Weights are 3 bits (+/- 0/1/3/7). It can be made fast and was fairly effective at things like OCR, but is naturally limited to things where inputs and outputs are 1-bit signals (like monochrome images). Basically, it was able to evaluate a 16-input neuron in 1 clock-cycle, and could run a small OCR test fast enough to still be acceptably fast in the Verilator simulation. ========== REMAINDER OF ARTICLE TRUNCATED ==========