X-Received: by 2002:a05:622a:242:: with SMTP id c2mr779467qtx.230.1603435326727;
        Thu, 22 Oct 2020 23:42:06 -0700 (PDT)
X-Received: by 2002:aed:25a3:: with SMTP id x32mr710703qtc.286.1603435326359;
 Thu, 22 Oct 2020 23:42:06 -0700 (PDT)
Path: ...!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Thu, 22 Oct 2020 23:42:06 -0700 (PDT)
In-Reply-To: <rmrjpt$o8q$1@dont-email.me>
Complaints-To: groups-abuse@google.com
Injection-Info: google-groups.googlegroups.com; posting-host=80.133.236.56; posting-account=59ThCgoAAAD2lLtkl4ewgOkDMBH0Fq8k
NNTP-Posting-Host: 80.133.236.56
References: <cd5fcc5c-a4f9-4ce2-bb50-8b7835482ac8n@googlegroups.com> <rmrjpt$o8q$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8006f8c6-c534-4e4a-adb3-e0967f42e4c3n@googlegroups.com>
Subject: Re: Colorforth /mod algorithm
From: Ulrich Hoffmann <ulrich.e.hoffmann@gmail.com>
Injection-Date: Fri, 23 Oct 2020 06:42:06 +0000
Content-Type: text/plain; charset="UTF-8"
Bytes: 7331
Lines: 146

Hello Dallin and all,

maybe I can shed some light into the dark:

> : /mod for begin 
>     over over . + 
>     -if  drop 2* [ swap ] next ; then
>     over or or - 2* - 
>   next ;

As Howerd already mentioned, there are some specialities in the c18
MachineForth/colorForth [1] that are worth mentioning:

1) colorForth .   is a noop. It's used to give + on the c18 time to 
_   propagate the carry.

2) colorForth -   is not subtraction but 1's complement: INVERT
_   c18 has no subtract, you'll have to add the 2's complement.

3) colorForth or  is not inclusive or but EXclusive or:  XOR

4) colorForth -if branches it TOS is negative put does not pop the stack:
_   DUP 0< IF 

5) colorForth u FOR NEXT counts down from u and ends with 0 so it has u+1 
_   loop iterations. Its loop index is the top of return stack.
_   You can replace  u FOR ... NEXT   0 u DO ... -1 +LOOP

6) Although colorForth suports ELSE the typical colorForth style is 
_   factoring and early exits. colorForth ; just compiles EXIT.

7) During compilation colorForth control structures put adresses on 
_   the stack very much like traditional Forth. [ swap ] in the /mod 
_   implementation actually moves the address of the BEGIN to top 
_   of stack, so that the following NEXT will jump back to BEGIN.
_   We see two NEXTs in the definition of /MOD, one in each case. Both
_   directly jump back to the same location marked by FOR BEGIN.
_   The same technique is available in Forth-94 with CS-ROLL.
_   If you use ELSE you can merge again at THEN and just have a single NEXT.

8) The c18 has no primitive SWAP instruction so using SWAP at runtime is
_   quite expensive. Likewise SWAP DROP. However the colorForth phrase
_   over or or  just performs SWAP DROP  ( a b -- a ) as  a^b^a = b
_   Note or is XOR. 

9) 2* is just 2*. In contrast to what Howerd said there is no difference
_   between 2* and an imaginary u2*. The signed/unsigend difference is only 
_   important with 2/ (arithmetig shift right) and u2/ (logical shift right)
_   assuming 2's complement arithmetic.


Given that knowledge we can give a Forth-94 definition of the colorForth /MOD:

> : cF-/mod ( denominator numerator #bits -- denominator rest|quotient )
>    0 SWAP DO
>      OVER OVER  +                  \ actually subtracts as denom is negated
>      DUP 0< IF
>        DROP                        \ ignore difference 
>        2*                          \ shift in 0
>      ELSE
>        SWAP DROP                  \ keep difference
>        INVERT 2* INVERT           \ shift in 1
>      THEN
>    -1 +LOOP ;

The algorithm itself is independent of the number of bits used to 
represent numbers (18 on c18) but relies on the cell with of the 
underlying Forth system as it tests the sign bit with -if resp. 0<

cf-/mod is actually only the core of the /mod algorithm not /MOD
as we know it. It can divide unsigned numbers that are at most
half the cell with (9 bits on c18). If we assume a 32 bit system, 
then it can divide 16 bit unsigned numbers. Let's assume that:

cf-/mod actually takes three parameters:

1. The denominator shifted 15 bits to the left and negated and sign extended
_  ddddddddddddddddd000000000000000
_  This is because c18 has no subtract and it's cheaper to
_  add a negated value. The denominator has to be prepared accordingly 
_  upfront.

2. The numerator
_  0000000000000000nnnnnnnnnnnnnnnn

3. The number of bits that should be in the result (minus one as FOR NEXT 
_   loops one additional time). So it's 15 in our case (16 iterations).


cf-/mod results in two stack items: The denominator as before and the
rest and quotient combined in a single cell 
_  rrrrrrrrrrrrrrrrqqqqqqqqqqqqqqqq

We can define a /MOD wrapper with usual argument order that prepares 
the arguments for cF-/mod, runs it and then extracts the result:

> : u/mod ( n d -- r q )
>    NEGATE 15 LSHIFT SWAP 
>    15 cF-/mod ( d r|q ) 
>    SWAP DROP
>    DUP 16 RSHIFT 65535 ( FFFF ) AND
>    SWAP 65535 ( FFFF ) AND ;


If we add some trace code and run a test case we get:

50 7 u/mod cr u. u.

15 1111111111111100|1000000000000000 0000000000000000|0000000000110010 in 0
14 1111111111111100|1000000000000000 0000000000000000|0000000001100100 in 0
13 1111111111111100|1000000000000000 0000000000000000|0000000011001000 in 0
12 1111111111111100|1000000000000000 0000000000000000|0000000110010000 in 0
11 1111111111111100|1000000000000000 0000000000000000|0000001100100000 in 0
10 1111111111111100|1000000000000000 0000000000000000|0000011001000000 in 0
09 1111111111111100|1000000000000000 0000000000000000|0000110010000000 in 0
08 1111111111111100|1000000000000000 0000000000000000|0001100100000000 in 0
07 1111111111111100|1000000000000000 0000000000000000|0011001000000000 in 0
06 1111111111111100|1000000000000000 0000000000000000|0110010000000000 in 0
05 1111111111111100|1000000000000000 0000000000000000|1100100000000000 in 0
04 1111111111111100|1000000000000000 0000000000000001|1001000000000000 in 0
03 1111111111111100|1000000000000000 0000000000000011|0010000000000000 in 0
02 1111111111111100|1000000000000000 0000000000000110|0100000000000000 in 1 
01 1111111111111100|1000000000000000 0000000000000101|1000000000000001 in 1 
00 1111111111111100|1000000000000000 0000000000000100|0000000000000011 in 1 
__ 1111111111111100|1000000000000000 0000000000000001|0000000000000111 
__ dddddddddddddddd d                rrrrrrrrrrrrrrrr qqqqqqqqqqqqqqqq

7 1 


The algorithm does a non performing restoring division [2]. It tries to 
subtract the (shifted) denominator from the nominator. If that overflows 
a 0 is shifted in and the difference is ignored. If it does not overflow 
the subtraction is kept and a 1 is shifted in. The quotient is constructed
bit by bit.

That is my current understanding. Any addtions are welcome.

Regards,
	Ulrich

P.S. Sorry for the leading underscores at some lines. Google news seems 
to eat leading spaces on submission of posts...

[1] https://colorforth.github.io/inst.htm
[2] https://en.wikipedia.org/wiki/Division_algorithm#Restoring_division