View Issue Details

IDProjectCategoryView StatusLast Update
0032103FPCRTLpublic2019-06-02 18:59
ReporterChristo CrauseAssigned ToJeppe Johansen 
PrioritynormalSeverityminorReproducibilityN/A
Status closedResolutionfixed 
PlatformEmbeddedOSOS Version
Product Version3.1.1Product Build 
Target VersionFixed in Version3.3.1 
Summary0032103: AVR - Assembler routines for 8, 16 & 32 bit unsigned div (code contribution)
DescriptionWhen including a div operation in a calculation the generated code size increases dramatically (in terms of memory generally available on an embedded target). It appears that the compiler uses generic software math routines from the rtl/inc folder (generic.inc and int64.inc). These routines includes an error check (HandleErrorAddrFrameInd) which pulls in a bunch of code (get_caller_frame, pushexceptionobject, dounhandledexception, raiseexception etc.).

I've adapted Atmel's application note AVR200 div routines to fpc, see attached unit.
Additional Informationhttp://forum.lazarus-ide.org/index.php/topic,37408.0.html
AVR200 application note: http://www.atmel.com/Images/doc0936.pdf
TagsAVR
Fixed in Revision42162
FPCOldBugId
FPCTarget3.2.0
Attached Files
  • integermath.pas.zip (971 bytes)
  • avr_div.patch (5,328 bytes)
    Index: rtl/avr/avr.inc
    ===================================================================
    --- rtl/avr/avr.inc	(revision 36643)
    +++ rtl/avr/avr.inc	(working copy)
    @@ -322,3 +322,5 @@
         end;
       end;
     
    +{include hand-optimized assembler code}
    +{$i math.inc}
    Index: rtl/avr/math.inc
    ===================================================================
    --- rtl/avr/math.inc	(revision 36643)
    +++ rtl/avr/math.inc	(working copy)
    @@ -13,3 +13,175 @@
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     
      **********************************************************************}
    +
    +// Based on restoring division algorithm
    +// Algorithm source document: Lecture notes by S. Galal and D. Pham, Division algorithms and hardware implementations.
    +// Link to documentation http://www.seas.ucla.edu/~ingrid/ee213a/lectures/division_presentV2.pdf
    +// Also refer to description on Wikipedia: https://en.wikipedia.org/wiki/Division_algorithm#Restoring_division
    +
    +// Note that the algorithm automatically yields the following results for special cases:
    +// n div 0 = MAX(type)
    +// 0 div 0 = MAX(type)
    +// 0 div z = 0
    +// Checks for n = 0; z = [0,1]; n = z and z > n could shortcut the algorithm for speed-ups
    +// but would add extra code
    +// Perhaps add the checks depending on optimization settings?
    +
    +// n (dividend) = q(quotient) x z(divisor) + p(remainder)
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_BYTE}
    +{$define FPC_SYSTEM_HAS_DIV_BYTE}
    +
    +// x in Ra, d in Rb, 0 in Rp
    +function fpc_div_byte(n, z: byte): byte; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_BYTE'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// n (A)   dividend    R24
    +// z (B)   divisor     R22
    +// p (P)   remainder   R20
    +// n	   counter     R18
    +
    +start:
    +  clr R20         // clear remainder
    +  ldi R18, 8      // iterate over 8 bits
    +
    +div1:
    +  lsl R24         // shift left A
    +  rol R20         // shift left P with carry from A shift
    +  sub R20, R22    // Subtract B from P, P <= P - B
    +  brlo div2
    +  ori R24, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R20, R22    // restore old value of P
    +
    +div3:
    +  dec R18
    +  brne div1
    +
    +finish:
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_byte(n, z: byte): byte; external name 'FPC_DIV_BYTE';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_BYTE}
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_WORD}
    +{$define FPC_SYSTEM_HAS_DIV_WORD}
    +
    +// n in Ra, z in Rb, 0 in Rp
    +function fpc_div_word(n, z: word): word; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_WORD'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// n (A)   dividend    R25, R24
    +// z (B)   divisor     R23, R22
    +// p (P)   remainder   R21, R20
    +// n	   counter     R18
    +
    +start:            // Start of division...
    +  clr R20         // clear remainder low
    +  clr R21         // clear remainder hi
    +  ldi R18, 16     // iterate over 16 bits
    +
    +div1:
    +  lsl R24         // shift left A_L
    +  rol R25
    +  rol R20         // shift left P with carry from A shift
    +  rol R21
    +  sub R20, R22    // Subtract B from P, P <= P - B
    +  sbc R21, R23
    +  brlo div2
    +  ori R24, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R20, R22    // restore old value of P
    +  adc R21, R23
    +
    +div3:
    +  dec R18
    +  brne div1
    +
    +finish:
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_word(n, z: word): word; external name 'FPC_DIV_WORD';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_WORD}
    +
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_DWORD}
    +{$define FPC_SYSTEM_HAS_DIV_DWORD}
    +
    +// n in Ra, z in Rb, 0 in Rp
    +function fpc_div_dword(n, z: dword): dword; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_DWORD'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// n (A)   dividend    R25, R24, R23, R22
    +// z (B)   divisor     R21, R20, R19, R18
    +// p (P)   remainder   R17, R16, R15, R14
    +// n	   counter     R26
    +
    +  push R17
    +  push R16
    +  push R15
    +  push R14
    +
    +start:            // Start of division...
    +  clr R14         // clear remainder
    +  clr R15         // clear remainder
    +  clr R16
    +  clr R17
    +  ldi R26, 32     // iterate over 32 bits
    +
    +div1:
    +  lsl R22         // shift left A_L
    +  rol R23
    +  rol R24
    +  rol R25
    +  rol R14         // shift left P with carry from A shift
    +  rol R15
    +  rol R16
    +  rol R17
    +  sub R14, R18    // Subtract B from P, P <= P - B
    +  sbc R15, R19
    +  sbc R16, R20
    +  sbc R17, R21
    +  brlo div2
    +  ori R22, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R14, R18    // restore old value of P
    +  adc R15, R19
    +  adc R16, R20
    +  adc R17, R21
    +
    +div3:
    +  dec R26
    +  brne div1
    +
    +finish:
    +  pop R14
    +  pop R15
    +  pop R16
    +  pop R17
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_dword(n, z: dword): dword; external name 'FPC_DIV_DWORD';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_DWORD}
    +
    
    avr_div.patch (5,328 bytes)
  • avr_div-2.patch (5,515 bytes)
    Index: rtl/avr/avr.inc
    ===================================================================
    --- rtl/avr/avr.inc	(revision 36734)
    +++ rtl/avr/avr.inc	(working copy)
    @@ -322,3 +322,5 @@
         end;
       end;
     
    +{include hand-optimized assembler code}
    +{$i math.inc}
    Index: rtl/avr/math.inc
    ===================================================================
    --- rtl/avr/math.inc	(revision 36734)
    +++ rtl/avr/math.inc	(working copy)
    @@ -13,3 +13,180 @@
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     
      **********************************************************************}
    +
    +// Based on restoring division algorithm
    +// Algorithm source document: Lecture notes by S. Galal and D. Pham, Division algorithms and hardware implementations.
    +// Link to documentation http://www.seas.ucla.edu/~ingrid/ee213a/lectures/division_presentV2.pdf
    +// Also refer to description on Wikipedia: https://en.wikipedia.org/wiki/Division_algorithm#Restoring_division
    +
    +// Note that the algorithm automatically yields the following results for special cases:
    +// z div 0 = MAX(type)
    +// 0 div 0 = MAX(type)
    +// 0 div n = 0
    +// Checks for z = 0; n = [0,1]; n = z and n > z could shortcut the algorithm for speed-ups
    +// but would add extra code
    +// Perhaps add the checks depending on optimization settings?
    +
    +// z (dividend) = q(quotient) x n(divisor) + p(remainder)
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_BYTE}
    +{$define FPC_SYSTEM_HAS_DIV_BYTE}
    +
    +// z in Ra, n in Rb, 0 in Rp
    +function fpc_div_byte(n, z: byte): byte; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_BYTE'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// z (A)   dividend    R22
    +// n (B)   divisor     R24
    +// p (P)   remainder   R20
    +// i	   counter     R18
    +
    +start:
    +  clr R20         // clear remainder
    +  ldi R18, 8      // iterate over 8 bits
    +
    +div1:
    +  lsl R22         // shift left A
    +  rol R20         // shift left P with carry from A shift
    +  sub R20, R24    // Subtract B from P, P <= P - B
    +  brlo div2
    +  ori R22, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R20, R24    // restore old value of P
    +
    +div3:
    +  dec R18
    +  brne div1
    +
    +finish:
    +  mov R24, R22    // Move result from R22 to R24
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_byte(n, z: byte): byte; external name 'FPC_DIV_BYTE';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_BYTE}
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_WORD}
    +{$define FPC_SYSTEM_HAS_DIV_WORD}
    +
    +// z in Ra, n in Rb, 0 in Rp
    +function fpc_div_word(n, z: word): word; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_WORD'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// z (A)   dividend    R23, R22
    +// n (B)   divisor     R25, R24
    +// p (P)   remainder   R21, R20
    +// i	   counter     R18
    +
    +start:            // Start of division...
    +  clr R20         // clear remainder low
    +  clr R21         // clear remainder hi
    +  ldi R18, 16     // iterate over 16 bits
    +
    +div1:
    +  lsl R22         // shift left A_L
    +  rol R23
    +  rol R20         // shift left P with carry from A shift
    +  rol R21
    +  sub R20, R24    // Subtract B from P, P <= P - B
    +  sbc R21, R25
    +  brlo div2
    +  ori R22, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R20, R24    // restore old value of P
    +  adc R21, R25
    +
    +div3:
    +  dec R18
    +  brne div1
    +
    +finish:
    +  movw R24, R22    // Move result from R22:R23 to R24:R25
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_word(n, z: word): word; external name 'FPC_DIV_WORD';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_WORD}
    +
    +
    +{$ifndef FPC_SYSTEM_HAS_DIV_DWORD}
    +{$define FPC_SYSTEM_HAS_DIV_DWORD}
    +
    +// z in Ra, n in Rb, 0 in Rp
    +function fpc_div_dword(n, z: dword): dword; assembler; nostackframe;
    +{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_DWORD'];{$endif}
    +label
    +  start, div1, div2, div3, finish;
    +asm
    +// Symbol  Name        Register(s)
    +// z (A)   dividend    R21, R20, R19, R18
    +// n (B)   divisor     R25, R24, R23, R22
    +// p (P)   remainder   R17, R16, R15, R14
    +// i	   counter     R26
    +
    +  push R17
    +  push R16
    +  push R15
    +  push R14
    +
    +start:            // Start of division...
    +  clr R14         // clear remainder
    +  clr R15         // clear remainder
    +  clr R16
    +  clr R17
    +  ldi R26, 32     // iterate over 32 bits
    +
    +div1:
    +  lsl R18         // shift left A_L
    +  rol R19
    +  rol R20
    +  rol R21
    +  rol R14         // shift left P with carry from A shift
    +  rol R15
    +  rol R16
    +  rol R17
    +  sub R14, R22    // Subtract B from P, P <= P - B
    +  sbc R15, R23
    +  sbc R16, R24
    +  sbc R17, R25
    +  brlo div2
    +  ori R18, 1      // Set A[0] = 1
    +  rjmp div3
    +div2:             // negative branch, A[0] = 0 (default after shift), restore P
    +  add R14, R22    // restore old value of P
    +  adc R15, R23
    +  adc R16, R24
    +  adc R17, R25
    +
    +div3:
    +  dec R26
    +  brne div1
    +
    +finish:
    +  movw R22, R18    // Move result from R18:R21 to R22:R25
    +  movw R24, R20
    +
    +  pop R14
    +  pop R15
    +  pop R16
    +  pop R17
    +end;
    +
    +{It is a compilerproc (systemh.inc), make an alias for internal use.}
    +{$ifdef FPC_IS_SYSTEM}
    +function fpc_div_dword(n, z: dword): dword; external name 'FPC_DIV_DWORD';
    +{$endif FPC_IS_SYSTEM}
    +{$endif FPC_SYSTEM_HAS_DIV_DWORD}
    +
    
    avr_div-2.patch (5,515 bytes)

Activities

Christo Crause

2017-07-03 21:42

reporter  

integermath.pas.zip (971 bytes)

Florian

2017-07-03 21:53

administrator   ~0101470

While I really appreciate your work, the problem might be the license. Do you know anything about the license of the ATMEL code?

Thaddy de Koning

2017-07-03 22:02

reporter   ~0101471

Last edited: 2017-07-03 22:04

View 2 revisions

As I understand it this is written by himself and (after I gave him the link during discussion) checked against ATMEL's suggested algorithm. Which is a description and not code. I can probably add the licencing info tomorrow.

Christo Crause

2017-07-04 13:16

reporter   ~0101487

Last edited: 2017-07-04 13:29

View 2 revisions

Mmm, I actually translated and adapted Atmel's code from http://www.atmel.com/Images/AVR200.zip. I did not think it was restricted since no restrictions are published either in the app note or the code. However looking at Atmel's catch-all legal notice (http://www.atmel.com/About/legal.aspx) paragraph LIMITED LICENSE TO ATMEL'S MATERIALS seems to limit the user rights.

Searching reveals several other sources for similar algorithms:
http://www.avr-asm-tutorial.net/avr_en/calc/DIV8E.html
https://www.virag.si/2010/02/simple-division-algorithm-for-arm-assembler/
http://6502org.wikidot.com/software-math-intdiv
University publication: http://sgate.emt.bme.hu/patai/publications/z80guide/part4.html
or C: https://stackoverflow.com/a/34458338
or just algorithm: https://courses.cs.vt.edu/~cs1104/BuildingBlocks/divide.030.html

None of the sites above cite any license, so when do we assume something like a well studied and published algorithm as software division is in the public domain?

Thaddy de Koning

2017-07-04 16:46

reporter   ~0101492

Well. That's basically what I found.
Contact ATMEL and explain. Anyway, to use the code you need to have a need for ATMEL hardware anyway...

Georg Hieber

2017-07-06 19:44

reporter   ~0101553

Christo, please send me a mail to georg at hieber-stgt dot de. Looks like we are working on similar things.

Christo Crause

2017-07-12 09:31

reporter   ~0101669

I contacted Microchip (they acquired Atmel) to ask for permission to use the code in the App Note. I got a reference number and a promise of a reply within 24 hours. That was a week ago.

I am now writing my own code based on a description of the restoring division algorithm. Got the 8 bit division version working, still need to adapt to 16 & 32 bit versions.

Christo Crause

2017-07-16 08:17

reporter  

avr_div.patch (5,328 bytes)
Index: rtl/avr/avr.inc
===================================================================
--- rtl/avr/avr.inc	(revision 36643)
+++ rtl/avr/avr.inc	(working copy)
@@ -322,3 +322,5 @@
     end;
   end;
 
+{include hand-optimized assembler code}
+{$i math.inc}
Index: rtl/avr/math.inc
===================================================================
--- rtl/avr/math.inc	(revision 36643)
+++ rtl/avr/math.inc	(working copy)
@@ -13,3 +13,175 @@
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
  **********************************************************************}
+
+// Based on restoring division algorithm
+// Algorithm source document: Lecture notes by S. Galal and D. Pham, Division algorithms and hardware implementations.
+// Link to documentation http://www.seas.ucla.edu/~ingrid/ee213a/lectures/division_presentV2.pdf
+// Also refer to description on Wikipedia: https://en.wikipedia.org/wiki/Division_algorithm#Restoring_division
+
+// Note that the algorithm automatically yields the following results for special cases:
+// n div 0 = MAX(type)
+// 0 div 0 = MAX(type)
+// 0 div z = 0
+// Checks for n = 0; z = [0,1]; n = z and z > n could shortcut the algorithm for speed-ups
+// but would add extra code
+// Perhaps add the checks depending on optimization settings?
+
+// n (dividend) = q(quotient) x z(divisor) + p(remainder)
+
+{$ifndef FPC_SYSTEM_HAS_DIV_BYTE}
+{$define FPC_SYSTEM_HAS_DIV_BYTE}
+
+// x in Ra, d in Rb, 0 in Rp
+function fpc_div_byte(n, z: byte): byte; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_BYTE'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// n (A)   dividend    R24
+// z (B)   divisor     R22
+// p (P)   remainder   R20
+// n	   counter     R18
+
+start:
+  clr R20         // clear remainder
+  ldi R18, 8      // iterate over 8 bits
+
+div1:
+  lsl R24         // shift left A
+  rol R20         // shift left P with carry from A shift
+  sub R20, R22    // Subtract B from P, P <= P - B
+  brlo div2
+  ori R24, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R20, R22    // restore old value of P
+
+div3:
+  dec R18
+  brne div1
+
+finish:
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_byte(n, z: byte): byte; external name 'FPC_DIV_BYTE';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_BYTE}
+
+{$ifndef FPC_SYSTEM_HAS_DIV_WORD}
+{$define FPC_SYSTEM_HAS_DIV_WORD}
+
+// n in Ra, z in Rb, 0 in Rp
+function fpc_div_word(n, z: word): word; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_WORD'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// n (A)   dividend    R25, R24
+// z (B)   divisor     R23, R22
+// p (P)   remainder   R21, R20
+// n	   counter     R18
+
+start:            // Start of division...
+  clr R20         // clear remainder low
+  clr R21         // clear remainder hi
+  ldi R18, 16     // iterate over 16 bits
+
+div1:
+  lsl R24         // shift left A_L
+  rol R25
+  rol R20         // shift left P with carry from A shift
+  rol R21
+  sub R20, R22    // Subtract B from P, P <= P - B
+  sbc R21, R23
+  brlo div2
+  ori R24, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R20, R22    // restore old value of P
+  adc R21, R23
+
+div3:
+  dec R18
+  brne div1
+
+finish:
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_word(n, z: word): word; external name 'FPC_DIV_WORD';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_WORD}
+
+
+{$ifndef FPC_SYSTEM_HAS_DIV_DWORD}
+{$define FPC_SYSTEM_HAS_DIV_DWORD}
+
+// n in Ra, z in Rb, 0 in Rp
+function fpc_div_dword(n, z: dword): dword; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_DWORD'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// n (A)   dividend    R25, R24, R23, R22
+// z (B)   divisor     R21, R20, R19, R18
+// p (P)   remainder   R17, R16, R15, R14
+// n	   counter     R26
+
+  push R17
+  push R16
+  push R15
+  push R14
+
+start:            // Start of division...
+  clr R14         // clear remainder
+  clr R15         // clear remainder
+  clr R16
+  clr R17
+  ldi R26, 32     // iterate over 32 bits
+
+div1:
+  lsl R22         // shift left A_L
+  rol R23
+  rol R24
+  rol R25
+  rol R14         // shift left P with carry from A shift
+  rol R15
+  rol R16
+  rol R17
+  sub R14, R18    // Subtract B from P, P <= P - B
+  sbc R15, R19
+  sbc R16, R20
+  sbc R17, R21
+  brlo div2
+  ori R22, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R14, R18    // restore old value of P
+  adc R15, R19
+  adc R16, R20
+  adc R17, R21
+
+div3:
+  dec R26
+  brne div1
+
+finish:
+  pop R14
+  pop R15
+  pop R16
+  pop R17
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_dword(n, z: dword): dword; external name 'FPC_DIV_DWORD';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_DWORD}
+
avr_div.patch (5,328 bytes)

Christo Crause

2017-07-16 08:20

reporter   ~0101724

Attached please find unsigned int division routines implemented from scratch by me. I've added them to the AVR RTL in the rtl/avr/math.inc file, hope that is appropriate.

Christo Crause

2017-07-16 12:33

reporter   ~0101725

Apologies, I noticed that I swapped around the parameters for div (numerator / denominator). The second patch (avr_div-2.patch) fixes the order of parameters.

Christo Crause

2017-07-16 12:34

reporter  

avr_div-2.patch (5,515 bytes)
Index: rtl/avr/avr.inc
===================================================================
--- rtl/avr/avr.inc	(revision 36734)
+++ rtl/avr/avr.inc	(working copy)
@@ -322,3 +322,5 @@
     end;
   end;
 
+{include hand-optimized assembler code}
+{$i math.inc}
Index: rtl/avr/math.inc
===================================================================
--- rtl/avr/math.inc	(revision 36734)
+++ rtl/avr/math.inc	(working copy)
@@ -13,3 +13,180 @@
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
  **********************************************************************}
+
+// Based on restoring division algorithm
+// Algorithm source document: Lecture notes by S. Galal and D. Pham, Division algorithms and hardware implementations.
+// Link to documentation http://www.seas.ucla.edu/~ingrid/ee213a/lectures/division_presentV2.pdf
+// Also refer to description on Wikipedia: https://en.wikipedia.org/wiki/Division_algorithm#Restoring_division
+
+// Note that the algorithm automatically yields the following results for special cases:
+// z div 0 = MAX(type)
+// 0 div 0 = MAX(type)
+// 0 div n = 0
+// Checks for z = 0; n = [0,1]; n = z and n > z could shortcut the algorithm for speed-ups
+// but would add extra code
+// Perhaps add the checks depending on optimization settings?
+
+// z (dividend) = q(quotient) x n(divisor) + p(remainder)
+
+{$ifndef FPC_SYSTEM_HAS_DIV_BYTE}
+{$define FPC_SYSTEM_HAS_DIV_BYTE}
+
+// z in Ra, n in Rb, 0 in Rp
+function fpc_div_byte(n, z: byte): byte; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_BYTE'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// z (A)   dividend    R22
+// n (B)   divisor     R24
+// p (P)   remainder   R20
+// i	   counter     R18
+
+start:
+  clr R20         // clear remainder
+  ldi R18, 8      // iterate over 8 bits
+
+div1:
+  lsl R22         // shift left A
+  rol R20         // shift left P with carry from A shift
+  sub R20, R24    // Subtract B from P, P <= P - B
+  brlo div2
+  ori R22, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R20, R24    // restore old value of P
+
+div3:
+  dec R18
+  brne div1
+
+finish:
+  mov R24, R22    // Move result from R22 to R24
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_byte(n, z: byte): byte; external name 'FPC_DIV_BYTE';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_BYTE}
+
+{$ifndef FPC_SYSTEM_HAS_DIV_WORD}
+{$define FPC_SYSTEM_HAS_DIV_WORD}
+
+// z in Ra, n in Rb, 0 in Rp
+function fpc_div_word(n, z: word): word; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_WORD'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// z (A)   dividend    R23, R22
+// n (B)   divisor     R25, R24
+// p (P)   remainder   R21, R20
+// i	   counter     R18
+
+start:            // Start of division...
+  clr R20         // clear remainder low
+  clr R21         // clear remainder hi
+  ldi R18, 16     // iterate over 16 bits
+
+div1:
+  lsl R22         // shift left A_L
+  rol R23
+  rol R20         // shift left P with carry from A shift
+  rol R21
+  sub R20, R24    // Subtract B from P, P <= P - B
+  sbc R21, R25
+  brlo div2
+  ori R22, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R20, R24    // restore old value of P
+  adc R21, R25
+
+div3:
+  dec R18
+  brne div1
+
+finish:
+  movw R24, R22    // Move result from R22:R23 to R24:R25
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_word(n, z: word): word; external name 'FPC_DIV_WORD';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_WORD}
+
+
+{$ifndef FPC_SYSTEM_HAS_DIV_DWORD}
+{$define FPC_SYSTEM_HAS_DIV_DWORD}
+
+// z in Ra, n in Rb, 0 in Rp
+function fpc_div_dword(n, z: dword): dword; assembler; nostackframe;
+{$ifdef FPC_IS_SYSTEM}[public,alias: 'FPC_DIV_DWORD'];{$endif}
+label
+  start, div1, div2, div3, finish;
+asm
+// Symbol  Name        Register(s)
+// z (A)   dividend    R21, R20, R19, R18
+// n (B)   divisor     R25, R24, R23, R22
+// p (P)   remainder   R17, R16, R15, R14
+// i	   counter     R26
+
+  push R17
+  push R16
+  push R15
+  push R14
+
+start:            // Start of division...
+  clr R14         // clear remainder
+  clr R15         // clear remainder
+  clr R16
+  clr R17
+  ldi R26, 32     // iterate over 32 bits
+
+div1:
+  lsl R18         // shift left A_L
+  rol R19
+  rol R20
+  rol R21
+  rol R14         // shift left P with carry from A shift
+  rol R15
+  rol R16
+  rol R17
+  sub R14, R22    // Subtract B from P, P <= P - B
+  sbc R15, R23
+  sbc R16, R24
+  sbc R17, R25
+  brlo div2
+  ori R18, 1      // Set A[0] = 1
+  rjmp div3
+div2:             // negative branch, A[0] = 0 (default after shift), restore P
+  add R14, R22    // restore old value of P
+  adc R15, R23
+  adc R16, R24
+  adc R17, R25
+
+div3:
+  dec R26
+  brne div1
+
+finish:
+  movw R22, R18    // Move result from R18:R21 to R22:R25
+  movw R24, R20
+
+  pop R14
+  pop R15
+  pop R16
+  pop R17
+end;
+
+{It is a compilerproc (systemh.inc), make an alias for internal use.}
+{$ifdef FPC_IS_SYSTEM}
+function fpc_div_dword(n, z: dword): dword; external name 'FPC_DIV_DWORD';
+{$endif FPC_IS_SYSTEM}
+{$endif FPC_SYSTEM_HAS_DIV_DWORD}
+
avr_div-2.patch (5,515 bytes)

Dimitrios Chr. Ioannidis

2019-04-18 11:49

reporter   ~0115642

Can I ask if this is going to be accepted ?

Dimitrios Chr. Ioannidis

2019-04-18 12:40

reporter   ~0115643

@florian Regarding the license for the code this is what I found at the Microchip web site :

:...
Microchip or its licensors retain all ownership and intellectual property rights in the Software, the underlying technology, and in all derivatives thereto (by whomever produced). You may use the Software, and any derivatives created by any person or entity by or on your behalf, exclusively with Microchip’s products. You agree that you are solely responsible for testing the Software and determining its suitability. Microchip has no obligation to modify, test, certify, provide maintenance or support the Software.
…"

and the link for the above https://www.microchip.com/about-us/legal-information/website-usage-and-limitation-of-liability

Christo Crause

2019-04-18 19:46

reporter   ~0115658

Please note that I rewrote the code from scratch based just on an algorithmic description. A simple check against the original code (now here: https://www.microchip.com/wwwAppNotes/AppNotes.aspx?appnote=en591172) should show that the sequence is somewhat different and that the number of operations are also not quite the same.

There is therefore no license issue with the latest patch in my opinion. Of course I would appreciate it if someone else can do an independent verification of this.

Jeppe Johansen

2019-06-02 15:17

developer   ~0116532

Applied. I added the check for division by zero though.

Christo Crause

2019-06-02 18:59

reporter   ~0116537

Thanks Jeppe

Issue History

Date Modified Username Field Change
2017-07-03 21:42 Christo Crause New Issue
2017-07-03 21:42 Christo Crause File Added: integermath.pas.zip
2017-07-03 21:43 Christo Crause Tag Attached: AVR
2017-07-03 21:53 Florian Note Added: 0101470
2017-07-03 22:02 Thaddy de Koning Note Added: 0101471
2017-07-03 22:04 Thaddy de Koning Note Edited: 0101471 View Revisions
2017-07-04 13:16 Christo Crause Note Added: 0101487
2017-07-04 13:29 Christo Crause Note Edited: 0101487 View Revisions
2017-07-04 16:46 Thaddy de Koning Note Added: 0101492
2017-07-06 19:44 Georg Hieber Note Added: 0101553
2017-07-12 09:31 Christo Crause Note Added: 0101669
2017-07-16 08:17 Christo Crause File Added: avr_div.patch
2017-07-16 08:20 Christo Crause Note Added: 0101724
2017-07-16 12:33 Christo Crause Note Added: 0101725
2017-07-16 12:34 Christo Crause File Added: avr_div-2.patch
2019-04-18 11:49 Dimitrios Chr. Ioannidis Note Added: 0115642
2019-04-18 12:40 Dimitrios Chr. Ioannidis Note Added: 0115643
2019-04-18 19:46 Christo Crause Note Added: 0115658
2019-06-02 15:17 Jeppe Johansen Assigned To => Jeppe Johansen
2019-06-02 15:17 Jeppe Johansen Status new => resolved
2019-06-02 15:17 Jeppe Johansen Resolution open => fixed
2019-06-02 15:17 Jeppe Johansen Fixed in Version => 3.3.1
2019-06-02 15:17 Jeppe Johansen Fixed in Revision => 42162
2019-06-02 15:17 Jeppe Johansen FPCTarget => 3.2.0
2019-06-02 15:17 Jeppe Johansen Note Added: 0116532
2019-06-02 18:59 Christo Crause Status resolved => closed
2019-06-02 18:59 Christo Crause Note Added: 0116537