AVR201: Using the AVR® Hardware Multiplier

Contents:


Features


Introduction

The megaAVR is a series of new devices in the AVR RISC Microcontroller family that includes, among other new enhancements, a hardware multiplier. The multiplier can handle both signed and unsigned integer and fractional numbers without speed or code size penalty. To be able to use the multiplier, six new instructions are added to the AVR instruction set. These are:


8-bit Multiplication

Example 1 – Basic Usage

	in 	r16,PINB 	; Read pin values
	ldi 	r17,5 		; Load 5 into r17
	mul 	r16,r17 	; r1:r0 = r17 * r16
	movw 	r17:r16,r1:r0 	; Move the result to the r17:r16 register pair


Example 2 – Special Cases

	lds 	r0,variableA 	; Load r0 with SRAM variable A
	lds 	r1,variableB 	; Load r1 with SRAM variable B
	mul 	r1,r0 		; r1:r0 = variable A * variable B
	lds 	r0,variableA 	; Load r0 with SRAM variable A
	mul 	r0,r0 		; r1:r0 = square(variable A)


Example 3 – Multiply-accumulate Operation

c(n) = a(n)x b + c(n-1)


; r17:r16 = r18 * r19 + r17:r16
	in 	r18,PINB 	; Get the current pin value on port B
	ldi 	r19,b 		; Load constant b into r19
	muls 	r19,r18 	; r1:r0 = variable A * variable B
	add 	r16,r0 		; r17:r16 += r1:r0
	adc 	r17,r1


16-bit Multiplication

When both factors are negative, the two’s complement notation is used;

..


16-bit x 16-bit = 16-bit Operation

Source Code


16-bit x 16-bit = 24-bit Operation

Source Code


16-bit x 16-bit = 32-bit Operation

	ldi 	R23,HIGH(672)
	ldi 	R22,LOW(672) 	; Load the number 672 into r23:r22
	ldi 	R21,HIGH(1844)
	ldi 	R20,LOW(184) 	; Load the number 1844 into r21:r20
	call 	mul16x16_32 	; Call 16bits x 16bits = 32bits multiply routine

Example 4 – Basic Usage 16-bit x 16-bit = 32-bit Integer Multiply

Source Code


16-bit Multiply-accumulate Operation

Source Code




Using Fractional Numbers


Example 5 – Basic Usage 8-bit x 8-bit = 16-bit Signed Fractional Multiply

	in 	r16,PINB 	; Read pin values
	ldi 	r17,$B0 	; Load -0.625 into r17
	fmuls 	r16,r17 	; r1:r0 = r17 * r16
	movw 	r17:r16,r1:r0 	; Move the result to the r17:r16 register pair


Example 6 – Multiply-accumulate Operation

The example below uses data from the ADC. The ADC should be configured so that the format of the ADC result is compatible with the fractional two’s complement format. For the ATmega83/163, this means that the ADLAR bit in the ADMUX I/O register is set and a differential channel is used. (The ADC result is normalized to one.)
	ldi 	r23,$62 	; Load highbyte of fraction 0.771484375
	ldi 	r22,$C0 	; Load lowbyte of fraction 0.771484375
	in 	r20,ADCL 	; Get lowbyte of ADC conversion
	in 	r21,ADCH 	; Get highbyte of ADC conversion
	call 	fmac16x16_32 	;Call routine for signed fractional multiply accumulate

The registers R19:R18:R17:R16 will be incremented with the result of the multiplication of 0.771484375 with the ADC conversion result. In this example, the ADC result is treated as a signed fraction number. We could also treat it as a signed integer and call it “mac16x16_32” instead of “fmac16x16_32”. In this case, the 0.771484375 should be replaced with an integer.


Comment on Implementations

All 16-bit x 16-bit = 32-bit functions implemented here start by clearing the R2 register, which is just used as a “dummy” register with the “add with carry” (ADC) and “subtract with carry” (SBC) operations. These operations do not alter the contents of the R2 register. If the R2 register is not used elsewhere in the code, it is not necessary to clear the R2 register each time these functions are called, but only once prior to the first call to one of the functions.


Complete AVR201.asm

;**** A P P L I C A T I O N   N O T E   A V R 2 0 1 ***************************
;*
;* Title		: 16bit multiply routines using hardware multiplier
;* Version		: V2.0
;* Last updated		: 10 Jun, 2002
;* Target		: Any AVR with HW multiplier
;*
;* Support email	: avr@atmel.com
;*
;* DESCRIPTION
;* 	This application note shows a number of examples of how to implement
;*	16bit multiplication using hardware multiplier. Refer to each of the
;*	funtions headers for details. The functions included in this file
;*	are :
;*
;*	mul16x16_16	- Multiply of two 16bits numbers with 16bits result.
;*	mul16x16_32	- Unsigned multiply of two 16bits numbers with 32bits
;*			  result.
;*	mul16x16_24	- Unsigned multiply of two 16bits numbers with 24bits
;*			  result.
;*	muls16x16_32	- Signed multiply of two 16bits numbers with 32bits
;*			  result.
;*	muls16x16_24	- Signed multiply of two 16bits numbers with 24bits
;*			  result.
;*	mac16x16_24	- Signed multiply accumulate of two 16bits numbers
;*			  with a 24bits result.
;*	mac16x16_32	- Signed multiply accumulate of two 16bits numbers
;*			  with a 32bits result.
;*	fmuls16x16_32	- Signed fractional multiply of two 16bits numbers
;*			  with 32bits result.
;*	fmac16x16_32	- Signed fractional multiply accumulate of two 16bits
;*			  numbers with a 32bits result.
;*
;******************************************************************************



;****************************************************************************** ;* ;* FUNCTION ;* mul16x16_16 ;* DECRIPTION ;* Multiply of two 16bits numbers with 16bits result. ;* USAGE ;* r17:r16 = r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 9 + ret ;* Words : 6 + ret ;* Register usage: r0, r1 and r16 to r23 (8 registers) ;* NOTE ;* Full orthogonality i.e. any register pair can be used as long as ;* the result and the two operands does not share register pairs. ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** mul16x16_16: mul r22, r20 ; al * bl movw r17:r16, r1:r0 mul r23, r20 ; ah * bl add r17, r0 mul r21, r22 ; bh * al add r17, r0 ret

;****************************************************************************** ;* ;* FUNCTION ;* mul16x16_32 ;* DECRIPTION ;* Unsigned multiply of two 16bits numbers with 32bits result. ;* USAGE ;* r19:r18:r17:r16 = r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 17 + ret ;* Words : 13 + ret ;* Register usage: r0 to r2 and r16 to r23 (11 registers) ;* NOTE ;* Full orthogonality i.e. any register pair can be used as long as ;* the 32bit result and the two operands does not share register pairs. ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** mul16x16_32: clr r2 mul r23, r21 ; ah * bh movw r19:r18, r1:r0 mul r22, r20 ; al * bl movw r17:r16, r1:r0 mul r23, r20 ; ah * bl add r17, r0 adc r18, r1 adc r19, r2 mul r21, r22 ; bh * al add r17, r0 adc r18, r1 adc r19, r2 ret

;****************************************************************************** ;* ;* FUNCTION ;* mul16x16_24 ;* DECRIPTION ;* Unsigned multiply of two 16bits numbers with 24bits result. ;* USAGE ;* r18:r17:r16 = r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 14 + ret ;* Words : 10 + ret ;* Register usage: r0 to r1, r16 to r18 and r20 to r23 (9 registers) ;* NOTE ;* Full orthogonality i.e. any register pair can be used as long as ;* the 24bit result and the two operands does not share register pairs. ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** mul16x16_24: mul r23, r21 ; ah * bh mov r18, r0 mul r22, r20 ; al * bl movw r17:r16, r1:r0 mul r23, r20 ; ah * bl add r17, r0 adc r18, r1 mul r21, r22 ; bh * al add r17, r0 adc r18, r1 ret
;****************************************************************************** ;* ;* FUNCTION ;* muls16x16_32 ;* DECRIPTION ;* Signed multiply of two 16bits numbers with 32bits result. ;* USAGE ;* r19:r18:r17:r16 = r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 19 + ret ;* Words : 15 + ret ;* Register usage: r0 to r2 and r16 to r23 (11 registers) ;* NOTE ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** muls16x16_32: clr r2 muls r23, r21 ; (signed)ah * (signed)bh movw r19:r18, r1:r0 mul r22, r20 ; al * bl movw r17:r16, r1:r0 mulsu r23, r20 ; (signed)ah * bl sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 mulsu r21, r22 ; (signed)bh * al sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret ;****************************************************************************** ;* ;* FUNCTION ;* muls16x16_24 ;* DECRIPTION ;* Signed multiply of two 16bits numbers with 24bits result. ;* USAGE ;* r18:r17:r16 = r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 14 + ret ;* Words : 10 + ret ;* Register usage: r0 to r1, r16 to r18 and r20 to r23 (9 registers) ;* NOTE ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** muls16x16_24: muls r23, r21 ; (signed)ah * (signed)bh mov r18, r0 mul r22, r20 ; al * bl movw r17:r16, r1:r0 mulsu r23, r20 ; (signed)ah * bl add r17, r0 adc r18, r1 mulsu r21, r22 ; (signed)bh * al add r17, r0 adc r18, r1 ret
;****************************************************************************** ;* ;* FUNCTION ;* mac16x16_24 ;* DECRIPTION ;* Signed multiply accumulate of two 16bits numbers with ;* a 24bits result. ;* USAGE ;* r18:r17:r16 += r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 16 + ret ;* Words : 12 + ret ;* Register usage: r0 to r1, r16 to r18 and r20 to r23 (9 registers) ;* ;****************************************************************************** mac16x16_24: muls r23, r21 ; (signed)ah * (signed)bh add r18, r0 mul r22, r20 ; al * bl add r16, r0 adc r17, r1 adc r18, r2 mulsu r23, r20 ; (signed)ah * bl add r17, r0 adc r18, r1 mulsu r21, r22 ; (signed)bh * al add r17, r0 adc r18, r1 ret
;****************************************************************************** ;* ;* FUNCTION ;* mac16x16_32 ;* DECRIPTION ;* Signed multiply accumulate of two 16bits numbers with ;* a 32bits result. ;* USAGE ;* r19:r18:r17:r16 += r23:r22 * r21:r20 ;* STATISTICS ;* Cycles : 23 + ret ;* Words : 19 + ret ;* Register usage: r0 to r2 and r16 to r23 (11 registers) ;* ;****************************************************************************** mac16x16_32: clr r2 muls r23, r21 ; (signed)ah * (signed)bh add r18, r0 adc r19, r1 mul r22, r20 ; al * bl add r16, r0 adc r17, r1 adc r18, r2 adc r19, r2 mulsu r23, r20 ; (signed)ah * bl sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 mulsu r21, r22 ; (signed)bh * al sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret mac16x16_32_method_B: ; uses two temporary registers ; (r4,r5), but reduces cycles/words ; by 1 clr r2 muls r23, r21 ; (signed)ah * (signed)bh movw r5:r4,r1:r0 mul r22, r20 ; al * bl add r16, r0 adc r17, r1 adc r18, r4 adc r19, r5 mulsu r23, r20 ; (signed)ah * bl sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 mulsu r21, r22 ; (signed)bh * al sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret ;****************************************************************************** ;* ;* FUNCTION ;* fmuls16x16_32 ;* DECRIPTION ;* Signed fractional multiply of two 16bits numbers with 32bits result. ;* USAGE ;* r19:r18:r17:r16 = ( r23:r22 * r21:r20 ) << 1 ;* STATISTICS ;* Cycles : 20 + ret ;* Words : 16 + ret ;* Register usage: r0 to r2 and r16 to r23 (11 registers) ;* NOTE ;* The routine is non-destructive to the operands. ;* ;****************************************************************************** fmuls16x16_32: clr r2 fmuls r23, r21 ; ( (signed)ah * (signed)bh ) << 1 movw r19:r18, r1:r0 fmul r22, r20 ; ( al * bl ) << 1 adc r18, r2 movw r17:r16, r1:r0 fmulsu r23, r20 ; ( (signed)ah * bl ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 fmulsu r21, r22 ; ( (signed)bh * al ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret ;****************************************************************************** ;* ;* FUNCTION ;* fmac16x16_32 ;* DECRIPTION ;* Signed fractional multiply accumulate of two 16bits numbers with ;* a 32bits result. ;* USAGE ;* r19:r18:r17:r16 += (r23:r22 * r21:r20) << 1 ;* STATISTICS ;* Cycles : 25 + ret ;* Words : 21 + ret ;* Register usage: r0 to r2 and r16 to r23 (11 registers) ;* ;****************************************************************************** fmac16x16_32: clr r2 fmuls r23, r21 ; ( (signed)ah * (signed)bh ) << 1 add r18, r0 adc r19, r1 fmul r22, r20 ; ( al * bl ) << 1 adc r18, r2 adc r19, r2 add r16, r0 adc r17, r1 adc r18, r2 adc r19, r2 fmulsu r23, r20 ; ( (signed)ah * bl ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 fmulsu r21, r22 ; ( (signed)bh * al ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret fmac16x16_32_method_B: ; uses two temporary registers ; (r4,r5), but reduces cycles/words ; by 2 clr r2 fmuls r23, r21 ; ( (signed)ah * (signed)bh ) << 1 movw r5:r4,r1:r0 fmul r22, r20 ; ( al * bl ) << 1 adc r4, r2 add r16, r0 adc r17, r1 adc r18, r4 adc r19, r5 fmulsu r23, r20 ; ( (signed)ah * bl ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 fmulsu r21, r22 ; ( (signed)bh * al ) << 1 sbc r19, r2 add r17, r0 adc r18, r1 adc r19, r2 ret ;**** End of File ****