KoblentsBlog Photography
Contact About
Ches
Bitwise Shifting Pitfalls
There are a few pitfalls in shifting. The issues appear when:
  • Programmer is unclear on logical versus arithmetic shifting
  • Programmer uses signed types
  • Programmer uses shifting to optimize divide/multiply by 2
The short version is this:
  • A left shift is a multiplication by 2n (with possible overflow.)
  • A right shift is division by 2n with caveats and pitfalls.
  • For left shifting, arithmetic and logical shifting is identical.
  • Arithmetic right shift will shift, then pad MSB with previous MSB.
  • Logical right shift will shift, then pad MSB with 0.
  • For unsigned values, arithmetic and logical right shift is identical.
  • For positive signed values, arithmetic and logical right shift is identical.
  • For negative values, arithmetic right shift is division by 2n rounding down.
  • For negative values, signed division by 2n rounds to 0.
  • Right shift for C is logical or arithmetic depending on the compiler behavior.
The full details are beyond the scope of this article. Sources:
http://en.wikipedia.org/wiki/Arithmetic_shift
http://en.wikipedia.org/wiki/Logical_shift
Here is a (hopefully) exhaustive example of the above cases. Let's first define some types:
12
typedef signed char int8;		// one signed byte
typedef unsigned char uint8;		// one unsigned byte
Showing logical shifting:
123456
uint8 x = 0b00001000;			// x = 0x08 = 8 base 10
printf("Shift left 1: %d\n", x << 1);	// x * 2^1 == 16 == 0b0001 0000
printf("Shift left 2: %d\n", x << 2);	// x * 2^2 == 32 == 0b0010 0000
printf("Shift right 1: %d\n", x >> 1);	// x / 2^1 == 4  == 0b0000 0100
printf("Shift right 2: %d\n", x >> 2);	// x / 2^2 == 2  == 0b0000 0010
printf("Shift right 4: %d\n", x >> 4);	// x / 2^4 == 0  == 0b0000 0000

As you can see, if you shift too far right, the data is lost: integer division of 8 by 16 is 0.
Showing arithmetic shifting:
123456
int8 x = 0b00001000;			// x = 0x08 = 8 base 10
printf("Shift left 1: %d\n", x << 1);	// x * 2^1 == 16 == 0b0001 0000
printf("Shift left 2: %d\n", x << 2);	// x * 2^2 == 32 == 0b0010 0000
printf("Shift right 1: %d\n", x >> 1);	// x / 2^1 == 4  == 0b0000 0100
printf("Shift right 2: %d\n", x >> 2);	// x / 2^2 == 2  == 0b0000 0010
printf("Shift right 4: %d\n", x >> 4);	// x / 2^4 == 0  == 0b0000 0000

So far so same, because left shifting is identical, and right shifting is identical for positive signed values. Let's try some negatives.
Let's have x be -6. Recall how to do two's complement:
	1. 6			0b0000 0110
	2. Invert		0b1111 1001
	3. Add one		0b1111 1010
Now, how do we arithmetic shift right signed numbers?
x >> n | ~(~0U >> n)

How does it work?
	1. ~0U = unsigned 0b111...1
	2. unsigned forces unsigned shift right = logical shift right
	3. ~0U >> n == 0b0..01111..1 with n leading 0s
	4. ~(~0U >> n) == 0b1..1000..0 with n leading 1s
	5. When ORed with x >> n, this means that the n leading bits will now be 1.
Example:
	let x			= 0b1111 1011 or -5 base 10
	shift 1 right:
	x >> 1			= 0b0111 1101
	~0U			= 0b1111 1111
	~0U >> 1		= 0b0111 1111
	~(~0U >> 1)		= 0b1000 0000
	x >> 1 | ~(~0U >> 1)	= 0b0111 1101 | 0b1000 0000
				= 0b1111 1101

And as we can see, the MSB becomes 1.
This is the meat and potatoes of this article. Here is an example illustrating an arithmentic shift right on a negative (signed) number:
1234567891011
int8 x = 0b11111010;			// x = -6 base 10
int8 temp;

printf("Shift left 1: %d\n", x << 1);	// x * 2^1 == -12 == 0b1111 0100
printf("Shift left 2: %d\n", x << 2);	// x * 2^2 == -24 == 0b1110 1000
temp = x >> 1 | ~(~0U >> 1);
printf("Shift right 1: %d\n", temp);	// x / 2^1 ==  -3 == 0b1111 1101
temp = x >> 2 | ~(~0U >> 2);
printf("Shift right 2: %d\n", temp);	// x / 2^2 ==  -1 == 0b1111 1111
temp = x >> 3 | ~(~0U >> 3);
printf("Shift right 3: %d\n", temp);	// x ^ 2^3 ==  -1 == 0b1111 1111
And here you expect 0, but instead get -1.
All of this has been fairly theoretical. You have some code, but you may not believe how it translates from C to x86 - and you're right to be skeptical, as this is all compiler-specific, though hopefully the code above forces most common compilers to behave the same. Regardless, let's see some real ASM, that you can compile and run right now, along with output. First, here's output:
12345678910111213
shl  8, 1 ==   16       Logical shifted 8 left 1, result: 16
shl  8, 2 ==   32       Logical shifted 8 left 2, result: 32
shl  8, 4 ==  128       Logical shifted 8 left 4, result: 128
sal -8, 1 ==  -16       Arithmetic shifted -8 left 1, result: -16
sal -8, 2 ==  -32       Arithmetic shifted -8 left 2, result: -32
sal -8, 4 == -128       Arithmetic shifted -8 left 4, result: -128
shr  8, 1 ==    4       Logical shifted 8 right 1, result: 4
shr  8, 2 ==    2       Logical shifted 8 right 2, result: 2
shr  8, 4 ==    0       Logical shifted 8 right 4, result: 0
sar -8, 1 ==   -4       Arithmetic shifted -8 right 1, result: -4
sar -8, 2 ==   -2       Arithmetic shifted -8 right 2, result: -2
sar -8, 4 ==   -1       Arithmetic shifted -8 right 4, result: -1
sar -8, 8 ==   -1       Arithmetic shifted -8 right 8, result: -1
As you can see, if we arithmetic shift right a negative (signed, obviously) number, we will eventually get -1 instead of 0. This is not the right result! To try this yourself, the makefile to run the code is trivial:
1234567891011121314
TARGET = shift
OBJECTS = shift.o

all: $(OBJECTS)
	ld $(OBJECTS) -o $(TARGET) -lc \
		--dynamic-linker /lib64/ld-linux-x86-64.so.2

%.o: %.asm
	nasm -f elf64 $< -o $@

clean:
	rm -rf *.o;
	rm -rf *.lst;
	rm -rf $(TARGET)
Finally, here's the code, saved as shift.asm:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143
section .text			; program test
	global	_start		; _start is default; export for linker
	extern	printf

_start:				; _start (default start point) sub/func
	mov	r14, 0x08	; start value: +8
	mov	cl, 1		; shift 1
	call	left_logic	; shift left, logical
	mov	cl, 2		; shift 2
	call	left_logic	; shift left, logical
	mov	cl, 4		; shift 4
	call	left_logic	; shift left, logical
	mov	r14, -0x08	; start value: -8
	mov	cl, 1		; shift 1
	call	left_arith	; shift left, arithmetic
	mov	cl, 2		; shift 2
	call	left_arith	; shift left, arithmetic
	mov	cl, 4		; shift 4
	call	left_arith	; shift left, arithmetic

	
	mov	r14, 0x08	; start value: +8
	mov	cl, 1		; shift 1
	call	right_logic	; shift right, logical
	mov	cl, 2		; shift 2
	call	right_logic	; shift right, logical
	mov	cl, 4		; shift 4
	call	right_logic	; shift right, logical
	mov	r14, -0x08	; start value: -8
	mov	cl, 1		; shift 1
	call	right_arith	; shift right, arithmetic
	mov	cl, 2		; shift 2
	call	right_arith	; shift right, arithmetic
	mov	cl, 4		; shift 4
	call	right_arith	; shift right, arithmetic
	mov	cl, 8		; shift 8
	call	right_arith	; shift right, arithmetic

				; next section is exit routine
	mov	ebx, 0		; exit code: 0 (success)
	mov	eax, 1		; system call: sys_exit
	int	0x80		; call kernel
	
left_logic:			; logical shift left
	mov	r10, r14	; store initial value
	shl	r10, cl		; shift
	xor	rdx, rdx	; clear rdx
	mov	dl, cl		; store shift amount to print later
	mov	rcx, r10	; store initial value to print
	mov	r15, leftsl	; string for logical shifting left
	call	print_int	; print
	mov	r15, leftl	; asm notation for logical shifting left
	call	print_int	; print
	
	ret

left_arith:			; arithmetic shift left
	mov	r10, r14	; store initial value
	sal	r10, cl		; shift
	xor	rdx, rdx	; clear rdx
	mov	dl, cl		; store shift amount to print later
	mov	rcx, r10	; store initial value to print
	mov	r15, leftsa	; string for arithmetic shifting left
	call	print_int	; print
	mov	r15, lefta	; asm notation for arithmetic shifting left
	call	print_int	; print
	
	ret
	
right_logic:			; logical shift right
	mov	r10, r14	; store initial value
	shr	r10, cl		; shift
	xor	rdx, rdx	; clear rdx
	mov	dl, cl		; store shift amount to print later
	mov	rcx, r10	; store initial value to print
	mov	r15, rightsl	; string for logical shifting right
	call	print_int	; print
	mov	r15, rightl	; asm notation for logical shifting right
	call	print_int	; print
	
	ret

right_arith:			; arithmetic shift right
	mov	r10, r14	; store initial value
	sar	r10, cl		; shift
	xor	rdx, rdx	; clear rdx
	mov	dl, cl		; store shift amount to print later
	mov	rcx, r10	; store initial value to print
	mov	r15, rightsa	; string for arithmetic shifting right
	call	print_int	; print
	mov	r15, righta	; asm notation for arithmetic shifting right
	call	print_int	; print
	
	ret


	
print_int:
	push	rax		; store all the registers (overkill? yes)
	push	rbx
	push	rcx
	push	rdx
	push	rsi
	push	rdi
	push	rbp
	push	rsp
	push	r8
	push	r9
	push	r10
	push	r11
	push	r12
	push	r13
	mov	rsi, r14	; value stored in r14
	mov	rdi, r15	; string stored in r15
	mov	eax, 0		; no non-int args
	call	printf		; call printf
	pop	r13		; pop all the registers
	pop	r12
	pop	r11
	pop	r10
	pop	r9
	pop	r8
	pop	rsp
	pop	rbp
	pop	rdi
	pop	rsi
	pop	rdx
	pop	rcx
	pop	rbx
	pop	rax
	ret
				

section .data			; program data

leftl	db	"Logical shifted %d left %d, result: %d", 10, 0
leftsl	db	"shl %2d, %d == %4d", 9, 0
rightl	db	"Logical shifted %d right %d, result: %d", 10, 0
rightsl	db	"shr %2d, %d == %4d", 9, 0
lefta	db	"Arithmetic shifted %d left %d, result: %d", 10, 0
leftsa	db	"sal %2d, %d == %4d", 9, 0
righta	db	"Arithmetic shifted %d right %d, result: %d", 10, 0
rightsa	db	"sar %2d, %d == %4d", 9, 0
This leads us into the next topic: When to use bitwise operators!
Ches Koblents
September 9, 2013
 
« Newer Older »
© Copyright Koblents.com, 2012-2024