5.4 Digression: Endianness and Instruction width

Hold on a second, there was some confusion with the previous section. When we specify the instruction mov $23, label1, how do we know the width of operands? How do we know what bytes are copied?

5.4.1 Operand width

The “width” of an operand is the number of bits. It can be 8, 16 or 32 (in 32-bit mode). The instruction mov $23, label1 does not indicate the width, at all. Note that the value 23 needs at least 5 bits to represent, but it can also be represented by more bits (just more leading zeros).

In other words, is the source operand 0b00010111, 0b00000000 00010111, or 0b00000000 00000000 00000000 00010111?

In the absence of a width indicator (b for 8 bits, w for 16 bits or l for 32 bits), the default is l (32 bits).

As a general rule, always use a width designator. Using an operator with the wrong width can cause interesting but obscure problems. We will explore that in other modules.

5.4.2 Endianness

Now that we know how to specify the width of operands, how will the bytes be organized? After all, the binary number 0b00000000 00010111 needs two bytes to represent, which byte has the lowest address?

Well, the processor industry cannot agree, either. Some manufacturers such as Motorola decides to put the most significant byte (MSB) first, at the lowest address. Others, such as Intel, decides to put the least significant byte (LSB) first, at the lowest address.

Little Endian refers to the Intel method, which is summarized as “least significant byte at the lowest address”. Big Endian, of course, means the other way around, “most significant byte at the lowest address”.

This can lead to some confusion to people who are just getting into this, mostly because when we write a number (in any base), we put the most significant digits on the left hand side. At the same time, when we list byte values using a debugger or the .byte directive, we list the bytes in increasing address.

As a test, see if you can understand why the following directives define the same byte sequence: