# Integer Representations

在学校教科书告诉我们原码,反码,补码的规则来计算正负数在计算机的表示。

原码:将一个整数,转换成二进制,就是其原码。
如单字节的 5 的原码为:0000 0101;-5 的原码为 1000 0101。
反码:正数的反码就是其原码;负数的反码是将原码中,除符号位以外,每一位取反。
如单字节的 5 的反码为:0000 0101;-5 的反码为 1111 1010。
补码:正数的补码就是其原码;负数的反码 + 1 就是补码。
如单字节的 5 的补码为:0000 0101;-5 的补码为 1111 1011。
在计算机中,正数是直接用原码表示的,如单字节 5,在计算机中就表示为:0000 0101。
负数用补码表示,如单字节 - 5,在计算机中表示为 1111 1011。

规则不仅复杂抽象不说,也无助于我们理解无符号和有符号的转换,无符号与有符号的扩展。当我阅读 CSAPP Chapter 2 Representing and Manipulating Information,忽然如沐春风,豁然开朗。如下是部分笔记。

# Unsigned and Two’s-Complement Encodings

Assume we have an integer data type of ω\omega bits. We write a bit vector as either x\vec{x}, to denote the entire vector, or as [xw1,xw2,...,x0][x_{w-1}, x_{w-2}, ..., x_0 ] to denote the individual bits within the vector. Treating x\vec{x}as a number written in binary notation, we obtain the unsigned interpretation of x\vec{x}. We express this interpretation as a function B2UwB2U_w (for “binary to unsigned”, length ω\omega)

\begin{equation} B2U_w(\vec{x}) = \sum_{i = 0}^{w-1} x_i 2^i \end{equation}

We wish to represent negative values as well by two’s-complement form. This is defined by interpreting the most significant bit of the word to have negative weight. We express this interpretation as a function B2TwB2T_w(for “binary to two’s-complement” length ω\omega )

\begin{equation} B2T_w(\vec{x}) = -x_{w - 1} 2^{w-1} + \sum_{i = 0}^{w-2} x_i 2^i \end{equation}

Problem: find the B2Tw(5)B2T_w(-5)

\begin{align} \because {5 + -5 = 0} \\ 0000 \space 0101 + 1\space\underbrace{\vec{x_6, x_5, ..., x_0}}_\text{x} = 0 \\ \underbrace{\vec{x_6, x_5, ..., x_0}}_\text{x} = 1000 \space 000 - 0000 \space 0101 \\ \underbrace{\vec{x_6, x_5, ..., x_0}}_\text{x} = 0111 \space 1011\\ \therefore {B2T_w(-5) = 1111 \space 1011} \end{align}

# Conversions Between Singed and Unsigned

\begin{gather} U2B_w(x) = B2U_w^{-1}(x) \\ T2B_w(x) = B2T_w^{-1}(x) \end{gather}

These functions give the unsigned or two’s-complement bit patterns for a numeric value. Consider the function

\begin{gather} U2T_w(x) = B2T_w(U2B_w(x)) \end{gather}

which takes a number between 0 and 2w12^w - 1 and yields a number between [2w1,2w1][-2^{w - 1},\space 2^{w - 1}], where the two numbers have identical bit representations, except that the argument is unsigned, while the result has a two’s-complement representation. Conversely, the function

\begin{gather} T2U_w(x) = B2U_w(T2B_w(x)) \end{gather}

yields the unsigned number having the same bit representation as the two’s-complement value of x.

To get a better understanding of the relation between a signed number xx and its unsigned counterpart T2Uw(x)T2U_w(x), we can use the fact that they have identical bit representations to derive a numerical relationship.

\begin{align} B2U_w(\vec{x}) - B2T_w(\vec{x}) = x_{w-1}(2^{w-1} - - 2^{w-1}) = x_{w-1}2^w \\ \end{align}

This gives a relationship B2Uw(x)=xw12w+B2Tw(x)\textcolor{red}{B2U_w(\vec{x}) = x_{w-1}2^w + B2T_w(\vec{x})}. If we let x=B2Tw(x)x = B2T_w(\vec{x}), we then have

\begin{equation} B2U_w(T2B_w(x)) = T2U_w(x) = x_{w-1}2^w + x \end{equation}

This relationship is useful for proving relationships between unsigned and two’s-complement arithmetic. In the two’s-complement representation of xx, bit xw1x_{w-1} determines whether or not xx is negative, giving

\begin{equation} T2U_w(x) = \begin{cases} x + 2^w, & x \lt 0\\ x, & x \ge 0 \end{cases} \end{equation}


Going in the other direction, we wish to derive the relationship between an unsigned number xx and its signed counterpart U2Tw(x).U2T_w(x). If we let x=B2Ux(x)x = B2U_x(\vec{x}) , then x=U2B(x)\vec{x} = U2B(x) we have

\begin{equation} B2T_w(U2B_w(x)) = U2T_w(x) = -x_{w-1}2^w + x \end{equation}

In the unsigned representation of xx, bit xw1x_{w-1} determines whether or not xx is greater than or equal to 2w12^{w-1}, giving

\begin{equation} U2T_w(x) = \begin{cases} x, & x \lt 2^{w-1}\\ x - 2^w, & x \ge 2^{w-1} \end{cases} \end{equation}

# 2.2.5 Expanding the Bit Representation of a Number

zero extension: To convert an unsigned number to a larger data type, we simply add leading 0s to the representation.

sign extension: To convert an signed(two’s-complement) number to a larger data type, , add leading most significant bit to the representation.

\begin{align} B2T_{w+1}([x_{w-1}, x_{w-1}, x_{w-2},...x_0]) &= -x_{w-1}2^w + [\sum_0^{w-1}x_i2^i] \\ &= -x_{w-1}2^w + x_{w-1}2^{w-1} +[\sum_0^{w-2}x_i2^i] \\ & = -x_{w-1}2^{w-1} +[\sum_0^{w-2}x_i2^i] \\ & = B2T_w([x_{w-1},x_{w-2},...x_0] \end{align}

# Truncating Numbers

\begin{align} B2U_w([x_{w-1}, x_{w-2},...x_0]) \space mod \space 2^k &= [\sum_0^{w-1}x_i2^i]\space mod\space 2^k \\ &= [\sum_0^{k-1}x_i2^i] \space mod \space 2^k \\ & = [\sum_0^{k-1}x_i2^i] \\ & = B2U_w([x_{k-1},x_{k-2},...x_0] \end{align}

# 2.4.1 Fractional Binary Numbers

Fractional binary notation:

\begin{equation} b = b_mb_{m-1}...b_1b_0.b_{-1}...b_{-n} = \sum_{i=-n}^mi^i * b_i \end{equation}

This notation is not efficient for representing 521005 * 2^{100}, which would consist of the bit pattern 101 followed by 100 zeros. Instead, we would like to represent numbers in a form x2yx * 2^y by giving the values of xx and yy.

The IEEEE floating-point standard represents a number in a form v=(1)sM2Ev = (-1)^s * M * 2^E

  • The sign s determines whether the number is negative ( s = 1) or positive (s = 0).
  • The significant M is a fractional binary number that ranges either between 1 and 2ϵ2 - \epsilon or between 0 and 1ϵ1 - \epsilon
  • The exponent E weights the value by a (possibly negative ) power of 2.

The bit representation of a floating-point number is divided into three fields to encode these values:

  • The single sign bit s directly encodes the sign s.
  • The k-bit exponent field exp=3k1...e1e0exp = 3_{k-1}...e_1e_0 encodes the exponent E.
  • The n-bit fraction field frac=fn1...f1f0frac = f_{n - 1}...f_1f_0 encodes the significant M.

Normalized Values

The bit pattern of expexp is neither all 0s nor all 1s. The exponent value is E=eBiasE = e - Bias where ee is the unsigned number having bit representation e=ek1...e1e0e = e_{k-1}...e_1e_0, and Bias is a bias value bias=2k11bias = 2^{k - 1} - 1. M=1+0.fn1...f1f0M = 1 + 0.f_{n-1}...f_1f_0

Denormalized Values

When the exponent field is all 0s.

  • E=1BiasE = 1 - Bias
  • M=fM = f

Special Values

  • Infinity: When the exponent field is all 1s and the fraction field is all 0s.
  • NaN: When the exponent field is all 1s but the fraction field is nonzero.

Problem 2.33: k = 2, n = 2, bias=2211=1bias = 2^{2 - 1} -1 = 1

BitseEfMV
0 00 0000 ( 1 - bias)000
0 00 0100 ( 1 - bias)1/41/41/4
0 00 1000 ( 1 - bias)2/42/42/4
0 00 1100 ( 1 - bias)3/43/43/4
0 01 0010 ( e - bias)04/44/4
0 01 01101/45/45/4
0 01 10102/46/46/4
0 01 11103/47/47/4
0 10 002104/48/4
0 10 01211/45/410/4
0 10 10212/46/412/4
0 10 11213/47/414/4
0 11 00----++\infty
0 11 01----NaN