
The special thing with these numbers is that the precision (i.e. Represented using a zero exponent field and a non-zero mantissa. Denormalized numbers - numbers smaller than the smallest normal number.Not a Number (NaN) - represented using an "all ones" exponent and a non-zero mantissa.plus and minus infinity - represented using an "all ones" exponent and a zero mantissa field.A minus zero is useful when the result of an operation is extremely small, but it's still important to know from which direction the operation came from. The sign bit is used to represent "plus zero" and "minus zero". Zero is encoded with both exponent and mantissa as zero.

In addition to the normal floating-point values, there are a number of special values: The rest of the binary digits are stored in an integer field, in the 32-bit case this field is 23 bits. This means that it's no point in storing it. When looking at the mantissa (the value between 1.0 and (almost) 2.0), one sees that all possible values start with a "1" (both in the decimal and binary representation).(0 and "all ones" are used to encode special values, see below.) A value in the middle (127, in the 32-bit case) represents zero, this is also known as the bias. 1 represents the smallest exponent and "all ones - 1" the largest.

The exponent is stored as an unsigned integer, for 32-bits floating-point values, this field is 8 bits.This is encoded as follows, according to the IEEE-754 floating-point standard. This is known as the "mantissa" or the significand. The value in the range 1.0 to (almost) 2.0.So, what is needed to encode this, as efficiently as possible? What do we really need? When it comes to the representation, you can see all normal floating-point numbers as a value in the range 1.0 to (almost) 2.0, scaled with a power of two. For normal 32-bit floating-point values, this corresponds to values in the range from 1.175494351 * 10^-38 to 3.40282347 * 10^+38.Ĭlearly, using only 32 bits, it's not possible to store every digit in such numbers. Unlike integers, a floating-point value is intended to represent extremely small values as well as extremely large.

To understand how they are stored, you must first understand what they are and what kind of values they are intended to handle.
