Table of Contents
Introduction
This document provides a detailed description of pack algorithms used in the current project. Briefly, pack algorithms are used throughout the project to transform data divided into chunks with one parameters into the data divided into chunks with another parameters. The dependence of algorithms on these parameters is discussed further in the Section Algorithms.
Basic notions and assumptions
This section introduces notions we will use throughout the document.
We will assume further that the data to be transformed by pack algorithms is byte-aligned. In addition, the packed data is also considered to be byte-aligned.
We will also suppose that the data (both input and output) is divided into chunks which are the groups of bytes for which the type is language-defined (such as uint8_t
or uint32_t
) or user-defined. Currently, we restrict the chunk type to be integral.
By term endianness we mean the significance order of groups of bytes further called units combined with the significance order of bits inside of each unit. For example, bit_unit_little_bit endianness refers to the most significant unit first order, and each unit contains the bits in the least significant bit first order.
Generally, we have the four following types of endianness:
big_unit_big_bit
endianness refers to the most significant unit first order, and each unit contains the bits in themost significant bit first
order;little_unit_big_bit
endianness refers to the least significant unit first order, and each unit contains the bits in themost significant bit first
order;big_unit_little_bit
endianness refers to the most significant unit first order, and each unit contains the bits in theleast significant bit first
order;little_unit_little_bit
endianness refers to the least significant unit first order, and each unit contains the bits in theleast significant bit first
order.
Note that if the unit is byte, then the first two endiannesses coincide with well-known big-endian and little-endian byte orders. However, we prefer to use the above-introduced classification, since it takes into account the architectures with non-typical endiannesses and, hence, is wider than just the big/little-endian dichotomy.
All the notation not described in this section will be introduced on the fly.
Algorithms
Pack algorithms are intended to transform byte-aligned data divided into chunks of bit size denoted by InputValueBits
into byte-aligned data divided into chunks of bit size denoted by OutputValueBits
. Moreover, we suppose that all input and output data chunks consist of units ordered in accordance with the corresponding endiannesses. We will refer to these endiannesses as InputEndianness
and OutputEndianness
, respectively.
Pack algorithms are divided into the following categories depending on the relation between the chunk sizes:
InputValueBits
<OutputValueBits
. This algorithm combines several small chunks into big one, and is further referred to as imploder. It is also supposed thatOutputValueBits
is a multiple ofInputValueBits
.InputValueBits
>OutputValueBits
. This algorithm splits one big chunk into several small chunks, and is further referred to as exploder. It is also supposed thatInputValueBits
is a multiple ofOutputValueBits
.InputValueBits
=OutputValueBits
. This algorithm transforms data chunk-by-chunk in accordance with the corresponding endianness conversion.
It is important to note that the combining and splitting operations in imploder and exploder algorithms are also dependent on endianness conversion.
Endianness conversion
Consider first the case of little_unit_big_bit
-to-big_unit_big_bit
conversion.
input:
output:
In the example above we suppose that the unit of arrays is byte. It is easy to see, that the value written to the out array is obtained by combining each input chunk byte data in reverse byte order.
It may seem at first look that all same endianness conversions are simplicity itself, but that's not quite true. To dive deeper into the problem of endianness conversion, consider the inverse conversion.
input:
output:
In this example, input
array units are ordered in big_unit_big_bit
endianness and output
array units are ordered in little_unit_big_bit
endianness (supposing that the unit is byte). One can see that in addition to reverse byte order we have the reverse order of input chunks in the out array.
An interested reader may wonder why changing of endiannesses leads to such a strange effect. Well, the answer to this question lies in the following convention: all data divided into chunks with units ordered in big_unit_big_bit
endianness will stay unchanged when tranforming to data with chunk units ordered in big_unit_big_bit
endianness. Let us explain it with the following example.
input:
output:
Here it is easy to see that the data from input
was just concatenated into the output
data with no additional tranformations. Now, notice that the first and the second example described in this section implicitly rely on the above-described convention. In the first example the input data is concatenated in reverse byte order, and in the second example the byte order is reversed after the input data concatenation.
We haven't touched the case of endian conversion with bit reversals yet. Let us see at the following example:
input:
output:
In this example, input
array units are ordered in big_unit_little_bit
endianness and output
array units are ordered in big_unit_big_bit
endianness (supposing that the unit is byte). Writing the byte 0x12
in binary form gives us 00010010
, its reverse binary form is 01001000
, which gives us 0x48
in hex representation. The same transformations are applied to the remaining bytes.
To conclude, there are three types of reversals that we must deal with in pack algorithms:
- data chunk order reversal (as in
big_unit_big_bit
-to-little_unit_big_bit
conversion); - unit order reversal (as in
little_unit_big_bit
-to-big_unit_big_bit
conversion); - bit order reversal (as in
big_unit_little_bit
-to-big_unit_big_bit
conversion).
Data chunk order reversal
In this section we suppose that the chunk type of input and output data is integral.
Data chunk order reversal tranforms a group of consecutive input chunks with units in InputEndianness
order into an output chunk with units in OutputEndianness
order and can be described as follows.
- Check whether
InputEndianness
orOutputEndianness
islittle_bit
. This condition determines the data chunk order reversal presence or absence. (We have already seen how the order of chunks changed inbig_unit_big_bit
-to-little_unit_big_bit
conversion, so this is just the generalization.)
The choice of endianness depends on an algorithm where this step is carried out (imploder or exploder). - If the endianness on the previous step is
little_bit
, set shift equal toOutputBits
- (InputBits
+ already_processed_bits) in the case of imploder, and toInputBits
- (OutputBits
+ already_processed_bits) in the case of exploder. Otherwise, set shift equal to already_processed_bits.
By already_processed_bits we mean the number of already processed (i.e. combined or splitted) input chunks multiplied by the number bits in a byte. The shift is later used either to retrieve or to accumulate input chunks ( see `Imploder`).
Unit order reversal
In this section we consider that the unit is no less than byte.
Unit order reversal transforms the order of units in each input chunk. It can be described as follows:
- If
InputEndianness
andOutputEndianness
have different unit orders, reverse byte order in input chunk and go to the next step. Otherwise, do nothing and return. - Reverse byte order in each input chunk unit.
Note that if unit is byte, the second step can be omitted.
Bit order reversal
In this section we consider that the unit is no less than byte.
Bit order reversal transforms the order of bits in each input chunk unit. It can be described as follows:
- If
InputEndianness
andOutputEndianness
have different bit orders, reverse byte order in each input chunk unit and go to the next step. Otherwise, do nothing and return. - Reverse bit order in each byte of input chunk unit.
Note that if unit is byte, the first step can be omitted.
Imploder
Recall that imploder algorithm deals with the case InputValueBits
< OutputValueBits
and converts data from InputEndianness
to OutputEndianness
order.
There are three main parts of imploder algorithm:
- Calculation of the value that indicates the position of input chunk in the output and indicates data chunk order reversal, if present. This part is currently implemented via shift trait containing the value that depends on whether the output endianness is
little_unit
. - Unit order reversal algorithm. This part is implemented via partial struct specializations which deal with different specific cases.
- Bit order reversal algorithm. This part is implemented via partial struct specializations which deal with different specific cases.
The described process can be written in the following pseudocode:
Here OR
denotes logical OR operation and <<
denotes left shift operation.
Exploder
Exploder algorithm deals with the case InputValueBits
> OutputValueBits
, converts data from InputEndianness
to OutputEndianness
order and is the same as the imploder algorithm described in Section Imploder except for several points:
- the condition of shift choice is replaced with
InputEndianness
instead ofOutputEndianness
; - right shift operation instead of left shift operation is used;
- the output chunk is just the part of input chunk.
The pseudocode of exploder with the above-described changes is presented below.
Equal size case
This algorithm deals with the case InputValueBits
= OutputValueBits
and converts data from InputEndianness
to OutputEndianness
. It uses the aforementioned unit and bit order reversal algorithms, and its pseudocode can be presented as follows: