ALib C++ Library
Library Version: 2402 R1
Documentation generated by doxygen
No Matches
ArrayCompressor Class Reference


This class provides several algorithms to compress arrays of integral data and encode them in BitBuffer objects. Besides a standard Huffman compression , different simple approaches are "tested" and the best compression algorithm is then chosen. The general assumption of the approaches (besides the Huffman coding) is that the data contains "signal data", which is either

  • sparsely filled,
  • has incremental values, or
  • has just values of a certain smaller range. Also combinations of these attributes are matched. Such data is often found in real-world applications and may be compressed much better than the generic Huffman approach may achieve.

Definition at line 73 of file ac.hpp.

#include <ac.hpp>

Inner Type Index:

class  Array
struct  Statistics

Public Type Index:

enum class  Algorithm {
  NONE = 0 , ALL = (1<< QtyAlgorithms) - 1 , Uncompressed = 1 , MinMax = 2 ,
  Sparse = 4 , VerySparse = 8 , Incremental = 16 , Huffman = 32 ,
  END_OF_ENUM = 64

Public Static Field Index:

static constexpr int QtyAlgorithms = 6
 The number of algorithms implemented.

Public Static Method Index:

template<typename TValue >
static std::pair< size_t, AlgorithmCompress (BitWriter &bitWriter, Array< TValue > &data, Algorithm algorithmsToTry=Algorithm::ALL, Statistics *statistics=nullptr)
template<typename TValue >
static void Decompress (BitReader &bitReader, Array< TValue > &data, Algorithm algorithm=Algorithm::ALL, Statistics *statistics=nullptr)

Public Method Index:

 ArrayCompressor ()=delete

Enumeration Details:

◆ Algorithm

enum class Algorithm

This enumeration denotes the different algorithms provided for compression.

With the inclusion of ALib Enums in the ALib Distribution , this enum is defined to be bitwise.
With the inclusion of at least one "ALib Camp" in the ALib Distribution , this enum is furthermore resourced and values are appendable to class AString and logable with ALox .

No compression method selected.


All compression methods selected.


Stores the data as integer values, which includes a simple sort of possible compression as documented with BitWriter::Write(TIntegral) .


Stores the differences between the minimum and maximum value found.


Writes '1' if next value is equal to previous, '0' plus next value otherwise.


Writes the number of following equal or non equal values.


Only distances of the values are written.


Huffman encoding (byte based).

Definition at line 305 of file ac.hpp.

Field Details:

◆ QtyAlgorithms

constexpr int QtyAlgorithms = 6

The number of algorithms implemented.

Definition at line 76 of file ac.hpp.

Constructor(s) / Destructor Details::

◆ ArrayCompressor()

ArrayCompressor ( )

Deleted default constructor (this class can not be created)

Method Details:

◆ Compress()

template<typename TValue >
std::pair< size_t, ArrayCompressor::Algorithm > Compress ( BitWriter & bitWriter,
Array< TValue > & data,
Algorithm algorithmsToTry = Algorithm::ALL,
Statistics * statistics = nullptr )

Compresses the given array and writes the data into the given bit writer. Each algorithm included in parameter algorithmsToTry are executed and finally that one with the best compression result is chosen. Prior to the usage data, some bits that determine the chosen algorithm are written, to enable method Decompress to deserialize the data.

To gain efficiency, the number of probed algorithms can be narrowed by setting a corresponding mask in algorithmsToTry. However, in many use case scenarios, the execution time is a less critical design factor than the compression factor reached. The decompression speed is solely dependent on the algorithm finally chosen, not on the number of algorithms tested on compression.

If only one algorithm is specified in parameter algorithmsToTry, then no meta-information about the algorithm chosen is written. Consequently, when reading back the data using Decompress, the same single algorithm has to be provided.
Template Parameters
TValueThe integral type of array data to compress.
bitWriterA bit writer to compress the data to.
dataThe array to compress.
algorithmsToTryThe set of algorithms to be tried on compression for best efficiency.
Defaults to Algorithm::ALL.
statisticsPointer a struct to collect statistics for the efficiency of array compression related to given user data. If set, methods Compress and Decompress will measure execution performance and compression rates for each algorithm. With that, a software may collect information about which algorithm is most efficient for typical datasets found and a programmer may, based on such heuristics decide to exclude certain algorithms not efficient in a use case.
A pair of value containing the resulting size in bits and the algorithm chosen.

Definition at line 526 of file ac.hpp.

Here is the call graph for this function:

◆ Decompress()

template<typename TValue >
void Decompress ( BitReader & bitReader,
Array< TValue > & data,
Algorithm algorithm = Algorithm::ALL,
Statistics * statistics = nullptr )

Decompresses an integral array from the given bit reader, which previously was encoded with methods Compress. The integral data type has to be the same as with encoding.

If compression was performed with specifying only one algorithm in parameter algorithmsToTry, then the same algorithm has to be exclusively set on decompression, because in this case no meta information about the compression algorithm is stored in the bit stream.
Template Parameters
TValueThe integral type of array data to decompress.
bitReaderA bit reader to read the data from.
dataThe array to decompress data to.
algorithmThe algorithm to use for read back. Must only be given, in case that compression was performed using a single algorithm of choice. Defaults to Algorithm::ALL.
statisticsAn optional statistics record to store the measured de-compression time. See method Compress for more information.

Definition at line 673 of file ac.hpp.

Here is the call graph for this function:

The documentation for this class was generated from the following file: