ALib C++ Library
Library Version: 2412 R0
Documentation generated by doxygen
Loading...
Searching...
No Matches
ArrayCompressor Class Reference

Description:

This class provides several algorithms to compress arrays of integral data and encode them in BitBuffer objects. Besides a standard Huffman compression , different simple approaches are "tested" and the best compression algorithm is then chosen. The general assumption of the approaches (besides the Huffman coding) is that the data contains "signal data", which is either

  • sparsely filled,
  • has incremental values, or
  • has just values of a certain smaller range. Also, combinations of these attributes are matched. Such data is often found in real-world applications and may be compressed much better than the generic Huffman approach may achieve.

Definition at line 51 of file ac.hpp.

#include <ac.hpp>

Inner Type Index:

class  Array
 
struct  Statistics
 

Public Type Index:

enum class  Algorithm {
  NONE = 0 , ALL = (1<< NumberOfAlgorithms) - 1 , Uncompressed = 1 , MinMax = 2 ,
  Sparse = 4 , VerySparse = 8 , Incremental = 16 , Huffman = 32 ,
  END_OF_ENUM = 64
}
 

Public Static Field Index:

static constexpr int NumberOfAlgorithms = 6
 The number of algorithms implemented.
 

Public Static Method Index:

template<typename TValue >
static std::pair< size_t, AlgorithmCompress (BitWriter &bitWriter, Array< TValue > &data, Algorithm algorithmsToTry=Algorithm::ALL, Statistics *statistics=nullptr)
 
template<typename TValue >
static void Decompress (BitReader &bitReader, Array< TValue > &data, Algorithm algorithm=Algorithm::ALL, Statistics *statistics=nullptr)
 

Public Method Index:

 ArrayCompressor ()=delete
 Deleted default constructor (this class cannot be created)
 

Enumeration Details:

◆ Algorithm

enum class Algorithm
strong

This enumeration denotes the different algorithms provided for compression. This enum is defined to be bitwise.

Note
With the inclusion of at least one "ALib Camp" in the ALib Distribution, this enum is furthermore resourced and values are appendable to class AString and logable with ALox.
Enumerator
NONE 

No compression method selected.

ALL 

All compression methods selected.

Uncompressed 

Stores the data as integer values, which includes a simple sort of possible compression as documented with BitWriter::Write<TIntegral>(TIntegral).

MinMax 

Stores the differences between the minimum and maximum value found.

Sparse 

Writes '1' if next value is equal to previous, '0' plus next value otherwise.

VerySparse 

Writes the number of following equal or non equal values.

Incremental 

Only distances of the values are written.

Huffman 

Huffman encoding (byte based).

Definition at line 266 of file ac.hpp.

Field Details:

◆ NumberOfAlgorithms

int NumberOfAlgorithms = 6
staticconstexpr

The number of algorithms implemented.

Definition at line 54 of file ac.hpp.

Method Details:

◆ Compress()

template<typename TValue >
static std::pair< size_t, Algorithm > Compress ( BitWriter & bitWriter,
Array< TValue > & data,
Algorithm algorithmsToTry = Algorithm::ALL,
Statistics * statistics = nullptr )
static

Compresses the given array and writes the data into the given bit writer. Each algorithm included in parameter algorithmsToTry are executed and finally that one with the best compression result is chosen. Before the usage data, some bits that determine the chosen algorithm are written, to enable method Decompress to deserialize the data.

To gain efficiency, the number of probed algorithms can be narrowed by setting a corresponding mask in algorithmsToTry. However, in many use case scenarios, the execution time is a less critical design factor than the compression factor reached. The decompression speed is solely dependent on the algorithm finally chosen, not on the number of algorithms tested on compression.

Attention
If only one algorithm is specified in parameter algorithmsToTry, then no meta-information about the algorithm chosen is written. Consequently, when reading back the data using Decompress, the same single algorithm has to be provided.
Template Parameters
TValueThe integral type of array data to compress.
Parameters
bitWriterA bit writer to compress the data to.
dataThe array to compress.
algorithmsToTryThe set of algorithms to be tried on compression for best efficiency.
Defaults to Algorithm::ALL.
statisticsPointer a struct to collect statistics for the efficiency of array compression related to given user data. If set, methods Compress and Decompress will measure execution performance and compression rates for each algorithm. With that, a software may collect information about which algorithm is most efficient for typical datasets found and a programmer may, based on such heuristics decide to exclude certain algorithms not efficient in a use case.
Returns
A pair of value containing the resulting size in bits and the algorithm chosen.

◆ Decompress()

template<typename TValue >
static void Decompress ( BitReader & bitReader,
Array< TValue > & data,
Algorithm algorithm = Algorithm::ALL,
Statistics * statistics = nullptr )
static

Decompresses an integral array from the given bit reader, which previously was encoded with methods Compress. The integral data type has to be the same as with encoding.

Attention
If compression was performed with specifying only one algorithm in parameter algorithmsToTry, then the same algorithm has to be exclusively set on decompression, because in this case no meta-information about the compression algorithm is stored in the bit stream.
Template Parameters
TValueThe integral type of array data to decompress.
Parameters
bitReaderA bit reader to read the data from.
dataThe array to decompress data to.
algorithmThe algorithm to use for read back. Must only be given, in case that compression was performed using a single algorithm of choice. Defaults to Algorithm::ALL.
statisticsAn optional statistics record to store the measured de-compression time. See method Compress for more information.

The documentation for this class was generated from the following file: