ALib C++ Library
Library Version: 2402 R1
Documentation generated by doxygen
Loading...
Searching...
No Matches
ALib Module Boxing - Programmer's Manual

Table of Contents

1. Introduction

The original motivation to implement ALib Boxing was the need to allow functions to accept an arbitrary amount of arguments of arbitrary type. While C++ has all mechanisms to implement this (using variadic template arguments ), the limitation of the template approach is that all needs to happen at compile-time. This limits the concept tremendously - for the sake of gaining the typical unrivalled C++ performance!

We were searching for a way to collect the arguments and pass them further for run-time interpretation. With other programming languages which provide a superclass Object and run-time type information this is a no-brainer. In C++ it needs some effort to achieve this. This library provides a very generalized, extensible approach that is not at all limited to variadic function arguments.

The prerequisites needed to reach the original goal were much more than we first expected, and in fact, only chapter 11. Variadic Function Arguments and Class Boxes presents the solution for this.

1.1 Abstract

This module, ALib Boxing , provides means to use C++ run-time type information in a most easy fashion.

For this, any C++ type, from fundamental "scalar" types to complex composite custom classes, can be assigned to an object of type Box . With the assignment, besides the object's value or a pointer to it, "run-time type information" is stored. The so called "Boxes", including their content, can be passed to functions as arguments, returned by functions or stored for later use. Finally, the contents can of-course be unboxed in a type-safe fashion.

The seamless way of how ALib boxes are usable, is achieved using template meta programming (TMP). While a default behavior handles custom type properly, the two necessary conversions which are called "boxing" and "unboxing", can be customized.

The concept of "boxing" is available in many programming languages and often even done in an inherent, hidden fashion (then sometimes called "auto-boxing").

Note
For example in programming languages Java and C# (that in this case share exactly the same syntax) the following simple line of code perform "inherent auto-boxing":
  int    i  = 5; // No boxing, as simple "value-type" int is used.
  Object box= 6; // Auto-boxing: Creation of a container-object that includes run-time type-information.

Starting with version C++ 17, the standard C++ library provides type std::any, which implements a similar concept. The differences between class Box introduced by this ALib Module and class std::any will be examined in detail in this Programmer's Manual.

As a quick summary and motivation, in short, the differences are:

  1. Bijective Type Mapping
    Different types may be boxed to the same target type. Unboxing may be restricted to a subset of the originating types. The advantage of this approach lies in a tremendous reduction of type-checking when processing boxes.
  2. Automatic Pointer-Boxing
    Types that by their size do not "fit" into a box (which is true for most structs and classes) are be default boxed as pointers. In contrast to this, std::any allocates memory and copies such types.
  3. Box-Function Calls
    This library allows to define functions that can be invoked on boxes. Depending on the type a value is boxed to, a custom implementation of the function is chosen. This concept allows to avoid unboxing and to handle 3rd-party types which are not known to the code that processes boxed values.
  4. Array Boxing
    Class Box allows to box single-dimensional array types. While causin a small memory/performance penalty, this feature provides huge benefits for example for the frequent use case of boxing character strings.

The performance penalty - if any - in respect to std::any is considerably low. Class Box is very lightweight and usually its footprint is one third bigger than that of std::any. In many occasions, ALib Boxing becomes even faster, due to

  • the usually reduced number of destination types,
  • the availability of box-functions, and
  • the fact that no deep copy of boxed values is performed.

We furthermore think that the use of class Box is much easier than that of std::any.

1.2 Module Dependencies

This ALib Module is located at a quite low level of the module dependency graph of the library and hence can be extracted and compiled with a surprisingly small fraction of the overall library source. For the convenience of the authors, the samples in this manual rely on (and therefore probably compile only with) the full ALib Distribution .

However, several sections of this manual give detail on the optional module dependencies and the according features of ALib , which leverage this module.

1.3 How To Read This Documentation

This documentation mixes tutorial sections and such that provide in-depth information. The tutorial chapters use the word "tutorial" in their headline and are usually followed by in-depth information.

In addition, some detailed topics are explained with the reference documentation of corresponding library types. If so, this manual will note the reader and offer deep links into the reference guide.

We hope that with this structure, experienced C++ programmers will be able to quickly grasp what they need, while less experienced ones get all information needed to fully understand all pros and cons of (using) this library.

While this manual is very detailed and quite lengthy, the good news is that it addresses programmers that include this module into own code only. If a software offers an API interface that accepts class Box as function arguments, the user of that interface does not need to know much about ALib Boxing . Only, if she wishes to in turn implement box-functions for her types or to start customizing boxing of those, then some deeper understanding is necessary.

1.4 Comparison To Class std::any

This Programmer's Manual will frequently compare features and implementation details of central class Box, with C++ 17 class std::any. This is done for various reasons:

  • std::any is a type that C++ programmers usually know about. In general, humans are good in learning new things, through comparison with existing knowledge.
  • std::any is a plain, lean and straight forward approach of the core idea that ALib Boxing implements. Offering the comparison, the design decisions behind specifics of class Box can be nicely shaped out.
  • Class Box is not meant to generally replace the use of std::any. Both approaches have good reason for existence. The comparison helps judging about which to choose in a specific use case.

By no means, the authors of the code or this manual want to give the impression that the comparison to std::any is about indicating a "superiority" of the ALib concept over that of the standard library. In contrast, we want to clearly state that the standard library just follows different design goals: It is rightfully very abstract and provides an approach of completeness in a mathematical and procedural sense.

And while having less functionality and flexibility, class std::any likewise has a smaller footprint and also in some cases provides better execution performance than class Box.

2.0 The Basics: Boxing, Type Checks and Unboxing

2.1 Tutorial: Boxing Values

Let us now quickly jump into code and have a look at a "hello world" sample:

// Include boxing (this all that is almost ever needed, appart from "enum.hpp" and "dgbboxing.hpp")
// Needed for ALib initialization
// Get support for writing boxes to std::cout
// Get support for enum element names to std::cout
using namespace std;
using namespace alib;
int main( int, char** )
{
// Initialize ALib
// Create a box containing a string
Box myBox= "Hello World";
// Write the contents of the box
cout << "My box contains: " << myBox << endl;
// Terminate ALib
// alib::Shutdown(); <-- commented out, because this sample code is in fact run in the unit tests
return 0;
}
Note
This manual will seldom show the inclusion of necessary header files and "bootstrapping" of ALib .
Manual chapter 12.5 Compilation, Header Inclusion And Bootstrapping will give details on what is needed.

Compiling and running this program, the output is:

My box contains: Hello World

The central type of this module is class Box , located in this module's namespace alib::boxing. As done with most ALib classes, it has an alias name defined in namespace alib, hence shortcut alib::Box can be used. Now, as the sample states

   using namespace alib;

just "Box" becomes sufficient.

The act of "emplacing a value in an instance of class Box" is called "boxing". The sample above shows how such "boxing" is performed: It is obviously done "inherently" with the simple C++ assignment operator. We can assign just anything to our "box" without getting compiler errors:

Box myBox= "Hello World";
cout << "My box contains a string: " << myBox << endl;
myBox= 42;
cout << "My box now contains an int: " << myBox << endl;
myBox= 3.1415;
cout << "My box now contains a double: " << myBox << endl;

Compiling and running this program, the output is:

My box contains a string:     Hello World
My box now contains an int:   42
My box now contains a double: 3.1415

For programmers who know C++ 17 type std::any already, this is not too surprising. The pure C++ language standards however do not suggest such code, because C++ is a strongly type-safe language!

Besides with assignments, this mechanism of "auto-boxing" works well with function calls. C++ allows exactly one implicit type conversion, if a function argument is defined as a constant reference type:

void TakeBox( const Box& box ) // parameter has to be a const reference to allow auto-boxing
{
cout << "Boxed argument is: " << box << endl;
}

The function can be invoked with any argument. Therefore, the following invocations:

TakeBox( 1 );
TakeBox( 2.0 );
TakeBox( "three" );

produce this output:

Boxed argument is: 1
Boxed argument is: 2.0
Boxed argument is: three

The "opposite", namely returning boxes is comparably simple. A function with a return type of class Box (here a value type!), can return any C++ type:

Box GetBox()
{
int random= rand();
if( random < RAND_MAX / 2 ) return random; // auto-boxing an integral value
else return "Too high!"; // auto-boxing a C++ string literal.
}

The following sample and output combines the two functions. We repeat the nested call several times to get a random result:

TakeBox( GetBox() );
TakeBox( GetBox() );
TakeBox( GetBox() );
TakeBox( GetBox() );
TakeBox( GetBox() );
TakeBox( GetBox() );
Boxed argument is: Too high!
Boxed argument is: 846930886
Boxed argument is: Too high!
Boxed argument is: Too high!
Boxed argument is: Too high!
Boxed argument is: 424238335

2.2 Tutorial: Type Detection

In the samples of the previous sections, values have been boxed and the boxes then have been streamed into std::cout. The overloaded streaming operator <<, that accepts type Box, was provided with the inclusion of header alib/compatibility/std_strings_iostream.hpp .
This operator obviously is able to unbox values and print their contents to the stream.

Note
The full truth is that the operator code itself, does not unbox. Magically, the operator can perform its task without "knowing" how to unbox different types. Instead it defers unboxing to another instance. It will be explained in a later chapter, how this operator is implemented.

Before we start unboxing values from boxes, we first need to demonstrate how the type of a box can be detected. The reason for this is simple: Unboxing a wrong type is forbidden and considered a severe error!

We can not simply request a type from a box, because type information is nothing that C++ easily returns from a method. Instead, unfortunately type detection is a game of guessing!. For making a guess, templated method Box::IsType exists. This method has no arguments, but expects the type to "guess" as a template parameter. As the method's name suggests, the return value is boolean:

Box myBox= true;
cout << "Is the type boolean? " << lang::Bool( myBox.IsType<bool >() ) << endl;
cout << "Is the type double? " << lang::Bool( myBox.IsType<double>() ) << endl;
myBox = 5.5;
cout << "Is the type boolean? " << lang::Bool( myBox.IsType<bool >() ) << endl;
cout << "Is the type double? " << lang::Bool( myBox.IsType<double>() ) << endl;

The output is:

Is the type boolean? True
Is the type double? False
Is the type boolean? False
Is the type double? True

For the time being, this is all we need to know to proceed with unboxing.

2.3 Tutorial: Unboxing

Likewise method IsType, introduced in the previous chapter (and likewise the constructor of class Box!), method Box::Unbox used for unboxing a value is a templated method.

The template type determines the type of value to be unboxed:

double original= 1.2345;
Box boxed = original;
double unboxed = boxed.Unbox<double>();
cout << "Original: " << original << endl;
cout << " Unboxed: " << unboxed << endl;

The output of this code snippet is:

Original: 1.2345
Unboxed: 1.2345

This was rather simple! We boxed a double value and also unboxed one. So what happens if we unboxed a different type? This code does this:

double original= 1.2345;
Box boxed = original;
// unboxing wrong type: runtime assertion, resp. undefined behavior
integer unboxed = boxed.Unbox<integer>();

The bad news is: this code compiles well! This means, the error in the code is not detected by the compiler. Unfortunately, the malformed code is detected only at run-time. In debug-compilations of ALib , an assertion would be raised, with a message similar to

    Can not unbox type <long> from boxed type <double>.

Even worse, in release compilations of ALib , running such code results in "undefined behavior", which is the nice wording for "this software sucks and will probably crash very soon!".

Note
This simple sample shows the biggest pitfall when using module ALib Boxing . The problem behind this is a very general topic of computer language theory: Computer languages may be rather type-safe or less type-safe. For example, many scripting languages are not much type-safe. Here, bugs in the code are exposed often only when actually running (testing) the code. With strongly type-safe languages like C++, many types of malformed code are detected already by the compiler. Both approaches have pros and cons and both have a good right for existence.
While we just named this behavior a "pitfall", on the other hand it could be said that this is exactly what module ALib Boxing is all about: transfer type checking from compile-time to run-time to enable type-agnostic coding.
With C++, generic programming is usually performed using templates. However, later in this manual we will see use cases, that show when boxing and run-time type detection is just superior in respect to code design, code size and even sometimes in respect to execution performance.
At the end of the day, these benefits are probably why type std::any was included in the standard library with C++ 17 and ALib Boxing provides a little more.

The two recent code samples, one that rightfully unboxes a double and the other that asserts at run-time, do not make much sense. An obvious use case for ALib Boxing is given, when the acts of boxing and unboxing are decoupled. So let's look at how type-safe unboxing is performed in a function that accepts a Box.
Function ProcessBox tests the given box for "known" types, unboxes values and displays them. For unknown types, a warning is written and false is returned:

bool ProcessBox( const Box& box )
{
// guessing integer?
if( box.IsType<integer>() )
{
cout << "Processing given integer value: " << box.Unbox<integer>() << endl;
return true;
}
// guessing double?
if( box.IsType<double>() )
{
cout << "Processing given double value: " << box.Unbox<double>() << endl;
return true;
}
// Unknown type
cout << "Warning: unknown type given!" << endl;
// With compilation symbol ALIB_DEBUG_BOXING set, we can use a helper class to display the
// given type name in the warning.
#if ALIB_DEBUG_BOXING
cout << " Type given: " << alib::DbgBoxing::TypeName( box.DbgGetVTable() ) << endl;
#endif
return false;
}

These sample invocations:

ProcessBox( 42 );
ProcessBox( 3.14 );
ProcessBox( "Hello" );

produce the following output:

Processing given integer value: 42
Processing given double value: 3.14
Warning: unknown type given!
Type given: char[]

Using the "type guessing" method Box::IsType , introduced in the previous chapter, this code is back to be fully type-safe. Nothing can crash at run-time. Of-course, code that invokes function ProcessBox needs to check the return value at (again run-time) and react properly if the box type was not "known" and false was returned.

There are two drawbacks, one minor and a real major one. The minor is that in the case that many different known types are to be processed, the execution performance of ProcessBox be weak. A first help would be to sort the guesses and put the more frequent types to the top. Using the much more performant switch statement is not possible, because type information is no constant data.

The eventually much worse drawback lies in the fixed set of types that a function can process if it is designed based on "guessing" like sampled here. While in a closed source unit, this might be not a problem, imagine that function ProcessBox resides in an external class library, where it can not be extended. In this case, the function can not be used for custom types that are not known to the library.

For both problems, module ALib Boxing provides a solution, which is introduced in a later chapter 8. Box-Function Calls.

Note
While what we have seen so far could be implemented with C++ 17 type std::any in a similar, fashion, a solution for the two drawbacks named is not offered by std::any .

2.4 How The Basics Work

The previous tutorial sections showcased boxing, unboxing and type guessing. We will see that for all three aspects, a lot more has to be said and showcased. While this chapter for this reason can not go much into technical details, yet, some important facts can be named and explained already.

2.4.1 Templated Approach

Class Box provides templated method Box::IsType and Box::Unbox to guess and unbox specific types of and from a box. The types in question are provided with the template parameter. Likewise, the constructor, which is also used by the copy-assign operator= of that class, uses templates. Otherwise, the straightforward assignment of any object to a box was not possible.

Besides using templates for "generic programming", a programming paradigm called "C++ template meta programming" (aka TMP) exists. The distinction between both, or otherwise the moment when extensive generic programming transitions to being TMP, can only be determined vaguely. Usually, C++ code should be called TMP in the moment structs found in header <type_traits> , like std::enable_if, std::is_pointer or std::is_baseof are used. (Or those found in similar libraries, like boost .)

ALib Boxing makes quite a lot of use of "type traits" and hence the whole module can be easily considered as based on "template meta programming". To understand the library code, a solid knowledge of this paradigm is therefore needed. However, for using the library, fortunately it is not.

2.4.2 Boxing And Unboxing

Class Box contains a data segment, aka an internal piece of memory, that can hold a certain amount of bytes to store values in. With each type given, one of a set of TMP constructors is activated, which copies the source object into this generic piece of computer memory.

With unboxing, according to the requested type the contrary operation is performed: the internal data stored in the box is re-interpreted back to the original type.

In most cases both actions result in a very simple (efficient) copy operation of a (probably) 64-bit value. While the code that is invoked may look longer and complicated and even function calls to other code entities may be made, TMP assures that the compiler generates a very short and efficient assembly code for both, boxing and unboxing without function calls.

Note
Readers that do not believe that, should debug into some methods of class std::vector and wonder what is going on there and how this class can be so fast while the debugger shows plenty of invocations for even the simplest action. Most of these invocations seen in a debugger are 100% optimized out by the compiler. This is the same for a lot of code found in this module.

2.4.3 Type Guessing

In addition to the boxed data, class Box stores type information. Otherwise, method Box::IsType could obviously not be implemented. In C++, type information is received with operator keyword typeid. While using standard function call syntax (round braces), it takes a C++ type as an argument. Returned is a constant reference to struct std::type_info. The struct does not offer too much functionality, in fact the only useful thing that can be done with it is to compare it to another reference received with another use of keyword typeid. This way, it can be determined if two types are the same or not.

With that, the type guessing can be performed: Consider a reference to struct type_info being stored with the TMP constructor of class Box along with the boxed value data. As mentioned, the set of TMP constructors are templated, so the type information is generated at compile-time.
Likewise templated method IsType compares the stored type with the type that its template parameter denotes at compile-time!

These mechanics explain why types can only be "guessed"!

3. Non-Bijective Type Relationships

3.1 Type Relationships

The term bijective is used for describing the relationship of elements of two sets. Bijective relations, mean that each element of set A corresponds to exactly one element of set B and vice versa.

The two sets we are looking at in this case is the set of boxable types and the set of resulting types found in boxes created from the boxable types. This manual calls the latter set "boxed types" or "mapped types". Both terms mean the same.

In the case of C++ 17 type std::any, the relationship between these two sets is bijective - just as a programmer should expect! It is a simple, straight-forward one to one relationship: The type you store in an std::any object, is exactly the type that you can get back from it.

To investigate into the type relationship of ALib Boxing , let us continue with an easy tutorial sample.

3.2 Tutorial: A Reduced Set Of Types To Test

In previous chapter 2.3 Tutorial: Unboxing, the following simple function ProcessBox was introduced:

bool ProcessBox( const Box& box )
{
// guessing integer?
if( box.IsType<integer>() )
{
cout << "Processing given integer value: " << box.Unbox<integer>() << endl;
return true;
}
// guessing double?
if( box.IsType<double>() )
{
cout << "Processing given double value: " << box.Unbox<double>() << endl;
return true;
}
// Unknown type
cout << "Warning: unknown type given!" << endl;
// With compilation symbol ALIB_DEBUG_BOXING set, we can use a helper class to display the
// given type name in the warning.
#if ALIB_DEBUG_BOXING
cout << " Type given: " << alib::DbgBoxing::TypeName( box.DbgGetVTable() ) << endl;
#endif
return false;
}

It was shown, that if invoked with a C++ string literal, a due warning about an unknown type was written.

Now, have a look at the following sample invocations:

int8_t int8 = 8; ProcessBox( int8 );
int16_t int16= 16; ProcessBox( int16 );
int32_t int32= 32; ProcessBox( int32 );
int64_t int64= 64; ProcessBox( int64 );
float f = 1.111f; ProcessBox( f );
double d = 2.222; ProcessBox( d );

You should be quite surprised about the following output:

Processing given integer value: 8
Processing given integer value: 16
Processing given integer value: 32
Processing given integer value: 64
Processing given double value: 1.111
Processing given double value: 2.222

While only two boxed (target) types are tested by function ProcessBox, namely alib::integer and double, a variety of six types can be passed to the function. Obviously, different signed integral types are all "mapped" to the same destination type and the two floating point types float and double are both mapped to type double.

Any programmer can easily see the benefit: with just two code blocks that perform "type guessing" all relevant boxable types can be processed. The term "relevant" can be very rightfully used: In the integral case even the C++ compiler itself would allow an automatic, inherent type conversion (cast) with assignments between the types in question. Not even with the toughest set of warning options, the compiler would complain.
Ok, in the floating point case, the compiler would warn like this:

    implicit conversion increases floating-point precision: 'float' to 'double'

if no static_cast<double>() was applied in to the float value. This is because of the fact that the float to double conversion is not free of precision loss.

Note
Later in this manual two things will be discussed:
  • Why a precision loss is not a problem with the common use-cases of ALib Boxing .
  • How - in the case of an uncommon use-case - the float to double conversion can be suppressed.

Here, we quickly interrupt this tutorial an continue with a manual documentation.

3.3 Non-Injective Type Mapping

The relationship between C++ types and resulting mapped types is not injective. This means, two different C++ types may result in the same boxed type. For example, by default, all signed integral types (of different byte width) are boxed as the same type alib::integer, which is just an alias to the "biggest natural integral type" of the compilation platform. (In short, type alib::integer aliases std::ptrdiff_t).
Likewise, all unsigned integral types are boxed to type alib::uinteger, which is an alias to std::size_t.

This relationship of boxing C++ fundamental types, is the built-in default behavior. As such, it can be modified. This leads us to general important statement:

The process of boxing and unboxing can be manipulated per C++ type. For various fundamental and non-fundamental C++ types, such customization of boxing exists, which leads to non-bijective type mappings.
Built-in customizations can be disabled.

The details of how boxing can be customized for a type can only be explained in a later chapter, when other prerequisites are made.

We have learned that ALib Boxing is not injective. The next question is whether it is at least surjective. If it was, all types that can be boxed, can also be unboxed.

As a sample, the question is: Can type int16_t be unboxed, regardless of the fact that it is possible to unbox type integer from a boxed int16_t?

Again, a tutorial section should investigate into this question.

3.4 Tutorial: Unboxing Non-Injective Types

We have seen so far, that

  • type integer can be unboxed from any boxable signed integral type,
  • type uinteger can be unboxed from any boxable unsigned integral type and
  • type double can be unboxed from boxable types float and double.

The benefit from this is that only a reduced set of types have to be "guessed" when processing boxes.

Let us still try to unbox the original type:

Box box = int16_t(16);
integer i = box.Unbox<integer>(); // OK
integer i16 = box.Unbox<int16_t>(); // Compiler error!

This code does not even compile! In the compiler's output, the following error is found, hinting to the third line of the snippet:

static_assert failed due to requirement 'CustomBoxingRule7'
    Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.

This seems surprising in two ways. Not only that this type can't be unboxed, but also that this is not a run-time assertion but caused by a C++ static_assert which is a compile-time message. As the message's text elaborates, it is just not possible to unbox type int16_t - no matter what was previously stored in the box. Furthermore we understand: This was explicitly forbidden, which means "voluntarily" in this case.

What we have here is a design decision of this ALib Module . Technically, it would be easy to allow unboxing that type. All that is needed is a static type cast, which by the way can be performed by the programmer easily herself if needed:

Box box = int16_t(16);
integer i16 = static_cast<int16_t>( box.Unbox<integer>() ); // OK

The point here is that with the standard use-cases of ALib Boxing , the width of an integral is seldom of any interest. It is just enough to know that an integral value was given, no matter what size it was. Now, to prevent to accidentally start guessing types that belongs to a group of types that are "aggregated" to one destination type, the built-in customization of these types are explicitly forbidding that.

Why "accidentally"? Well, in respect to previous sample function ProcessBox, testing for all sorts of integrals would be just redundant code. For the same reason, to perform type guessing on an not-unboxable type, is already illegal. This code:

Box box = int16_t(16);
bool result= box.IsType<int16_t>(); // Compiler error!

produces the very same compilation error as the one above that tries to unbox the type.

Note
This design decision is only effective with the library defaults. Chapter 4. Boxing Fundamental Types will show how this behavior can be changed.

To conclude this tutorial section, an next important observation has to be made. For this, let us look at the following code snippet:

std::string stdString = "Hello";
NString alibString = "World";
Box box;
// box a std::string_view
box = stdString;
assert( box.IsType<std::string_view>() );
box.Unbox <std::string_view>();
assert( box.IsType<alib::NString >() );
box.Unbox <alib::NString >();
// box an ALib string
box = alibString;
assert( box.IsType<std::string_view>() );
box.Unbox <std::string_view>();
assert( box.IsType<alib::NString >() );
box.Unbox <alib::NString >();

This sample shows that ALib string types can be unboxed from a box that previously got a std::string_view assigned and vice versa. Each original type can also be unboxed and type guessing for both types returns true.

The takeaway from this is: Just from the fact that a type B is unboxable from type A, it can not be concluded that the original type A is not unboxable. While this is true for types int16_t and integer, this is not true for types std::string and ALib strings.

Note
With 10. Boxing Character Strings, this manual later dedicates a whole chapter on using character strings with ALib Boxing .

3.5 Non-Surjective Type Mapping

In the first sections of this chapter, it was explained that type mapping is not injective. This means that different source types can result in the same boxed type.

Now, with the latest tutorial section, it was demonstrated that some boxable types can not be unboxed. For these types, this manual uses the term "not unboxable types" or "locked types".

Conceptually this means that ALib Boxing is also not surjective: Not all origin types are "found" in the destination type set.

A relation is bijective if it is both injective and surjective. Consequently it is not bijective if it is either not injective or not surjective. Unfortunately, no word exists for the condition "not injective and not surjective". Therefore, this manual uses "not bijective" and this is meant in the broadest sense.

3.6 Summary And Rationals

A quick summary of what was said in this chapter should be given in bullets:

  • While the type conversion of C++ 17 type std::any is bijective, a huge difference of ALib Boxing is that its type relationship is not bijective, precisely it is neither injective nor surjective.
  • With the reduction of possible target types, less type guessing has to be performed when processing boxes.
  • While usually any type can be boxed, unboxing of certain types may be forbidden by compile-time assertions. In this case, usually the target type of the boxing conversion can be unboxed. Besides that, also other types might be available for unboxing.
  • The exact same static assertions given with method Box::Unbox are applied with method Box::IsType . This means, if a type must not be unboxed, it must not even be "guessed".
  • Apart from one important exclamation (which is only explained later), by default a bijective type relationship is established, likewise class std::any suggests.
  • Only if the boxing becomes customized (explained later), non-bijective boxing is activated. This is not supported by std::any.
  • Such customization is already built into the library with its default compilation settings, but it can be disabled (explained later).

Some rationals why non-bijective type mapping is even defaulted in the library:
The approach taken with non-bijective type mapping, of-course also has obvious disadvantages. First of all, type information is just lost: When detecting an integer type stored in a box, the processing code can not perform different actions depending on the width of the given original integral type. The information on the size is just lost. Even worse, in the case of floating point values, the inherent conversion of values of type float to those of type double, even include a loss of the precision of the value.

So why does ALib Boxing take these restriction into account by default? Why is the benefit of just having to cope with a shorter set of target types, weighted a higher gain than the loss in precision?
This can be answered only by looking at the use cases of boxing. Remember that C++ until its version 17, not even suggested to do something like boxing. Instead the language is known for its type safeness and its close binding to the underlying hardware, where the difference between int16_t and int32_t is considered a very huge one.
So this answer is rather, that boxing is not used in these areas of a software that contributed to the overall decision to use C++ as the source language. Instead, the use cases are rather found where more relaxed demands are applicable - and these can be parts of the same software. Take for example a software that calculates tomorrows weather forecast: A C++ software would be able to process billions of calculations, or at least feed corresponding dedicated "number crunchers" with the input data and process the result. For this task, the data should never be boxed and transported in a generic way. This is absolutely no use case for boxing! However, the very same software would also write a log file or display some messages on the console. Here, even the most valuable final results, namely tomorrows average temperature and wind speed may be intermediately converted to a boxed value: When unboxed, the first fractional digits of the floating point value will still be intact and precise enough to displayed to a homo sapiens.

Consequently, a rather "convenient" formatting function is needed, as known from printf (which is not type-safe and therefore a "no-go") or from the standard libraries of various different programming languages. It can be noted that neither the complex syntax options of format strings introduced by the Python language (using brackets "{}" as placeholders) nor those introduced by the Java language (using "%" as placeholder and extending the good old printf format) provide any means to distinguish 16- from 32-bit integrals. While the output can be altered in various ways, the originating type is just irrelevant.

This is important to understand: The use of ALib Boxing has to be justified. It is not just to be seen as a convenience library that enables easy, generic coding.

Sometimes however, as we will see in later chapters and also in the appendix chapters, ALib Boxing solves a real problem that arises from the nature of the C++ language, which otherwise can be solved only with std::any or using bare keyword typeid directly. But even in these cases, bijective boxing remains the default.

4. Boxing Fundamental Types

4.1 Definition Of Fundamental Types

So called "Fundamental C++ types" are specified by the C++ language .

In short, those are all types that can be defined using a valid combination of the type keywords bool, int, long, int16_t, int32_t, float, double, char, wchar_t, char16_t and char32_t as well as modifier keywords signed, unsigned, short and long.

4.2 Default Boxing On 64-Bit Systems

The following defaults are set if ALib is compiled on a 64-bit compiler/platform (precisely one where std::size_t has a width of 64 bits).

The subset of fundamental exceeding a size of 64 bits are always boxed in a bijective way, which means in a one to one relationship. Those are:

  • Integrals larger than 64-Bit (platform/compiler dependent, e.g. with GCC type __uint128_t is concerned).
  • long double (a floating point value usually larger than 64 bits)

Furthermore, character types (char, wchar_t, char16_t and char32_t) are always boxed bijective.

All remaining fundamentals by default are boxed in an injective way. By that, they can be grouped into three different sets:

  1. All signed integrals up to a maximum of 64 bits length, will be boxed to type alib::integer.
  2. All unsigned integrals up to a maximum of 64 bits length, will be boxed to type alib::uinteger.
  3. Types float and double will be boxed to type double.
  4. Character types char, wchar_t, char16_t and char32_t will be boxed to alib::wchar.

Only the destination type of each group is allowed to be guessed and unboxed.

4.3 Default Boxing On 32-Bit Systems

The following defaults are set if ALib is compiled on a 32-bit compiler/platform (precisely one where std::size_t has a width of 32 bits).

Integrals of a size of 64 bits are boxed in a bijective way, which means in a one to one relationship.

Attention
Integrals larger than 64 bits, as well as type long double (if even available on a 32-bit platform), by default can not be boxed at all!

All remaining fundamentals by default are boxed in an injective way. By that, they can be grouped into four different sets:

  1. All signed integrals up to a maximum of 32 bits length, will be boxed to type alib::integer.
  2. All unsigned integrals up to a maximum of 32 bits length, will be boxed to type alib::uinteger.
  3. Types float and double will be boxed to type double.
  4. Character types char, wchar_t, char16_t and char32_t will be boxed to alib::wchar.

Only the destination type of each group is allowed to be guessed and unboxed.

4.4 Disabling The Default Customized Boxing

In the previous two sections, a fourth group of aggregated types was named with character types Note, that the non-bijective boxing of character types was not shown in the tutorial. Destination type wchar is defined with dependency module ALib Characters , which sorts a little of the "mess" a C++ programmer faces when dealing with characters. ALib Boxing leverages this module here for boxing plain character types. As we will see later, the benefits of module ALib Characters for boxing are even much greater.

Three ALib Compiler Symbols are available, which disable the custom boxing definitions.

The consequences of changing the defaults (enabling bijective behavior) should be obvious. For example, a processing code may now have to guess different integral types and it can and it has to unbox and process them separately.

Different code units that use a different setting in respect to one of the three compilation symbols, must not be mixed. For example a box created from type int16_t in a code unit that enabled bijective boxing on compilation, can not be processed by a code unit that uses default non-bijective boxing enabled. Remember that the processing code unit would receive a compile-time assertion, if it tried to unbox the value.

Often, the use of the ALib Compiler Symbols can be avoided, by using the set of methods:

Note
Any ALib Module that relies on ALib Boxing , for example modules ALib BaseCamp or ALib Expressions , use this symbol to compile and be compatible with any of the selected setting.
However, if the defaults are disabled and then furthermore a custom boxing for fundamental types is defined (which is explained in a later chapter), then these ALib Modules might become incompatible!
While we do not see a technical solution for this, we as well do not foresee good reasons for replacing the built-in non-bijective boxing of fundamental types with an own definition. In other words: Disabling the defaults might be justified in rare cases and is supported, but a replacement of the non-bijective boxing relationships by a custom one is not.
See also
Chapter 5. Building The Library of the Programmer's Manual of ALib for more information on compiling the library and using compiler symbols. For example, if using CMake , corresponding cached CMake variables ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS, ALIB_FEAT_BOXING_BIJECTIVE_CHARACTERS and ALIB_FEAT_BOXING_BIJECTIVE_FLOATS are available.

5. Boxing Arrays And Vectors

5.1 Support For 1-Dimensional Arrays

Module ALib Boxing has a built-in support for boxing C++ arrays with one dimension. With the current proceeding of this manual, it can not be easily justified and discussed what the rational for this support is. Only a bigger picture, looking at prominent use cases and the "side effects" that this feature enables, allow to give a complete answer to that.

Therefore, let us at this point rather describe what is available and provide more rationals in later sections.

Type guessing and unboxing for boxed array types slightly differs from those of scalar types. Method Box::IsType is not applicable to array types. The reason is simply, that the C++ language does not allow to specify template types to be arrays of arbitrary size. The template parameter TBoxable of the method IsType might be int[3] or double[25], but can not be just int[] or double[].

Therefore, alternative method Box::IsArrayOf is provided. For example IsArrayOf<int>() may be used to guess a boxed array of int values.

Note
While by default, fundamental integral types becomes boxed as type integer , the element types of arrays are not (and can not) be modified. For example an array of type int16_t[10], will be boxed as an array of int16_t.

Further methods according to array boxing are:

  • Box::IsArray This is a non-templated method that returns true if the box holds an array of any type.
  • Box::UnboxLength Returns the length of a boxed array. The result of the method is well defined for boxed arrays and a due prior test for a boxed array type using method Box::IsArray is array.
    However, no compile- or runtime- assertion is made if invoked on non-array boxes. Also, a process will not signal any exception. In other words, the invocation on non-array types is harmless but the result is undefined.
  • Box::UnboxArray This templated method returns a pointer to the boxed array. The template type that has to be provided is the element type of the array (likewise with method Box::IsArrayOf ).
  • Box::UnboxElement This templated method returns the value of an array element. Besides the template parameter specifying element type, the method has one argument denoting the element's index.

With that information, we can do two tutorial sections:

5.2 Tutorial: Boxing And Processing Arrays

The following simple method prints the contents of boxed int and double arrays:

bool ProcessArray( const alib::Box& box )
{
// not an array type?
if ( !box.IsArray() )
{
cout << "Not an array" ALIB_DBG( ", but scalar type " << box.TypeID() ) << endl;
return false;
}
// guess int[]
if ( box.IsArrayOf<int>() )
{
cout << "int[" << box.UnboxLength() << "]= { ";
for( alib::integer i= 0; i < box.UnboxLength(); ++i )
cout << box.UnboxElement<int>( i ) << " ";
cout << "}" << endl;
return true;
}
// guess double[]
if ( box.IsArrayOf<double>() )
{
// using alternative, than above by unboxing a pointer the start of the array
cout << "double[" << box.UnboxLength() << "]= { ";
double* array= box.UnboxArray<double>();
for( alib::integer i= 0; i != box.UnboxLength(); ++i )
cout << array[ i ] << " ";
cout << "}" << endl;
return true;
}
// either no array or array element type not covered
cout << "Unknown array element type" ALIB_DBG( ": " << box.ElementTypeID() ) << endl;
return false;
}

Some test invocations:

int intArray [3]= { 1 , 2 , 3 };
double doubleArray[2]= { 3.3, 4.4 };
long longArray [3]= { 5 , 6 , 7 };
ProcessArray( intArray );
ProcessArray( doubleArray );
ProcessArray( longArray );
ProcessArray( 42 );

lead to the following result:

int[3]= { 1 2 3 }
double[2]= { 3.3 4.4 }
Unknown array element type: long
Not an array, but scalar type long

Note the two different ways of implementing the array-loop. For type int, each element is unboxed one by one, which avoids unboxing and locally storing the pointer to the array. However, for type double this effort is made. The element loop itself then runs directly on the array instead of the box.
In release-compilations, both alternatives should result in the very same object code and thus share the same runtime performance. In debug-compilations however the first version performs a type check as well as a bounds check of the given argument in respect to the boxed array's size for each element. It is depending on the situation, which alternative is to be preferred. In this simple case, we would choose the second alternative, because neither type nor range checks are necessary in debug-compilations. Maybe a matter of taste.

Further note the use of method Box::TypeID . In its default implementation, obviously this method returns the boxed type in case of scalars and the element type in case of vectors. This decision can be altered, by explicitly providing an otherwise defaulted and therefore in this sample not visible, template parameter. Please see the method's documentation for further information.

Finally, in debug-compilations - the result of method TypeID can just be streamed to std::out. This is very convenient and possible due to some tricks of other ALib modules, which includes the use of type alib::lang::DbgTypeDemangler. For technical reasons, type DbgTypeDemangler is only available in debug-compilations. Method TypeID itself however is available also in release-compilations.

5.3 Tutorial: Multi-Dimensional C++ Arrays

As already mentioned, ALib Boxing does not provide a similar solution for multi-dimensional arrays. When multi-dimensional arrays are boxed and unboxed, the sizes of the higher dimensions need to be known. The following quick sample demonstrates this:

int mArray[2][3] = {{ 1,2,3 },{ 4,5,6 } };
alib::Box box( mArray );
std::cout << "Is int[][3]: " << box.IsArrayOf<int[3]>() << std::endl;
int (&arraySlice)[3]= box.UnboxElement<int[3]>(1);
std::cout << "array[1][2]= " << arraySlice[2] << std::endl;

Output:

Is int[][3]: 1
array[1][2]= 6

While the code above is feasible, multi-dimensional arrays are better boxed when wrapped in a custom type, e.g. one that stores the sizes explicitly and allows to restore them after unboxing. We do not consider this a huge drawback for this module in general, especially in respect to the use cases of ALib Boxing , which probably seldom include multi-dimensional arrays.

Type std::any does not provide any more support, even the size of one-dimensional arrays is not stored there.

5.4 Boxing Vector Types

The term "Vector Types" here means collection type std::vector<T, std::allocator<T>> and similar custom (3rd-party) types that store their elements in a single chunk of memory.

ALib Boxing provides built-in support to customize the boxing of class std::vector.

Note
For built-in support of vector types of other (3rd-party) libraries, checkout namespace alib::boxing::compatibility and its sub-namespaces.

With the customization, objects of the type std::vector are boxed to C++ arrays of the templated element type. The customization, requests a vector's allocated memory (method std::vector::data) and stores this pointer besides its size.
The advantage of this approach is (as with any non-bijective type mapping) that the code that processes boxes needs to check for + arrays of a certain element type only. Separated checks for other vector types are not needed.

As still, it was never discussed yet, how custom boxing is finally performed, for now all that is needed to know that the injective boxing of objects of type std::vector to one-dimensional array types can be enabled per compilation unit, by simply including header file alib/compatibility/std_boxing.hpp .

If so, we can feed method ProcessArray sampled above with objects of type vector:

int intArray[3] = { 1 , 2 , 3 };
std::vector<int> intVector = { 4 , 5 , 6 };
ProcessArray( intArray );
ProcessArray( intVector );

The output will be:

int[3]= { 1 2 3 }
int[3]= { 4 5 6 }

The built-in customization does not allow to unbox type std::vector from boxed C++ arrays. Again this is a design decision, technically this would be possible. The rationale for this is that unboxing to std::vector would impose a memory allocation and a deep copy of the data.

Therefore, such unboxing should be performed only with very explicit code. With the inclusion of the compatibility header named above, a templated, inline function for this task is already provided . This is its simple source code:

template<typename TElement>
inline void CopyToVector( ::std::vector<TElement>& target, const Box& box )
{
target.reserve( target.size() + static_cast<size_t>( box.UnboxLength() ) );
for( integer i= 0 ; i < box.UnboxLength() ; ++i )
target.emplace_back( box.UnboxElement<TElement>( i ) );
}

With that, unboxing a std::vector from a boxed C++ array is done as sampled here:

int intArray[3] = { 1 , 2 , 3 };
Box box= intArray;
std::vector<int> intVector;

5.5 Rationale and Technical Background

Technically, C++ arrays are boxed by storing the pointer to the first element together with the array's length.

While we have not discussed the possibilities of customization of boxing for a certain type, yet, it can be said here that C++ array types constitute an exclamation: Their way of boxing is not customizable.

Note
Again this is due to the fact that the language syntax does not allow a single template parameter parameter to denote C++ array types of arbitrary length. As we had seen before, it is not even possible to unbox the C++ type. Therefore, the interface of class box allows to either unbox a pointer to the start of the array or a single element.

But what is possible, is to customize the boxing of other types (structs and classes) to result in the same boxed type as if a native C++ array was boxed. This was demonstrated in the previous tutorial section. Consequently such types might also well be unboxed from boxes created originally by C++ arrays.

The special treatment of one-dimensional arrays with ALib Boxing imposes advantages and disadvantages and hence is the result of a design decision.

The disadvantages are:

  • The array length needs to be stored with every Box object. This "space" for this exists with each and every instance of class Box, whether it is used or not.
    In fact, the footprint of type Box on a standard 64-bit machine is 24 bytes (three times the size of a pointer). In contrast, the size of type std::any is only two thirds of that, namely 16 bytes.
  • The Complexity of the API and thus its use increases. For example, it was needed to provide extra interface methods that enable to check for boxed arrays in general, to guess array types and to unbox one-dimensional arrays, respectively single elements of those.

The advantages of the approach taken are:

  • One-dimensional arrays can be boxed along with their length in a type-safe way. Precisely spoken, ALib Boxing distinguishes between a simple pointer to a type and a one-dimensional array of that type.
  • The provided customizations of boxing type std::vector and optional customizations of boxing similar container types (of any arbitrary 3rd-party libraries!) unify the boxed type to be a simple C++ array type. With such, the processing code can uniquely handle boxes from arbitrary "vector-like" sources.
  • The additional storage capacity used to store the length of an array, is available for just other use in the case of boxing scalar types.

It will be discussed in a later chapter, that array-boxing is especially helpful in the domain of string types: Arbitrary string types can be boxed as nothing else but simple one-dimensional character arrays. This way, this messy bunch of types, coming from tons of 3rd-party libraries, can all be aggregated to the very same type!

5.6 Exclamation With Boxing The Length Of Character Arrays

For one-dimensional character array types char[], wchar_t[], char16_t[] and char32_t[] the TMP constructor of class Box shortens the stored array length by one.

The rational for this is that in most cases, boxed character arrays are string literals. String literals are zero-terminated arrays, hence the following line compiles:

    char string[4]= "123";

If a length of three was given, compilation would fail.

With this exclamation in place, character strings are stored with the "right" size. The term is justified in the moment that that a programmer believes that zero-terminated strings are not nice. The zero-termination is "forgotten" in that moment. However, the benefit is, that the length of the box represents the true length of the string given!

This feature can not be disabled. On the one hand, custom boxing is not available for C++ character types (for technical reasons, as already mentioned). Also, there is no preprocessor symbol introduced to disable this behavior, as we can not consider a use case where this behavior wasn't acceptable. If it was, too many dependent features of various ALib Modules would discontinue working and had to be disabled.

6. Boxing Structs And Classes

So far in this manual we have only been boxing fundamental C++ types and C++ arrays. The only exclamation we saw, was class std::vector<T> in the previous chapter. Here, it was only explained that it uses a customized boxing and nothing was said about how this works.
Still, customization is only explained in the next chapter, because for most custom composite types, ALib Boxing works very well "out of the box"!

The first thing we do is looking at a few samples.

6.1 Tutorial: Boxing and Unboxing Custom Classes

Let's have a simple custom class:

class SmallClass
{
private:
integer value;
public:
SmallClass( integer v )
: value(v)
{}
integer Get() const
{
return value;
}
};

With this in place, we can box, guess and unbox an object of that class:

SmallClass smallClass( 42 );
// boxing
Box box= smallClass;
// type checking
cout << "IsType<SmallClass>: " << box.IsType<SmallClass>() << endl;
// unboxing
cout << "Value within unboxed class: " << box.Unbox<SmallClass>().Get() << endl;

The output of this code sequence will simply be:

IsType<SmallClass>: 1
Value within unboxed class: 42

While this was very easy and straightforward, here comes the pitfall! We define a "bigger" class:

class BigClass
{
private:
integer value1;
integer value2;
integer value3;
public:
BigClass( integer v1, integer v2, integer v3 )
: value1(v1)
, value2(v2)
, value3(v3)
{}
integer Get() const
{
return value1 + value2 + value3;
}
};

With that, we use the same code as above:

BigClass bigClass( 1, 2, 3 );
// boxing
Box box= bigClass;
// type checking
cout << "IsType<BigClass>: " << box.IsType<BigClass>() << endl;
// unboxing
cout << "Sum of values within unboxed class: " << box.Unbox<BigClass>().Get() << endl;

Unfortunately, this code does not compile. The compiler complains twice, once with call IsType and also with Unbox . The error message is as follows:

static_assert failed due to requirement 'DefaultBoxingRule1'
    This type can not be unboxed by value: By default, values that do not fit into boxes
    are boxed as pointers.

This tells us that instead of type BigClass, type BigClass* was boxed! We need to fix the code above in three places: Twice for providing the pointer type as the template parameter to methods IsType and Unbox and also change operator. to operator->, when invoking method Get of the unboxed pointer:

BigClass bigClass( 1, 2, 3 );
// boxing
Box box= bigClass;
// type checking
cout << "IsType<BigClass*>: " << box.IsType<BigClass*>() << endl;
// unboxing
cout << "Sum of values within unboxed class: " << box.Unbox<BigClass*>()->Get() << endl;

Now the code compiles and runs fine. Its output is:

IsType<BigClass*>: 1
Sum of values within unboxed class: 6

"Easy!" you could say, because the static_assert helps to create this clear compiler message seen above. Just make it a pointer type, and here we go! Unfortunately, there is a next pitfall related with switching to pointers, but this is discussed in a next chapter.

Before we elaborate more theory, let's quickly finalize this tutorial part with a last, probably astonishing thing. From the compiler error message that said "...values that do not fit into boxes are boxed as pointers" an attentive reader might get suspicious and wonder: Does this mean that the opposite is also true? Are "fitting objects" just always boxed as values, even if a pointer to a fitting type is boxed?

This code gives the answer, as it compiles and runs well:

SmallClass smallClass( 1234 );
// boxing a pointer!
Box box= &smallClass;
// type checking for non-pointer
cout << "IsType<SmallClass>: " << box.IsType<SmallClass>() << endl;
// unboxing non-pointer
cout << "Value within unboxed class: " << box.Unbox<SmallClass>().Get() << endl;
IsType<SmallClass>: 1
Value within unboxed class: 1234

Consequently, trying to unbox a pointer to class SmallClass, leads to compiler error:

static_assert failed due to requirement 'DefaultBoxingRule3'
    This type can not be unboxed as pointer: Default boxing of types that fit into boxes
    and are copy constructible and trivially destructible, is performed by value.

6.2 Value vs. Pointer Boxing

By default, when a composite type (a struct or a class) is boxed, ALib Boxing checks whether a value of the given type "fits" into the data segment of class Box and from this decides if the type is boxed as value or as pointer. In both cases, the chosen type is used, no matter if a pointer to the type a value is passed.

Note
In the case that value boxing is chose and nullptr is boxed, the internal memory of the box (introduced in detail in the next chapter) is set to zero values.

A question now is: what types do fit in? The answer is quite simple. On a 64-bit platform, class Box is ready to store a pointer or any other 64-bit wide argument. In addition to that, due to the built-in ability of boxing one-dimensional C++ arrays, a second 64-bit value can be stored. With C++ array types, this member holds the array's length. With other types, it is available for free use.
As a result two times 64-bit, hence 16 bytes can be stored. If type BigClass from the previous tutorial section held only two values of type integer instead of three, it would fit in and became boxed as value. Likewise, on a 32-bit platform, the usable value data of class Box is two times 32-bit equalling 8 bytes.

A second constraint that defaults boxing of a type as pointers, is when a type is either not copy-constructible or not trivially destructible. Coincidentally, a good sample for such a type is one of the C++ standard library, that this module heavily uses: std::type_info. While on common platforms values of the type fit nicely into a box, the type is boxed as pointer type, because only references and pointers of it may exist. The TMP enabled constructors of class Box detect that and perform pointer boxing. Trying to unbox that type as value, leads to compiler error:

static_assert failed due to requirement 'DefaultBoxingRule2'
    This type can not be unboxed by value: By default, types that are not copy constructible,
    or not trivially destructible are boxed as pointers."

This design aspect of ALib Boxing might be surprising. In fact it could be legitimately argued that this behavior is not along the design lines of C++. Consider that the following to lines of code:

    Box box1=  myValue;
    Box box2= &myValue;

create two boxes with the very same contents! And: without knowing the size of the type, a reader even can not tell if both times a pointer is boxed or if the objects are copied by value.

On the positive side of the two lines above is that a programmer does not need to care if she passes a value or a pointer, things will just be boxed to the right type. One of the answers why ALib Boxing is allowed here to trade "convenience" against pure C++ standards, is once more given from the limited set of scenarios where boxing should be used at all.

This and some other aspects should be discussed in the following few sections.

6.3 Non-Bijective Boxing - What Is Default And What Not?

Value and pointer boxing and its transparent treatment, constitutes a next aspect of non-bijective boxing, that we have already discussed in depth in chapter 3. Non-Bijective Type Relationships. By default, all pairs of type T and T* are boxed to either one of the two, just depending on the size of the type. This approach effectively reduces the number of types that need to be guessed when processing boxes by half.

In the next chapter we will see how boxing can be customized per type. This includes the option to redefine this automatic default treatment. Arbitrary combinations are possible:

  • The destination type of both T and T* can be "switched" from the one chosen by the default rules, to the complementary type.
  • Both types can be treated independently, including the definition of different target types (TTX and T*TY)
  • Unboxing can be forbidden for one of the types, or both.
  • Boxing can be forbidden for one of the types, or both.

Between the type mapping seen so far and this mapping of value and pointer types, two differences exist:

  1. The mapping of a type to a completely different type (like all signed integer types to type integer ) needs customizing. In contrast The pointer/value mapping is a built-in default and needs customizing to be switched off.
  2. As we will see later, often the mapping of a type to a completely different type needs some custom "conversion" method. In contrast to this, the mapping between values und pointers of the same type, can easily be performed by the TMP code autonomously. It is just a matter of applying either address operator '&', or indirection operator '*'.

The latter might be important to understand: The conversion with operators '&' and '*' is done as the very first step. It could be said, that in fact the complement type is boxed instead of boxing the given type itself.

6.4 Pointers To Fundamental Types

In previous chapter 4. Boxing Fundamental Types, nothing was said about boxing pointers to fundamental types. But this was only to avoid confusion at that point in time! Instead, it was explained was that the non-bijective boxing groups all fundamental types into four sets:

  1. Signed integral types that get boxed to type integer
  2. Unsigned integral types that get boxed to type uinteger
  3. Types float and double that get boxed to type double.
  4. Character types char, wchar_t, char16_t and char32_t will be boxed to alib::wchar.

Now, pointers to all fundamental types are boxed like their value counterpart. Likewise with structs and classes, the two boxes from the following sample:

    int i= 42;
    Box box1=  i;
    Box box2= &i;

receive the identical contents of type integer and value 42.

Note
The only exclamation are constant pointers to character types, for example const char* or const alib::character*. These are considered zero-terminated strings and are boxed to C++ array types. A rational for, and all details on this exclamation will be given in chapter 10. Boxing Character Strings.

6.5 Constant vs. Mutable Box Contents

A next non-bijective behavior of ALib Boxing is constituted by following boxing rules:

If T is a non-constant value type, then:

  1. If value boxing applies for T, then types T and const T are both boxed as T.
  2. If pointer boxing applies for T, then types T* and const T* are both boxed as const T*.

The same two rules can be phrased from the perspective of the boxed types as follows:

  1. Mapped value types are unboxed as non-constant values, regardless if a constant or non-constant pointer or value was boxed.
  2. Mapped pointer types are unboxed as constant pointers, regardless if a constant or non-constant pointer or value was boxed.

The rationals for this are:

  1. Value boxing copies the object and thus can always returns a non-constant copy. This reduces the size of the set of mapped value types by half, as it is irrelevant whether a constant or mutable object was boxed.
  2. Pointer boxing copies the pointer. To reduce the size of the set of mapped pointer types by half, ALib Boxing volunteers to always treat pointers to boxed objects as constants, even if a mutable object was boxed.

This all means that the information about whether a type was constant or mutable is lost with boxing it. Only when a processing code is "sure" that a boxed pointer points to a mutable object it might apply a static_cast on the result of method Unbox if it intents to perform modifications. Furthermore, for convenience, method Box::UnboxMutable is available, which just calls Unbox() and performs the static_cast to return a mutable result.

Finally, it is important to understand that although types that are boxed as pointers are always treated as constant pointers, this never is noted anywhere. For example, template parameters of method Box::IsType and Box::Unbox expect a non-const type.
The rational for this is: Because all pointer types are returned as constant pointers, a need to pass keyword const with pointer types was redundant.
The following code snippet should make this clear:

auto small= myBox.Unbox<SmallClass >();
auto big = myBox.Unbox< BigClass*>();
static_assert( ATMP_EQ( decltype(small), SmallClass ), "Error" );
static_assert( ATMP_EQ( decltype(big ), const BigClass* ), "Error" );

While BigClass is unboxed as const BigClass*, the template parameter just says <BigClass*>.

A corresponding static assertion will fail, if keyword const is used with type specifications.

6.6 Boxing Volatile Types

For types that are boxed as values, type attribute volatile is removed from the copy.

Note
It is in the user's responsibility to decide if it is a good idea to copy a volatile object to a box.

Volatile objects of types that are boxed as pointers, are not allowed to be boxed. If tried, compile-time assertion:

DefaultBoxingRule4
    Types boxed as pointers can not be boxed if volatile.

will be given.

Methods Box::IsType and Box::Unbox will statically assert if type specifier volatile was given with template parameter TUnboxable .

6.6 Life-Cycle Constraints Of Boxes And Their Contents

In the case of value boxing, performed for fundamental types and such composite types that "fit into" a box, a all necessary data is copied into the box. Therefore, the life-cycle of the box instance is independent from the source value.

This is different when pointers are boxed. Here, the exact same rules as using normal pointers apply: A pointer must be dereferenced only if the objects it points to is still valid.

Now one could argue that this becomes a little "delicate" in the moment a programmer does not know if a type is boxed by pointer or by value. Maybe she would think that an object just fits and therefore delete the source type after boxing, which of-course leads to undefined behavior if the type didn't fit!

The simple solution to this is: When an object of a composite type (struct or class) is boxed, the box just always should be considered to have a life-cycle bound to the object, regardless if by coincidence the value fits to the box and is thus copied. A programmer should just volunteer to take this little chance of her worries being unnecessary into account.

However, some thinking has always to be given. For example, reconsider how class std::vector<T> is box to a C++ array, as demonstrated in 5.4 Boxing Vector Types. Well, while this is not pointer boxing, still a pointer to the first array element is stored. Now a user of the standard C++ library knows that class std::vector<T> allocates dynamic memory for storing the values. This memory is deleted with the destruction of the vector. Hence, the life-cycle of the box is bound to its source object.
But it is even worse: During the life-cycle of the box, the vector must also not be modified! Appending a new element might or might not lead to a re-allocation of the internal array. Consequently, a certain level of care has to be taken when passing boxes around to different code entities.

Once more, the good news about the pitfalls of life-cycle-management lie in the limitations of typical use cases of ALib Boxing . In most cases, are not even actively created by a software. Instead they are implicitly created when generic functions accept arguments of type const Box&. In this most frequent case, after the function returns, the current thread's stack frame is unwinded , and the boxed argument objects are disposed!
A sample for this is given in appendix chapter C1. of this manual.

Should the processing function want to store some data that it received from a box argument "for later use", then such function itself should be responsible to create copies of such boxed data that might be not available after the function returns. The function can quite easily perform this, as it has anyhow knowledge about how to interpreted different boxed types and their contents.
A sample for this is implemented with ALib Expressions , which is discussed in more detail in chapter C.4 Use Case: Module ALib Expressions of this manual.

With ALib Expressions , class Box is also used as a return value of functions. While this is a more rare case, it is absolute rightful and necessary to do so in that module. The constraints applied here is that the functions that return a box are responsible to ensure that the contents is valid during a certain "scope" of the execution of the software. This scope is individual per library and in case of ALib Expressions it is well documented in the according Programmer's Manual.

Finally, if the contents of boxes need to survive their originating object's deletion, then a next option to achieve this, is given in chapter 12.6 Life-Cycle Considerations.

6.7 Comparison To std::any

It was already pointed out in chapter 3. Non-Bijective Type Relationships that C++ 17 type std::any does not offer non-bijective boxing. Value type T is boxed as value and type T* is boxed as pointer. Consequently, a processing function implemented with std::any always had to check both types, if it wants to support both.

In the case of storing pointers with std::any, the same care about life-cycle management is needed as with using ALib Boxing .

In the case of values, things can become quite ineffective. As type any does not "automatically" switch to a pointer type, the copy constructor of objects provided as value is invoked. For the storage of the copied object a heap allocation is performed. Note that many developers underestimate the execution costs of allocating dynamic memory.
Furthermore, the copy constructor of many types perform a "deep copy". For example in the case std::string<T> this means string data is copied. Besides the effort for copying the string data itself, a second heap allocation has to be performed for the internal string buffer.

From the other perspective: while std::any allows to store values of "any" size, class Box does not. Even when boxing is customized, the conversion from the source object to the boxed data must not perform (and store a pointer) to heap allocations. ALib Boxing simply does not perform any object destruction or deletion.

In this sample:

MyClass  myClass1;
MyClass  myClass2;
Box box= myClass1;
    box= myClass2;

the boxing performed with the second assignment in the last line, simply overwrites what was previously boxed, independent from the fact what that previous contents was. The benefit of this is that boxing an extremely fast and efficient code. Often, the compiler optimizes the assignment to a box to just writing directly the three integer-sized words.

Once more, the rationals behind this design is found in the use cases of ALib Boxing , which do not need anything else and heavily benefit from this behavior.

To conclude this section, let's imagine two functions, one accepting a variadic list of std::any objects the other a variadic list of boxes. While to the latter, just any variable can be passed "as is" because the automatic choice of the right type, with the std::any implementation, each parameter has to be checked by the programmer to apply the right of operators '&' or '*' that lead to an efficient and to the wanted behavior: copy or not!

7. Customizing Boxing

In previous chapters it was mentioned already several times that ALib Boxing can be customized per source type. From this, a good indication of what is customizable was already given. At this point in the manual, it is a good time for explaining the customization in detail.

7.1 Customization Features

The following customizations can be performed for a type:

1. Type Mapping
Customization allows to map a source type (aka "boxable type") to a specific target type (aka "mapped type"). For example, the built-in customization (which can be deactivated) maps all common signed integral types to the same destination type integer, unless they are bigger than the latter.

2. Type Conversion Mechanics
Depending on the customization performed, specific code for type conversion for both, boxing and unboxing may be provided which replaces the built-in default mechanics.

3. Manipulation Of Automatic Value-/Vector Boxing
By default, ALib Boxing does not distinguish between boxing a value type T or its counterpart T*. The joint (same) mapped type of both is either one of them, depending on a value's physical size and whether a type is copy-constructible and trivially destructible.

This default behavior can be in arbitrary ways

4. Disallowing Unboxing
If a type is mapped to a different target type, it might still be unboxable from this target type. Sometimes, to forbid unboxing can be just a voluntary design decision. In other cases, unboxing the original type might technically just not be feasible.
(A sample for both options had been given in 3.4 Tutorial: Unboxing Non-Injective Types. A further sample was already explained in 5.4 Boxing Vector Types)

5. Disallowing Boxing
Finally, boxing may also be completely forbidden for a type. With that, any assignment to an object of type Box fails compilation. Forbidding boxing, on the same token disallows unboxing.

7.2 When Customization Is Needed

The good news is, that the defaults of ALib Boxing work well with most types. The most frequent use case for customization, is to perform non-bijective type boxing, to reduce the effort of processing boxes or to generalize a type to a common mapped type to enable the processing of otherwise unknown (source) types.

With type mapping, two scenarios may occur:

  1. The full information of the source type remains available.
    The obvious sample is the built-in customization of types int8_t or int16_t to type integer . As the latter is larger than the source types, all information contained in the source remains, except for the original type information.
    In such a case it generally is a design decision, if unboxing the source type is still allowed. We had already seen that the library itself has samples of both decisions implemented.
  2. Only a part of the information stored in the source object is boxed.
    Also for this, a sample was given already with chapter 5.4 Boxing Vector Types which explained the built-in optional use of class std::vector<T>. This type holds a pointer to its buffer as well as the length of the stored array. In addition, also the length of the allocated buffer is stored. This is equal or greater to the array length. With the optional built-in boxing, this information is not stored. Instead the type is boxed as a C++ array type, hence only the pointer to the buffer and its fill-length survive.
    While in this specific case, unboxing is still feasible (by creating a new instance and copying the data into it like done by tool function CopyToVector ), in more complex cases, unboxing might not be feasible at all. The recommendation is to disallow unboxing in general for partly stored types.

7.3 Type Traits Struct T_Boxer

As noticed in chapter 2.4.1 Templated Approach, this ALib Module uses template meta programming (TMP) for boxing, type guessing and unboxing. With this paradigm, so called "type traits" are frequently used. Simply spoken, type traits enable the compiler to choose different code when compiling templated methods or functions.

Typically, type traits are implemented by a templated struct. The non-specialized definition of the struct sets the defaults, by adding default types, functions etc. Then, specializations of the struct for specific types can be given, from library internal or external code. The C++ language allows virtually arbitrary changes to the original struct when specialized, including even changing the type's inheritance relationship, changing the signature of methods, leave out entities and add new ones. However, with TMP, the documentation of traits structs tell programmers, which properties the specialized struct needs to provide.

Note
This is one major criticism of C++ TMP. Due to not (or only sparsely) inventing new keywords and language syntax to support TMP, it is considered unsafe and hard to implement.
While the latter is absolutely true, it is not unsafe or otherwise dangerous. Some software companies even disallow to use TMP in own code, of-course not in respect to using libraries. The C++ standard library itself makes very extensive use of TMP.

This design pattern of using type traits is also leveraged with ALib Boxing . The type traits struct that is to be specialized for customization is given with alib::boxing::T_Boxer.

Attention
Two code units must use the same custom boxing settings for all types that they share by passing boxes. Otherwise, this library has undefined behavior. The reason for this, was already explained in chapter 4.4 Disabling The Default Customized Boxing.
This means that specializations of type traits struct T_Boxer, need to be shared between all code units in question. In practice, specializations for this reason are made in header files and such are to be included by each code unit that boxes or unboxes shared types.

Type traits struct T_Boxer is well documented and should be referred to for all details. The specialization of the struct can optionally be performed using helper macro ALIB_BOXING_CUSTOMIZE and its siblings.

Instead of repeating what is said in the reference documentation of the struct and macro, this manual rather gives various real life samples along the lines of the important use cases.

7.4 Tutorial: Mapping Type 'int16_t' to Type 'integer'

The mapping of type int16_t to integer was already used as a sample in various parts of this documentation. Let's now look at how this is done with the built-in customization of type int16_t. This piece of code does the job:

namespace alib { namespace boxing {
template<> struct T_Boxer<int16_t>
{
using Mapping= TMappedTo<integer>;
static constexpr
Placeholder Write( int16_t const & value )
{
return Placeholder( static_cast<integer>( value ) );
}
static void Read( const Placeholder& src); // no implementation given, never called
};
}}

This is what is done:

  • Struct T_Boxer is specialized for source-type int_16_t. The type that a specialization is made for, always denotes the C++ source type that is supposed to be boxed differently.
  • Type definition Mapping, specifies the target type. But integer is not given as is, but "wrapped" as the template parameter of helper struct TMappedTo .
    The purpose of this helper is to enable array boxing: If we wanted the boxed (mapped) type to be an array of integer, then we had used TMappedToArrayOf .
  • Static method Write is defined, but with a different, alternative signature as documented with T_Boxer::Write . The common method of return type void, has to write a representation of the given object into the placeholder of the box given as an argument. Such writing has to be compatible with how the target type would write its value into the placeholder.
    This alternative version, creates and returns a placeholder value instead.
    Note
    More information on when and how this alternative version is to be used is found in later chapter 12.3 Optimizations With "constexpr"-Boxing.
    Whichever version is used, in this case a simple cast is all that has to be done. This cast ensures that the 16-bit value is stored in the same "physical format" as values of type integer themselves. With that, unboxing the value will work, no matter if the original value was of type int_16_t or integer.
  • Finally Method Read is declared. However, it is declared to return void, instead of the source type int16_t. Declaring Read to return void disables unboxing! And well, as it is disabled, no implementation of the function needs to be given.

Note, that with non-bijective type mapping, all boxable types (source types) that are mapped to the same destination type, have to "agree" to write the data in the same format. It should be easy to understand that if doing otherwise, the result is undefined behavior. The format that default boxing, as well as built-in customized boxing use, is documented with union Placeholder . In a later chapter, more information on this class is given.

Instead of providing all the code "manually", we could also pick and use one out of a set of provided macros:

ALIB_BOXING_CUSTOMIZE_NOT_UNBOXABLE_CONSTEXPR( int16_t, integer )

Technically, the differences are:

  • The namespace change and specialization syntax is done by the macro.
  • The mapped type does not need to (and must not!) be wrapped in TMapped<T>, this is done by the macro.
  • Method Write is defined as above, including a static_cast to the given destination type.
  • Method Read is declared with return type void.

The principal differences when using the macros, are:

  1. The code is less error prone.
  2. The code is better readable.
  3. Most important: Chances are high, that the code is compatible with future versions of the library.

7.5 Tutorial: Mapping Type 'std::vector<T>' to Type 'T[]'

In chapter 5.4 Boxing Vector Types, boxing of std::vector<T> was demonstrated. It was said that by including header file alib/compatibility/std_boxing.hpp a default customization was given. In comparison to the sample of the previous tutorial section, there is one small challenge here: The type is templated. The goal is now to define custom boxing for type std::vector<T> - of any element type T.

The C++ syntax supports templated specializations in a straight forward way. In chapter 10. Boxing Character Strings it will be shown that std::vector<T> is to be customized differently if T is a character type. Therefore, those have to be excluded from the templated specialization, as they will be customized differently.

Here is the code for specializing the struct for all type but character types, taken from the header file named above:

template<typename TElement>
struct T_Boxer<std::vector<TElement>, ATMP_VOID_IF(!characters::TT_IsChar<TElement>::value) >
{
/** Mapped type is \c TElement[]. */
using Mapping= TMappedToArrayOf<TElement>;
/**
* Implementation of custom boxing for template class std::vector
* @param target The placeholder of the target box.
* @param value The object to box.
*/
static void Write( Placeholder& target, const std::vector<TElement>& value)
{
target.Write( value.data(), static_cast<integer>( value.size() ) );
}
/**
* Forbid unboxing by declaring Read as void.
* @param src Ignored.
*/
static void Read( const Placeholder& src);
};

The - otherwise unused - second template parameter TEnableIf of T_Boxer is invalidated for character types, which will omit those in the specialization.

For all other types, this specialization uses helper type TMappedToArrayOf to wrap the destination type. This denotes that the type should be boxed to a C++ array type. Remember that C++ array types of arbitrary size can be defined with a (non-templated) type definition. This is just not possible by the language.

The boxing method Write is so simple, its definition should not need any further explanation. Finally, like in the sample shown in 7.4 Tutorial: Mapping Type 'int16_t' to Type 'integer', method Read is declared to return void, which disables any unboxing of class std::vector<T>. If a code still tried to unbox one, the compiler would complain something like that:

static_assert failed due to requirement 'CustomBoxingRule7'
    Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.

For templated specializations as shown here, no helper macro exists.

7.6 Customizing Value And Pointer Boxing

In the previous sections, including the tutorial parts, we had only seen how value types T continued to be boxed as T, or sampled with class std::vector<T> how a type that would by default be boxed as T* is customized to always be boxed as a different value type, in this case a C++ array.

There are two further cases possible:

  1. A value type T which by default is boxed as value (because it just fits) should always be boxed as pointer type T*
  2. A type T should be boxed just differently, if given by value T or by pointer T*

Both variants are explained now.

Boxing both, T and T* as pointer T*:
Should - for whatever reason - a fitting (small) and copy-constructible and trivially destructible type be boxed as a pointer, a customization for the pointer type has to be given. The mapped type then is the same pointer type as the source type.
For example, if we wanted to have class SmallClass from a previous tutorial sample, to always be boxed as pointers, the customization would look like this:

ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass*, SmallClass* )

As the default boxing and unboxing mechanics work well with pointer types, we can simply use macro ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING for this.

Of-course, a different target type could be specified likewise. The important point here is, that if a specialization for T* is given and none for T, this customization is used for mapping both T and T* to T*.

Note
Within the ALib library as a whole (which makes quite some use of this basic ALib Module ) no fitting and copy-constructible and trivially destructible type exists with such a customization. We have not found a use case, yet.

Boxing types T and T* differently:
The last case revokes the non-bijective default behavior of boxing complement types T and T*. Instead, a one to one mapping is enabled.
All that is needed for this is to specify just both customizations:

ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass , SmallClass  )
ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass*, SmallClass* )

Again, different mapped type and custom Write/Read methods may be given, if other macros were used.

For this variant, valid use cases exist - although again, no ALib Module uses that internally. As a sample, let us stick to type std::vector<T>. We learned, that with the inclusion of header alib/compatibility/std_boxing.hpp , values and pointers to the type becomes boxed as C++ array type.
With this custom boxing, internal information of the vector object is lost (the capacity). A processing function, can only access the currently stored elements, but the vector can not be unboxed to be modified. If unboxing a pointer was allowed, the unboxed vector could be modified (what of-course would modify the original object).
This could rightfully be wanted behavior and looking at C++ 17 type std::any tells us, that with its lack of non-bijective type mapping, this is even the only possible behavior there.

Because for a templated specialization, none of the helper macros can be used, the following templated specialization of type-traits struct T_Boxer has to be given:

namespace alib::boxing {
template<typename TElem>
struct T_Boxer< std::vector<TElem>* >
{
using Mapping= TMappedTo<std::vector<TElem>*>;
static void Write( Placeholder& target, std::vector<TElem>* const & value )
{
target.Write( value );
}
static std::vector<TElem>* Read( const Placeholder& src)
{
return src.Read<std::vector<TElem>*>();
}
};
}

Still header alib/compatibility/std_boxing.hpp is to be included, as we want to keep mapping of value types to C++ arrays intact.

Without the customization shown above, the following code would not compile:

std::vector<int> intVector = { 4 , 5 , 6 };
Box box;
box= intVector; cout << "Unboxing int array: " << box.UnboxArray< int >() [0] << endl;
box= &intVector; cout << "Unboxing vector<int>*:" << ( *box.Unbox < std::vector<int>* >() )[1] << endl;

The compiler would complain in line 4:

static_assert failed due to requirement 'CustomBoxingRule1'
    This pointer type T* can not be unboxed, because custom boxing is defined for value type T,
    while no custom boxing is defined for pointer type T*.

When patiently reading further, a next compiler error tells us:

static_assert failed due to requirement 'CustomBoxingRule9'
    Customized boxing forbids unboxing value type T ('T_Boxer<T>::Read' returns a different type),
    while no customization for this pointer type T* was given.

With the additional customization, the code compiles fine and the output is:

Unboxing int array: 4
Unboxing vector<int>*:5

7.7 Tutorial: Conditional Customization

This manual can not go into the all details of TMP, therefore this is a tutorial section, is only giving an example and an indication of what is possible.

We had seen, that specializing type traits struct T_Boxer for a single type has the following syntax:

template<>
struct T_Boxer<MyType>
{
    using Mapping=  TMappedTo<MyTargetType>;
    ...
};

To do the same in a templated fashion for a generic type, we used:

template<typename T>
struct T_Boxer< MyGeneric<T> >
{
    using Mapping= TMappedTo<MyTargetType>;
    ...
};

This maps a whole set of types to the same target type. But how about other sets of types? Sets that are not defined by generics? For example, an obvious question is: how can a type and all its derived types be customize at once?

All that is needed to achieve this, is a little template meta programming. To prepare that, type traits struct T_Boxer is equipped with a second template parameter of typename type. The reason why we have not noticed this parameter, yet (also in the code samples it is not visible), is because that it is defaulted to be "void". Its identifier name is TEnableIf. The type is not referred to, neither within the struct itself and consequently not anywhere else in the code.

Note
Whenever within ALib a template argument of a type traits struct carries the name TEnableIf, this indicates that it is used for the purpose of conditional specialization.

A sample should demonstrate how this can be used. Consider the following two types:

struct MyBase
{
integer value1;
integer value2;
MyBase( integer v1, integer v2 ) : value1(v1), value2(v2) {}
};
struct MyDerived : public MyBase
{
integer ExtendedData;
MyDerived( integer v1, integer v2, integer v3 ) : MyBase(v1, v2), ExtendedData(v3) {}
};

We do a TMP enabled customization for type MyBase and all derived types:

namespace alib::boxing {
template<typename TBaseOrDerived> struct T_Boxer< TBaseOrDerived,
// The second, otherwise defaulted type is set to a 'legal' type (here void), in the case that given
// template type TBaseOrDerived is a MyBase or derived of it.
// Otherwise the type expression below does not exist and hence the whole specialization
// becomes illegal - and thus is not used by the compiler!
typename std::enable_if< std::is_base_of<MyBase, TBaseOrDerived>::value >::type
>
{
// Type mapping is fixed to 'MyBase'.
using Mapping= TMappedTo<MyBase>;
// This simple sample class fits into the placeholder. Hence, we just cast down the derived
// type and write it to the placeholder.
// With more complex scenarios, different things could be done. For example, virtual methods
// might be invoked to evaluate the data that is to be boxed in a type specific way.
static void Write( Placeholder& target, const TBaseOrDerived& src)
{
target.Write( static_cast<MyBase>( src ) );
}
// Read returns 'MyBase'. This implies that only this type can be unboxed, all derived types
// are not unboxable.
// With more complex scenarios, the return type could also be 'TBaseOrDerived', which
// would enable to unbox any derived type. Furthermore, it could be a conditionally evaluated
// type, which would allow unboxing for some types of the set only!
// (A sample for the latter will be seen later in this manual.)
static MyBase Read( const Placeholder& src)
{
return src.Read<MyBase>();
}
};
}

The following sample proves that we achieved what we wanted, because it successfully compiles and when running, it does not produce a run-time assertion about unboxing wrong types:

MyBase myBase ( 1, 2 );
MyDerived myDerived ( 3, 4, 5 );
Box box;
box= myBase; cout << "Unboxing MyBase:" << box.Unbox<MyBase>().value1 << endl;
box= &myBase; cout << "Unboxing MyBase:" << box.Unbox<MyBase>().value1 << endl;
box= myDerived; cout << "Unboxing MyBase:" << box.Unbox<MyBase>().value1 << endl;
box= &myDerived; cout << "Unboxing MyBase:" << box.Unbox<MyBase>().value1 << endl;

The output is:

Unboxing MyBase:1
Unboxing MyBase:1
Unboxing MyBase:3
Unboxing MyBase:3

Finally, if we tried to unbox the derived type:

box.Unbox<MyDerived>();

The following compiler error was given:

static_assert failed due to requirement 'CustomBoxingRule7'
    Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.

7.8 Union Placeholder

We have seen in the previous chapters, that even when boxing is customized, such customization often can conveniently use the simple default implementations of methods Placeholder::Write and Placeholder::Read .

This is due to the fact that the methods are implemented by a set of overloaded and TMP enabled methods, that go along well with fundamental and fitting value types.

Besides using the interface methods documented with union type Placeholder , it is also possible to directly access its different members and this way write and read whatever is needed for a certain use case.

Attention
If custom value conversion is performed for types that are not mapped to arrays, it might happen that type traits struct T_SizeInPlaceholder has to be specialized for the mapped type to provide the right "placeholder fill level", which is used by default implementations of box-functions FHashcode or FEquals (introduced in the next chapter).
More rational on this topic is given with the documentation of struct T_SizeInPlaceholder. Also note documentation of struct Placeholder itself for information about storing custom types.

7.9 Bypass Custom Boxing With Identity-Boxing

There might be situations, where an exclamation to the bijective, simplifying nature of ALib Boxing is needed. For example, as it was explained already, if a set of custom types are all boxed to the same more fundamental mapped type, then of-course information is lost. Because customization of boxing has to be the same throughout all compilation units, a decision for such "reduced" boxing of a type is a global decision.

But there is an easy way to "bypass" custom boxing: All that is needed is to "wrap" an object into another one and box the wrapper type. A convenient wrapper type that can well be used is found in the C++ standard library with std::reference_wrapper. This templated class is very simple and stores a reference to the object given in its constructor.
Of-course, a wrapped object of type T has to be guessed and unboxed as type std::reference_wrapper<T> and furthermore different life-cycle restrictions might apply in contrast to using the customized boxing (according to the standard C++ mechanics and rules).

The following sample demonstrates the technique with two types:

  • float
    For fundamental types, std::reference_wrapper can not be used. Therefore a quick custom struct WrappedFloat is given. By default, fundamental type float is converted to double when boxed.
  • AString
    This is a class found in module ALib Strings . While an own, dedicated chapter about string-boxing is given later, so much should be explained here: The class manages an own string buffer and does not fit into a box's placeholder. Customization defines it to be boxed as character array. Thus, an AString can not be retrieved back when unboxing.

The wrapper types consequently are:

// A wrapper for float values
struct WrappedFloat
{
float value;
};
// A wrapper for AString objects
using WrappedAString= std::reference_wrapper<alib::AString>;

With that in place, either use custom boxing can be used, or it can be bypassed:

process( 3.1415f ); // boxed as double
process( WrappedFloat { 3.1415f } ); // float value wrapped, will not be converted to double
alib::AString astring("Hello");
process( astring ); // boxed as character array
process( WrappedAString( astring ) ); // AString wrapped, the whole object "survives" boxing

With method process defined like this:

void process( const alib::Box & box )
{
// 'normal' boxed types
if ( box.IsType <double>() ) std::cout << "double value: " << box.Unbox<double >();
else if ( box.IsArrayOf<char >() ) std::cout << "string value: " << box.Unbox<std::string_view>();
// wrapped types
else if ( box.IsType<WrappedFloat >() ) std::cout << "float value: " << box.Unbox<WrappedFloat >().value;
else if ( box.IsType<WrappedAString>() ) std::cout << "AString: " << box.Unbox<WrappedAString >().get();
cout << std::endl;
}

The output of the code above will be:

double value: 3.1415
float value: 3.1415
string value: Hello
AString: Hello

7.10 Summary And Rationals

In the previous sections of this chapter, most details of "custom boxing" were explained. Technically, custom boxing allows to modify the type mapping as well as the way object data is boxed and unboxed.

While this chapter was quite lengthy, and while template meta programming and the creation of specializations of type traits structs may be "dubious topics" to less experienced C++ programmers, a user of this ALib Module should not fear too many troubles in respect to custom boxing.
This is true, because:

  1. The defaults work well for most types, hence custom boxing is not often needed.
  2. The customization is quite simple: All that is needed to be given is:
    • The source and the target type
    • Two rather short static methods that perform a simple data conversion.

In consideration of this effort, the benefits are huge. The main goal that custom boxing achieves is to further shrink the set of mapped types. Using ALib without any modification, this set is already reduced dramatically:

  • Any value type T and its corresponding pointer type T* are boxed to the same type.
  • The built-in customizations reduce the set of fundamental types to only three main types.
  • Types std::vector<T> and similar 3rd party types, are boxed to corresponding C++ arrays.

Reducing the types, does not only mean that the effort of guessing types when processing boxes is reduced. Often, a custom type is mapped to a common, "already known" type. In this case a processing function does not even need to be changed. This is very helpful if a programmer just can't change such function, when it resides in a library, or her co-worker is just responsible for it.

We will see in later chapter 10. Boxing Character Strings that ALib Boxing by default maps arbitrary (3rd-party) string types to simple C++ character arrays. A function that processes boxed character arrays will this way generically be able to digest any 3rd party string without the need of adoption.

Of-course, there are limits in achieving generic processing of arbitrary boxed types by just mapping the types. While strings are a great sample, often it is not an option to just map a type to something else, maybe because in other places more of the original type's data is needed and boxing it as a pointer to the original is mandatory.

To revoke these limitations, lets quickly move on to the next chapter of this manual!

8. Box-Functions

You, the reader of this manual, probably know all details of C++ and virtual functions. The first section of this chapter provides a brief recap of some basic knowledge on that matter, You are free to skip that!

People in a hurry, might also want to skip section 8.2 Function Declarations, Implementations and Registrations and instead right jump into code with 8.3 Tutorial: Implementing A 'ToString()' For Boxes.

8.1 Introduction: OO-Languages And Virtual Classes

In previous chapters, it was explained that mapping several C++ types to the very same boxed type, does not only reduce the efforts of processing boxes. It further allows to process boxes created from "unknown" types that are mapped to a known boxed type. While this is well feasible for some types, for others type mapping may not be an option, when too much information gets lost.

To resolve the general problem, object oriented languages offer "virtual functions": Instead of performing the task themselves, the processing code calls a type-specific, "virtual" function on a given argument. This way, the responsibility is passed back to the object that is processed.
But how is this technically solved? How does the processing function know the address of the function that is to be called, when it is a different function for each object type?

C++ uses run-time type information for that. While non-virtual class methods are statically linked at compile-time (respectively at link time), the address of a virtual function call is only evaluated at run-time. As soon as a first virtual function is declared with a class (or one of its base classes), a virtual function table (aka "vtable") is added by the compiler to each new instance of that type. Such types are called "polymorphic types" or just "virtual types".

Adding this vtable increases the footprint of virtual C++ types by the size of one pointer. Together with the loss of run-time performance, this increase of object size is the general disadvantage of virtual classes. It is technically just not avoidable: If a processing function should be able to call variants of methods tailored to types that it does not "know" at compile-time, then the memory addresses of these methods have to be passed together with the argument object.

Virtual functions are just one out of two purposes for having a vtable in C++. Its second use is with C++ keyword dynamic_cast<T>. While a static_cast is performed by the compiler, a dynamic_cast<T> ist performed at run-time by special code inserted by the compiler. This code performs a type-check using the vtable. On failure, dynamic_cast returns nullptr.

We had learned in chapter 2.4 How The Basics Work, that ALib Boxing stores run-time type-information along with the boxed data. You could rightfully say, that the disadvantage of needing a vtable with instances of virtual C++ classes, is of the very same nature like the need to store type-information with boxes. While C++ uses this pointer for type-checks and virtual function calls, so far we had seen ALib Boxing using it for type checks only.

Well, here is the good news: Also ALib Boxing supports virtual function calls!

8.2 Function Declarations, Implementations and Registrations

In the sense of the C++ language, any function that is invokable on an instance of type Box is "virtual", because run-time type information is used to determine the right version of the function for a box containing a certain mapped type. However, from the perspective of ALib Boxing , there is nothing like a "static" or "link-time" function. As a consequence, this manual of module ALib Boxing does not talk about "virtual functions" but just "functions" or "box-functions".

This section explains the three steps to define box-functions.

8.2.1 Function Declarations

Type-safeness is a mandatory feature of any C++ software. ALib Boxing is a type-safe software, although - for technical reasons - some heavy use of keyword reinterpret_cast is done when boxing and unboxing values. The type-safeness that is lost in that moment at compile-time, is regained at run-time with the use templated interface methods. For example, if T in a call to Box::Unbox does not match the boxed type, a run-time assertion is given. And such can be prevented using Box::IsType , which never asserts.

With box-functions, the situation is similar: For technical reasons, the vtable of a box stores the address of invokable functions as a void*. However, when the function is invoked, a template parameter used with the invocation assures that the signature of the function stored matches the function parameters given.

We call the template parameter types used with function invocations "FunctionDescriptors". Such FunctionDescriptor is just a struct with a single type definition.
Here is a sample:

struct FMyDescriptor
{
using Signature = bool (*) ( const Box& self, int arg1, double arg2 );
};

Besides the requirement that the type definition in the struct is named "Signature" and that it denotes a function pointer, only two further conditions need to be met:

  1. The first argument of the function has to be of type 'const Box&' or 'Box&':
    When a function is invoked on a box, a reference to the box is passed as the first argument. Most box-functions do not modify the box and use const Box&. We will learn in a later section about the difference of invoking constant and non-constant box-functions.
    The name of this parameter is not necessarily needed to be specified. Internally, ALib Boxing volunteers to do so and always use name "self".

  2. The return type of box-functions must be default-constructible:
    When invoking a function on a box, the result of that invocation is returned. As it it might happen that a function is not defined for a specific mapped type, a default value is needed. Then a default value is created and returned.

    In the case that a function should return a type which is not default-constructible, then the approach to do this, is to declare the function void and instead add an output parameter. For example a pointer to a pre-constructed object or a pointer to a pointer, if the object should be dynamically allocated by the function.

8.2.2 Function Definitions

The second ingredient needed are function implementations - one for each mapped type that is to be supported. Implementations can be defined globally or within a namespace. Furthermore, static member functions are likewise accepted.

However, it is always a good idea to place box-functions in an anonymous namespace of a compilation unit (aka non-header file). With that, it is hidden from the C++ linker and does not clutter a compilation unit's linker information.
It is possible to do so, because the functions are not called using the linker or C++ virtual tables. Instead ALib Boxing uses the C++ call operator() directly on their address stored in the vtable of the box.

8.2.3 Function Registration

The final step is to associate the function implementation with boxes of a specific mapped type. This is done with templated namespace function alib::boxing::BootstrapRegister.

Attention
Function registration and function invocation are not protected against racing conditions of multi-threaded access. For this reason, it is mandatory to perform function registration exclusively while bootstrapping a software, when no threads are started, yet. Registrations can be made prior to bootstrapping ALib , respectively during or after phase BootstrapPhases::PrepareResources .
If for any reason registration is performed after bootstrapping ALib and module ALib Monomem is included in the ALib Distribution , and this function is invoked after ALib was bootstrapped, then prior to an invocation of this method, mutex GlobalAllocatorLock has to be acquired. This can be done with:
    ALIB_LOCK_WITH( alib::monomem::GlobalAllocatorLock )
Note that even when this lock is set, still multi-threaded access to registration and/or box-function invocations is not allowed.

The function uses two template parameters that have to be explicitly specified:

  1. The function descriptor type.
  2. The mapped type.

We had seen in chapter 7.3 Type Traits Struct T_Boxer how to denote a mapped type with field T_Boxer::Mapping : The C++ type has to be wrapped in either TMappedTo or TMappedToArrayOf . The same notation is used here.

Finally, the address of the box-function is to be passed to BootstrapRegister as a normal argument.

Note
Details on the internal implementation of boxes are given in later chapter 12.2.3 Technical Background On VTables.
An understanding of these details will make clear, why box-functions have to be registered at run-time.

8.3 Tutorial: Implementing A 'ToString()' For Boxes

The previous chapter gave a detailed (rather lengthy) explanation about box-functions. This tutorial section now shows how simple their definition and use in deed is.

The goal of the sample we are looking at, is to enable boxes to write their contents to a string. In other programming languages, such function is often called ToString().

Here is the declaration of the function:

// Descriptor of box-function ToString.
// Implementations create a string representation of boxed values.
struct FToString
{
// The function signature.
//
// @param self The box that the function was invoked on.
// @param buffer A string buffer used for string creation.
using Signature = alib::String (*) ( const Box& self, alib::AString& buffer );
};

Besides the box itself, the function expects an AString defined with module ALib Strings . This is used as a buffer to write to. The return value is String , which is a lightweight string type, similar to C++ 17 tye std::string_view.

Let's create three implementations for different types:

// anonymous namespace
namespace {
// Implementation of FToString for boxed type 'integer'
alib::String FToString_integer( const Box& self, alib::AString& buffer )
{
return buffer.Reset() << self.Unbox< integer>();
}
// Implementation of FToString for boxed type 'double'
alib::String FToString_double ( const Box& self, alib::AString& buffer )
{
return buffer.Reset() << self.Unbox< double>();
}
// Templated implementation of FToString for array types
template<typename T>
alib::String FToString_array( const Box& self, alib::AString& buffer )
{
buffer.Reset() << "{";
for( int i= 0 ; i < self.UnboxLength() ; ++i )
buffer << ( i!=0 ? ", " : " " )
<< self.UnboxElement<T>( i );
return buffer << " }";
}
}

First of all, it has to be noticed that unboxing from parameter self does not need type-guessing with Box::IsType . The reason is that each function is associated with boxes of a corresponding type and thus self always contains the right type.

The first two implementations simply unbox the right type and use AString::operator<< to convert the type.
The third function is templated. It is designed to be usable with different boxed array types. Unfortunately, we can not attach a templated method to just various boxes. Instead, an instantiation of the templated function has to be given for each boxed array type that we want to support. Such instantiation is implicitly performed by the compiler when passing the function to BootstrapRegister .

Let's register 4 functions that way:

void RegisterMyFunctions()
{
// This lock is usually NOT NEEDED!
// We do this, here because this sample code is run in the unit tests, when ALib is already
// bootstrapped.
// See note in reference documentation of function BootstrapRegister()
// registering FToString for type integer
alib::boxing::BootstrapRegister<FToString, alib::boxing::TMappedTo <integer> >( FToString_integer );
// registering FToString for type double
alib::boxing::BootstrapRegister<FToString, alib::boxing::TMappedTo <double > >( FToString_double );
// registering FToString for character arrays
alib::boxing::BootstrapRegister<FToString, alib::boxing::TMappedToArrayOf<char > >( FToString_array<char > );
// registering FToString for integer arrays
alib::boxing::BootstrapRegister<FToString, alib::boxing::TMappedToArrayOf<integer> >( FToString_array<integer> );
}

A call to RegisterMyFunctions() needs to go to the bootstrap section of the process.

With all that in place, functions can be "called" with templated method Box::Call . It expects the function declaration as a template type and the function argument as its own arguments. Its return type is equivalent to the return type of the box-function!

The following code creates an array of boxes and calls their method in a loop:

// A sample array
integer intArray[4] {1,2,3,4} ;
// An array of 4 sample boxes
Box boxes[4];
boxes[0]= 5;
boxes[1]= 1.111;
boxes[2]= "Hello";
boxes[3]= intArray;
// the string buffer used with the function calls.
AString buffer;
// Generic loop over all 4 boxes
for( int i= 0 ; i < 4 ; ++i )
cout << "box["<< i <<"].ToString(): \"" << boxes[i].Call<FToString>( buffer ) << '\"' << endl;

The output of the code above will be:

box[0].ToString(): "5"
box[1].ToString(): "1.111"
box[2].ToString(): "{ H, e, l, l, o }"
box[3].ToString(): "{ 1, 2, 3, 4 }"

We conclude this tutorial section with a test: What happens if we invoke the method on a box of a mapped type that no implementation is registered for? As we were lazy, for example uinteger is not covered:

AString buffer;
Box box= static_cast<uinteger>( 42 );
cout << "box.ToString(): \"" << box.Call<FToString>( buffer ) << '\"' << endl;

Running this does not assert! The output is:

box.ToString(): ""

Obviously an empty string was returned by Box::Call , without further complaints.

8.4 Default Functions

It is a design decision of ALib Boxing , that calls to box-functions that are not registered for the actually boxed type, do not assert. Method Box::Call just returns a default value of the designated return type, that's it. The rational for this design is once more to favour convenience when handling boxes over other considerations. A processing code could use Box::GetFunction prior to invoking the function, if it wanted to react on boxes that do not support a box-function.

Looking at virtual functions of OO-languages once more: There, virtual functions may or may not be specialized with each derived class. If a function is invoked on a derived class, the "best" implementation is chosen, by walking up the inheritance chain and choosing the first implementation found in a base class.

The type system of ALib Boxing is not hierarchical and does not know inheritance. But in theory there are at least two levels!

  1. A box with a mapped type, and
  2. A box of just any type.

And that is our little fallback: This library supports to define "default functions" that - if available - are used used when no specific function.

Often, there is not much to do for them, because interpreting the Placeholder contents without knowing the type, is not possible. Still we will see in a next chapter that there there are some good use cases for them.
Sometimes it is useful to implement and register a default function solely in debug-compilations of a software: These can then assert, write log file warnings or perform other appropriate actions.

Default functions are registered with namespace function BootstrapRegisterDefault . Compared to BootstrapRegister , the function omits the second template parameter specifying the mapped type.

8.5 Tutorial: A Default ToString() Function

To continue the sample of section 8.3 Tutorial: Implementing A 'ToString()' For Boxes, a default implementation usable with any box of FToString should be developed. Here it is:

namespace {
alib::String FToString_Default( const Box& self, alib::AString& buffer )
{
buffer.Reset();
#if !ALIB_DEBUG
if( !self.IsArray() )
buffer << "Boxed <unknown>";
else
buffer << "Boxed <unknown" << '[' << self.UnboxLength() << "]>";
#else
if( !self.IsArray() )
buffer << "Boxed <" << alib::lang::DbgTypeDemangler( self.TypeID() ).Get() << '>';
else
buffer << "Boxed <" << alib::lang::DbgTypeDemangler( self.ElementTypeID() ).Get()
<< '[' << self.UnboxLength() << "]>";
buffer << " (missing box-function FToString)";
#endif
return buffer;
}
}

It is registered with BootstrapRegisterDefault

// This lock is usually NOT NEEDED!
// We do this, here because this sample code is run in the unit tests, when ALib is already
// bootstrapped.
// See note in reference documentation of function BootstrapRegister()
// registering FToString default implementation
alib::boxing::BootstrapRegisterDefault<FToString>( FToString_Default );

We repeat the "failed" invocation we had with type uinteger and also test a call on a boxed array with an unknown element type. A third type repeats the call on a character array, that got a specialized implementation:

AString buffer;
double doubleArray[3] { 1.1, 2.2, 3.3 };
Box box1= static_cast<uinteger>( 42 );
Box box2= doubleArray;
Box box3= "Boxing rocks!";
cout << "box1.ToString(): \"" << box1.Call<FToString>( buffer ) << '\"' << endl;
cout << "box2.ToString(): \"" << box2.Call<FToString>( buffer ) << '\"' << endl;
cout << "box3.ToString(): \"" << box3.Call<FToString>( buffer ) << '\"' << endl;

The result is now:

box1.ToString(): "Boxed <unsigned long> (missing box-function FToString)"
box2.ToString(): "Boxed <double[3]> (missing box-function FToString)"
box3.ToString(): "{ B, o, x, i, n, g, , r, o, c, k, s, ! }"

8.6 Calls To Undefined Functions And Empty Boxes

It was already mentioned, that ALib Boxing is tolerant towards calling a function on a box whose mapped type is not associated with an implementation. The call is just not performed and instead, a default-constructed value of the according return type is returned by method Box::Call .
On the same token, a call of a function performed on a box that "does not contain a value" (see chapter 12.1 Void And Nulled Boxes) is likewise tolerated.

This design decision is once more justified with the common use cases for this module. The expectation of a programmer calling a box-function is: "Perform what is appropriate with the boxed type". And if there is no implementation, well, to do nothing is the appropriate action. Consequently, specific checks for the availability of function implementations can be omitted.

If a code wanted to take action on the fact that no type-specific implementation exists or that neither a type specific, nor a default implementation exists, such availabilities can be queried using Box::GetFunction . The method's parameter searchScope controls which sorts of functions are searched. The method is likewise tolerant against unset boxes.
If this is done, the returned function pointer already contains the function found, respectively is nullptr on failure. To avoid a repeated search for that same function with a subsequent Call, alternative method Box::CallDirect can be used, which omits the search and instead expects the function pointer as a first parameter.

Finally, to check whether a box does not contain a value prior to calling a box-function, type-guessing for type void is to be used with IsType() .

8.7 Function Calls On Mutable Boxes

It might happen that a box-function intends to change the contents of a box. In theory, such change could even include changing the mapped type, but changing the value only is probably a more common use-case.

Two things are needed to allow that:

  1. The type definition Signature of the function descriptor needs to specify mutable type Box& for first parameter self .
  2. The non-constant version of Box::Call has to be selectable by the compiler, hence the box that call is invoked on has to be mutable.
    Method Box::CallDirect is likewise available in a non-constant version for cases that tested for the availability of a function upfront.

The next chapter introduces the built-in functions of ALib Boxing . With them, one quite useful sample of a mutable box-function is found.

8.8 Built-In Box-Functions

8.8.1 Equals, Hashcode, Clone...

In contrast to C++, many other object oriented programming languages declare any class to be inherited of a built-in base type. For example, in JAVA, all classes inherit class Object. Such "mother of all objects", usually provides a set of methods that are available for any object in the language. In JAVA, the methods for example include equals(), hashCode(), clone() and toString().

Likewise, module ALib Boxing implements a set of built-in box-functions. Those are:

With the inclusion of module ALib BaseCamp , furthermore function FFormat becomes available.

The following implementations are given:

  • Default implementations are registered for all built-in function types.
  • If appropriate, implementations for C++ fundamental are be given.
  • Some of the functions provide templated implementations that can be generically registered with mapped types that meet certain conditions. If available, these templates are defined as static members of the corresponding function descriptor struct. This way, notes to such templated functions are include in the reference documentation of the function descriptors listed above.
  • With the call of certain bootstrap functions declared in header files of folder alib/compatibility implementations for types of namespace std and 3rd-party types become available.
Attention
The registration of built-in box-function implementations, needs due bootstrapping of the library. See chapter 12.5 Compilation, Header Inclusion And Bootstrapping for more information.

This manual will not repeat a description of each function. Instead, please see the corresponding reference documentation, linked above with the enumeration of functions. Therefore, we conclude this section with just some quick facts:

As a final remark, some of the built-in function declarators provide inner static functions, with some of them being templated. Those may be used to create custom specializations. Again, please consult the reference documentation for further details.

8.8.2 Overwriting A Built-In Function

Repeated registrations of default or type-specific functions using BootstrapRegisterDefault and BootstrapRegister , are allowed. Any formerly set function is simply replaced. It is also allowed to register nullptr, which disables a built-in function without providing a new one.

The built-in default and type-specialized functions are registered with namespace function Bootstrap . In most combinations of ALib Distribution , this function is automatically invoked with bootstrapping the library. Because each function can be disabled or replaced, no configuration option allows to otherwise manipulate the defaults.

Any function implementation that specializes the behavior for a mapped type, may call the default implementation internally, for example to take specific action if a certain state of the boxed value is given, otherwise use the default implementation and probably return its result. To achieve this, the pointer to the default function implementation has to be received, which is done with method detail::FunctionTable::Get that has to be invoked on singleton object detail::DEFAULT_FUNCTIONS .

While this already touches objects in namespace detail, calling a specialized version of a function that was replaced by another (like calling the implementation of a base class in C++) is not explicitly supported by the library, but possible. For this, the bootstrap code that registers a function has to receive and store the previously registered implementation, which then can be called and which in turn may call another one or the default.
The rational why this is not otherwise offered by the library is that such complicated use of box-function is out of the scope of the usual use cases for ALib Boxing .

9. Boxing Enumerations

9.1 Boxing Enumeration Elements

Scoped enumerations as well as traditional enumerations, receive a special treatment with ALib Boxing . Unless their boxing is not customized, they are boxed to their identical type, however the value stored in the Placeholder is casted from their underlying integral type to integer . When unboxed, the value is casted back from integer to the original underlying type.

While this speciality is not noticeable when boxing and unboxing enumerations, the advantage of this treatment is that the values of elements of different enum types become "generically usable" when read directly from Placeholder::Integrals . The rational why this constitutes "an advantage", is given in the next section.

9.2 Class Enum

Class Enum is the only derivate of class Box found in the library.

The class is useful to store and pass around enum values of arbitrary C++ scoped enum types in a type-safe way. It is implemented to ease the use of scoped enums in situations where programmers otherwise tend to "fall back" to non-scoped (non type-safe enum types). This is the case, when enum elements of different types should be allowed as a function argument or otherwise used as an "identifier". While C++ 11 introduced the syntax for enum class types (aka "scoped enums"), still these are very limited. In especial, those do not support inheritance. Thus, an API can not define an interface method that accepts enums of "custom derived types". This is quite often a problem. Of-course, using module ALib Boxing , an interface method now may accept a box, but then anything else apart from enumeration types was accepted. Class Enum as a good tool to help here.

In the constructor, enum elements of arbitrary type are accepted. With the run-time type-information added, the processing function can now work with any the enum types transparently.

A good example use case is given with type Exception of module ALib BaseCamp . Any exception is created with an enum element of arbitrary type. The exception handlers then can use nested if statements: The outer if is about the exception type, the inner about the concrete exception. This gives a nice two-level order scheme for exceptions with no need to define "error number ranges" for each code unit.

10. Boxing Character Strings

10.1 Dependency Modules "Characters" And "Strings"

A lot was said already in this manual about non-bijective boxing and its advantages. When it comes to boxing string types, the way to go is obvious: Whatever string type is boxed (and there might be many of them found in a software that uses 3rd-party libraries) - everything is simply boxed to a C++ array of the corresponding character type. A processing function then does not need to care about the origin type, but by only handling character arrays, any sort of string is treated correctly.

To achieve this, this module leverages type definitions and type-traits found with module ALib Characters . This is explained in the next section.
The section after that, covers further options that are available when module ALib Strings is included in the ALib Distribution . Finally, some good use of ALib Boxing and ALib Strings is made by module ALib BaseCamp . While this is not a part of this manual, some overview on it is provided in appendix chapter C.1 Use Case: Module BaseCamp.

10.2 Character Arrays

Previous manual chapter 7. Customizing Boxing explained in detail how type-traits struct T_Boxer is used to provide information and static methods that allow to customize boxing of any type. The gaol with boxing string types is to map any of them to a character array. This could be done in the straight forward way, for example by just specializing T_Boxer<std::string> for C++ standard type std::string.

But this is not what this library does! Instead it leverages module ALib Characters . An interested reader should read this module's Programmer's Manual now first, before continuing with this chapter of ALib Boxing . A short summary of what is provided by this module should be given in bullets:

  • Module ALib Characters defines a set of new character types with the aim to replace the C++ ones:
    • Three types that aim to make C++ character types wchar_t independent from platform and compiler: nchar , wchar and xchar .
    • Three types that make the use of narrow or wide character types transparent to a user's code: character , complementChar and strangeChar .
    • Each type in the two groups is equivalent to one type of the other group.
  • As a result of these definitions, the module emphasizes the use of type character whenever possible. Depending on the platform and the chosen library compile options, this type is an alias of either char, wchar_t, char16_t or char32_t.
  • Next, type-traits struct T_CharArray<T, TChar> is provided. Specializations can be given to denote that a type T is a of array type of TChar. Furthermore access methods to an object's array data and length are provided and optionally also a method that creates an object of type T from a given character array.
  • The library defines such type-traits for built-in C++ "string types", like const character*, string literals and character arrays. With the inclusion of module ALib Strings , type-traits for the five string-types found in that module are given. Finally,compatibility headers are provided that for example specialize T_CharArray for string and vector types of namespace std or those found in the QT Class Library .
Note
Module ALib Characters in addition provides type-traits to denote "bad old" zero-terminated character arrays. ALib Boxing does not make any use of this but instead treats any character array as if it was not zero-terminated.

With this in place, all that this module provides is a conditional specialization of type-traits struct T_Boxer for all types that T_CharArray is specialized for!

Note
A sample for a "conditional specialization" was given in tutorial chapter 7.7 Tutorial: Conditional Customization.

Precisely, two conditional specializations are given:

  1. If field T_CharArray::Access is specialized to equal AccessType::Implicit , and field T_CharArray::Construction is specialized to equal ConstructionType::Implicit , then T_Boxer is specialized to enable boxing and unboxing of the string type.
    The latter of both conditions, namely the "implicit construction", indicates to ALib Boxing that the string type is a "lightweight type" that can be unboxed with no effort.
    Samples for such types are C++ 17 type std::string_view, ALib type String or QStringView.
  2. If the second condition is not met (field T_CharArray::Construction does not equal ConstructionType::Implicit ), then the type is locked and can not be unboxed from character arrays.
    Character array types that are not implicitly constructible, usually are "heavy types" that for example allocate memory and copy given string data when constructed. If such types are to be created from boxed character arrays, this has to be done using an explicit constructor invocation and passing either an unboxed "lightweight type" or the result of methods Box::UnboxArray and Box::UnboxLength .
    Samples for locked types are std::string, ALib type AString or QString.

As a result, to customize boxing for a custom string type, it is recommended to specialize T_CharArray instead of T_Boxer.

While it is still possible to use T_Boxer for customization, the advantage of the recommended approach is obvious: generally announcing the custom type to be of character array type enables it's use with module ALib Strings as well as with boxing. Also other modules and software built on ALib might directly benefit from such type-traits.

In the unlikely case that T_CharArray is specialized and still T_Boxer should be specialized (with the aim to provide a certain customization that is different from the one that this module automatically provides if T_CharArray is given), then, to avoid ambiguities, helper type-traits struct T_SuppressCharArrayBoxing may be specialized to inherit std::true_type. As its name says, a specialization of this type disables the automatic custom boxing and hence allows a specialization of T_CharArray and a parallel specialization of T_Boxer.

Attention
Header Inclusion Order:
The use of "underlying" module ALib Characters and its type-traits struct T_CharArray to specialize T_Boxer for a whole set of types at once, imposes the requirement of keeping the right header inclusion order: Any specialization of T_CharArray that is to be announced to a compilation unit, has to be made before the conditional customization of struct T_Boxer is given.
Because the latter is defined with the inclusion of header alib/boxing/boxing.hpp , this means that any headers that specialize T_CharArray have to be included prior to this!
See also chapter 12.5.2 Header Inclusion.

10.3 Box-Function FAppend

With the inclusion of module ALib Strings in the ALib Distribution , built-in box-function FAppend becomes available.

Class AString supports a TMP-based mechanism to append objects of arbitrary type, documented wiht chapter 5.1 Appending Custom Types of the Programmer's Manual of module ALib Strings .

Of-course, if an object of type Box is "appended", then TMP does not work, as the compile-time information about the boxed type is lost. Consequently, box-function FAppend is needed that performs the job. If a box is appended to an AString, simply this function is called.

For all types which already specialize functor T_Append , a templated implementation of this function can be used: This unboxes the template type and appends it. This template function is provided with static member FAppend::Appendable .

As a result, there are two ways of implementing interface FAppend for a custom boxable type:

  1. As with other boxing interfaces: Provide a custom implementation.
  2. Make the type appendable to class AString and then register FAppend::Appendable for the mapped type.

The second approach has the advantage, that the custom type is directly appendable to objects of class AString - independent from boxing. Therefore, this is the recommended option.

Note
With module ALib BaseCamp , a next string-related box-function FFormat becomes available, which allows to control the string conversion of boxed values by the use of a format string.
Information and a sample implementation about both, FAppend and FFormat, is provided in chapter 4.3. Formatting Custom Types of the Programmer's Manual of that module.

11. Variadic Function Arguments and Class Boxes

11.1 Variadic Function Arguments

With class Box in place, it becomes possible to define functions and methods that take an arbitrary value as an argument. The need for this is often combined with the need to allow an arbitrary number of such arbitrary arguments. C++ 11 introduced variadic template arguments for this.

Class Box might greatly simplify the use of this language feature and provide a type-safe and indexed way to access variadic arguments. (In fact, this was one of the original goals for creating module ALib Boxing !)

The following quick sample demonstrates this:

template <typename... T> void VariadicFunction( const T&... args )
{
// fetch the arguments into an array of boxes
alib::Box boxes[]= { args... };
// do something
for( size_t i= 0; i < sizeof...(T) ; ++i )
{
alib::Box& box= boxes[i];
//...
}
}

With this function definition, it can be called like this:

VariadicFunction( 7, "ALib", 3.14 );

It is only a single, simple line of code that fetches all function parameters and puts them into an array of boxes.

Of-course, the classical recursive approach to process template arguments using class Box may also be implemented but avoiding the recursion makes the code easier and more readable.

The sample above can be slightly modified to use C++ 11 Perfect Forwarding which in some situations is a little more efficient and produces smaller code. The following code snippet uses this technique and may be copied as a recipe on how to implement variadic template functions with ALib Boxing :

template <typename... T> void VariadicRecipe( T&&... args )
{
// fetch the arguments into an array of boxes
alib::Box boxes[]= { std::forward<T>( args )... };
// ...
}

11.2 Class Boxes

In the previous chapter it was demonstrated how simple the use of variadic template arguments gets with ALib Boxing . The recipe given, uses a single line of code to let the compiler create an array of objects of class Box. This is sufficient in many cases, but obviously using container class std::vector<alib::Box> instead of a simple array would give more flexibility: It allows to add and remove boxes from the array and to pass the array to other (non-templated functions) without passing its size in an extra parameter.

For this and more purposes, class Boxes is provided. It publicly inherits from std::vector<alib::Box> and method Add accepting templated variadic arguments. This way, its use is as simple as this:

boxes.Add( 7, "ALib", 3.14 );
boxes.Add( 42, "Yipee-yeah" );

In this sample, five boxed objects are added to the container using method Boxes::Add .

We replace the simple C++ array of the recipe given in the previous section by an object of this type:

template <typename... T> void VariadicFunction( T&&... args )
{
// fetch the arguments into a Boxes object
alib::Boxes boxes;
boxes.Add( std::forward<T>( args )... );
// do something
for( alib::Box& box : boxes )
{
if( box.IsType<alib::integer>() )
std::cout << box.Unbox<alib::integer>() << " ";
else
std::cout << " Unknown Argument Type ";
}
std::cout << std::endl;
}

The advantage of the former version is that the array was created on the "stack". In contrast to this, class Boxes uses dynamic memory to store an arbitrary amount of boxes.
This is a disadvantage of using class Boxes that should not be underestimated. (Programmers generally tend to underestimate the performance impact of heap allocations, which in fact only becomes a problem in complex software that makes a lot of allocations or for example runs for a longer time). More on this issue is discussed in the next section.

11.3 Memory Management

The previous chapter introduced class Boxes. The class is derived from std::vector<Box> and in case that module ALib Monomem is included in the ALib Distribution , in addition an internal allocator type is given to this vector, which has a very similar implementation as StdContMAOptional .

Note
The only reason why an own, internal copy of type StdContMAOptional was used instead of the original, is that this allowed to omit the inclusion of any header file of module ALib Memory , which largely simplifies header inclusion.

With that, two modes of memory allocation are available: Depending on whether an instance of MonoAllocator is given in the constructor (in the previous sample it was not), the std::vector either allocates memory from the heap or from the monotonic memory.

The use cases for monotonic allocation mode are described with module ALib Monomem and not repeated here, besides the following hint: Should the given MonoAllocator be \ alib{monomem,MonoAllocator::Reset,reset}, and the Boxes instance not be destructed but continued to be used, then the instance has to be "reset" as well. This is done by performing a C++ placement-new, as described here.

Monotonic allocation is not needed to be implemented with the use of class Boxes, if the following "design pattern" pattern is followed for types that accept variadic template arguments with the help of class Boxes:

  • Add an internal member of type Boxes to the type.
  • Add an interface method that accepts a reference to class Boxes and optionally let its implementation reside in a compilation unit (non-header).
  • A templated variadic overload of the method clears the internal member, adds the given arguments to it and passes them to the overload that accepts the boxes reference.
  • Provide a method that returns the internal member (as reference) to allow the user to collect the arguments in a more sophisticated, step by step way (instead of having to place them all into one function call).

Samples of this pattern are found in ALib itself, for example with types Formatter or Lox .

11.4 Advanced Usage of Class Boxes

Besides providing variadic template arguments, method Boxes::Add uses some template meta programming to "flatten" the array in the case that another instance of class Boxes is added. In other words, if an instance of class Boxes is passed to Boxes::Add, the boxes contained in this instance are copied into the destination vector! Due to this fact, when using sample method VariadicFunction from above, the invocation:

boxes.Add( 2, 3 );
VariadicFunction( 1, boxes, 4 );

produces the following output:

1 2 3 4 

The reason why this is implemented like this, is that the user of a method has a next benefit: He/she has the freedom of choice to either pass all parameters just inside the function call or to collect all objects prior to the call in an own instance of class Boxes and then just pass this instance as a single argument - even together with other, fixed arguments.

This makes the use of the function more flexible, without the need of providing an overloaded version that accepts and processes an object of Boxes directly.

Finally, besides detecting objects of class Boxes inside method Boxes::Add, it is also detected if an object of class Boxes is passed as a boxed object. Let us first look at a sample and its result:

boxes.Add( 2, 3 );
alib::Box box( boxes );
VariadicFunction( 1, box, 4 );
1 2 3 4 

Looking at this sample a reader might think "Wow, this is cool, but where is the use case for this?". Generally spoken, this is useful when a method has several overloaded versions with different parameters, and still should support to accept an arbitrary amount of any type of arguments. In this moment, it might get quite complicated (or impossible!) to define the methods properly in the sense that no ambiguities may occur when invoking them. A solution here is to declare the method to accept just exactly one const alib::Box& argument instead of a variadic list of arguments.

If inside the method this box is passed into a local instance of class Boxes, a user might invoke the method with just a single argument of arbitrary type (which gets boxed), or with an arbitrary amount of arguments, by collecting those in class Boxes. This might be done right in the invocation.
To demonstrate this, we use the method from above, but instead of accepting variadic template arguments, it accepts now just a single argument of type const Box&:

void HeavilyOverloadedFunction( const alib::Box& boxOrBoxes )
{
// pass the single box into a Boxes object. This way, if another boxes object gets passed,
// its elements are added to the list!
alib::Boxes boxes;
boxes.Add( std::forward<const alib::Box>( boxOrBoxes ) );
// do something
for( alib::Box& box : boxes )
{
if( box.IsType<alib::integer>() )
std::cout << box.Unbox<alib::integer>() << " ";
else
std::cout << " Unknown Argument Type ";
}
std::cout << std::endl;
}

This can be invoked as follows:

HeavilyOverloadedFunction( 1 );
boxes.Add(1, 2, 3);
HeavilyOverloadedFunction( boxes );

...which produces:

1  
1  2  3  

A real world sample can be found in the logging library ALox which is built on ALib and makes a lot of use of ALib Boxing . While straightforward methods Lox::Info, Lox::Verbose, etc. accept variadic template arguments as objects to be logged, method Lox::Once is more complicated: Various overloaded versions exist that interpret the term "once" differently. Therefore, each overloaded version accepts only one object to log - which at the first sight is only suitable to accept a simple log message string. But internally, a Boxes instance is created and this way, multiple objects can be passed just as with other interface functions.

As a final note, besides "flattening" a boxed instance of class Boxes, method Boxes::Add will do the same with a "boxed array of boxes". Hence the following code:

Box array[3]= { 1, 2, 3};
HeavilyOverloadedFunction( array );

produces:

1  2  3  

12. Further Topics And Details

12.1 Void And Nulled Boxes

12.1.1 Void Boxes

Default-constructed instances of class Box or those constructed passing keyword nullptr as an argument, do not contain a boxed value. Technically this means, that no VTable singleton is set, because VTables only exist for mapped types.

To test if a box "is void", aka does not contain a value, a test for type void is to be performed by invoking Box::IsType .
As soon as anything else but nullptr is boxed (with construction or assignment), the instance looses its void state. Vice versa, by assigning keyword nullptr, a box is "reset" to void state.

The following methods are allowed to be called on void boxes:

Forbidden methods that produce undefined behavior if invoked, are:

In debug-compilations these methods raise a run-time assertion when invoked on a void box. Most of the times an explicit test on whether a box is void is still not necessary, because unboxing is only allowed after successful type guessing.

The void state constitutes a piece of information that might be used in APIs.

12.1.2 Nulled Boxes

Very different from the attribute of a Box being void, is the attribute of being nulled. The latter applies only to non-void boxes. In theory, the nulled-state of a box is undefined if no value is boxed.

If a box is nulled is evaluated using built-in box-function Box::IsNotNull which is invoked by methods Box::IsNull and negated Box::IsNotNull .

Because ALib Boxing is tolerant in respect to calling box-functions on void boxes, calling FIsNotNull on a void box returns the default value of bool, which is false. This way, boxes that do not contain a value report to be nulled, which is appropriate behavior with most use cases.

Default implementations of FIsNotNull for fundamental types return true, as such types are not considered nullable. The default implementation returns false (nulled), for array types that have a length of 0 and for pointer types that have value nullptr. Otherwise the default implementation returns true (not nulled).

12.2 Optimizations With Static VTables

12.2.1 Introduction

Using class Box to pass data between code entities, causes a certain amount of "effort", which has an impact on the code size and the execution performance.

Before it is explained how to minimize this effort, the following important note is to be made:

Note
Disclaimer for this chapter:
  1. Without custom optimization that is described below, the built-in mechanics of ALib Boxing performs very well already. In most use cases, doing so is not worth the effort. Generally, a programmer can skip reading this chapter and leave things as they are. In older versions of this library, the optimization was not even possible.
  2. For C++ fundamental types such optimization is already built-in.
  3. To keep this and the next manual section short, technical details are postponed to a third section of this chapter only.

The processes of boxing, type guessing and unboxing should be implemented in a fast and lean code. The three share two actions:

  1. Identifying the mapped type: this is done with boxing and type guessing.
  2. Copying the data: this is done with boxing and unboxing.

Point two is a matter of the implementation of struct T_Boxer . If the default for methods Write and Read can be used, this is implemented most efficiently and can not be optimized. What is left is point one. This in turn is split into three steps:

  1. Decide about the mapped type.
  2. Retrieve a corresponding singleton of struct VTable.
  3. Store the singleton in the box (used with boxing) or compare the singleton with the stored one (used with type guessing).

The good news is, that step one is performed at compile-time using TMP and this way has no run-time effects. Step three is a most efficient simple pointer assignment, respectively comparison.
As a result, the only point that leaves room for optimizations is with step two, retrieving the vtable singleton. If it is done, then retrieving the singleton is nothing else than a single direct memory access.

Together, for example boxing a value is compiled to nothing more than just filling (all or a part of) the 24 bytes (respectively 12 bytes on a 32-bit platform) with values that the CPU can simply load from other memory addresses!
As mentioned above, the impact of not performing the optimization for a mapped type, is described in the section 12.2.3 Technical Background On VTables.

12.2.2 Declaration, Definition and Registration Of Static VTables

The goal of the optimization is to provide a named singleton object for the vtable of a mapped type. To do so, three simple steps are involved. As optimizations for all fundamental types are already built into the library, the library code itself, used for types bool and char[] are used as a sample.

1. Declaration of the vtable

Named singletons of struct detail::VTable have to be declared in a header file. For this, macros

are to be used. For types bool and char[], the internal (always included) header file alib/boxing/customizations.inl states:

Besides the mapped type, a second parameter specifies a valid and unique C++ identifier name.

Attention
Likewise with boxing customization (the provision of specializations of type traits struct T_Boxer ) it is mandatory, that each compilation unit that boxes, guesses, or unboxes the mapped type, needs to "be aware" of that optimization. It is recommended to place the optimizations in the in the same header file as the customizations and make sure it is always included. Otherwise, the result is undefined behavior!
ALib Boxing will give a run-time assertion in debug-compilations if a code unit misses such inclusion.

2. Definition of the vtable

The singleton objects have to be defined in a compilation unit (e.g. cpp-file). Corresponding macros

are used:

ALIB_BOXING_VTABLE_DEFINE( bool, vt_bool )

The macro parameters are the very same as for the declaration.

3. Registration of the vtable

This final step is needed only in debug-compilations. Consequently, macro

(which is used to register both, non-array-type and array-type vtables) is empty when compiling a release-version.

Similar to the registration of box-function implementations, the registration of static vtables has to be performed with the bootstrap code of a software. It is a good idea to place the macros to the same bootstrap section, where function registrations are done.

In our sample, this looks as follows:

The registration done in debug-compilations has two effects:

  1. It helps to detect if two or more code units use the mapped type in conflicting ways: some with the declaration of the static table, while others without it.
  2. It allows to enumerate the overall known vtables, when creating debug status information about ALib Boxing , as documented with chapter 12.7 Debug Helpers. In turn, the use of such debug options helps to identify vtable candidates that might benefit from optimization, as this also lists the overall usage of a type.
Note
Although registration is purely introduced and performed for debug-purposes, it is mandatory to be done in debug-compilations. Methods Box::Unbox , Box::UnboxArray and Box::UnboxElement will assert if a non-registered vtable is used with the current box.

This is all that is needed to do. With that, ALib Boxing is as fast as technically possible. The penalty of the use of boxes is marginalized in both respects: code size and execution performance.

12.2.3 Technical Background On VTables

Creating Strict Singleton VTables:

What the vtable is in C++ , is struct detail::VTable for ALib Boxing . Both are singletons, which means that two objects of the same mapped type share a pointer to the same vtable and that for each mapped type only one instance exists.

At compile-time, when an object is boxed, the right singleton has be chosen and stored together with the object's data in the box. The small challenge now is to find a way of how to define a singleton for the endless amount of types that can be mapped? The solution is done with a simple trick: An otherwise empty template class detail::VTableTT is derived from VTable. In parallel this template class is also derived from ALib class Singleton . Two template type parameters are specified, TPlainOrArray and TMapped . These are exactly those types that are found in structs TMappedTo and TMappedToArrayOf . Either of them has to be used for the type definition Mapping of type traits struct T_Boxer to specify the mapped type.

If the vtable was not optimized (as shown in the previous section), then the static method Singleton::GetSingleton is invoked on type VTableTT:

VTableTT<typename TMapping::PlainOrArray, typename TMapping::Type>::GetSingleton()

Et voila! This gives the constructors of class Box the strict singleton object it needs to store.

Now, to allow optimizations, class Box does not perform the retrieval of the right singleton directly. Instead, it is done indirectly through a next specializable type traits struct. This is named detail::T_VTableFactory . Only its default implementation - used with non-optimized mapped types - acts like described above. Specialized versions directly return a static object with method T_VTableFactory::Get .

The macros ALIB_BOXING_VTABLE_DECLARE and ALIB_BOXING_VTABLE_DECLARE_ARRAYTYPE declare such singleton and on the same token specialize the factory for the given mapped type to return it.

Impact of non-optimized vtables:

The final technical question is now: what negative impact does the use of class Singleton::GetSingleton have? As type Singleton has to be templated, it's construction has to be performed inline. The same is obviously the case with struct VTableTT which derives both VTable and the singleton. The first thing that Singleton::Get does is to check whether the singleton was already created by an earlier call. If yes, it is instantly returned. If not, construction has to be performed. Although the latter is done only once, each time a value is boxed, the whole (inlined) code has to be added to the construction. Therefore, the impact on code size is rather high, while the execution performance - from the second invocation on - suffers from only a marginal penalty.

On Windows OS, with the use of DLLs, things become even a little more complicated. This is the main reason for the existence of dedicated ALib helper class Singleton. Different DLLs and the main process that loads them, do not share one data segment. Because of this, before a singleton is created, a check has to be made whether the same singleton was created already in a different data segment. Of-course, such check needs to avoid race conditions and therefore uses a semaphore. Luckily, this code does not need to be inlined.
Note that this "DLL-problem" does not apply for the optimized, static vtable objects. Here, a definition can be used in a distinct compilation unit, that the process and the DLLs share.

More details on this topic are found with the Programmer's Manual of ALib Module ALib Singletons .

Management Of Boxed Functions:

With field VTable::Functions , each vtable embeds struct FunctionTable which is responsible to store and retrieve implementations of box-functions. Furthermore, one dedicated instance of this type is defined in the namespace to store the default implementations.

Methods FunctionTable::Set and FunctionTable::Get use TMP enabled overload mechanics by their template type TFDescr . For the built-in functions FClone, FEquals, etc, a direct access to a corresponding pointer member is performed.

For registered custom functions, a global hash table is used that maps the function table and the function type to the function's implementation. Besides the hash table access needed, in addition a mutex is acquired to protect the global hash table against concurrent access.

12.3 Optimizations With "constexpr"-Boxing

Under certain conditions, instances of class Box are constexpr values. For example, the following code compiles without an error:

Box box1= "Hello world"; // Here you can step in with the debugger
constexpr Box Box2= "Constructed at compile-time!"; // Here, you can't!

While the typical use cases of ALib Boxing do not raise the requirement to be able to define constexpr Box variables by users of the library, still there is some advantage of constexpr boxes with the possibility for the compiler to optimize the object code. In addition, such box instances objects may be placed in the data segment of an executable, that is residing in read-only memory (e.g. embedded systems).

Note
The latter (storing constexpr boxes in read-only memory) imposes the only mandatory rational for this type of optimization. For other purposes, it is very questionable if the result is worth the effort and a reader might skip the following explanations.

Requirements

The C++ rules for creating constexpr objects imposes that the constructor of class Box that is chosen according to a given argument type TBoxable , is implemented constexpr. The constructor creates two field members, the vtable and the Placeholder . Consequently, the creation of both objects need to be implemented constexpr.

Note
From the sample above it can be told that obviously both requirements are met with a) vtables of mapped character array type, and b) boxable type C++ String Literal.

1. Static VTable:
For the vtable to meet the requirement, the optimization discussed in previous chapter 12.2 Optimizations With Static VTables has to be performed and thus is the first mandatory requirement to enable constexpr boxes is to implement what is described in this chapter for the mapped type in question.

2. Static Definition of T_Boxer::Write:
The second requirement of creating the Placeholder in a constexpr way, can not be achieved with the implementation of method T_Boxer::Write as it was presented in chapter 7. Customizing Boxing! The reason is that with this definition, one or more members of union Placeholder have to be set inside the function. Functions that do this are forbidden to be constexpr (even in C++ 17).
Besides the macros used for customization introduced in that manual chapter, two further ones exist, with postfix "_CONSTEXPR":

The difference of the "_CONSTEXPR"-versions of the macros is the definition of boxing method Write. Instead of receiving the target's Placeholder along with the value to box:

static            void         Write( Placeholder& target,  TSource const & value ) {...}

these macros define the method with only the value argument while returning a placeholder object:

static constexpr  Placeholder  Write( TSource const & value )                       {...}

If the TMP code of class Box detects this change, a different constructor - one that is defined constexpr - is chosen!

It was said, that modifying different members of a union is forbidden with the C++ rules. With the modified Write method, customization code has the chance to construct a new placeholder value and initialize one of the union fields. Unfortunately, also here, a strict rule applies: The constructor of a union is allowed to set only one of the union members.

Note
If a reader might wonder that a union always has just one member set, remember that in the case of union Placeholder , some members are of array type. The tricky part therefore is, that different array elements of different union members may very well be set without overwriting each other!

The way out of this dilemma was to provide a bigger set of constexpr constructors to union Placeholder that in turn make use of corresponding sets of constructors of detail types StructArray , UnionIntegrals , UnionFloatingPoints and UnionPointers . Some of those allow to initialize one or more of the array or struct elements. Note that as stated in the reference documentation of union Placeholder , these constructors are not listed in that reference documentation. If needed for a custom T_Boxer::Write method, please consult the source code.

To summarize: The second requirement about creating the Placeholder in constexpr way, can be achieved by using the alternative version of T_Boxer::Write as described.

Built-In Behavior

The following rules apply for different types:

  • Void, void* And Nulled Boxes:
    Void and nulled boxes are constexpr.
  • Fundamental Types:
    All C++ fundamental types can be created constexpr (built-in adoptions)
  • Array Types:
    If boxing is performed as an array type, the following applies:
    • For character arrays, a static vtable is defined. For other element types, a definition has to be performed
    • If boxed from a C++ array (which is non-customizable boxing) boxing is performed constexpr.
    • With all built-in ALib types (like AString), boxing is performed constexpr.
    • Custom types need to provide a constexpr specialization of T_Boxer::Write.
  • Enum Types:
    Boxing of all enum types is performed constexpr. Consequently all that is needed is to define a static vtable for the mapped enum type.
    For all major enum types of ALib , such static vtable is defined.
  • Pointer Types
    Types boxed as pointers are boxed in a constexpr fashion by default. Hence the only precondition is to define a static vtable for the type.

12.4 Global And Static Box Instances And Their Initialization

Instances of class Box may generally exists as global data or static members as long as they not initialized with a boxed value.

If a default-initialization should be given, then the resulting mapped type's vtable has to be statically defined as described in chapter 12.2 Optimizations With Static VTables. The reason for this is, that dynamically created vtables are using the mechanics implemented with ALib type Singleton . To achieve the creation of process-wide "true" singleton objects, this class uses a globally defined hash-map that in case of a first creation within a compilation unit might be used to receive one already created in another compilation unit. The technical background for this is explained with module ALib Singletons . In short, the problematic platform here is WindowsOS, which allows a DLL to have an own global data segment.

Because the sequence order of initialization of global objects is not defined with the C++ language, it can not be assured that the hash-map is already initialized when the singleton vtable of an initialized global or static box is required.

As it was documented in chapter 12.2 Optimizations With Static VTables, for all fundamental types as well as for character arrays, a static vtable implementation is always in place. Therefore, global or static boxes may well be initialized with values of these types. If a custom type is to be used for initialization, a static vtable has to be given.

In debug-compilations, the use of dynamic vtables with global or static instances of class Box raises a run-time assertion.

12.5 Compilation, Header Inclusion And Bootstrapping

12.5.1 Compilation Options

The following ALib Compiler Symbols are provided by this ALib Module :

See also
Chapter 5. Building The Library of the ALib Programmer's Manual.

12.5.2 Header Inclusion

The sample code given in this manual only seldom show the inclusion of necessary header files. The module provides just three headers:

  1. Header alib/boxing/boxing.hpp
    This recursively includes almost all features that ALib Boxing , i.e provides classes Box and Boxes, type-traits struct T_Boxer and struct Placeholder, as well as the built-in box-functions.
  2. Header alib/boxing/enum.hpp
    Makes class Enum available.
  3. Header alib/boxing/dbgboxing.hpp
    In debug-compilations, declares static struct DbgBoxing .

With the use of other ALib Modules that rely on boxing, the inclusion of the header files is usually not necessary. For example, when including alib/lang/format/formatterpythonstyle.hpp , the inclusion of headers of ALib Boxing is inherently performed.

Some care has to be taken, with boxing string types. As explained in chapter 10.2 Character Arrays, all specializations of type-traits struct T_CharArray have to be "included" prior to the definition of type-traits T_Boxer . Therefore, a compilation unit has to include such specializations prior to including header boxing.hpp.

For example, if string types of the QT Class Library are to be used with formatter FormatterPythonStyle , then the corresponding compatibility header has to be included before any other header that includes boxing.hpp:

    #include "alib/compatibility/qt_characters.hpp"
    #include "alib/lang/format/formatterpythonstyle.hpp"

12.5.3 Bootstrapping

Like with most ALib Modules , a due bootstrapping of ALib Boxing has to be performed. As documented in the general manual of ALib , this usually is performed automatically with bootstrapping the library.

In the case that ALib Boxing is used as an extracted module, for bootstrapping, namespace function alib::boxing::Bootstrap has to be called. The method should be called as early as possible. I.e. it has to be called before custom code performs registration of custom box-function implementations and before the registration of custom static vtables.

12.6 Life-Cycle Considerations

With ALib Boxing , no mechanisms are in place that link the life-cycle of boxes with their boxed values. Class Box does not even have a destructor defined! This is a huge difference to C++ 17 class std::any.
It is completely left to the user of the library to make sure that any pointer or data that otherwise references values available during boxing, are still intact and available when unboxed and vice-versa, that allocated objects that become boxed are de-allocated after a box that refers to them is disposed.

In many use cases, this is absolutely no problem: Often, ALib Boxing is used to implement generic (and optionally variadic) function arguments. If those are then used inside the function only and not stored otherwise, the access to boxed data is safe. A prominent sample for this use case is given with appendix chapter C.1.

However, other use-cases might introduce the need to use boxed data out of the scope that boxed the data. A good sample for this is given with appendix chapter C.2 Use Case: ALib Exceptions. Objects of type Exception carry exception arguments while the function-call stack is "unwinded". Hence, all locally defined objects are destructed and get out of scope.

In this and similar cases, a user of the library has to ensure that boxes of mapped types whose data might become corrupted, are either not unboxed or the data is copied prior to having the box leaving the scope. A nice way to perform such copying is provided with built-in box-function FClone . Its default implementation copies the data of boxed arrays.

Depending on the use-case, the concept of "cloning" does not need be taken too literally, because function FClone might take other actions as well. Implementations are allowed to overwrite the given box's content, hence including to change the mapped type of the box! Often it is enough to create some sort of representation of an object, for example just an ID or another sort of key value. In the mentioned use case of exception handling, sometimes just a string representation of an object might be created, which is later used for assembling a human readable formatted log output.

12.7 Debug Helpers

12.7.1 Available Debug Objects And Fields

In debug-compilations, compiler symbol ALIB_DEBUG_BOXING may be set. In that moment, the following entities become available:

Together, this discloses all information necessary to investigate into the built-in and default behavior of ALib Boxing . Please consult the reference manual of the named types, for further details.

12.7.2 Class DbgBoxing

Instead of using the methods and objects listed above, struct DbgBoxing provides a more handy alternative in the moment that module ALib BaseCamp is included in the ALib Distribution . The class then offers additional static interface methods that collect and format various sorts of information.

For details, please consult the type's reference documentation . In this Programmer's Manual, we just want to provide some sample invocations.

Note
A prerequisite for the following samples to compile is the inclusion of header file alib/boxing/dbgboxing.hpp .

Showing The Mapped Type:

If a programmer is unsure, which mapped type results from boxing, all that is needed to do is to pass a "sample box" to method TypeName :

cout << "The mapped type is: " << alib::DbgBoxing::TypeName( "char array" ) << endl;
The mapped type is: char[]

Detailed Info On Boxable And Corresponding Mapped Type:

A next, quite powerful method is TypeInfo . It provides all information on a boxable type TBoxable and its a mapped type.
The boxable type needs to be provided as a template parameter TBoxable . If it is not default constructible, a corresponding the sample box has to be provided as well. To stay with sample above, to get information for mapped type char[], one possible TBoxable is alib::String. With that, the invocation looks like this:

cout << alib::DbgBoxing::TypeInfo<alib::String>();

It produces the following details:

Boxing Information For Boxable Type: strings::TString<char>
  Mapping:        Array
  Mapped Type:    char[]
  Customized T:   true
  Customized T*:  false
  Is Unboxable:   Yes (Custom unboxing from array type)
  VTable Type:    Static Singleton (Specialized T_VTableFactory)
  Usage Counter:  26805

  Associated Specialized Functions:
    dox_boxing_sample_functions::FToString  ( 2)
    FAppend<char16_t>                       ( 0)
    FAppend<char>                           (8904)
    FAppend<wchar_t>                        ( 0)
    FEquals                                 ( 8)
    FIsLess                                 ( 0)

For readers of this manual, all information should be easily understandable. Line "Usage Counter" provides the quantity of unboxing operations and function invocations that have been performed on boxes of the mapped type so far. The value also depends on when during a process's life-cycle the method was invoked. If this value indicates a high usage and line "VTable Type" denotes a dynamically created vtable type, it might make sense to define a static vtable for that mapped type.

Likewise, with each specialized box-function, the number of its invocations is given in brackets behind their names.

Listing All Known Box-Functions:

To get a list of just all box-functions, that either a defaulted or one or more specialized implementation has been registered for, the following code can be used:

Resulting to:

dox_boxing_sample_functions::FToString  ( 2)
expressions::FToLiteral                 (No default implementation)
FAppend<char16_t>                       ( 0)
FAppend<char>                           ( 0)
FAppend<wchar_t>                        ( 0)
FClone                                  ( 0)
FEquals                                 ( 3)
FHashcode                               ( 0)
FIsLess                                 ( 0)
FIsNotNull                              (121)
FIsTrue                                 ( 0)
lang::format::FFormat                   (No default implementation)

For those box-functions that dispose over an associated default implementation, the number of invocation of that default is given in brackets.

Listing Mapped Types With Static/Dynamic VTables:

To list just all types that a dynamic vtable is created for (and therefore could be optimized), the following line of code can be used:

Here is the list:

Mapped types with dynamic VTables:
-----------------------------------------------------------------------------
(0)   bitbuffer::ac_v1::ArrayCompressor::Algorithm
(2)   double[]
(1)   dox_boxing_sample_classes1::BigClass*
(2)   dox_boxing_sample_classes1::SmallClass
(1)   dox_boxing_sample_customization_bypass::WrappedFloat
(0)   files::FInfo::Qualities
(0)   files::FInfo::Types
(1)   int [3][]
(13)  int[]
(0)   lang::format::ByteSizeIEC
(0)   lang::format::ByteSizeSI
(0)   lang::format::ByteSizeUnits
(5)   long[]
(4)   MyBase
(16)  std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >
(0)   std::reference_wrapper<std::__cxx11::basic_string<char16_t, std::char_traits<char16_t>, std::allocator<char16_t> > >
(0)   std::reference_wrapper<std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >
(1)   std::vector<int, std::allocator<int> >*
(18)  ut_reclog::AppendLog*

If true was passed to DumpVTables , then those with static tables had been given. A second, default boolean parameter can be used to trigger the list of specialized functions with each vtable listed.

Getting A Quick Overview:

To finish this chapter, method DbgBoxing::DumpAll is invoked, which aggregates much of the above.
The following shows the invocation and a possible corresponding output:

Mapped types with static VTables and their associated specialized functions:
-----------------------------------------------------------------------------
(0)   bool
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsNotNull                                   ( 0)

(1248)  Box[]

(39)  Boxes*

(0)   char16_t[]
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FIsLess                                      ( 0)

(0)   char32_t[]

(27788)  char[]
 dox_boxing_sample_functions::FToString       ( 2)
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                (9060)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 8)
 FIsLess                                      ( 0)

(0)   cli::Exceptions
 FAppend<char>                                ( 0)

(0)   config::Exceptions
 FAppend<char>                                ( 0)

(36)  config::Priorities
 FAppend<char>                                (12)

(15)  double
 dox_boxing_sample_functions::FToString       ( 1)
 FAppend<char>                                ( 2)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsLess                                      ( 0)
 FIsNotNull                                   ( 0)

(0)   expressions::detail::VirtualMachine::Command::OpCodes
 FAppend<char>                                ( 0)

(0)   expressions::Exceptions
 FAppend<char>                                ( 0)

(0)   lang::Exception*

(0)   lang::format::FMTExceptions
 FAppend<char>                                ( 0)

(10)  lang::Report::Types

(0)   lang::system::SystemErrors
 FAppend<char>                                ( 0)

(0)   lang::system::SystemExceptions
 FAppend<char>                                ( 0)

(430)  long
 dox_boxing_sample_functions::FToString       ( 1)
 FAppend<char>                                (65)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsLess                                      ( 0)
 FIsNotNull                                   ( 0)

(0)   long double
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsLess                                      ( 0)

(51)  lox::detail::Logger*
 FAppend<char>                                (17)

(734)  lox::Scope
 FAppend<char>                                (256)

(2545)  lox::Verbosity
 FAppend<char>                                (1268)

(156)  std::pair<lox::Verbosity, config::Priorities>
 FAppend<char>                                (52)

(0)   std::reference_wrapper<strings::TAString<char16_t> >
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(23)  std::reference_wrapper<strings::TAString<char> >
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 8)
 FAppend<wchar_t>                             ( 0)

(0)   std::reference_wrapper<strings::TAString<wchar_t> >
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(0)   std::type_info*
 FAppend<char>                                ( 0)

(0)   strings::util::Token*
 FAppend<char>                                ( 0)

(4)   time::DateTime
 expressions::FToLiteral                      ( 0)
 lang::format::FFormat                        ( 0)

(0)   time::Ticks

(0)   time::TimePointBase<std::chrono::_V2::steady_clock, time::Ticks>::Duration
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(0)   time::TimePointBase<std::chrono::_V2::system_clock, time::DateTime>::Duration
 expressions::FToLiteral                      ( 0)
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(283)  unsigned long
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsLess                                      ( 0)
 FIsNotNull                                   ( 0)

(0)   void*

(328)  wchar_t
 FAppend<char>                                (18)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FHashcode                                    ( 0)
 FIsLess                                      ( 0)
 FIsNotNull                                   ( 0)

(0)   wchar_t[]
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)
 FEquals                                      ( 0)
 FIsLess                                      ( 0)


Mapped types with dynamic VTables and their associated specialized functions:
-----------------------------------------------------------------------------
(0)   bitbuffer::ac_v1::ArrayCompressor::Algorithm
 FAppend<char>                                ( 0)

(2)   double[]

(1)   dox_boxing_sample_classes1::BigClass*

(2)   dox_boxing_sample_classes1::SmallClass

(1)   dox_boxing_sample_customization_bypass::WrappedFloat

(0)   files::FInfo::Qualities
 FAppend<char>                                ( 0)

(0)   files::FInfo::Types
 FAppend<char>                                ( 0)

(1)   int [3][]

(13)  int[]

(0)   lang::format::ByteSizeIEC
 FAppend<char>                                ( 0)

(0)   lang::format::ByteSizeSI
 FAppend<char>                                ( 0)

(0)   lang::format::ByteSizeUnits
 FAppend<char16_t>                            ( 0)
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(5)   long[]
 dox_boxing_sample_functions::FToString       ( 1)

(4)   MyBase

(16)  std::reference_wrapper<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >
 FAppend<char>                                ( 6)
 FAppend<wchar_t>                             ( 0)

(0)   std::reference_wrapper<std::__cxx11::basic_string<char16_t, std::char_traits<char16_t>, std::allocator<char16_t> > >
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(0)   std::reference_wrapper<std::__cxx11::basic_string<wchar_t, std::char_traits<wchar_t>, std::allocator<wchar_t> > >
 FAppend<char>                                ( 0)
 FAppend<wchar_t>                             ( 0)

(1)   std::vector<int, std::allocator<int> >*

(18)  ut_reclog::AppendLog*
 FAppend<char>                                ( 6)


Known Function Declarators And Usage Of Default Implementation:
-----------------------------------------------------------------------------
  dox_boxing_sample_functions::FToString       ( 2)
  expressions::FToLiteral                      (No default implementation)
  FAppend<char16_t>                            ( 0)
  FAppend<char>                                ( 0)
  FAppend<wchar_t>                             ( 0)
  FClone                                       ( 0)
  FEquals                                      ( 3)
  FHashcode                                    ( 0)
  FIsLess                                      ( 0)
  FIsNotNull                                   (121)
  FIsTrue                                      ( 0)
  lang::format::FFormat                        (No default implementation)

Note that the output include types that we have used during this tutorial. This is due to the fact that the unit test that produce this manual's output run all in one process and are run in the order of the chapters.

12.8 Performance Considerations

The reason why the effort of implementing this library is needed is the C++ language design principle to be as performant and close to the hardware as possible. Other programming languages are designed for other goals. For example in languages Java or C#, the principle "everything is an object" is (almost) implemented. In these languages, all instances (!) of class types have run-time type information attached. In C++, only virtual classes have that.

And what happens in Java and C# when a plain, fundamental type is passed to a method that expects an object? The corresponding compiler performs "auto-boxing" of the values to pre-defined class types like Char, Integer or Double!

ALib Boxing allows very similar things in C++. Therefore it is indicated to have do quick analysis of the memory and performance impact. We do this in a rather loose order:

12.8.1 A General Note On C++ RTTI

Due to the C++ language history, there is some confusion and wrong information spread in consideration of run-time type information (RTTI), especially with programmers that have a long-term record of C++ experience (because they probably went through the painful discussions of older days, which freshmen did not).

Therefore quickly some facts:

  • All standard compilers nowadays support RTTI and such support is not switched-off by default.
  • Such support has no influence on programs that do not use the feature. In other words, switching RTTI off (what some compilers still support!) makes no sense.
  • If RTTI is used in a compilation unit, only the code-lines that use it are affected.
  • The performance impact of using RTTI in C++ is extremely marginal, probably more marginal than in almost any other programming language. For each type that keyword typeid is used on, the footprint of an executable increases by the size of the corresponding std::type_info struct that the linker has to place in the data segment for that type.
    The impact to get information on a type using keyword typeid is negligible. It is constant time, in Big O notation it is O(1). Keyword typeid just reads the pointer to a global struct residing in the data segment of an executable.

12.8.2 VTables

For each mapped type, a singleton of a type detail::VTable is created once.

This is again is negligible, even if no static vtable is declared for a mapped type. If it is, then the impact of using a mapped type is comparable to the use of C++ vtables which are created by the compiler and included by the linker for each virtual C++ class used.

12.8.3 Footprint Of Class Box

Class Box contains two members: A pointer to the vtable singleton and the data union Placeholder , which consists of two "words". For example on a standard 64-bit platform a pointer and a word is 8 bytes wide, hence an instance of class Box on those platforms has a size of 24 bytes. With many use cases, boxes are created in "stack memory" which allocates and deallocates in zero time (yes, its less than "O(1)", it is just nothing).

Once created, to pass them to another function or store them in a container like Boxes, these 24 bytes have to be copied.

While this is three times more than copying just a pointer, it might me much less effort in cases that composite types automatically become boxed as pointers. If those had been passed for example as variadic templated parameter, a deep-copy of the argument value had to be performed. With ALib Boxing , it is always only the 24 bytes.

12.8.4 Construction And Destruction Of Class Box

When a value is boxed, hence an object of class Box is created, two things have to be done. First the right vtable is identified. This is done using (inlined) TMP code and "magically" this is reduced to the inlined retrieval of a singleton.

This rather tricky procedure is very fast after it was done once for a type, but still the code needed to be inlined might be rather huge. This overhead can be optimized using static vtables. With such optimization, the effort is reduced to single copy operation of a pointer to a data structure residing in the global data segment of an executable.

Secondly, the Placeholder found with member Box::data has to be set. Again, this is mostly inlined TMP code and when compiled should be in most cases result in one or two simple copy operations of pointers or fundamental C++ values.

Because no destructor of an instance of a Box is given, as well as embedded union Placeholder or its members do not have a destructor, destruction of boxes is not performed.

12.8.5 Type Guessing

Template method Box::IsType compares the internal pointer to the singleton vtable with the that singleton that would be chosen if the given type (the template parameter) was boxed. Therefore, the impact is the same as boxing a value, minus the process of boxing data, plus a pointer comparison.

Again, if optimized vtables are used for the mapped type resulting from the guessed type, method IsType is compiled to one simple inlined pointer comparison.

Template methods Box::IsArray and Box::IsArrayOf have to perform an additional check for a void box, and then otherwise perform a similar pointer comparison.

12.8.6 Methods Box::GetFunction And Box::Call

Template method Box::GetFunction performs a lookup of the function in struct detail::FunctionTable that is embedded in the vtable member of each box. This struct has simple pointer "slots" for each built-in function which are selected using template specializations of the corresponding access functions.

For custom box-functions, a global hash table is used to search the function implementation using a pair of a function table pointer and the function type as the key value.

As a result, a function lookup for built-in function is performed in O(1), one for is slower and only in the average case is O(1).

If parameter searchScope of method Box::GetFunction equals Reach::Global, then in case of not finding a specific implementation, the search is repeated using namespace object detail::DEFAULT_FUNCTIONS .

Finally, template method Box::Call uses GetFunction and then just passes any given parameters to a C++ function call. Parameters are passed using C++ 11 "perfect forwarding". In the case that no interface method is found, a default value of the return type TReturn is created. Depending on the type, this might invoke a default constructor.

12.8.7 Compile Times

Due to the use of type-traits and TMP selected methods with rather complicated type expressions that the compiler has to evaluate, the time to compile a code unit increases with the use of ALib Boxing .

Unfortunately, this increase can be reasonably high.

12.8.8 Conclusion And Comparison To std::any

We consider the implementation of ALib Boxing to be as performant as it is possible.

It is hard or impossible to compare the impact on code size and performance between using of techniques like C++ variadic template arguments and the invocation of methods that do auto-boxing, probably using class Boxes to fetch variadic arguments.

In comparison to using C++ 17 type std::any, the most important advantage of ALib Boxing is that no heap memory allocations are performed, because class Box "switches" to pointer-boxing in the case a value does not fit to its placeholder. Reversely, when just fundamental types and small value classes are boxed, then std::any has an advantage in construction performance and memory footprint.

At the end of the day, the typical use cases of ALib Boxing anyhow do not impose high demands on performance. The main motivation for providing this manual chapter is for the sake of completeness and furthermore, that the authors of the manual think that the previous considerations help to profoundly understand how ALib Boxing is implemented and therefore is to be used.

Appendix A Quick Reference

While the namespace documentation provides an extensive reference index (generated with marvelous Doxygen ), the following quick lists should help finding the information you need:

A.1 Type Guessing

Method Description
Box::IsType Tests for boxes that are default constructed or have a nullptr assigned, hence have no value boxed.
Box::IsType Tests a box for containing a value of boxable type T.
Box::IsArray Returns true, if a one-dimensional C++ array had been boxed.
Box::IsArrayOf Returns true, if IsArray returns true and if the boxed array element type corresponds to given type T.
Box::IsPointer Returns true, if the mapped type is of pointer type.
Box::IsEnum Returns true, if the box contains an enumeration element.
Box::IsSameType Non-template method that returns true if a box contains the same mapped-type than a given one.
Box::IsCharacter Aggregation function that tests for mapped character types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_CHARACTERS.
Box::IsSignedIntegral Aggregation function that tests for mapped signed integral types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS.
Box::IsUnsignedIntegral Aggregation function that tests for mapped unsigned integral types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS.
Box::IsFloatingPoint Aggregation function that tests for mapped floating point types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_FLOATS.

A.2 Unboxing

Method Description
Box::Unbox Unboxes non-array type T.
Box::UnboxArray Unboxes the pointer to an array of element type T.
Box::UnboxElement Unboxes an array's element of type T.
Box::UnboxLength Unboxes an array's length.
Box::UnboxCharacter Aggregation function that unboxes a wchar , respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_CHARACTERS.
Box::UnboxSignedIntegral Aggregation function that unboxes a integer , respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS.
Box::UnboxUnsignedIntegral Aggregation function that unboxes a uinteger , respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS.
Box::UnboxFloatingPoint Aggregation function that unboxes a value of type double, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_FLOATS.
Box::Data Allows direct constant access to a box's placeholder.

A.3 Built-In Box-Functions

The following box-functions are predefined with the library:

Name Description/Notes
FEquals Logical comparison of the contents of two boxes. Specialization given for all fundamental and character array types
Templated implementations for comparable types are given with FEquals::ComparableTypes .
FIsLess Logical comparison of the contents of two boxes. Specialization given for all fundamental and character array types
Templated implementations for comparable types are given with FIsLess::ComparableTypes .
FIsNotNull See chapter 12.1.2 Nulled Boxes for more information.
FClone See chapter 12.6 Life-Cycle Considerations for more information.
FIsTrue Returns true if a boxed value is considered to represent value true; false otherwise.
The default implementation returns true for array types with zero length and for non-array types; if the used placeholder bytes do not all contain 0.
No type-specific implementations are given.
FHashcode Calculates a hash-code using the boxed type information as well as the boxed data.
A default implementation is given that takes all used placeholder bytes into account for types boxed as values or enums; the pointer address for types boxed as pointers and the array contents for boxed arrays. Furthermore specializations for all fundamental types are given by using static templated member FHashcode::UsePlaceholderBytes .
For pointer types, the provision of a specialization that collects type-specific hashable data from the pointer may lead to advanced hashing results.
FAppend Appends a string representation of the contents of the box to a given AString.
The default implementation writes the type name and a hexadecimal number in brackets behind for pointer types and similar information for other types. As type name information is available in debug-compilations only, in release code, the words "ValueType", "PointerType", "ArrayType" or "EnumType" are written instead. Hence, this indicates that a missing specialization is in fact an error and the default implementation is rather given for convenience and testing purposes.
Specializations are given for fundamental and character array types.

A.4 Box-Function Invocation

Method Description
Box::Call Calls a box-function.
Box::CallDirect Calls box-function previously received with GetFunction.
Box::GetFunction Returns a box-function's implementation.
Box::Clone Implicitly calls box-function FClone .
Box::Hashcode Implicitly calls box-function FHashcode .
Box::IsNull Implicitly calls box-function FIsNotNull .
Box::IsNotNull Implicitly calls box-function FIsNotNull .
Box::operator bool() Implicitly calls box-function FIsTrue .
Box::operator== Implicitly calls box-function FEquals .
Box::operator!= Implicitly calls box-function FEquals .
Box::operator< Implicitly calls box-function FIsLess .
Box::operator<= Implicitly calls box-functions FIsLess and FEquals .
Box::operator> Implicitly calls box-functions FIsLess and FEquals .
Box::operator>= Implicitly calls box-function FIsLess .

A.5 Further Methods Of Class Box:

Method Description
Box::TypeID Returns the typeid of a mapped type.
Box::ElementTypeID Returns the typeid of a boxed array's element type.
Box::GetPlaceholderUsageLength Returns the bytes used in the placeholder. Usefull to write generic code, e.g. to implement default versions of box-functions.

A.6 Debug Methods And Entities:

Method Description
DbgBoxing Static tool class to create human readable information about the configuration of ALib Boxing .
Box::DbgGetVTable Returns the vtable singleton of a box.
VTable::Functions Has a set of fields whose names are prefixed "DbgCntInvocations" and provide the number of invocations of the corresponding built-in box-function. Likewise; method DbgBoxing::GetSpecificFunctionTypes returns the usage number with each registered custom box-function.
VTable::DbgProduction Denotes if a vtable singleton was dynamically created or is an optimized static object.
VTable::DbgCntUsage A usage counter for the mapped type. The counter is increased with the invocation of various unboxing methods and when a box-function invocation is performed.

A.7 Built-In Non-Bijective Boxing

With default compilations, the following bijective boxing rules apply:

Source Type Mapped Type Unboxing/Comments
References and values of composite types (structs and classes) that either do not fit into union Placeholder or that are not copy-constructible or trivially destructible. Pointers to corresponding composite types Only the pointer type can be unboxed.
Pointers to objects of composite types (structs and classes) that do fit into union Placeholder and that are copy-constructible or trivially destructible. Values of corresponding composite types Only the value type can be unboxed.
Signed integral types of any size integer Only type integer can be unboxed.
Unsigned integral types of any size uinteger Only type uinteger can be unboxed.
float double Only type double can be unboxed.
char, wchar_t, char16_t and char32_t wchar Only type wchar can be unboxed.
const pointer to any of the three character types nchar , wchar or xchar
string literals
char[]
std::string
std::string_view
std::vector<char>
ALib string types
Array of corresponding character type "Lightweight" string types like std::string_view or String can be unboxed, "heavyweight" string types like AString can not.

Appendix B: 3rd Party Library Compatibility

In the source tree of the ALib C++ Library , folder alib/compatibility is found. Within that, a few header files are placed which are not included by other library headers, but instead may optionally be included by using code.
The directory aggregates headers imposed by different ALib Modules , targeting different 3rd-party libraries (in this respect we consider also the C++ standard library as such, as its use is optional).

The naming scheme of the header files is: "libname_modulename_something.hpp". For example, you will find header std_boxing_functional which applies to C++ standard library, this module and the area of "functionals".

There is no further documentation given in this user manual. However, in the reference documentation of this module, which is found with the documentation of namespace alib::boxing, sub-namespace alib::boxing::compatibility exists, which aggregates some of the customization content.

To achieve this, the documentation even sometimes "fakes" entities into this namespace, that technically must not be there - and in reality therefore are not there.
As a sample, take functor struct alib::boxing::compatibility::std::hash<alib::boxing::Box>. While the documentation claims it to be in that deep namespace, it is a specialization of struct std::hash and therefore "in reality" is and has to be made in namespace std. The reference documentation of all "faked", moved entities will individually hint to this fact.

As noted in chapter 12.5.2 Header Inclusion, compatibility headers provided for module ALib Characters , have to be included prior to those provided for ALib Boxing .

The headers found should give a good guidance for implementing custom ones as needed. Please feel free to sent us your implementations for inclusion in this library. But please do this only together with a due approval that those contributions are allowed to be published by us under the ALib License Terms.

Appendix C: Use Cases

Quite often in this Programmer's Manual, it was talked about certain "design decisions" and that those are "justifyable" with the typical use-case scenarios of module ALib Boxing .

The following presentation of sample use cases now intent to give such justification. For example, it will be shown that:

  • Loosing type information and data of the original type with non-bijective boxing is very seldom a problem, and if it is, it can be easily bypassed using identity boxing.
  • Loosing information about whether a source type was constant or not, and making constant the default for pointer types, is seldom be a problem, because processing functions usually perform read-only operations
  • Not performing "deep copies" of values larger than 16 bytes (respectively 8 bytes on a 32-bit system), but instead just boxing a pointer to the original, only seldom imposes a life-cycle conflict.
    And if it does, box-function FClone helps out, even in a way that reduces dynamic memory allocations to a bare minimum.
  • Box-functions enable a library to offer support for 3rd-party types, without the need of touching those. In other words, the processing function may reside in a library as well as the boxed types may do. Even then, the external types can be made compatible with the also external processing function.
  • Finally, it will be shown that customization of boxing, as well as defining implementations of box-functions is seldom needed. And if it is, a custom code can provide nice macros for its own audience, to enable those programmers to use the custom library.

All use-case are taken from other ALib Modules , which depend on module ALib Boxing .

C.1 Use Case: Sub-namespace "format" of Module BaseCamp

The format types of module ALib BaseCamp are more than a use-case. In fact they were the whole reason and motivation of creating ALib Boxing !
That module implements the well known "printf paradigm", which is available in standard libraries of various programming languages. A printf-like function interface is used to create a string representation of an arbitrary amount of arguments of arbitrary type. To do so, a "format string" that contains one placeholder for each provided argument is passed along with the arbitrary arguments. The placeholders within the format string, follow a certain syntax which allow various output modifications, like number formats, horizontal alignment, etc.

Module ALib BaseCamp provides abstract class Formatter which offers two overloaded versions of method Format : both accept a target AString as the first argument. The first accepts a reference to class Boxes , while the second accepts variadic template parameters besides the target string. How the latter invokes the first with a few lines of inlined code is explained in chapter 11. Variadic Function Arguments and Class Boxes.

Most Flexible Invocation: :

So, where is the format string then found in this interface? Well, here is a first idiosyncrasy of this implementation: The format string is not expected as a separated string type but just as a first of the arbitrary arguments. This approach has the following advantages:

  • Being of type Box, just as the arbitrary argument list, allows any type that boxes as a characters string to be passed.
    If the first argument is not of string type, then its contents is just appended to the target string using box-function FAppend . Then the next argument is checked to be string type. And so on.
  • The second advantage lies in the fact that consequently, if a format string was found and all of it's placeholders were processed and still arguments remain in the argument list, the whole procedure starts from scratch. This way, a user of the method is free to perform more than one format operation in one invocation.
  • Out of this advantage, a next one results: the overloaded version of method Format that takes a reference to class Boxes becomes very interesting: A using code might collect various format operations during its course of execution in an object of that type and when done performing all formatting in one invocation. This means, depending on the branches that a code takes, different format strings and format arguments might be collected.
    This is especially helpful when module ALib BaseCamp is used in the context of debug- and release-logging, which is discussed in a later chapter.

All three advantages together make the format-interface given with module ALib BaseCamp unrivalled in respect to flexibility.

Efficiency Due To Implicit Pointer-Conversion:

A next aspect that this use-case nicely shows is the exclusive use of class Box as function arguments. With this, no concerns of life-cycle management of the boxed data has to be taken into account. (We refer to those discussed in chapter 12.6 Life-Cycle Considerations). When arguments are passed and boxes are created implicitly on the stack, their life-cycle ends exactly when the function returns. This greatly justifies the design decision to "automatically" box pointers to objects in the case that given values do not fit into union Placeholder . If C++ 17 class std::any was used instead, unless the library documentation would demand its users to explicitly pass pointers, deep copies of "bigger" objects would be created. And this would be completely unnecessary overhead, because the formatters treat each argument as a constant (read-only) object.
One could argument, that it is typical and thus rightful C++ style, to use address operator& when passing objects, while in contrary this boxing approach hides away the pointerization. Our counter-argument is: A concept as implemented with std::any hides away the deep copy operation if just no pointer is passed. This is a negative impact on the performance, while the implicit pointerization is not!

Support for Custom Format Syntax:

Finally, the use case implemented with module ALib BaseCamp shows nicely how ALib Boxing enables to offer a library that can be extended to serve custom types in a most flexible way. This is shown with the provision of box-function FFormat by that module. This allows to introduce new placeholder syntax (!) for custom types, of-course without touching the original source code of the module.
A sample of how a custom type can be featured with a custom placeholder syntax is given in the Programmer's Manual of that module with chapter 4.3. Formatting Custom Types

C.2 Use Case: ALib Exceptions

Module ALib BaseCamp introduces class Exception , which is used in all ALib Modules as the throwable.

Use Arbitrary Scoped Enums:

Class Exception stores a list of Message objects that may extend the exception object with new information while the call stack is unwinded. Each message entry has an identifier that is implemented with field Message::Type . This field is of type Enum and is a very good sample for using this type. With that it became possible that every ALib Module (and likewise a using custom software) defines its own scoped enum type that enumerates all exceptions that the module (respectively custom software) may trow. As a result, an exception entry's type can contain enum elements of custom enum types transparently. A two-level hierarchy results from that. A usual catch handler consists of nested if-statements: The outer uses Enum::IsEnumType to test for the general exception type. The inner then uses Enum::operator== to test for a specific element of that exception type.

Attaching Arbitrary Arguments:

Each Message of an exception may store an arbitrary amount of arbitrary objects that provides further information about the entry, hence about the cause of the exception or about state information of the code that threw the exception.
For this, field class Message inherits type Boxes which is a container storing elements of type Box . The information stored can (has to) be interpreted in a custom way by corresponding implementations of the exception handlers. A recommendation for users of this ALib Module , is to prepend a format string as the first element of this list. Such format string should contain a placeholder for every provided message argument and together this provides the possibility for an exception handler to easily create a human readable text message from an exception entry, by just passing the Boxes object to a Formatter , as discussed in the previous use-case chapter.

Cloning Exception Arguments:

In contrast to the previous use case of text formatting, with Exception and its used Message object, the life-cycle management of the boxed message arguments is a quite critical issue. To resolve this, class Message offers method Message::CloneArguments , which simply invokes Boxes::CallAll on all contained boxes.

A code that throws an exception or while handling one, appends a new message to an exception, has to assure that either of the following is true for each boxed argument attached:

  1. The argument is boxed as value type.
  2. The argument is boxed as pointer type and the object passed survives unwinding the call-stack.
  3. The argument is boxed to an array. (The default implementation of box-function FClone copies arrays.)
  4. The argument is boxed to a mapped type that is equipped with a proper specific implementation of FClone .

Class Exception provides - and is even allocated within (!) - an object of type MonoAllocator , which itself is allocated in its own first chunk of memory! If the first chunk of memory is sufficient, then only one single dynamic memory allocation is performed for the creation of the exception, including the copies of all message arguments!

C.3 Use Case: Module ALib ALox

We said in appendix C.1, that it was the original motivation for creating module ALib Boxing . The truth it, module ALox was it, just as the whole library once started with the development of ALox .

Of-course, ALox uses the formatting features of ALib BaseCamp and thus all that was said for this use case applies to ALox.

Prefix Logables:

In the context of ALox , boxes are called "logables" because they are the input to the logger. Now, ALox has an option to define prefixes (objects that are prepended to each log entry) in various ways. They can appear globally, or only with log-messages that are placed in a certain scope. The scope can be a source code file, a function or method, or even a certain execution thread.

A particular interesting thing is that if these prefixes are string objects (note that ALox also supports non-textual logging) these strings are copied when set as a prefix. The rational for this is to allow the assembly of a local string object and pass this to ALox as a prefix logable. This is a pure convenience feature. However, in some seldom cases a software might wish to set a mutable string object as prefix logable. In this case the string must not be copied, but rather stored as a pointer to the original string object that then might be modified by other code entities. To achieve this and bypass the string copy feature, the string object has to be wrapped in std::reference_wrapper.
Consequently, this is a sample use-case for what is explained in chapter 7.9 Bypass Custom Boxing With Identity-Boxing.

Overloaded Methods Using Variadic Arguments:

With C++, when overloading methods that use templated variadic arguments, quite quickly compile-time ambiguities occur: From a given set of arguments, the compiler can often not decide which of the overloaded versions to take, because two or more are matching the variadic portion. ALib Boxing solves this issue and allows ALox to offer a flexible API with many variants of overloaded methods that still accept variadic arguments. How this is done is explained in chapter 11.3 Advanced Usage of Class Boxes.

C.4 Use Case: Module ALib Expressions

Goal of Module ALib Expressions:

The aim of module ALib Expressions is to provide an easy, yet powerful C++ library that allows run-time compilation of expressions. Expression syntax mimics and covers the whole set of C++ operators and like C++ is deemed to by type-safe during compilation (here: expression compilation performed at run-time!), while allowing custom intermediate and result types, processed by custom expression identifiers and functions. Expression strings are compiled to a "program", which is executed by a virtual machine (a simple stack machine provided with the module) to evaluate an expression result. Together with the program, the virtual machine is fed with an "expression scope" that provides access to custom data used by the program's identifiers and functions.

The use of ALib Boxing with this module, probably provides the most uncommon - but thus even more exciting - use case of ALib Boxing .
The full truth is, when planning that module, its authors did not expect how compelling and helpful the use of class Box would be for the implementation. Only during the development it became clear that the use of ALib Boxing simplifies almost every aspect of that library. And this is not only true for the library development itself, but also from the perspective of an "end-user" that incorporates that module into his own software.
What during development first seemed a like a "misuse" of class Box (and was deemed to be replaced later), turned out to not only to easy the libraries use, but to also boost performance and minimize code size.

Transport Of Type Information:

This at first considered "misuse" is documented with manual chapter 3.2 Type Definitions With "Sample Boxes". Note that that manual still talks of "lazy use" or even "mis-use". In fact this is not really true. The effect on code size and ease of use is tremendous and it was a thorough decision to keep this concept since the first released library version.

Let us try to generalize the use-case described in the manual section linked above: Class Box is used to transport type-information between a user's code and a library code. In contrast to exchanging this information using std::type_info references and C++ keyword typeid, along with the information about whether that type is an array type or not, references to simple "sample boxes" are exchanged. A user of module ALib Expressions this way is not bothered with things like non-bijective boxing, value or pointer boxing, array boxing and the rather uncommon C++ RTTI mechanics. All that a user needs to do is to assign a simple sample value to an object of type Box and pass a reference to this box around.

The reason why the other use cases presented above did not need such use is obvious: Only module ALib Expressions deals with run-time type information "officially". It's whole goal is to allow the compilation of expression strings defined by end-users at run-time. Expression strings that a user feeds into a compiled software might result in different types - at run-time! Now the code using the library has to tell the library for example which result type an expression is allowed to have. Another sample are user-defined expression functions that have a signature of arguments and the result value. Now, during the type-safe compilation of expression strings (at run-time), the compiler needs to be able to select the right overloaded functions. The signature of a custom expression function is defined by a simple list of sample boxes.

Return a Value of Type Box:

The previous use-cases introduced in this appendix did not include a sample where a function or method returns a value of type Box. While the principle of doing so was presented in this manual very early already (see chapter 2.1 Tutorial: Boxing Values), it seems it is not too easy to find a good real-life use case for this. But module ALib Expressions has that.

The module allows to define custom expression functions. When the built-in virtual machine executes an expression program (aka evaluates an expression) such functions are called. As input arguments, the current boxes placed on the machine's stack are passed. Each function returns its result with a value, which replaces the input arguments on the stack. Consequently, all custom expression functions (which are also used to define custom unary or binary operators, auto-casts, etc) use boxes as input arguments and return a box.

Generating Custom Expression Literals:

Finally, another nice sample that module ALib Expressions demonstrates is in the area of box-functions. The module introduces the declaration FToLiteral . This is used by the expression compiler to generate an "optimized expression string". This may be wanted when a user passes an expression that can be optimized by the compiler to a shorter expression. While the optimization internally works and can be used, a software might want to present an expression string back to the user that - if compiled - directly resulted in the optimized expression program.
Details on that use-case are given in chapter 11.5 Optimizations of that module's Programmer's Manual, as well as in the reference documentation of the box-function declarator FToLiteral .