The original motivation to implement ALib Boxing was the need to allow functions to accept an arbitrary amount of arguments of arbitrary type. While C++ has all mechanisms to implement this (using variadic template arguments ), the limitation of the template approach is that all needs to happen at compile-time. This limits the concept tremendously - for the sake of gaining the typical unrivalled C++ performance!
We were searching for a way to collect the arguments and pass them further for run-time interpretation. With other programming languages which provide a superclass Object
and run-time type information this is a no-brainer. In C++ it needs some effort to achieve this. This library provides a very generalized, extensible approach that is not at all limited to variadic function arguments.
The prerequisites needed to reach the original goal were much more than we first expected, and in fact, only chapter 11. Variadic Function Arguments and Class TBoxes presents the solution for this.
This module, ALib Boxing, provides means to use C++ run-time type information in a most easy fashion.
For this, any C++ type, from fundamental "scalar" types to complex composite custom classes, can be assigned to an object of type Box. With the assignment, besides the object's value or a pointer to it, "run-time type information" is stored. The so called "Boxes", including their content, can be passed to functions as arguments, returned by functions or stored for later use. Finally, the contents can of course be unboxed in a type-safe fashion.
The seamless way of how ALib boxes are usable, is achieved using template meta programming (TMP). While a default behavior handles custom type properly, the two necessary conversions which are called "boxing" and "unboxing", can be customized.
The concept of "boxing" is available in many programming languages and often even done in an inherent, hidden fashion (then sometimes called "auto-boxing").
int i = 5; // No boxing, as simple "value-type" int is used. Object box= 6; // Auto-boxing: Creation of a container-object that includes run-time type-information.
Starting with version C++ 17, the standard C++ library provides type std::any
, which implements a similar concept. The differences between class Box introduced by this ALib Module and class std::any
will be examined in detail in this Programmer's Manual.
As a quick summary and motivation, in short, the differences are:
The performance penalty - if any - in respect to std::any
is considerably low. Class Box is very lightweight and usually its footprint is one third bigger than that of std::any
. In many occasions, ALib Boxing becomes even faster, due to
We furthermore think that the use of class Box is much easier than that of std::any
.
This ALib Module is located at a quite low level of the module dependency graph of the library and hence can be extracted and compiled with a surprisingly small fraction of the overall library source. For the convenience of the authors, the samples in this manual rely on (and therefore probably compile only with) the full ALib Distribution.
However, several sections of this manual give detail on the optional module dependencies and the according features of ALib, which leverage this module.
This documentation mixes tutorial sections and such that provide in-depth information. The tutorial chapters use the word "tutorial" in their headline and are usually followed by in-depth information.
In addition, some detailed topics are explained with the reference documentation of corresponding library types. If so, this manual will note the reader and offer deep links into the reference guide.
We hope that with this structure, experienced C++ programmers will be able to quickly grasp what they need, while less experienced ones get all information needed to fully understand all pros and cons of (using) this library.
While this manual is very detailed and quite lengthy, the good news is that it addresses programmers that include this module into own code only. If a software offers an API interface that accepts class Box as function arguments, the user of that interface does not need to know much about ALib Boxing. Only, if she wishes to in turn implement box-functions for her types or to start customizing boxing of those, then some deeper understanding is necessary.
This Programmer's Manual will frequently compare features and implementation details of central class Box, with C++ 17 class std::any
. This is done for various reasons:
std::any
is a type that C++ programmers usually know about. In general, humans are good in learning new things, through comparison with existing knowledge.std::any
is a plain, lean and straight forward approach of the core idea that ALib Boxing implements. Offering the comparison, the design decisions behind specifics of class Box can be nicely shaped out.std::any
. Both approaches have good reason for existence. The comparison helps judging about which to choose in a specific use case.By no means, the authors of the code or this manual want to give the impression that the comparison to std::any
is about indicating a "superiority" of the ALib concept over that of the standard library. In contrast, we want to clearly state that the standard library just follows different design goals: It is rightfully very abstract and provides an approach of completeness in a mathematical and procedural sense.
And while having less functionality and flexibility, class std::any
likewise has a smaller footprint and also in some cases provides better execution performance than class Box.
Let us now quickly jump into code and have a look at a "hello world" sample:
Compiling and running this program, the output is:
My box contains: Hello World
The central type of this module is class Box, located in this module's namespace alib::boxing. As done with most ALib classes, it has an alias name defined in namespace alib, hence shortcut alib::Box can be used. Now, as the sample states
using namespace alib;
just "Box" becomes sufficient.
The act of "emplacing a value in an instance of class Box" is called "boxing". The sample above shows how such "boxing" is performed: It is obviously done "inherently" with the simple C++ assignment operator. We can assign just anything to our "box" without getting compiler errors:
Compiling and running this program, the output is:
My box contains a string: Hello World My box now contains an int: 42 My box now contains a double: 3.1415
For programmers who know C++ 17 type std::any
already, this is not too surprising. The pure C++ language standards do not suggest such code, because C++ is a strongly type-safe language!
Besides with assignments, this mechanism of "auto-boxing" works well with function calls. C++ allows exactly one implicit type conversion, if a function argument is defined as a constant reference type:
The function can be invoked with any argument. Therefore, the following invocations:
produce this output:
The "opposite", namely returning boxes is comparably simple. A function with a return type of class Box (here a value type!), can return any C++ type:
The following sample and output combines the two functions. We repeat the nested call several times to get a random result:
In the samples of the previous sections, values have been boxed and the boxes then have been streamed into std::cout
. The overloaded streaming operator <<
, that accepts type Box, was provided with the inclusion of header alib/compatibility/std_strings_iostream.hpp.
This operator obviously is able to unbox values and print their contents to the stream.
Before we start unboxing values from boxes, we first need to demonstrate how the type of a box can be detected. The reason for this is simple: Unboxing a wrong type is forbidden and considered a severe error!
We cannot simply request a type from a box, because type information is nothing that C++ easily returns from a method. Instead, unfortunately type detection is a game of guessing!. For making a guess, templated method Box::IsType exists. This method has no arguments, but expects the type to "guess" as a template parameter. As the method's name suggests, the return value is boolean:
The output is:
Is the type boolean? True Is the type double? False Is the type boolean? False Is the type double? True
For the time being, this is all we need to know to proceed with unboxing.
Likewise method IsType, introduced in the previous chapter (and likewise the constructor of class Box!), method Box::Unbox used for unboxing a value is a templated method.
The template type determines the type of value to be unboxed:
The output of this code snippet is:
This was rather simple! We boxed a double
value and also unboxed one. So what happens if we unboxed a different type? This code does this:
The bad news is: this code compiles well! This means, the error in the code is not detected by the compiler. Unfortunately, the malformed code is detected only at run-time. In debug-compilations of ALib, an assertion would be raised, with a message similar to
Cannot unbox type <long> from boxed type <double>.
Even worse, in release compilations of ALib, running such code results in "undefined behavior", which is the nice wording for "this software sucks and will probably crash very soon!".
std::any
was included in the standard library with C++ 17 and ALib Boxing provides a little more.The two recent code samples, one that rightfully unboxes a double
and the other that asserts at run-time, do not make much sense. An obvious use case for ALib Boxing is given, when the acts of boxing and unboxing are decoupled. So let's look at how type-safe unboxing is performed in a function that accepts a Box.
Function ProcessBox tests the given box for "known" types, unboxes values and displays them. For unknown types, a warning is written and false
is returned:
These sample invocations:
produce the following output:
Using the "type guessing" method Box::IsType, introduced in the previous chapter, this code is back to be fully type-safe. Nothing can crash at run-time. Of course, code that invokes function ProcessBox needs to check the return value at (again run-time) and react properly if the box type was not "known" and false
was returned.
There are two drawbacks, one minor and a real major one. The minor is that in the case that many different known types are to be processed, the execution performance of ProcessBox
be weak. A first help would be to sort the guesses and put the more frequent types to the top. Using the much more performant switch
statement is not possible, because type information is no constant data.
The eventually much worse drawback lies in the fixed set of types that a function can process if it is designed based on "guessing" like sampled here. While in a closed source unit, this might be not a problem, imagine that function ProcessBox
resides in an external class library, where it cannot be extended. In this case, the function cannot be used for custom types that are not known to the library.
For both problems, module ALib Boxing provides a solution, which is introduced in a later chapter 8. Box-Function Calls.
std::any
in a similar, fashion, a solution for the two drawbacks named is not offered by std::any
.The previous tutorial sections showcased boxing, unboxing and type guessing. We will see that for all three aspects, a lot more has to be said and showcased. While this chapter for this reason cannot go much into technical details, yet, some important facts can be named and explained already.
Class Box provides templated method Box::IsType and Box::Unbox to guess and unbox specific types of and from a box. The types in question are provided with the template parameter. Likewise, the constructor, which is also used by the copy-assign operator=
of that class, uses templates. Otherwise, the straightforward assignment of any object to a box was not possible.
Besides using templates for "generic programming", a programming paradigm called "C++ template meta programming" (aka TMP) exists. The distinction between both, or otherwise the moment when extensive generic programming transitions to being TMP, can only be determined vaguely. Usually, C++ code should be called TMP at the moment structs found in header <type_traits> , like std::enable_if
, std::is_pointer
or std::is_baseof
are used. (Or those found in similar libraries, like boost .)
ALib Boxing makes quite a lot of use of "type traits" and hence the whole module can be easily considered as based on "template meta programming". To understand the library code, a solid knowledge of this paradigm is therefore needed. However, for using the library, fortunately it is not.
Class Box contains a data segment, aka an internal piece of memory, that can hold a certain amount of bytes to store values in. With each type given, one of a set of TMP constructors is activated, which copies the source object into this generic piece of computer memory.
With unboxing, according to the requested type the contrary operation is performed: the internal data stored in the box is re-interpreted back to the original type.
In most cases both actions result in a very simple (efficient) copy operation of a (probably) 64-bit value. While the code that is invoked may look longer and complicated and even function calls to other code entities may be made, TMP ensures that the compiler generates a very short and efficient assembly code for both, boxing and unboxing without function calls.
std::vector
and wonder what is going on there and how this class can be so fast while the debugger shows plenty of invocations for even the simplest action. Most of these invocations seen in a debugger are 100% optimized out by the compiler. This is the same for a lot of code found in this module.In addition to the boxed data, class Box stores type information. Otherwise, method Box::IsType could obviously not be implemented. In C++, type information is received with operator keyword typeid. While using standard function call syntax (round braces), it takes a C++ type as an argument. Returned is a constant reference to struct std::type_info
. The struct does not offer too much functionality, in fact the only useful thing that can be done with it is to compare it to another reference received with another use of keyword typeid. This way, it can be determined if two types are the same or not.
With that, the type guessing can be performed: Consider a reference to struct type_info
being stored with the TMP constructor of class Box along with the boxed value data. As mentioned, the set of TMP constructors are templated, so the type information is generated at compile-time.
Likewise templated method IsType compares the stored type with the type that its template parameter denotes at compile-time!
These mechanics explain why types can only be "guessed"!
The term bijective is used for describing the relationship of elements of two sets. Bijective relations, mean that each element of set A corresponds to exactly one element of set B and vice versa.
The two sets we are looking at in this case is the set of boxable types and the set of resulting types found in boxes created from the boxable types. This manual calls the latter set "boxed types" or "mapped types". Both terms mean the same.
In the case of C++ 17 type std::any
, the relationship between these two sets is bijective - just as a programmer should expect! It is a simple, straight-forward one to one relationship: The type you store in an std::any
object, is exactly the type that you can get back from it.
To investigate into the type relationship of ALib Boxing, let us continue with an easy tutorial sample.
In previous chapter 2.3 Tutorial: Unboxing, the following simple function ProcessBox was introduced:
It was shown, that if invoked with a C++ string literal, a due warning about an unknown type was written.
Now, have a look at the following sample invocations:
You should be quite surprised about the following output:
While only two boxed (target) types are tested by function ProcessBox, namely alib::integer and double
, a variety of six types can be passed to the function. Obviously, different signed integral types are all "mapped" to the same destination type and the two floating point types float
and double
are both mapped to type double
.
Any programmer can easily see the benefit: with just two code blocks that perform "type guessing" all relevant boxable types can be processed. The term "relevant" can be very rightfully used: In the integral case even the C++ compiler itself would allow an automatic, inherent type conversion (cast) with assignments between the types in question. Not even with the toughest set of warning options, the compiler would complain.
Ok, in the floating point case, the compiler would warn like this:
implicit conversion increases floating-point precision: 'float' to 'double'
if no static_cast<double>()
was applied in to the float
value. This is because of the fact that the float
to double
conversion is not free of precision loss.
float
to double
conversion can be suppressed.Here, we quickly interrupt this tutorial an continue with a manual documentation.
The relationship between C++ types and resulting mapped types is not injective. This means, two different C++ types may result in the same boxed type. For example, by default, all signed integral types (of different byte width) are boxed as the same type alib::integer, which is just an alias to the "biggest natural integral type" of the compilation platform. (In short, type alib::integer aliases std::ptrdiff_t
).
Likewise, all unsigned integral types are boxed to type alib::uinteger, which is an alias to std::size_t
.
This relationship of boxing C++ fundamental types, is the built-in default behavior. As such, it can be modified. This leads us to general important statement:
The details of how boxing can be customized for a type can only be explained in a later chapter, when other prerequisites are made.
We have learned that ALib Boxing is not injective. The next question is whether it is at least surjective. If it was, all types that can be boxed, can also be unboxed.
As a sample, the question is: Can type int16_t
be unboxed, regardless of the fact that it is possible to unbox type integer from a boxed int16_t
?
Again, a tutorial section should investigate into this question.
We have seen so far, that
float
and double
.The benefit from this is that only a reduced set of types have to be "guessed" when processing boxes.
Let us still try to unbox the original type:
This code does not even compile! In the compiler's output, the following error is found, hinting to the third line of the snippet:
static_assert failed due to requirement 'CustomBoxingRule7' Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.
This seems surprising in two ways. Not only that this type can't be unboxed, but also that this is not a run-time assertion but caused by a C++ static_assert
which is a compile-time message. As the message's text elaborates, it is just not possible to unbox type int16_t
- no matter what was previously stored in the box. Furthermore we understand: This was explicitly forbidden, which means "voluntarily" in this case.
What we have here is a design decision of this ALib Module. Technically, it would be easy to allow unboxing that type. All that is needed is a static type cast, which by the way can be performed by the programmer easily herself if needed:
The point here is that with the standard use-cases of ALib Boxing, the width of an integral is seldom of any interest. It is just enough to know that an integral value was given, no matter what size it was. Now, to prevent to accidentally start guessing types that belongs to a group of types that are "aggregated" to one destination type, the built-in customization of these types are explicitly forbidding that.
Why "accidentally"? Well, in respect to previous sample function ProcessBox, testing for all sorts of integrals would be just redundant code. For the same reason, to perform type guessing on an not-unboxable type, is already illegal. This code:
produces the very same compilation error as the one above that tries to unbox the type.
To conclude this tutorial section, an next important observation has to be made. For this, let us look at the following code snippet:
This sample shows that ALib string-types can be unboxed from a box that previously got a std::string_view
assigned and vice versa. Each original type can also be unboxed and type guessing for both types returns true
.
The takeaway from this is: Just from the fact that a type B is unboxable from type A, it cannot be concluded that the original type A is not unboxable. While this is true for types int16_t
and integer, this is not true for types std::string
and ALib strings.
In the first sections of this chapter, it was explained that type mapping is not injective. This means that different source types can result in the same boxed type.
Now, with the latest tutorial section, it was demonstrated that some boxable types cannot be unboxed. For these types, this manual uses the term "not unboxable types" or "locked types".
Conceptually this means that ALib Boxing is also not surjective: Not all origin types are "found" in the destination type set.
A relation is bijective if it is both injective and surjective. Consequently it is not bijective if it is either not injective or not surjective. Unfortunately, no word exists for the condition "not injective and not surjective". Therefore, this manual uses "not bijective" and this is meant in the broadest sense.
A quick summary of what was said in this chapter should be given in bullets:
std::any
is bijective, a huge difference of ALib Boxing is that its type relationship is not bijective, precisely it is neither injective nor surjective.std::any
suggests.std::any
.Some rationals why non-bijective type mapping is even defaulted in the library:
The approach taken with non-bijective type mapping, of course also has obvious disadvantages. First of all, type information is just lost: When detecting an integer type stored in a box, the processing code cannot perform different actions depending on the width of the given original integral type. The information on the size is just lost. Even worse, in the case of floating point values, the inherent conversion of values of type float
to those of type double
, even include a loss of the precision of the value.
So why does ALib Boxing take these restriction into account by default? Why is the benefit of just having to cope with a shorter set of target types, weighted a higher gain than the loss in precision?
This can be answered only by looking at the use cases of boxing. Remember that C++ until its version 17, not even suggested to do something like boxing. Instead, the language is known for its type safeness and its close binding to the underlying hardware, where the difference between int16_t
and int32_t
is considered a very huge one.
So this answer is rather, that boxing is not used in these areas of a software that contributed to the overall decision to use C++ as the source language. Instead, the use cases are rather found where more relaxed demands are applicable - and these can be parts of the same software. Take for example a software that calculates tomorrows weather forecast: A C++ software would be able to process billions of calculations, or at least feed corresponding dedicated "number crunchers" with the input data and process the result. For this task, the data should never be boxed and transported in a generic way. This is absolutely no use case for boxing! However, the very same software would also write a log file or display some messages on the console. Here, even the most valuable final results, namely tomorrows average temperature and wind speed may be intermediately converted to a boxed value: When unboxed, the first fractional digits of the floating point value will still be intact and precise enough to displayed to a homo sapiens.
Consequently, a rather "convenient" formatting function is needed, as known from printf
(which is not type-safe and therefore a "no-go") or from the standard libraries of various different programming languages. It can be noted that neither the complex syntax options of format strings introduced by the Python language (using brackets "{}"
as placeholders) nor those introduced by the Java language (using "%"
as placeholder and extending the good old printf
format) provide any means to distinguish 16- from 32-bit integrals. While the output can be altered in various ways, the originating type is just irrelevant.
This is important to understand: The use of ALib Boxing has to be justified. It is not just to be seen as a convenience library that enables easy, generic coding.
Sometimes however, as we will see in later chapters and also in the appendix chapters, ALib Boxing solves a real problem that arises from the nature of the C++ language, which otherwise can be solved only with std::any
or using bare keyword typeid
directly. But even in these cases, bijective boxing remains the default.
So called "Fundamental C++ types" are specified by the C++ language .
In short, those are all types that can be defined using a valid combination of the type keywords bool, int, long, int16_t, int32_t, float, double, char, wchar_t, char16_t
and char32_t
as well as modifier keywords signed, unsigned, short
and long
.
The following defaults are set if ALib is compiled on a 64-bit compiler/platform (precisely one where std::size_t
has a width of 64 bits).
The subset of fundamental exceeding a size of 64 bits are always boxed in a bijective way, which means in a one to one relationship. Those are:
__uint128_t
is concerned).long double
(a floating point value usually larger than 64 bits)Furthermore, character types (char
, wchar_t
, char16_t
and char32_t
) are always boxed bijective.
All remaining fundamentals by default are boxed in an injective way. By that, they can be grouped into three different sets:
float
and double
will be boxed to type double
.char
, wchar_t
, char16_t
and char32_t
will be boxed to alib::wchar.Only the destination type of each group is allowed to be guessed and unboxed.
The following defaults are set if ALib is compiled on a 32-bit compiler/platform (precisely one where std::size_t
has a width of 32 bits).
Integrals of a size of 64 bits are boxed in a bijective way, which means in a one to one relationship.
long double
(if even available on a 32-bit platform), by default cannot be boxed at all!All remaining fundamentals by default are boxed in an injective way. By that, they can be grouped into four different sets:
float
and double
will be boxed to type double
.char
, wchar_t
, char16_t
and char32_t
will be boxed to alib::wchar.Only the destination type of each group is allowed to be guessed and unboxed.
In the previous two sections, a fourth group of aggregated types was named with character types Note, that the non-bijective boxing of character types was not shown in the tutorial. Destination type wchar is defined with dependency module ALib Characters, which sorts a little of the "mess" a C++ programmer faces when dealing with characters. ALib Boxing leverages this module here for boxing plain character types. As we will see later, the benefits of module ALib Characters for boxing are even much greater.
Three ALib Compiler Symbols are available, which disable the custom boxing definitions.
The consequences of changing the defaults (enabling bijective behavior) should be obvious. For example, a processing code may now have to guess different integral types and it can and it has to unbox and process them separately.
Different code units that use a different setting in respect to one of the three compilation symbols, must not be mixed. For example a box created from type int16_t
in a code unit that enabled bijective boxing on compilation, cannot be processed by a code unit that uses default non-bijective boxing enabled. Remember that the processing code unit would receive a compile-time assertion, if it tried to unbox the value.
Often, the use of the ALib Compiler Symbols can be avoided, by using the set of methods:
Module ALib Boxing has a built-in support for boxing C++ arrays with one dimension. With the current proceeding of this manual, it cannot be easily justified and discussed what the rationale for this support is. Only a bigger picture, looking at prominent use cases and the "side effects" that this feature enables, allow to give a complete answer to that.
Therefore, let us at this point rather describe what is available and provide more rationals in later sections.
Type guessing and unboxing for boxed array types slightly differs from those of scalar types. Method Box::IsType is not applicable to array types. The reason is simply, that the C++ language does not allow to specify template types to be arrays of arbitrary size. The template parameter TBoxable of the method IsType might be int
[3] or double
[25], but cannot be just int
[] or double
[].
Therefore, alternative method Box::IsArrayOf is provided. For example IsArrayOf<int>()
may be used to guess a boxed array of int values.
int16_t
[10], will be boxed as an array of int16_t
.Further methods according to array boxing are:
true
if the box holds an array of any type.With that information, we can do two tutorial sections:
The following simple method prints the contents of boxed int
and double
arrays:
Some test invocations:
lead to the following result:
int[3]= { 1 2 3 } double[2]= { 3.3 4.4 } Unknown array element type: long Not an array, but scalar type long
Note the two different ways of implementing the array-loop. For type int
, each element is unboxed one by one, which avoids unboxing and locally storing the pointer to the array. However, for type double
this effort is made. The element loop itself then runs directly on the array instead of the box.
In release-compilations, both alternatives should result in the very same object code and thus share the same runtime performance. In debug-compilations, the first version performs a type check as well as a bounds check of the given argument in respect to the boxed array's size for each element. It is depending on the situation, which alternative is to be preferred. In this simple case, we would choose the second alternative, because neither type nor range checks are necessary in debug-compilations. Maybe a matter of taste.
Further note the use of method Box::TypeID. In its default implementation, obviously this method returns the boxed type in case of scalars and the element type in case of vectors. This decision can be altered, by explicitly providing an otherwise defaulted and therefore in this sample not visible, template parameter. Please see the method's documentation for further information.
Finally, in debug-compilations - the result of method TypeID can just be streamed to std::out
. This is very convenient and possible due to some tricks of other ALib modules, which includes the use of type alib::lang::DbgTypeDemangler. For technical reasons, type DbgTypeDemangler is only available in debug-compilations. Method TypeID itself is available also in release-compilations.
As already mentioned, ALib Boxing does not provide a similar solution for multi-dimensional arrays. When multi-dimensional arrays are boxed and unboxed, the sizes of the higher dimensions need to be known. The following quick sample demonstrates this:
Output:
Is int[][3]: 1 array[1][2]= 6
While the code above is feasible, multi-dimensional arrays are better boxed when wrapped in a custom type, e.g., one that stores the sizes explicitly and allows restoring them after unboxing. We do not consider this a huge drawback for this module in general, especially in respect to the use cases of ALib Boxing, which probably seldom include multi-dimensional arrays.
Type std::any
does not provide any more support, even the size of one-dimensional arrays is not stored there.
The term "Vector Types" here means collection type std::vector<T, std::allocator<T>>
and similar custom (3rd-party) types that store their elements in a single memory buffer.
ALib Boxing provides built-in support to customize the boxing of class std::vector
.
With the customization, objects of the type std::vector
are boxed to C++ arrays of the templated element type. The customization, requests a vector's allocated memory (method std::vector::data
) and stores this pointer besides its size.
The advantage of this approach is (as with any non-bijective type mapping) that the code that processes boxes needs to check for + arrays of a certain element type only. Separated checks for other vector types are not needed.
As still, it was never discussed yet, how custom boxing is finally performed, for now all that is needed to know that the injective boxing of objects of type std::vector
to one-dimensional array types can be enabled per compilation unit, by simply including header file alib/compatibility/std_boxing.hpp.
If so, we can feed method ProcessArray sampled above with objects of type vector:
The output will be:
int[3]= { 1 2 3 } int[3]= { 4 5 6 }
The built-in customization does not allow to unbox type std::vector
from boxed C++ arrays. Again this is a design decision, technically this would be possible. The rationale for this is that unboxing to std::vector
would impose a memory allocation and a deep copy of the data.
Therefore, such unboxing should be performed only with very explicit code. With the inclusion of the compatibility header named above, a templated, inline function for this task is already provided. This is its simple source code:
With that, unboxing a std::vector
from a boxed C++ array is done as sampled here:
Technically, C++ arrays are boxed by storing the pointer to the first element together with the array's length.
While we have not discussed the possibilities of customization of boxing for a certain type, yet, it can be said here that C++ array types constitute an exception: Their way of boxing is not customizable.
But what is possible, is to customize the boxing of other types (structs and classes) to result in the same boxed type as if a native C++ array was boxed. This was demonstrated in the previous tutorial section. Consequently such types might also well be unboxed from boxes created originally by C++ arrays.
The special treatment of one-dimensional arrays with ALib Boxing imposes advantages and disadvantages and hence is the result of a design decision.
The disadvantages are:
std::any
is only two thirds of that, namely 16 bytes.The advantages of the approach taken are:
std::vector
and optional customizations of boxing similar container types (of any arbitrary 3rd-party libraries!) unify the boxed type to be a simple C++ array type. With such, the processing code can uniquely handle boxes from arbitrary "vector-like" sources.It will be discussed in a later chapter, that array-boxing is especially helpful in the domain of string types: Arbitrary string types can be boxed as nothing else but simple one-dimensional character arrays. This way, this messy bunch of types, coming from tons of 3rd-party libraries, can all be aggregated to the very same type!
For one-dimensional character array types char
[], wchar_t
[], char16_t
[] and char32_t
[] the TMP constructor of class Box shortens the stored array length by one.
The rationale for this is that in most cases, boxed character arrays are string literals. String literals are zero-terminated arrays, hence the following line compiles:
char string[4]= "123";
If a length of three was given, compilation would fail.
With this exception in place, character strings are stored with the "right" size. The term is justified at the moment that that a programmer believes that zero-terminated strings are not nice. The zero-termination is "forgotten" at that moment. However, the benefit is, that the length of the box represents the true length of the string given!
This feature cannot be disabled. On the one hand, custom boxing is not available for C++ character types (for technical reasons, as already mentioned). Also, there is no preprocessor symbol introduced to disable this behavior, as we cannot consider a use case where this behavior wasn't acceptable. If it was, too many dependent features of various ALib Modules would discontinue working and had to be disabled.
So far in this manual we have only been boxing fundamental C++ types and C++ arrays. The only exception we saw, was class std::vector<T>
in the previous chapter. Here, it was only explained that it uses a customized boxing and nothing was said about how this works.
Still, customization is only explained in the next chapter, because for most custom composite types, ALib Boxing works very well "out of the box"!
The first thing we do is looking at a few samples.
Let's have a simple custom class:
With this in place, we can box, guess and unbox an object of that class:
The output of this code sequence will simply be:
While this was very easy and straightforward, here comes the pitfall! We define a "bigger" class:
With that, we use the same code as above:
Unfortunately, this code does not compile. The compiler complains twice, once with call IsType<BigClass> and also with Unbox<BigClass>. The error message is as follows:
static_assert failed due to requirement 'DefaultBoxingRule1' This type cannot be unboxed by value: By default, values that do not fit into boxes are boxed as pointers.
This tells us that instead of type BigClass
, type BigClass*
was boxed! We need to fix the code above in three places: Twice for providing the pointer type as the template parameter to methods IsType
and Unbox
and also change operator.
to operator->
, when invoking method Get of the unboxed pointer:
Now the code compiles and runs fine. Its output is:
"Easy!" you could say, because the static_assert
helps to create this clear compiler message seen above. Just make it a pointer type, and here we go! Unfortunately, there is a next pitfall related with switching to pointers, but this is discussed in a next chapter.
Before we elaborate more theory, let's quickly finalize this tutorial part with a last, probably astonishing thing. From the compiler error message that said "...values that do not fit into boxes are boxed as pointers" an attentive reader might get suspicious and wonder: Does this mean that the opposite is also true? Are "fitting objects" just always boxed as values, even if a pointer to a fitting type is boxed?
This code gives the answer, as it compiles and runs well:
Consequently, trying to unbox a pointer to class SmallClass, leads to compiler error:
static_assert failed due to requirement 'DefaultBoxingRule3' This type cannot be unboxed as pointer: Default boxing of types that fit into boxes and are copy constructible and trivially destructible, is performed by value.
By default, when a composite type (a struct
or a class
) is boxed, ALib Boxing checks whether a value of the given type "fits" into the data segment of class Box and from this decides if the type is boxed as value or as pointer. In both cases, the chosen type is used, no matter if a pointer to the type a value is passed.
nullptr
is boxed, the internal memory of the box (introduced in detail in the next chapter) is set to zero values.A question now is: what types do fit in? The answer is quite simple. On a 64-bit platform, class Box is ready to store a pointer or any other 64-bit wide argument. In addition to that, due to the built-in ability of boxing one-dimensional C++ arrays, a second 64-bit value can be stored. With C++ array types, this member holds the array's length. With other types, it is available for free use.
As a result two times 64-bit, hence 16 bytes can be stored. If type BigClass from the previous tutorial section held only two values of type integer instead of three, it would fit in and became boxed as value. Likewise, on a 32-bit platform, the usable value data of class Box is two times 32-bit equalling 8 bytes.
A second constraint that defaults boxing of a type as pointers, is when a type is either not copy-constructible or not trivially destructible. Coincidentally, a good sample for such a type is one of the C++ standard library, that this module heavily uses: std::type_info
. While on common platforms values of the type fit nicely into a box, the type is boxed as pointer type, because only references and pointers of it may exist. The TMP enabled constructors of class Box detect that and perform pointer boxing. Trying to unbox that type as value, leads to compiler error:
static_assert failed due to requirement 'DefaultBoxingRule2' This type cannot be unboxed by value: By default, types that are not copy constructible, or not trivially destructible are boxed as pointers."
This design aspect of ALib Boxing might be surprising. In fact it could be legitimately argued that this behavior is not along the design lines of C++. Consider that the following to lines of code:
Box box1= myValue; Box box2= &myValue;
create two boxes with the very same contents! And: without knowing the size of the type, a reader even cannot tell if both times a pointer is boxed or if the objects are copied by value.
On the positive side of the two lines above is that a programmer does not need to care if she passes a value or a pointer, things will just be boxed to the right type. One of the answers why ALib Boxing is allowed here to trade "convenience" against pure C++ standards, is once more given from the limited set of scenarios where boxing should be used at all.
This and some other aspects should be discussed in the following few sections.
Value and pointer boxing and its transparent treatment, constitutes a next aspect of non-bijective boxing, that we have already discussed in depth in chapter 3. Non-Bijective Type Relationships. By default, all pairs of type T and T* are boxed to either one of the two, just depending on the size of the type. This approach effectively reduces the number of types that need to be guessed when processing boxes by half.
In the next chapter we will see how boxing can be customized per type. This includes the option to redefine this automatic default treatment. Arbitrary combinations are possible:
Between the type mapping seen so far and this mapping of value and pointer types, two differences exist:
'&'
, or indirection operator '*'
.The latter might be important to understand: The conversion with operators '&'
and '*'
is done as the very first step. It could be said, that in fact the complement type is boxed instead of boxing the given type itself.
In previous chapter 4. Boxing Fundamental Types, nothing was said about boxing pointers to fundamental types. But this was only to avoid confusion at that point in time! Instead, it was explained was that the non-bijective boxing groups all fundamental types into four sets:
double
.char
, wchar_t
, char16_t
and char32_t
will be boxed to alib::wchar.Now, pointers to all fundamental types are boxed like their value counterpart. Likewise with structs and classes, the two boxes from the following sample:
int i= 42; Box box1= i; Box box2= &i;
receive the identical contents of type integer and value 42.
const char*
or const alib::character*
. These are considered zero-terminated strings and are boxed to C++ array types. A rationale for, and all details on this exception will be given in chapter 10. Boxing Character Strings.A next non-bijective behavior of ALib Boxing is constituted by following boxing rules:
If T is a non-constant value type, then:
The same two rules can be phrased from the perspective of the boxed types as follows:
The rationals for this are:
This all means that the information about whether a type was constant or mutable is lost with boxing it. Only when a processing code is "sure" that a boxed pointer points to a mutable object it might apply a static_cast
on the result of method Unbox if it intents to perform modifications. Furthermore, for convenience, method Box::UnboxMutable is available, which just calls Unbox() and performs the static_cast
to return a mutable result.
Finally, it is important to understand that although types that are boxed as pointers are always treated as constant pointers, this never is noted anywhere. For example, template parameters of method Box::IsType and Box::Unbox expect a non-const type.
The rationale for this is: Because all pointer types are returned as constant pointers, a need to pass keyword const
with pointer types was redundant.
The following code snippet should make this clear:
While BigClass is unboxed as const BigClass*
, the template parameter just says <BigClass*>.
A corresponding static assertion will fail, if keyword const
is used with type specifications.
For types that are boxed as values, type attribute volatile
is removed from the copy.
Volatile objects of types that are boxed as pointers, are not allowed to be boxed. If tried, compile-time assertion:
DefaultBoxingRule4 Types boxed as pointers cannot be boxed if volatile.
will be given.
Methods Box::IsType and Box::Unbox will statically assert if type specifier volatile
was given with template parameter TUnboxable.
In the case of value boxing, performed for fundamental types and such composite types that "fit into" a box, a all necessary data is copied into the box. Therefore, the life-cycle of the box instance is independent of the source value.
This is different when pointers are boxed. Here, the exact same rules as using normal pointers apply: A pointer must be dereferenced only if the objects it points to is still valid.
Now one could argue that this becomes a little "delicate" at the moment a programmer does not know if a type is boxed by pointer or by value. Maybe she would think that an object just fits and therefore delete the source type after boxing, which of course leads to undefined behavior if the type didn't fit!
The simple solution to this is: When an object of a composite type (struct or class) is boxed, the box just always should be considered to have a life-cycle bound to the object, regardless if by coincidence the value fits to the box and is thus copied. A programmer should just volunteer to take this little chance of her worries being unnecessary into account.
However, some thinking has always to be given. For example, reconsider how class std::vector<T> is box to a C++ array, as demonstrated in 5.4 Boxing Vector Types. Well, while this is not pointer boxing, still a pointer to the first array element is stored. Now a user of the standard C++ library knows that class std::vector<T> allocates dynamic memory for storing the values. This memory is deleted with the destruction of the vector. Hence, the life-cycle of the box is bound to its source object.
But it is even worse: During the life-cycle of the box, the vector must also not be modified! Appending a new element might or might not lead to a re-allocation of the internal array. Consequently, a certain level of care has to be taken when passing boxes around to different code entities.
Once more, the good news about the pitfalls of life-cycle-management lie in the limitations of typical use cases of ALib Boxing. In most cases, are not even actively created by a software. Instead, they are implicitly created when generic functions accept arguments of type const Box&
. In this most frequent case, after the function returns, the current thread's stack frame is unwinded , and the boxed argument objects are disposed!
A sample for this is given in appendix chapter C1. of this manual.
Should the processing function want to store some data that it received from a box argument "for later use", then such function itself should be responsible to create copies of such boxed data that might be not available after the function returns. The function can quite easily perform this, as it has anyhow knowledge about how to interpreted different boxed types and their contents.
A sample for this is implemented with ALib Expressions, which is discussed in more detail in chapter C.4 Use Case: Module ALib Expressions of this manual.
With ALib Expressions, class Box is also used as a return value of functions. While this is a more rare case, it is absolute rightful and necessary to do so in that module. The constraints applied here is that the functions that return a box are responsible to ensure that the contents is valid during a certain "scope" of the execution of the software. This scope is individual per library and in case of ALib Expressions it is well documented in the according Programmer's Manual.
Finally, if the contents of boxes need to survive their originating object's deletion, then a next option to achieve this, is given in chapter 12.6 Life-Cycle Considerations.
It was already pointed out in chapter 3. Non-Bijective Type Relationships that C++ 17 type std::any
does not offer non-bijective boxing. Value type T
is boxed as value and type T*
is boxed as pointer. Consequently, a processing function implemented with std::any
always had to check both types, if it wants to support both.
In the case of storing pointers with std::any
, the same care about life-cycle management is needed as with using ALib Boxing.
In the case of values, things can become quite ineffective. As type any does not "automatically" switch to a pointer type, the copy constructor of objects provided as value is invoked. For the storage of the copied object a heap allocation is performed. Note that many developers underestimate the execution costs of allocating dynamic memory.
Furthermore, the copy constructor of many types perform a "deep copy". For example in the case std::string<T>
this means string data is copied. Besides the effort for copying the string data itself, a second heap allocation has to be performed for the internal string buffer.
From the other perspective: while std::any
allows storing values of "any" size, class Box does not. Even when boxing is customized, the conversion from the source object to the boxed data must not perform (and store a pointer) to heap allocations. ALib Boxing simply does not perform any object destruction or deletion.
In this sample:
MyClass myClass1; MyClass myClass2; Box box= myClass1; box= myClass2;
the boxing performed with the second assignment in the last line, simply overwrites what was previously boxed, independent of the fact what that previous contents was. The benefit of this is that boxing an extremely fast and efficient code. Often, the compiler optimizes the assignment to a box to just writing directly the three integer-sized words.
Once more, the rationals behind this design is found in the use cases of ALib Boxing, which do not need anything else and heavily benefit from this behavior.
To conclude this section, let's imagine two functions, one accepting a variadic list of std::any
objects the other a variadic list of boxes. While to the latter, just any variable can be passed "as is" because the automatic choice of the right type, with the std::any
implementation, each parameter has to be checked by the programmer to apply the right of operators '&'
or '*'
that lead to an efficient and to the wanted behavior: copy or not!
In previous chapters it was mentioned already several times that ALib Boxing can be customized per source type. From this, a good indication of what is customizable was already given. At this point in the manual, it is a good time for explaining the customization in detail.
The following customizations can be performed for a type:
1. Type Mapping
Customization allows mapping a source type (aka "boxable type") to a specific target type (aka "mapped type"). For example, the built-in customization (which can be deactivated) maps all common signed integral types to the same destination type integer, unless they are bigger than the latter.
2. Type Conversion Mechanics
Depending on the customization performed, specific code for type conversion for both, boxing and unboxing may be provided which replaces the built-in default mechanics.
3. Manipulation Of Automatic Value-/Vector Boxing
By default, ALib Boxing does not distinguish between boxing a value type T or its counterpart T*. The joint (same) mapped type of both is either one of them, depending on a value's physical size and whether a type is copy-constructible and trivially destructible.
This default behavior can be in arbitrary ways
4. Disallowing Unboxing
If a type is mapped to a different target type, it might still be unboxable from this target type. Sometimes, to forbid unboxing can be just a voluntary design decision. In other cases, unboxing the original type might technically just not be feasible.
(A sample for both options had been given in 3.4 Tutorial: Unboxing Non-Injective Types. A further sample was already explained in 5.4 Boxing Vector Types)
5. Disallowing Boxing
Finally, boxing may also be completely forbidden for a type. With that, any assignment to an object of type Box fails compilation. Forbidding boxing, by the same token disallows unboxing.
The good news is, that the defaults of ALib Boxing work well with most types. The most frequent use case for customization, is to perform non-bijective type boxing, to reduce the effort of processing boxes or to generalize a type to a common mapped type to enable the processing of otherwise unknown (source) types.
With type mapping, two scenarios may occur:
int8_t
or int16_t
to type integer. As the latter is larger than the source types, all information contained in the source remains, except for the original type information.std::vector<T>
. This type holds a pointer to its buffer as well as the length of the stored array. In addition, also the length of the allocated buffer is stored. This is equal or greater to the array length. With the optional built-in boxing, this information is not stored. Instead, the type is boxed as a C++ array type, hence only the pointer to the buffer and its fill-length survive.As noticed in chapter 2.4.1 Templated Approach, this ALib Module uses template meta programming (TMP) for boxing, type guessing and unboxing. With this paradigm, so called "type traits" are frequently used. Simply spoken, type traits enable the compiler to choose different code when compiling templated methods or functions.
Typically, type traits are implemented by a templated struct. The non-specialized definition of the struct sets the defaults, by adding default types, functions etc. Then, specializations of the struct for specific types can be given, from library internal or external code. The C++ language allows virtually arbitrary changes to the original struct when specialized, including even changing the type's inheritance relationship, changing the signature of methods, leave out entities and add new ones. However, with TMP, the documentation of traits structs tell programmers, which properties the specialized struct needs to provide.
This design pattern of using type traits is also leveraged with ALib Boxing. The type traits struct that is to be specialized for customization is given with alib::boxing::T_Boxer.
Type traits struct T_Boxer is well documented and should be referred to for all details. The specialization of the struct can optionally be performed using helper macro ALIB_BOXING_CUSTOMIZE and its siblings.
Instead of repeating what is said in the reference documentation of the struct and macro, this manual rather gives various real life samples along the lines of the important use cases.
The mapping of type int16_t
to integer
was already used as a sample in various parts of this documentation. Let's now look at how this is done with the built-in customization of type int16_t
. This piece of code does the job:
This is what is done:
int_16_t
. The type that a specialization is made for, always denotes the C++ source type that is supposed to be boxed differently.void
, has to write a representation of the given object into the placeholder of the box given as an argument. Such writing has to be compatible with how the target type would write its value into the placeholder.int_16_t
or integer.void
, instead of the source type int16_t
. Declaring Read to return void
disables unboxing! And well, as it is disabled, no implementation of the function needs to be given.Note, that with non-bijective type mapping, all boxable types (source types) that are mapped to the same destination type, have to "agree" to write the data in the same format. It should be easy to understand that if doing otherwise, the result is undefined behavior. The format that default boxing, as well as built-in customized boxing use, is documented with union Placeholder. In a later chapter, more information on this class is given.
Instead of providing all the code "manually", we could also pick and use one out of a set of provided macros:
ALIB_BOXING_CUSTOMIZE_NOT_UNBOXABLE_CONSTEXPR( int16_t, integer )
Technically, the differences are:
static_cast
to the given destination type.void
.The principal differences when using the macros, are:
In chapter 5.4 Boxing Vector Types, boxing of std::vector<T>
was demonstrated. It was said that by including header file alib/compatibility/std_boxing.hpp a default customization was given. In comparison to the sample of the previous tutorial section, there is one small challenge here: The type is templated. The goal is now to define custom boxing for type std::vector<T>
- of any element type T
.
The C++ syntax supports templated specializations in a straight forward way. In chapter 10. Boxing Character Strings it will be shown that std::vector<T>
is to be customized differently if T is a character type. Therefore, those have to be excluded from the templated specialization, as they will be customized differently.
Here is the code for specializing the struct for all type but character types, taken from the header file named above:
The - otherwise unused - second template parameter TEnableIf of T_Boxer is invalidated for character types, which will omit those in the specialization.
For all other types, this specialization uses helper-type TMappedToArrayOf to wrap the destination type. This denotes that the type should be boxed to a C++ array type. Remember that C++ array types of arbitrary size can be defined with a (non-templated) type definition. This is just not possible by the language.
The boxing method Write is so simple, its definition should not need any further explanation. Finally, like in the sample shown in 7.4 Tutorial: Mapping Type 'int16_t' to Type 'integer', method Read is declared to return void
, which disables any unboxing of class std::vector<T>
. If a code still tried to unbox one, the compiler would complain something like that:
static_assert failed due to requirement 'CustomBoxingRule7' Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.
For templated specializations as shown here, no helper macro exists.
In the previous sections, including the tutorial parts, we had only seen how value types T continued to be boxed as T, or sampled with class std::vector<T>
how a type that would by default be boxed as T* is customized to always be boxed as a different value type, in this case a C++ array.
There are two further cases possible:
Both variants are explained now.
Boxing both, T and T* as pointer T*:
Should - for whatever reason - a fitting (small) and copy-constructible and trivially destructible type be boxed as a pointer, a customization for the pointer type has to be given. The mapped type then is the same pointer type as the source type.
For example, if we wanted to have class SmallClass from a previous tutorial sample, to always be boxed as pointers, the customization would look like this:
ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass*, SmallClass* )
As the default boxing and unboxing mechanics work well with pointer types, we can simply use macro ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING for this.
Of course, a different target type could be specified likewise. The important point here is, that if a specialization for T* is given and none for T, this customization is used for mapping both T and T* to T*.
Boxing types T and T* differently:
The last case revokes the non-bijective default behavior of boxing complement types T and T*. Instead, a one to one mapping is enabled.
All that is needed for this is to specify just both customizations:
ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass , SmallClass ) ALIB_BOXING_CUSTOMIZE_TYPE_MAPPING( SmallClass*, SmallClass* )
Again, different mapped type and custom Write/Read methods may be given, if other macros were used.
For this variant, valid use cases exist - although again, no ALib Module uses that internally. As a sample, let us stick to type std::vector<T>
. We learned, that with the inclusion of header alib/compatibility/std_boxing.hpp, values and pointers to the type becomes boxed as C++ array type.
With this custom boxing, internal information of the vector object is lost (the capacity). A processing function, can only access the currently stored elements, but the vector cannot be unboxed to be modified. If unboxing a pointer was allowed, the unboxed vector could be modified (what of course would modify the original object).
This could rightfully be wanted behavior and looking at C++ 17 type std::any
tells us, that with its lack of non-bijective type mapping, this is even the only possible behavior there.
Because for a templated specialization, none of the helper macros can be used, the following templated specialization of type-traits struct T_Boxer has to be given:
Still header alib/compatibility/std_boxing.hpp is to be included, as we want to keep mapping of value types to C++ arrays intact.
Without the customization shown above, the following code would not compile:
The compiler would complain in line 4:
static_assert failed due to requirement 'CustomBoxingRule1' This pointer type T* cannot be unboxed, because custom boxing is defined for value type T, while no custom boxing is defined for pointer type T*.
When patiently reading further, a next compiler error tells us:
static_assert failed due to requirement 'CustomBoxingRule9' Customized boxing forbids unboxing value type T ('T_Boxer<T>::Read' returns a different type), while no customization for this pointer type T* was given.
With the additional customization, the code compiles fine and the output is:
This manual cannot go into the all details of TMP, therefore this is a tutorial section, is only giving an example and an indication of what is possible.
We had seen, that specializing type traits struct T_Boxer for a single type has the following syntax:
template<> struct T_Boxer<MyType> { using Mapping= TMappedTo<MyTargetType>; ... };
To do the same in a templated fashion for a generic type, we used:
template<typename T> struct T_Boxer< MyGeneric<T> > { using Mapping= TMappedTo<MyTargetType>; ... };
This maps a whole set of types to the same target type. But how about other sets of types? Sets that are not defined by generics? For example, an obvious question is: how can a type and all its derived types be customize at once?
All that is needed to achieve this, is a little template meta programming. To prepare that, type traits struct T_Boxer is equipped with a second template parameter of typename
type. The reason why we have not noticed this parameter, yet (also in the code samples it is not visible), is because that it is defaulted to be "void". Its identifier name is TEnableIf. The type is not referred to, neither within the struct itself and consequently not anywhere else in the code.
A sample should demonstrate how this can be used. Consider the following two types:
We do a TMP enabled customization for type MyBase and all derived types:
The following sample proves that we achieved what we wanted, because it successfully compiles and when running, it does not produce a run-time assertion about unboxing wrong types:
The output is:
Finally, if we tried to unbox the derived type:
The following compiler error was given:
static_assert failed due to requirement 'CustomBoxingRule7' Customized boxing forbids unboxing this value type: 'T_Boxer<T>::Read' returns a different type.
We have seen in the previous chapters, that even when boxing is customized, such customization often can conveniently use the simple default implementations of methods Placeholder::Write and Placeholder::Read .
This is due to the fact that the methods are implemented by a set of overloaded and TMP enabled methods, that go along well with fundamental and fitting value types.
Besides using the interface methods documented with union type Placeholder, it is also possible to directly access its different members and this way write and read whatever is needed for a certain use case.
There might be situations, where an exception to the bijective, simplifying nature of ALib Boxing is needed. For example, as it was explained already, if a set of custom types are all boxed to the same more fundamental mapped type, then of course information is lost. Because customization of boxing has to be the same throughout all compilation units, a decision for such "reduced" boxing of a type is a global decision.
But there is an easy way to "bypass" custom boxing: All that is needed is to "wrap" an object into another one and box the wrapper type. A convenient wrapper type that can well be used is found in the C++ standard library with std::reference_wrapper
. This templated class is very simple and stores a reference to the object given in its constructor.
Of course, a wrapped object of type T has to be guessed and unboxed as type std::reference_wrapper<T>
and furthermore different life-cycle restrictions might apply in contrast to using the customized boxing (according to the standard C++ mechanics and rules).
The following sample demonstrates the technique with two types:
float
std::reference_wrapper
cannot be used. Therefore a quick custom struct WrappedFloat is given. By default, fundamental type float
is converted to double
when boxed.The wrapper types consequently are:
With that in place, either use custom boxing can be used, or it can be bypassed:
With method process defined like this:
The output of the code above will be:
In the previous sections of this chapter, most details of "custom boxing" were explained. Technically, custom boxing allows modifying the type mapping as well as the way object data is boxed and unboxed.
While this chapter was quite lengthy, and while template meta programming and the creation of specializations of type traits structs may be "dubious topics" to less experienced C++ programmers, a user of this ALib Module should not fear too many troubles in respect to custom boxing.
This is true, because:
In consideration of this effort, the benefits are huge. The main goal that custom boxing achieves is to further shrink the set of mapped types. Using ALib without any modification, this set is already reduced dramatically:
std::vector<T>
and similar 3rd party types, are boxed to corresponding C++ arrays.Reducing the types, does not only mean that the effort of guessing types when processing boxes is reduced. Often, a custom type is mapped to a common, "already known" type. In this case a processing function does not even need to be changed. This is very helpful if a programmer just can't change such function, when it resides in a library, or her co-worker is just responsible for it.
We will see in later chapter 10. Boxing Character Strings that ALib Boxing by default maps arbitrary (3rd-party) string types to simple C++ character arrays. A function that processes boxed character arrays will this way generically be able to digest any 3rd party string without the need of adoption.
Of course, there are limits in achieving generic processing of arbitrary boxed types by just mapping the types. While strings are a great sample, often it is not an option to just map a type to something else, maybe because in other places more of the original type's data is needed and boxing it as a pointer to the original is mandatory.
To revoke these limitations, lets quickly move on to the next chapter of this manual!
You, the reader of this manual, probably know all details of C++ and virtual functions. The first section of this chapter provides a brief recap of some basic knowledge on that matter, You are free to skip that!
People in a hurry, might also want to skip section 8.2 Function Declarations, Implementations and Registrations and instead right jump into code with 8.3 Tutorial: Implementing A 'ToString()' For Boxes.
In previous chapters, it was explained that mapping several C++ types to the very same boxed type, does not only reduce the efforts of processing boxes. It further allows processing boxes created from "unknown" types that are mapped to a known boxed type. While this is well feasible for some types, for others type mapping may not be an option, when too much information gets lost.
To resolve the general problem, object oriented languages offer "virtual functions": Instead of performing the task themselves, the processing code calls a type-specific, "virtual" function on a given argument. This way, the responsibility is passed back to the object that is processed.
But how is this technically solved? How does the processing function know the address of the function that is to be called, when it is a different function for each object type?
C++ uses run-time type information for that. While non-virtual class methods are statically linked at compile-time (respectively at link time), the address of a virtual function call is only evaluated at run-time. As soon as a first virtual function is declared with a class (or one of its base classes), a virtual function table (aka "vtable") is added by the compiler to each new instance of that type. Such types are called "polymorphic types" or just "virtual types".
Adding this vtable increases the footprint of virtual C++ types by the size of one pointer. Together with the loss of run-time performance, this increase of object size is the general disadvantage of virtual classes. It is technically just not avoidable: If a processing function should be able to call variants of methods tailored to types that it does not "know" at compile-time, then the memory addresses of these methods have to be passed together with the argument object.
Virtual functions are just one out of two purposes for having a vtable in C++. Its second use is with C++ keyword dynamic_cast<T>
. While a static_cast
is performed by the compiler, a dynamic_cast<T>
is performed at run-time by special code inserted by the compiler. This code performs a type-check using the vtable. On failure, dynamic_cast
returns nullptr
.
We had learned in chapter 2.4 How The Basics Work, that ALib Boxing stores run-time type-information along with the boxed data. You could rightfully say, that the disadvantage of needing a vtable with instances of virtual C++ classes, is of the very same nature like the need to store type-information with boxes. While C++ uses this pointer for type-checks and virtual function calls, so far we had seen ALib Boxing using it for type checks only.
Well, here is the good news: Also ALib Boxing supports virtual function calls!
In the sense of the C++ language, any function that is invokable on an instance of type Box is "virtual", because run-time type information is used to determine the right version of the function for a box containing a certain mapped type. However, from the perspective of ALib Boxing, there is nothing like a "static" or "link-time" function. As a consequence, this manual of module ALib Boxing does not talk about "virtual functions" but just "functions" or "box-functions".
This section explains the three steps to define box-functions.
Type-safeness is a mandatory feature of any C++ software. ALib Boxing is a type-safe software, although - for technical reasons - some heavy use of keyword reinterpret_cast
is done when boxing and unboxing values. While the type-safeness is lost at compile-time, it is regained at run-time with the use of the templated interface methods. For example, if T in a call to Box::Unbox<T> does not match the boxed type, a run-time assertion is given. And such can be prevented using Box::IsType<T>, which never asserts.
With box-functions, the situation is similar: For technical reasons, the vtable of a box stores the address of invokable functions as a void*
. However, when the function is invoked, a template parameter used with the invocation ensures that the signature of the function stored matches the function parameters given.
We call the template parameter types used with function invocations "FunctionDescriptors". Such FunctionDescriptor is just a struct with a single type definition.
Here is a sample:
Besides the requirement that the type definition in the struct is named "Signature" and that it denotes a function pointer, only two further conditions need to be met:
const Box&
. We will learn in a later section about the difference of invoking constant and non-constant box-functions.The return type of box-functions must be default-constructible:
When invoking a function on a box, the result of that invocation is returned. As it it might happen that a function is not defined for a specific mapped type, a default value is needed. Then a default value is created and returned.
In the case that a function should return a type which is not default-constructible, then the approach to do this, is to declare the function void
and instead add an output parameter. For example a pointer to a pre-constructed object or a pointer to a pointer, if the object should be dynamically allocated by the function.
The second ingredient needed are function implementations - one for each mapped type that is to be supported. Implementations can be defined globally or within a namespace. Furthermore, static member functions are likewise accepted.
However, it is always a good idea to place box-functions in an anonymous namespace of a compilation unit (aka non-header file). With that, it is hidden from the C++ linker and does not clutter a compilation unit's linker information.
It is possible to do so, because the functions are not called using the linker or C++ virtual tables. Instead, ALib Boxing uses the C++ call operator()
directly on their address stored in the vtable of the box.
The final step is to associate the function implementation with boxes of a specific mapped type. This is done with templated namespace function alib::boxing::BootstrapRegister.
The function uses two template parameters that have to be explicitly specified:
We had seen in chapter 7.3 Type Traits Struct T_Boxer how to denote a mapped type with field T_Boxer::Mapping: The C++ type has to be wrapped in either TMappedTo<T> or TMappedToArrayOf<T>. The same notation is used here.
Finally, the address of the box-function is to be passed to BootstrapRegister as a normal argument.
The previous chapter gave a detailed (rather lengthy) explanation about box-functions. This tutorial section now shows how simple their definition and use in deed is.
The goal of the sample we are looking at, is to enable boxes to write their contents to a string. In other programming languages, such function is often called ToString().
Here is the declaration of the function:
Besides the box itself, the function expects an AString defined with module ALib Strings. This is used as a buffer to write to. The return value is String, which is a lightweight string type, similar to C++ 17 tye std::string_view
.
Let's create three implementations for different types:
First of all, it has to be noticed that unboxing from parameter self does not need type-guessing with Box::IsType. The reason is that each function is associated with boxes of a corresponding type and thus self always contains the right type.
The first two implementations simply unbox the right type and use AString::operator<< to convert the type.
The third function is templated. It is designed to be usable with different boxed array types. Unfortunately, we cannot attach a templated method to just various boxes. Instead, an instantiation of the templated function has to be given for each boxed array type that we want to support. Such instantiation is implicitly performed by the compiler when passing the function to BootstrapRegister.
Let's register 4 functions that way:
A call to RegisterMyFunctions() needs to go to the bootstrap section of the process.
With all that in place, functions can be "called" with templated method Box::Call. It expects the function declaration as a template type and the function argument as its own arguments. Its return type is equivalent to the return type of the box-function!
The following code creates an array of boxes and calls their method in a loop:
The output of the code above will be:
We conclude this tutorial section with a test: What happens if we invoke the method on a box of a mapped type that no implementation is registered for? As we were lazy, for example uinteger is not covered:
Running this does not assert! The output is:
box.ToString(): ""
Obviously an empty string was returned by Box::Call, without further complaints.
It is a design decision of ALib Boxing, that calls to box-functions that are not registered for the actually boxed type, do not assert. Method Box::Call just returns a default value of the designated return type, that's it. The rationale for this design is once more to favor convenience when handling boxes over other considerations. A processing code could use Box::GetFunction before invoking the function, if it wanted to react on boxes that do not support a box-function.
Looking at virtual functions of OO-languages once more: There, virtual functions may or may not be specialized with each derived class. If a function is invoked on a derived class, the "best" implementation is chosen, by walking up the inheritance chain and choosing the first implementation found in a base class.
The type system of ALib Boxing is not hierarchical and does not know inheritance. But in theory there are at least two levels!
And that is our little fallback: This library supports to define "default functions" that - if available - are used used when no specific function.
Often, there is not much to do for them, because interpreting the Placeholder contents without knowing the type, is not possible. Still we will see in a next chapter that there there are some good use cases for them.
Sometimes it is useful to implement and register a default function solely in debug-compilations of a software: These can then assert, write log file warnings or perform other appropriate actions.
Default functions are registered with namespace function BootstrapRegisterDefault. Compared to BootstrapRegister, the function omits the second template parameter specifying the mapped type.
To continue the sample of section 8.3 Tutorial: Implementing A 'ToString()' For Boxes, a default implementation usable with any box of FToString should be developed. Here it is:
It is registered with BootstrapRegisterDefault
We repeat the "failed" invocation we had with type uinteger and also test a call on a boxed array with an unknown element type. A third type repeats the call on a character array, that got a specialized implementation:
The result is now:
It was already mentioned, that ALib Boxing is tolerant towards calling a function on a box whose mapped type is not associated with an implementation. The call is just not performed and instead, a default-constructed value of the according return type is returned by method Box::Call.
By the same token, a call of a function performed on a box that "does not contain a value" (see chapter 12.1 Void And Nulled Boxes) is likewise tolerated.
This design decision is once more justified with the common use cases for this module. The expectation of a programmer calling a box-function is: "Perform what is appropriate with the boxed type". And if there is no implementation, well, to do nothing is the appropriate action. Consequently, specific checks for the availability of function implementations can be omitted.
If a code wanted to take action on the fact that no type-specific implementation exists or that neither a type specific, nor a default implementation exists, such availabilities can be queried using Box::GetFunction. The method's parameter searchScope controls which sorts of functions are searched. The method is likewise tolerant against unset boxes.
If this is done, the returned function pointer already contains the function found, respectively is nullptr
on failure. To avoid a repeated search for that same function with a subsequent Call, alternative method Box::CallDirect can be used, which omits the search and instead expects the function pointer as a first parameter.
Finally, to check whether a box does not contain a value before calling a box-function, type-guessing for type void
is to be used with IsType<void>().
It might happen that a box-function intends to change the contents of a box. In theory, such change could even include changing the mapped type, but changing the value only is probably a more common use-case.
Two things are needed to allow that:
Box&
for first parameter self.The next chapter introduces the built-in functions of ALib Boxing. With them, one quite useful sample of a mutable box-function is found.
In contrast to C++, many other object oriented programming languages declare any class to be inherited of a built-in base type. For example, in JAVA, all classes inherit class Object. Such "mother of all objects", usually provides a set of methods that are available for any object in the language. In JAVA, the methods for example include equals(), hashCode(), clone() and toString().
Likewise, module ALib Boxing implements a set of built-in box-functions. Those are:
With the inclusion of module ALib BaseCamp, furthermore function FFormat becomes available.
The following implementations are given:
std
and 3rd-party types become available.This manual will not repeat a description of each function. Instead, please see the corresponding reference documentation, linked above with the enumeration of functions. Therefore, we conclude this section with just some quick facts:
Functions FEquals and FIsLess are called with global operators
For example, the following if statement:
translates to:
The fist version results to shorter code, but slightly slower code, because the operator's implementations are not inlined.
As a final remark, some of the built-in function declarators provide inner static functions, with some of them being templated. Those may be used to create custom specializations. Again, please consult the reference documentation for further details.
Repeated registrations of default or type-specific functions using BootstrapRegisterDefault and BootstrapRegister, are allowed. Any formerly set function is simply replaced. It is also allowed to register nullptr
, which disables a built-in function without providing a new one.
The built-in default and type-specialized functions are registered with namespace function Bootstrap. In most combinations of ALib Distribution, this function is automatically invoked with bootstrapping the library. Because each function can be disabled or replaced, no configuration option allows otherwise manipulating the defaults.
Any function implementation that specializes the behavior for a mapped type, may call the default implementation internally, for example to take specific action if a certain state of the boxed value is given, otherwise use the default implementation and probably return its result. To achieve this, the pointer to the default function implementation has to be received, which is done with method detail::FunctionTable::Get that has to be invoked on singleton object detail::DEFAULT_FUNCTIONS.
While this already touches objects in namespace detail, calling a specialized version of a function that was replaced by another (like calling the implementation of a base class in C++) is not explicitly supported by the library, but possible. For this, the bootstrap code that registers a function has to receive and store the previously registered implementation, which then can be called and which in turn may call another one or the default.
The rationale why this is not otherwise offered by the library is that such complicated use of box-function is out of the scope of the usual use cases for ALib Boxing.
Scoped enumerations as well as traditional enumerations, receive a special treatment with ALib Boxing. Unless their boxing is not customized, they are boxed to their identical type, the value stored in the Placeholder is cast from their underlying integral type to integer. When unboxed, the value is cast back from integer
to the original underlying type.
While this speciality is not noticeable when boxing and unboxing enumerations, the advantage of this treatment is that the different enum element-values of different enum types become "generically usable" when read directly from Placeholder::Integrals. The rationale why this constitutes "an advantage", is given in the next section.
Class Enum is the only derivate of class Box found in the library.
The class is useful to store and pass around enum values of arbitrary C++ scoped enum types in a type-safe way. It is implemented to ease the use of scoped enums in situations where programmers otherwise tend to "fall back" to non-scoped (non type-safe enum types). This is the case, when enum elements of different types should be allowed as a function argument or otherwise used as an "identifier". While C++ 11 introduced the syntax for enum class
types (aka "scoped enums"), still these are very limited. In especial, those do not support inheritance. Thus, an API cannot define an interface method that accepts enums of "custom derived types". This is quite often a problem. Of course, using module ALib Boxing, an interface method now may accept a box, but then anything else apart from enumeration types was accepted. Class Enum as a good tool to help here.
In the constructor, enum elements of arbitrary type are accepted. With the run-time type-information added, the processing function can now work with any the enum types transparently.
A good example use case is given with type Exception of module ALib BaseCamp. Any exception is created with an enum element of arbitrary type. The exception handlers then can use nested if
statements: The outer if
is about the exception type, the inner about the concrete exception. This gives a nice two-level order scheme for exceptions with no need to define "error number ranges" for each code unit.
A lot was said already in this manual about non-bijective boxing and its advantages. When it comes to boxing string types, the way to go is obvious: Whatever string type is boxed (and there might be many of them found in a software that uses 3rd-party libraries) - everything is simply boxed to a C++ array of the corresponding character type. A processing function then does not need to care about the origin type, but by only handling character arrays, any sort of string is treated correctly.
To achieve this, this module leverages type definitions and type-traits found with module ALib Characters. This is explained in the next section.
The section after that, covers further options that are available when module ALib Strings is included in the ALib Distribution. Finally, some good use of ALib Boxing and ALib Strings is made by module ALib BaseCamp. While this is not a part of this manual, some overview on it is provided in appendix chapter C.1 Use Case: Module BaseCamp.
Previous manual chapter 7. Customizing Boxing explained in detail how type-traits struct T_Boxer is used to provide information and static methods that allow to customize boxing of any type. The gaol with boxing string types is to map any of them to a character array. This could be done in the straight forward way, for example by just specializing T_Boxer<std::string> for C++ standard type std::string
.
But this is not what this library does! Instead, it leverages module ALib Characters. An interested reader should read this module's Programmer's Manual now first, before continuing with this chapter of ALib Boxing. A short summary of what is provided by this module should be given in bullets:
wchar_t
independent of platform and compiler: nchar, wchar and xchar.char
, wchar_t
, char16_t
or char32_t
.const character*
, string literals and character arrays. With the inclusion of module ALib Strings, type-traits for the five string-types found in that module are given. Finally,compatibility headers are provided that for example specialize T_CharArray for string and vector types of namespace std
or those found in the QT Class Library .With this in place, all that this module provides is a conditional specialization of type-traits struct T_Boxer for all types that T_CharArray is specialized for!
Precisely, two conditional specializations are given:
std::string_view
, ALib type String or QStringView.std::string
, ALib type AString or QString.As a result, to customize boxing for a custom string type, it is recommended to specialize T_CharArray instead of T_Boxer.
While it is still possible to use T_Boxer for customization, the advantage of the recommended approach is obvious: generally announcing the custom type to be of character array type enables it's use with module ALib Strings as well as with boxing. Also other modules and software built on ALib might directly benefit from such type-traits.
In the unlikely case that T_CharArray is specialized and still T_Boxer should be specialized (with the aim to provide a certain customization that is different from the one that this module automatically provides if T_CharArray is given), then, to avoid ambiguities, helper-type-traits struct T_SuppressCharArrayBoxing may be specialized to inherit std::true_type
. As its name says, a specialization of this type disables the automatic custom boxing and hence allows a specialization of T_CharArray and a parallel specialization of T_Boxer.
With the inclusion of module ALib Strings in the ALib Distribution, built-in box-function FAppend becomes available.
Class AString supports a TMP-based mechanism to append objects of arbitrary type, documented wiht chapter 5.1 Appending Custom Types of the Programmer's Manual of module ALib Strings.
Of course, if an object of type Box is "appended", then TMP does not work, as the compile-time information about the boxed type is lost. Consequently, box-function FAppend is needed that performs the job. If a box is appended to an AString, simply this function is called.
For all types which already specialize functor T_Append, a templated implementation of this function can be used: This unboxes the template type and appends it. This template function is provided with static member FAppend::Appendable.
As a result, there are two ways of implementing interface FAppend for a custom boxable type:
The second approach has the advantage, that the custom type is directly appendable to objects of class AString - independent of boxing. Therefore, this is the recommended option.
With class Box in place, it becomes possible to define functions and methods that take an arbitrary value as an argument. The need for this is often combined with the need to allow an arbitrary number of such arbitrary arguments. C++ 11 introduced variadic template arguments for this.
Class Box might greatly simplify the use of this language feature and provide a type-safe and indexed way to access variadic arguments. (In fact, this was one of the original goals for creating module ALib Boxing!)
The following quick sample demonstrates this:
With this function definition, it can be called like this:
It is only a single, simple line of code that fetches all function parameters and puts them into an array of boxes.
Of course, the classical recursive approach to process template arguments using class Box may also be implemented but avoiding the recursion makes the code easier and more readable.
The sample above can be slightly modified to use C++ 11 Perfect Forwarding which in some situations is a little more efficient and produces smaller code. The following code snippet uses this technique and may be copied as a recipe on how to implement variadic template functions with ALib Boxing:
In the previous chapter it was demonstrated how simple the use of variadic template arguments becomes with ALib Boxing. The recipe given, uses a single line of code to let the compiler create an array of objects of class Box. This is sufficient in many cases, but obviously using container class std::vector<alib::Box>
instead of a simple array would give more flexibility: It allows adding and removing boxes from the array and to pass the array to other (non-templated functions) without passing its size in an extra parameter.
For this and more purposes, templated class TBoxes is provided. It publicly inherits from std::vector<alib::Box>
and introduces method Add accepting templated variadic arguments. This way, its use is as simple as this:
In this sample, five boxed objects are added to the container using method TBoxes::Add.
We replace the simple C++ array of the recipe given in the previous section by an object of this type:
The advantage of the former version is that the array was created on the "stack". In contrast to this, class BoxesHA uses dynamic memory to store an arbitrary amount of boxes.
Even more efficient is the use of type BoxesMA, which performs only one single allocation (as long as the list of boxes is not exceeding around 40 boxes, then a second allocation would be performed). More on this topic of memory management is discussed in the next section.
The previous chapter introduced class TBoxes. It was said, that the class is derived from std::vector<Box>
. This is not exactly true. In fact it is derived from std::vector<Box, lang::StdContainerAllocator<Box, TAllocator>>
. Together with the classes' template parameter TAllocator and the std
allocator type StdContainerAllocator, different allocation strategies than just heap allocation can be implemented. Now, in case that module ALib Monomem is included in the ALib Distribution, the number of heap allocations can be reduced or even eliminated.
The use cases for monotonic allocation mode are described with module ALib Monomem and not repeated here, besides the following hint: Should the given MonoAllocator be reset, and the TBoxes instance not be destructed but continued to be used, then the instance has to be "reset" as well. This is done by performing a C++ placement-new, as described here.
Besides providing variadic template arguments, method TBoxes::Add uses some template meta programming to "flatten" the array in the case that another instance of class TBoxes is added. In other words, if an instance of class TBoxes is passed to TBoxes::Add, the boxes contained in this instance are copied into the destination vector! Due to this fact, when using sample method VariadicFunction from above, the invocation:
produces the following output:
1 2 3 4
The reason why this is implemented like this, is that the user of a method has a next benefit: He/she has the freedom of choice to either pass all parameters just inside the function call or to collect all objects before the call in an own instance of class TBoxes and then just pass this instance as a single argument - even together with other, fixed arguments.
This makes the use of the function more flexible, without the need of providing an overloaded version that accepts and processes an object of TBoxes directly.
Finally, besides detecting objects of class TBoxes inside method TBoxes::Add, it is also detected if an object of class TBoxes is passed as a boxed object. Let us first look at a sample and its result:
1 2 3 4
Looking at this sample a reader might think "Wow, this is cool, but where is the use case for this?". Generally spoken, this is useful when a method has several overloaded versions with different parameters, and still should support to accept an arbitrary amount of any type of arguments. In this moment, it might get quite complicated (or impossible!) to define the methods properly in the sense that no ambiguities may occur when invoking them. A solution here is to declare the method to accept just exactly one const alib::Box&
argument instead of a variadic list of arguments.
If inside the method this box is passed into a local instance of class TBoxes, a user might invoke the method with just a single argument of arbitrary type (which gets boxed), or with an arbitrary amount of arguments, by collecting those in class TBoxes. This might be done right in the invocation.
To demonstrate this, we use the method from above, but instead of accepting variadic template arguments, it accepts now just a single argument of type const Box&
:
This can be invoked as follows:
...which produces:
1 1 2 3
A real world sample can be found in the logging library ALox which is built on ALib and makes a lot of use of ALib Boxing. While straightforward methods Lox::Info, Lox::Verbose, etc. accept variadic template arguments as objects to be logged, method Lox::Once is more complicated: Various overloaded versions exist that interpret the term "once" differently. Therefore, each overloaded version accepts only one object to log - which at the first sight is only suitable to accept a simple log message string. But internally, a TBoxes instance is created and this way, multiple objects can be passed just as with other interface functions.
As a final note, besides "flattening" a boxed instance of class TBoxes, method TBoxes::Add will do the same with a "boxed array of boxes". Hence the following code:
produces:
1 2 3
Default-constructed instances of class Box or those constructed passing keyword nullptr
as an argument, do not contain a boxed value. Technically this means, that no VTable singleton is set, because VTables only exist for mapped types.
To test if a box "is void", aka does not contain a value, a test for type void
is to be performed by invoking Box::IsType<void>.
As soon as anything else but nullptr
is boxed (with construction or assignment), the instance loses its void state. Vice versa, by assigning keyword nullptr
, a box is "reset" to void state.
The following methods are allowed to be called on void boxes:
false
, even if both boxes are in void state.typeid(void)
if a box is not initialized.Forbidden methods that produce undefined behavior if invoked, are:
In debug-compilations these methods raise a run-time assertion when invoked on a void box. Most of the times an explicit test on whether a box is void is still not necessary, because unboxing is only allowed after successful type guessing.
The void state constitutes a piece of information that might be used in APIs.
Very different from the attribute of a Box being void, is the attribute of being nulled. The latter applies only to non-void boxes. In theory, the nulled-state of a box is undefined if no value is boxed.
If a box is nulled is evaluated using built-in box-function Box::IsNotNull which is invoked by methods Box::IsNull and negated Box::IsNotNull.
Because ALib Boxing is tolerant in respect to calling box-functions on void boxes, calling FIsNotNull on a void box returns the default value of bool
, which is false
. This way, boxes that do not contain a value report to be nulled, which is appropriate behavior with most use cases.
Default implementations of FIsNotNull for fundamental types return true
, as such types are not considered nullable. The default implementation returns false
(nulled), for array types that have a length of 0
and for pointer types that have value nullptr
. Otherwise the default implementation returns true
(not nulled).
Using class Box to pass data between code entities, causes a certain amount of "effort", which has an impact on the code size and the execution performance.
Before it is explained how to minimize this effort, the following important note is to be made:
The processes of boxing, type guessing and unboxing should be implemented in a fast and lean code. The three share two actions:
Point two is a matter of the implementation of struct T_Boxer. If the default for methods Write and Read can be used, this is implemented most efficiently and cannot be optimized. What is left is point one. This in turn is split into three steps:
The good news is, that step one is performed at compile-time using TMP and this way has no run-time effects. Step three is a most efficient simple pointer assignment, respectively comparison.
As a result, the only point that leaves room for optimizations is with step two, retrieving the vtable singleton. If it is done, then retrieving the singleton is nothing else than a single direct memory access.
Together, for example boxing a value is compiled to nothing more than just filling (all or a part of) the 24 bytes (respectively 12 bytes on a 32-bit platform) with values that the CPU can simply load from other memory addresses!
As mentioned above, the impact of not performing the optimization for a mapped type, is described in the section 12.2.3 Technical Background On VTables.
The goal of the optimization is to provide a named singleton object for the vtable of a mapped type. To do so, three simple steps are involved. As optimizations for all fundamental types are already built into the library, the library code itself, used for types bool
and char
[] are used as a sample.
Named singletons of struct detail::VTable have to be declared in a header file. For this, macros
are to be used. For types bool
and char
[], the internal (always included) header file alib/boxing/customizations.inl states:
Besides the mapped type, a second parameter specifies a valid and unique C++ identifier name.
The singleton objects have to be defined in a compilation unit (e.g., cpp-file). Corresponding macros
are used:
The macro parameters are the very same as for the declaration.
This final step is needed only in debug-compilations. Consequently, macro
(which is used to register both, non-array-type and array-type vtables) is empty when compiling a release-version.
Similar to the registration of box-function implementations, the registration of static vtables has to be performed with the bootstrap code of a software. It is a good idea to place the macros to the same bootstrap section, where function registrations are done.
In our sample, this looks as follows:
The registration done in debug-compilations has two effects:
This is all that is needed to do. With that, ALib Boxing is as fast as technically possible. The penalty of the use of boxes is marginalized in both respects: code size and execution performance.
What the vtable is in C++ , is struct detail::VTable for ALib Boxing. Both are singletons, which means that two objects of the same mapped type share a pointer to the same vtable and that for each mapped type only one instance exists.
At compile-time, when an object is boxed, the right singleton has be chosen and stored together with the object's data in the box. The small challenge now is to find a way of how to define a singleton for the endless amount of types that can be mapped? The solution is done with a simple trick: An otherwise empty template class detail::VTableTT is derived from VTable. In parallel this template class is also derived from ALib class Singleton. Two template type parameters are specified, TPlainOrArray and TMapped. These are exactly those types that are found in structs TMappedTo and TMappedToArrayOf. Either of them has to be used for the type definition Mapping of type traits struct T_Boxer<TBoxable> to specify the mapped type.
If the vtable was not optimized (as shown in the previous section), then the static method Singleton::GetSingleton is invoked on type VTableTT:
VTableTT<typename TMapping::PlainOrArray, typename TMapping::Type>::GetSingleton()
Et voila! This gives the constructors of class Box the strict singleton object it needs to store.
Now, to allow optimizations, class Box does not perform the retrieval of the right singleton directly. Instead, it is done indirectly through a next specializable type traits struct. This is named detail::T_VTableFactory. Only its default implementation - used with non-optimized mapped types - acts like described above. Specialized versions directly return a static object with method T_VTableFactory::Get.
The macros ALIB_BOXING_VTABLE_DECLARE and ALIB_BOXING_VTABLE_DECLARE_ARRAYTYPE declare such singleton and by the same token specialize the factory for the given mapped type to return it.
The final technical question is now: what negative impact does the use of class Singleton::GetSingleton have? As type Singleton has to be templated, it's construction has to be performed inline. The same is obviously the case with struct VTableTT which derives both VTable and the singleton. The first thing that Singleton::Get does is to check whether the singleton was already created by an earlier call. If yes, it is instantly returned. If not, construction has to be performed. Although the latter is done only once, each time a value is boxed, the whole (inlined) code has to be added to the construction. Therefore, the impact on code size is rather high, while the execution performance - from the second invocation on - suffers from only a marginal penalty.
On Windows OS, with the use of DLLs, things become even a little more complicated. This is the main reason for the existence of dedicated ALib helper-class Singleton. Different DLLs and the main process that loads them, do not share one data segment. Because of this, before a singleton is created, a check has to be made whether the same singleton was created already in a different data segment. Of course, such check needs to avoid race conditions and therefore uses a semaphore. Luckily, this code does not need to be inlined.
Note that this "DLL-problem" does not apply for the optimized, static vtable objects. Here, a definition can be used in a distinct compilation unit, that the process and the DLLs share.
More details on this topic are found with the Programmer's Manual of ALib Module ALib Singletons.
With field VTable::Functions, each vtable embeds struct FunctionTable which is responsible to store and retrieve implementations of box-functions. Furthermore, one dedicated instance of this type is defined in the namespace to store the default implementations.
Methods FunctionTable::Set and FunctionTable::Get use TMP enabled overload mechanics by their template type TFDescr. For the built-in functions FClone, FEquals, etc, a direct access to a corresponding pointer member is performed.
For registered custom functions, a global hash table is used that maps the function table and the function type to the function's implementation. Besides the hash table access needed, in addition a mutex is acquired to protect the global hash table against concurrent access.
Under certain conditions, instances of class Box are constexpr
values. For example, the following code compiles without an error:
While the typical use cases of ALib Boxing do not raise the requirement to be able to define constexpr
Box variables by users of the library, still there is some advantage of constexpr
boxes with the possibility for the compiler to optimize the object code. In addition, such box instances objects may be placed in the data segment of an executable, that is residing in read-only memory (e.g., embedded systems).
constexpr
boxes in read-only memory) imposes the only mandatory rationale for this type of optimization. For other purposes, it is very questionable if the result is worth the effort and a reader might skip the following explanations.The C++ rules for creating constexpr
objects imposes that the constructor of class Box that is chosen according to a given argument type TBoxable, is implemented constexpr
. The constructor creates two field members, the vtable and the Placeholder. Consequently, the creation of both objects need to be implemented constexpr
.
1. Static VTable:
For the vtable to meet the requirement, the optimization discussed in previous chapter 12.2 Optimizations With Static VTables has to be performed and thus is the first mandatory requirement to enable constexpr
boxes is to implement what is described in this chapter for the mapped type in question.
2. Static Definition of T_Boxer::Write:
The second requirement of creating the Placeholder in a constexpr
way, cannot be achieved with the implementation of method T_Boxer::Write as it was presented in chapter 7. Customizing Boxing! The reason is that with this definition, one or more members of union Placeholder have to be set inside the function. Functions that do this are forbidden to be constexpr (even in C++ 17).
Besides the macros used for customization introduced in that manual chapter, two further ones exist, with postfix "_CONSTEXPR":
The difference of the "_CONSTEXPR"-versions of the macros is the definition of boxing method Write. Instead of receiving the target's Placeholder along with the value to box:
static void Write( Placeholder& target, TSource const & value ) {...}
these macros define the method with only the value argument while returning a placeholder object:
static constexpr Placeholder Write( TSource const & value ) {...}
If the TMP code of class Box detects this change, a different constructor - one that is defined constexpr
- is chosen!
It was said, that modifying different members of a union is forbidden with the C++ rules. With the modified Write method, customization code has the chance to construct a new placeholder value and initialize one of the union fields. Unfortunately, also here, a strict rule applies: The constructor of a union is allowed to set only one of the union members.
The way out of this dilemma was to provide a bigger set of constexpr
constructors to union Placeholder that in turn make use of corresponding sets of constructors of detail types StructArray, UnionIntegrals, UnionFloatingPoints and UnionPointers. Some of those allow to initialize one or more of the array or struct elements. Note that as stated in the reference documentation of union Placeholder, these constructors are not listed in that reference documentation. If needed for a custom T_Boxer::Write method, please consult the source code.
To summarize: The second requirement about creating the Placeholder in constexpr
way, can be achieved by using the alternative version of T_Boxer::Write as described.
The following rules apply for different types:
constexpr
.constexpr
(built-in adoptions)constexpr
.constexpr
.constexpr
specialization of T_Boxer::Write.constexpr
. Consequently all that is needed is to define a static vtable for the mapped enum type.constexpr
fashion by default. Hence the only precondition is to define a static vtable for the type.Instances of class Box may generally exists as global data or static members as long as they not initialized with a boxed value.
If a default-initialization should be given, then the resulting mapped type's vtable has to be statically defined as described in chapter 12.2 Optimizations With Static VTables. The reason for this is, that dynamically created vtables are using the mechanics implemented with ALib type Singleton. To achieve the creation of process-wide "true" singleton objects, this class uses a globally defined hash-map that in case of a first creation within a compilation unit might be used to receive one already created in another compilation unit. The technical background for this is explained with module ALib Singletons. In short, the problematic platform here is WindowsOS, which allows a DLL to have an own global data segment.
Because the sequence order of initialization of global objects is not defined with the C++ language, it cannot be ensured that the hash-map is already initialized when the singleton vtable of an initialized global or static box is required.
As it was documented in chapter 12.2 Optimizations With Static VTables, for all fundamental types as well as for character arrays, a static vtable implementation is always in place. Therefore, global or static boxes may well be initialized with values of these types. If a custom type is to be used for initialization, a static vtable has to be given.
In debug-compilations, the use of dynamic vtables with global or static instances of class Box raises a run-time assertion.
The following ALib Compiler Symbols are provided by this ALib Module:
The sample code given in this manual only seldom show the inclusion of necessary header files. The module provides just three headers:
With the use of other ALib Modules that rely on boxing, the inclusion of the header files is usually not necessary. For example, when including alib/lang/format/formatterpythonstyle.hpp, the inclusion of headers of ALib Boxing is inherently performed.
Some care has to be taken, with boxing string types. As explained in chapter 10.2 Character Arrays, all specializations of type-traits struct T_CharArray have to be "included" before the definition of type-traits T_Boxer. Therefore, a compilation unit has to include such specializations before including header boxing.hpp.
For example, if string types of the QT Class Library are to be used with formatter FormatterPythonStyle, then the corresponding compatibility header has to be included before any other header that includes boxing.hpp:
#include "alib/compatibility/qt_characters.hpp" #include "alib/lang/format/formatterpythonstyle.hpp"
Like with most ALib Modules, a due bootstrapping of ALib Boxing has to be performed. As documented in the general manual of ALib, this usually is performed automatically with bootstrapping the library.
In the case that ALib Boxing is used as an extracted module, for bootstrapping, namespace function alib::boxing::Bootstrap has to be called. The method should be called as early as possible. I.e. it has to be called before custom code performs registration of custom box-function implementations and before the registration of custom static vtables.
With ALib Boxing, no mechanisms are in place that link the life-cycle of boxes with their boxed values. Class Box does not even have a destructor defined! This is a huge difference to C++ 17 class std::any
.
It is completely left to the user of the library to make sure that any pointer or data that otherwise references values available during boxing, are still intact and available when unboxed and vice-versa, that allocated objects that become boxed are de-allocated after a box that refers to them is disposed.
In many use cases, this is absolutely no problem: Often, ALib Boxing is used to implement generic (and optionally variadic) function arguments. If those are then used inside the function only and not stored otherwise, the access to boxed data is safe. A prominent sample for this use case is given with appendix chapter C.1.
However, other use-cases might introduce the need to use boxed data out of the scope that boxed the data. A good sample for this is given with appendix chapter C.2 Use Case: ALib Exceptions. Objects of type Exception carry exception arguments while the function-call stack is "unwinded". Hence, all locally defined objects are destructed and get out of scope.
In this and similar cases, a user of the library has to ensure that boxes of mapped types whose data might become corrupted, are either not unboxed or the data is copied before having the box leaving the scope. A nice way to perform such copying is provided with built-in box-function FClone. Its default implementation copies the data of boxed arrays.
Depending on the use-case, the concept of "cloning" does not need be taken too literally, because function FClone might take other actions as well. Implementations are allowed to overwrite the given box's content, hence including to change the mapped type of the box! Often it is enough to create some sort of representation of an object, for example just an ID or another sort of key value. In the mentioned use case of exception handling, sometimes just a string representation of an object might be created, which is later used for assembling a human readable formatted log output.
In debug-compilations, compiler symbol ALIB_DEBUG_BOXING may be set. With it, following entities become available:
Together, this discloses all information necessary to investigate into the built-in and default behavior of ALib Boxing. Please consult the reference manual of the named types, for further details.
Instead of using the methods and objects listed above, struct DbgBoxing provides a more handy alternative at the moment that module ALib BaseCamp is included in the ALib Distribution. The class then offers additional static interface methods that collect and format various sorts of information.
For details, please consult the type's reference documentation. In this Programmer's Manual, we just want to provide some sample invocations.
If a programmer is unsure, which mapped type results from boxing, all that is needed to do is to pass a "sample box" to method TypeName:
The mapped type is: char[]
A next, quite powerful method is TypeInfo. It provides all information on a boxable type TBoxable and its a mapped type.
The boxable type needs to be provided as a template parameter TBoxable. If it is not default constructible, a corresponding the sample box has to be provided as well. To stay with sample above, to get information for mapped type char
[], one possible TBoxable is alib::String. With that, the invocation looks like this:
It produces the following details:
Boxing Information For Boxable Type: TString<char> Mapping: Array Mapped Type: char[] Customized T: true Customized T*: false Is Unboxable: Yes (Custom unboxing from array type) VTable Type: Static Singleton (Specialized T_VTableFactory) Usage Counter: 28942 Associated Specialized Functions: FAppend<char, HeapAllocator> (9482) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 8) FIsLess ( 0) FToString ( 2)
For readers of this manual, all information should be easily understandable. Line "Usage Counter" provides the quantity of unboxing operations and function invocations that have been performed on boxes of the mapped type so far. The value also depends on when during a process's life-cycle the method was invoked. If this value indicates a high usage and line "VTable Type" denotes a dynamically created vtable type, it might make sense to define a static vtable for that mapped type.
Likewise, with each specialized box-function, the number of its invocations is given in brackets behind their names.
To get a list of just all box-functions, that either a defaulted or one or more specialized implementation has been registered for, the following code can be used:
Resulting to:
FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FClone ( 0) FEquals ( 3) FFormat (No default implementation) FHashcode ( 0) FIsLess ( 0) FIsNotNull (121) FIsTrue ( 0) FToLiteral (No default implementation) FToString ( 2)
For those box-functions that dispose over an associated default implementation, the number of invocation of that default is given in brackets.
To list just all types that a dynamic vtable is created for (and therefore could be optimized), the following line of code can be used:
Here is the list:
Mapped types with dynamic VTables: ----------------------------------------------------------------------------- (0) Algorithm (18) AppendLog* (1) BigClass* (2) double[] (0) id (1) int [3][] (13) int[] (5) long[] (4) MyBase (0) PHTypes (16) reference_wrapper<basic_string<char, char_traits<char>, allocator<char>> > (0) reference_wrapper<basic_string<char16_t, char_traits<char16_t>, allocator<char16_t>> > (0) reference_wrapper<basic_string<wchar_t, char_traits<wchar_t>, allocator<wchar_t>> > (2) SmallClass (0) State (0) TBoxes<TPoolAllocator<TMonoAllocator<HeapAllocator>, 8ul>>* (0) Thread* (1) vector<int, allocator<int>>* (1) WrappedFloat
If true
was passed to DumpVTables, then those with static tables had been given. A second, default boolean parameter can be used to trigger the list of specialized functions with each vtable listed.
To finish this chapter, method DbgBoxing::DumpAll is invoked, which aggregates much of the above.
The following shows the invocation and a possible corresponding output:
Mapped types with static VTables and their associated specialized functions: ----------------------------------------------------------------------------- (0) Alignment FAppend<char, HeapAllocator> ( 0) (0) bool FAppend<char, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsNotNull ( 0) (0) Bool FAppend<char, HeapAllocator> ( 0) (1404) Box[] (0) ByteSizeIEC FAppend<char, HeapAllocator> ( 0) (0) ByteSizeSI FAppend<char, HeapAllocator> ( 0) (0) ByteSizeUnits FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) Caching FAppend<char, HeapAllocator> ( 0) (0) CallerInfo* FAppend<char, HeapAllocator> ( 0) FFormat ( 0) (0) Case FAppend<char, HeapAllocator> ( 0) (0) char16_t[] FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FIsLess ( 0) (0) char32_t[] (30237) char[] FAppend<char, HeapAllocator> (9688) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 8) FIsLess ( 0) FToString ( 2) (0) ContainerOp FAppend<char, HeapAllocator> ( 0) (0) CreateDefaults FAppend<char, HeapAllocator> ( 0) (0) CreateIfNotExists FAppend<char, HeapAllocator> ( 0) (0) CurrentData FAppend<char, HeapAllocator> ( 0) (4) DateTime FFormat ( 0) FToLiteral ( 0) (15) double FAppend<char, HeapAllocator> ( 2) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsLess ( 0) FIsNotNull ( 0) FToString ( 1) (0) Exception* (0) Exceptions FAppend<char, HeapAllocator> ( 0) (0) Exceptions FAppend<char, HeapAllocator> ( 0) (0) Exceptions FAppend<char, HeapAllocator> ( 0) (0) File FAppend<char, HeapAllocator> ( 0) FFormat ( 0) (0) FMTExceptions FAppend<char, HeapAllocator> ( 0) (0) Inclusion FAppend<char, HeapAllocator> ( 0) (0) Initialization FAppend<char, HeapAllocator> ( 0) (39) Logger* FAppend<char, HeapAllocator> (13) (662) long FAppend<char, HeapAllocator> (68) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsLess ( 0) FIsNotNull ( 0) FToString ( 1) (0) long double FAppend<char, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsLess ( 0) (0) OpCodes FAppend<char, HeapAllocator> ( 0) (144) pair<Verbosity, Priority> FAppend<char, HeapAllocator> (48) (0) Path* FAppend<char, HeapAllocator> ( 0) (0) Permissions (0) Phase FAppend<char, HeapAllocator> ( 0) (36) Priority FAppend<char, HeapAllocator> (12) (0) Propagation FAppend<char, HeapAllocator> ( 0) (0) Qualities FAppend<char, HeapAllocator> ( 0) (0) Qualities3Letters FAppend<char, HeapAllocator> ( 0) (0) Reach FAppend<char, HeapAllocator> ( 0) (23) reference_wrapper<TAString<char, HeapAllocator>> FAppend<char, HeapAllocator> ( 8) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) reference_wrapper<TAString<char16_t, HeapAllocator>> FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) reference_wrapper<TAString<wchar_t, HeapAllocator>> FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) Responsibility FAppend<char, HeapAllocator> ( 0) (0) Safeness FAppend<char, HeapAllocator> ( 0) (734) Scope FAppend<char, HeapAllocator> (256) (0) Side FAppend<char, HeapAllocator> ( 0) (0) SortOrder FAppend<char, HeapAllocator> ( 0) (0) SourceData FAppend<char, HeapAllocator> ( 0) (0) StringTree<TMonoAllocator<HeapAllocator>, Entry, ConfigNodeHandler, (Recycling)1>::TCursor<true> (0) Switch FAppend<char, HeapAllocator> ( 0) (0) SystemErrors FAppend<char, HeapAllocator> ( 0) (0) SystemExceptions FAppend<char, HeapAllocator> ( 0) (21) TBoxes<HeapAllocator>* (128) TBoxes<TMonoAllocator<HeapAllocator>>* (0) Ticks (0) TimePointBase<steady_clock, Ticks>::Duration FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) TimePointBase<system_clock, DateTime>::Duration FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FToLiteral ( 0) (0) Timezone FAppend<char, HeapAllocator> ( 0) (0) Timing FAppend<char, HeapAllocator> ( 0) (0) Token* FAppend<char, HeapAllocator> ( 0) (0) type_info* FAppend<char, HeapAllocator> ( 0) (0) TypeNames1Letter FAppend<char, HeapAllocator> ( 0) (0) TypeNames2Letters FAppend<char, HeapAllocator> ( 0) (0) TypeNames3Letters FAppend<char, HeapAllocator> ( 0) (120) Types (0) Types FAppend<char, HeapAllocator> ( 0) (525) unsigned long FAppend<char, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsLess ( 0) FIsNotNull ( 0) (0) ValueReference FAppend<char, HeapAllocator> ( 0) (0) Variable FAppend<char, HeapAllocator> ( 0) (2627) Verbosity FAppend<char, HeapAllocator> (1309) (0) void* (406) wchar_t FAppend<char, HeapAllocator> (18) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FHashcode ( 0) FIsLess ( 0) FIsNotNull ( 0) (0) wchar_t[] FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FEquals ( 0) FIsLess ( 0) (0) Whitespaces FAppend<char, HeapAllocator> ( 0) Mapped types with dynamic VTables and their associated specialized functions: ----------------------------------------------------------------------------- (0) Algorithm FAppend<char, HeapAllocator> ( 0) (18) AppendLog* FAppend<char, HeapAllocator> ( 6) (1) BigClass* (2) double[] (0) id FAppend<char, HeapAllocator> ( 0) (1) int [3][] (13) int[] (5) long[] FToString ( 1) (4) MyBase (0) PHTypes FAppend<char, HeapAllocator> ( 0) (16) reference_wrapper<basic_string<char, char_traits<char>, allocator<char>> > FAppend<char, HeapAllocator> ( 6) FAppend<wchar_t, HeapAllocator> ( 0) (0) reference_wrapper<basic_string<char16_t, char_traits<char16_t>, allocator<char16_t>> > FAppend<char, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (0) reference_wrapper<basic_string<wchar_t, char_traits<wchar_t>, allocator<wchar_t>> > FAppend<char, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) (2) SmallClass (0) State FAppend<char, HeapAllocator> ( 0) (0) TBoxes<TPoolAllocator<TMonoAllocator<HeapAllocator>, 8ul>>* (0) Thread* FAppend<char, HeapAllocator> ( 0) (1) vector<int, allocator<int>>* (1) WrappedFloat Known Function Declarators And Usage Of Default Implementation: ----------------------------------------------------------------------------- FAppend<char, HeapAllocator> ( 0) FAppend<char16_t, HeapAllocator> ( 0) FAppend<wchar_t, HeapAllocator> ( 0) FClone ( 0) FEquals ( 3) FFormat (No default implementation) FHashcode ( 0) FIsLess ( 0) FIsNotNull (121) FIsTrue ( 0) FToLiteral (No default implementation) FToString ( 2)
Note that the output include types that we have used during this tutorial. This is due to the fact that the unit test that produce this manual's output run all in one process and are run in the order of the chapters.
The reason why the effort of implementing this library is needed is the C++ language design principle to be as performant and close to the hardware as possible. Other programming languages are designed for other goals. For example in languages Java or C#, the principle "everything is an object" is (almost) implemented. In these languages, all instances (!) of class types have run-time type information attached. In C++, only virtual classes have that.
And what happens in Java and C# when a plain, fundamental type is passed to a method that expects an object? The corresponding compiler performs "auto-boxing" of the values to pre-defined class types like Char, Integer or Double!
ALib Boxing allows very similar things in C++. Therefore it is indicated to have do quick analysis of the memory and performance impact. We do this in a rather loose order:
Due to the C++ language history, there is some confusion and wrong information spread in consideration of run-time type information (RTTI), especially with programmers that have a long-term record of C++ experience (because they probably went through the painful discussions of older days, which freshmen did not).
Therefore quickly some facts:
typeid
is used on, the footprint of an executable increases by the size of the corresponding std::type_info
struct that the linker has to place in the data segment for that type.typeid
is negligible. It is constant time, in Big O notation it is O(1). Keyword typeid
just reads the pointer to a global struct residing in the data segment of an executable.For each mapped type, a singleton of a type detail::VTable is created once.
This is again is negligible, even if no static vtable is declared for a mapped type. If it is, then the impact of using a mapped type is comparable to the use of C++ vtables which are created by the compiler and included by the linker for each virtual C++ class used.
Class Box contains two members: A pointer to the vtable singleton and the data union Placeholder, which consists of two "words". For example on a standard 64-bit platform a pointer and a word is 8 bytes wide, hence an instance of class Box on those platforms has a size of 24 bytes. With many use cases, boxes are created in "stack memory" which allocates and deallocates in zero time (yes, its less than "O(1)", it is just nothing).
Once created, to pass them to another function or store them in a container like TBoxes, these 24 bytes have to be copied.
While this is three times more than copying just a pointer, it might me much less effort in cases that composite types automatically become boxed as pointers. If those had been passed for example as variadic templated parameter, a deep-copy of the argument value had to be performed. With ALib Boxing, it is always only the 24 bytes.
When a value is boxed, hence an object of class Box is created, two things have to be done. First the right vtable is identified. This is done using (inlined) TMP code and "magically" this is reduced to the inlined retrieval of a singleton.
This rather tricky procedure is very fast after it was done once for a type, but still the code needed to be inlined might be rather huge. This overhead can be optimized using static vtables. With such optimization, the effort is reduced to single copy operation of a pointer to a data structure residing in the global data segment of an executable.
Secondly, the Placeholder found with member Box::data has to be set. Again, this is mostly inlined TMP code and when compiled should be in most cases result in one or two simple copy operations of pointers or fundamental C++ values.
Because no destructor of an instance of a Box is given, as well as embedded union Placeholder or its members do not have a destructor, destruction of boxes is not performed.
Template method Box::IsType compares the internal pointer to the singleton vtable with the that singleton that would be chosen if the given type (the template parameter) was boxed. Therefore, the impact is the same as boxing a value, minus the process of boxing data, plus a pointer comparison.
Again, if optimized vtables are used for the mapped type resulting from the guessed type, method IsType is compiled to one simple inlined pointer comparison.
Template methods Box::IsArray and Box::IsArrayOf have to perform an additional check for a void box, and then otherwise perform a similar pointer comparison.
Template method Box::GetFunction performs a lookup of the function in struct detail::FunctionTable that is embedded in the vtable member of each box. This struct has simple pointer "slots" for each built-in function which are selected using template specializations of the corresponding access functions.
For custom box-functions, a global hash table is used to search the function implementation using a pair of a function table pointer and the function type as the key value.
As a result, a function lookup for built-in function is performed in O(1), one for is slower and only in the average case is O(1).
If parameter searchScope of method Box::GetFunction equals Reach::Global, then in case of not finding a specific implementation, the search is repeated using namespace object detail::DEFAULT_FUNCTIONS.
Finally, template method Box::Call uses GetFunction and then just passes any given parameters to a C++ function call. Parameters are passed using C++ 11 "perfect forwarding". In the case that no interface method is found, a default value of the return type TReturn is created. Depending on the type, this might invoke a default constructor.
Due to the use of type-traits and TMP selected methods with rather complicated type expressions that the compiler has to evaluate, the time to compile a code unit increases with the use of ALib Boxing.
Unfortunately, this increase can be reasonably high.
We consider the implementation of ALib Boxing to be as performant as it is possible.
It is hard or impossible to compare the impact on code size and performance between using of techniques like C++ variadic template arguments and the invocation of methods that do auto-boxing, probably using class TBoxes to fetch variadic arguments.
In comparison to using C++ 17 type std::any
, the most important advantage of ALib Boxing is that no heap memory allocations are performed, because class Box "switches" to pointer-boxing in the case a value does not fit to its placeholder. Reversely, when just fundamental types and small value classes are boxed, then std::any
has an advantage in construction performance and memory footprint.
At the end of the day, the typical use cases of ALib Boxing anyhow do not impose high demands on performance. The main motivation for providing this manual chapter is for the sake of completeness and furthermore, that the authors of the manual think that the previous considerations help to profoundly understand how ALib Boxing is implemented and therefore is to be used.
While the namespace documentation provides an extensive reference index (generated with marvelous Doxygen ), the following quick lists should help finding the information you need:
Method | Description |
---|---|
Box::IsType<void> | Tests for boxes that are default constructed or have a nullptr assigned, hence have no value boxed. |
Box::IsType<T> | Tests a box for containing a value of boxable type T. |
Box::IsArray | Returns true , if a one-dimensional C++ array had been boxed. |
Box::IsArrayOf<T> | Returns true , if IsArray returns true and if the boxed array element type corresponds to given type T. |
Box::IsPointer | Returns true , if the mapped type is of pointer type. |
Box::IsEnum | Returns true , if the box contains an enumeration element. |
Box::IsSameType<Box> | Non-template method that returns true if a box contains the same mapped-type than a given one. |
Box::IsCharacter | Aggregation function that tests for mapped character types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_CHARACTERS. |
Box::IsSignedIntegral | Aggregation function that tests for mapped signed integral types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS. |
Box::IsUnsignedIntegral | Aggregation function that tests for mapped unsigned integral types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS. |
Box::IsFloatingPoint | Aggregation function that tests for mapped floating point types, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_FLOATS. |
Method | Description |
---|---|
Box::Unbox<T> | Unboxes non-array type T. |
Box::UnboxArray<T> | Unboxes the pointer to an array of element type T. |
Box::UnboxElement<T> | Unboxes an array's element of type T. |
Box::UnboxLength | Unboxes an array's length. |
Box::UnboxCharacter | Aggregation function that unboxes a wchar, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_CHARACTERS. |
Box::UnboxSignedIntegral | Aggregation function that unboxes a integer, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS. |
Box::UnboxUnsignedIntegral | Aggregation function that unboxes a uinteger, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_INTEGRALS. |
Box::UnboxFloatingPoint | Aggregation function that unboxes a value of type double, respecting compiler symbol ALIB_FEAT_BOXING_BIJECTIVE_FLOATS. |
Box::Data | Allows direct constant access to a box's placeholder. |
The following box-functions are predefined with the library:
Name | Description/Notes |
---|---|
FEquals | Logical comparison of the contents of two boxes. Specialization given for all fundamental and character array types Templated implementations for comparable types are given with FEquals::ComparableTypes. |
FIsLess | Logical comparison of the contents of two boxes. Specialization given for all fundamental and character array types Templated implementations for comparable types are given with FIsLess::ComparableTypes. |
FIsNotNull | See chapter 12.1.2 Nulled Boxes for more information. |
FClone | See chapter 12.6 Life-Cycle Considerations for more information. |
FIsTrue | Returns true if a boxed value is considered to represent value true; false otherwise.The default implementation returns true for array types with zero length and for non-array types; if the used placeholder bytes do not all contain 0 .No type-specific implementations are given. |
FHashcode | Calculates a hash-code using the boxed type information as well as the boxed data. A default implementation is given that takes all used placeholder bytes into account for types boxed as values or enums; the pointer address for types boxed as pointers and the array contents for boxed arrays. Furthermore specializations for all fundamental types are given by using static templated member FHashcode::UsePlaceholderBytes. For pointer types, the provision of a specialization that collects type-specific hashable data from the pointer may lead to advanced hashing results. |
FAppend | Appends a string representation of the contents of the box to a given AString. The default implementation writes the type name and a hexadecimal number in brackets behind for pointer types and similar information for other types. As type name information is available in debug-compilations only, in release code, the words "ValueType", "PointerType", "ArrayType" or "EnumType" are written instead. Hence, this indicates that a missing specialization is in fact an error and the default implementation is rather given for convenience and testing purposes. Specializations are given for fundamental and character array types. |
Method | Description |
---|---|
Box::Call | Calls a box-function. |
Box::CallDirect | Calls box-function previously received with GetFunction. |
Box::GetFunction | Returns a box-function's implementation. |
Box::Clone | Implicitly calls box-function FClone. |
Box::Hashcode | Implicitly calls box-function FHashcode. |
Box::IsNull | Implicitly calls box-function FIsNotNull. |
Box::IsNotNull | Implicitly calls box-function FIsNotNull. |
Box::operator bool() | Implicitly calls box-function FIsTrue. |
Box::operator== | Implicitly calls box-function FEquals. |
Box::operator!= | Implicitly calls box-function FEquals. |
Box::operator< | Implicitly calls box-function FIsLess. |
Box::operator<= | Implicitly calls box-functions FIsLess and FEquals. |
Box::operator> | Implicitly calls box-functions FIsLess and FEquals. |
Box::operator>= | Implicitly calls box-function FIsLess. |
Method | Description |
---|---|
Box::TypeID | Returns the typeid of a mapped type. |
Box::ElementTypeID | Returns the typeid of a boxed array's element type. |
Box::GetPlaceholderUsageLength | Returns the bytes used in the placeholder. Usefull to write generic code, e.g., to implement default versions of box-functions. |
Method | Description |
---|---|
DbgBoxing | Static tool class to create human-readable information about the configuration of ALib Boxing. |
Box::DbgGetVTable | Returns the vtable singleton of a box. |
VTable::Functions | Has a set of fields whose names are prefixed "DbgCntInvocations" and provide the number of invocations of the corresponding built-in box-function. Likewise; method DbgBoxing::GetSpecificFunctionTypes returns the usage number with each registered custom box-function. |
VTable::DbgProduction | Denotes if a vtable singleton was dynamically created or is an optimized static object. |
VTable::DbgCntUsage | A usage counter for the mapped type. The counter is increased with the invocation of various unboxing methods and when a box-function invocation is performed. |
With default compilations, the following bijective boxing rules apply:
Source Type | Mapped Type | Unboxing/Comments |
---|---|---|
References and values of composite types (structs and classes) that either do not fit into union Placeholder or that are not copy-constructible or trivially destructible. | Pointers to corresponding composite types | Only the pointer type can be unboxed. |
Pointers to objects of composite types (structs and classes) that do fit into union Placeholder and that are copy-constructible or trivially destructible. | Values of corresponding composite types | Only the value type can be unboxed. |
Signed integral types of any size | integer | Only type integer can be unboxed. |
Unsigned integral types of any size | uinteger | Only type uinteger can be unboxed. |
float | double | Only type double can be unboxed. |
char , wchar_t , char16_t and char32_t | wchar | Only type wchar can be unboxed. |
const pointer to any of the three character types nchar, wchar or xcharstring literals char[] std::string std::string_view std::vector<char> ALib string types | Array of corresponding character type | "Lightweight" string types like std::string_view or String can be unboxed, "heavyweight" string types like AString cannot. |
In the source tree of the ALib C++ Library, folder alib/compatibility is found. Within that, a few header files are placed which are not included by other library headers, but instead may optionally be included by using code.
The directory aggregates headers imposed by different ALib Modules, targeting different 3rd-party libraries (in this respect we consider also the C++ standard library as such, as its use is optional).
The naming scheme of the header files is: "libname_modulename_something.hpp". For example, you will find header std_boxing_functional which applies to C++ standard library, this module and the area of "functionals".
There is no further documentation given in this user manual. However, in the reference documentation of this module, which is found with the documentation of namespace alib::boxing, inner namespace alib::boxing::compatibility exists, which aggregates some of the customization content.
To achieve this, the documentation even sometimes "fakes" entities into this namespace, that technically must not be there - and in reality therefore are not there.
As a sample, take functor struct alib::boxing::compatibility::std::hash<alib::boxing::Box>. While the documentation claims it to be in that deep namespace, it is a specialization of struct std::hash
and therefore "in reality" is and has to be made in namespace std
. The reference documentation of all "faked", moved entities will individually hint to this fact.
As noted in chapter 12.5.2 Header Inclusion, compatibility headers provided for module ALib Characters, have to be included before those provided for ALib Boxing.
The headers found should give a good guidance for implementing custom ones as needed. Please feel free to sent us your implementations for inclusion in this library. But please do this only together with a due approval that those contributions are allowed to be published by us under the ALib License Terms.
Quite often in this Programmer's Manual, it was talked about certain "design decisions" and that those are "justifyable" with the typical use-case scenarios of module ALib Boxing.
The following presentation of sample use cases now intent to give such justification. For example, it will be shown that:
All use-case are taken from other ALib Modules, which depend on module ALib Boxing.
The format types of module ALib BaseCamp are more than a use-case. In fact they were the whole reason and motivation of creating ALib Boxing!
That module implements the well known "printf paradigm", which is available in standard libraries of various programming languages. A printf-like function interface is used to create a string representation of an arbitrary amount of arguments of arbitrary type. To do so, a "format string" that contains one placeholder for each provided argument is passed along with the arbitrary arguments. The placeholders within the format string, follow a certain syntax which allow various output modifications, like number formats, horizontal alignment, etc.
Module ALib BaseCamp provides abstract class Formatter which offers two overloaded versions of method Format: both accept a target AString as the first argument. The first accepts a reference to class TBoxes, while the second accepts variadic template parameters besides the target string. How the latter invokes the first with a few lines of inlined code is explained in chapter 11. Variadic Function Arguments and Class TBoxes.
So, where is the format string then found in this interface? Well, here is a first idiosyncrasy of this implementation: The format string is not expected as a separated string type but just as a first of the arbitrary arguments. This approach has the following advantages:
All three advantages together make the format-interface given with module ALib BaseCamp unrivalled in respect to flexibility.
A next aspect that this use-case nicely shows is the exclusive use of class Box as function arguments. With this, no concerns of life-cycle management of the boxed data has to be taken into account. (We refer to those discussed in chapter 12.6 Life-Cycle Considerations). When arguments are passed and boxes are created implicitly on the stack, their life-cycle ends exactly when the function returns. This greatly justifies the design decision to "automatically" box pointers to objects in the case that given values do not fit into union Placeholder. If C++ 17 class std::any
was used instead, unless the library documentation would demand its users to explicitly pass pointers, deep copies of "bigger" objects would be created. And this would be completely unnecessary overhead, because the formatters treat each argument as a constant (read-only) object.
One could argument, that it is typical and thus rightful C++ style, to use address operator& when passing objects, while in contrary this boxing approach hides away the pointerization. Our counter-argument is: A concept as implemented with std::any
hides away the deep copy operation if just no pointer is passed. This is a negative impact on the performance, while the implicit pointerization is not!
Finally, the use case implemented with module ALib BaseCamp shows nicely how ALib Boxing enables to offer a library that can be extended to serve custom types in a most flexible way. This is shown with the provision of box-function FFormat by that module. This allows introducing new placeholder syntax (!) for custom types, of course without touching the original source code of the module.
A sample of how a custom type can be featured with a custom placeholder syntax is given in the Programmer's Manual of that module with chapter 4.3. Formatting Custom Types
Module ALib BaseCamp introduces class Exception, which is used in all ALib Modules as the throwable.
Class Exception stores a list of Message objects that may extend the exception object with new information while the call stack is unwinded. Each message entry has an identifier that is implemented with field Message::Type. This field is of type Enum and is a very good sample for using this type. With that it became possible that every ALib Module (and likewise a using custom software) defines its own scoped enum
type that enumerates all exceptions that the module (respectively custom software) may trow. As a result, an exception entry's type can contain enum
elements of custom enum
types transparently. A two-level hierarchy results from that. A usual catch
handler consists of nested if
-statements: The outer uses Enum::IsEnumType to test for the general exception type. The inner then uses Enum::operator== to test for a specific element of that exception type.
Each Message of an exception may store an arbitrary amount of arbitrary objects that provides further information about the entry, hence about the cause of the exception or about state information of the code that threw the exception.
For this, field class Message inherits type TBoxes which is a container storing elements of type Box. The information stored can (has to) be interpreted in a custom way by corresponding implementations of the exception handlers. A recommendation for users of this ALib Module, is to prepend a format string as the first element of this list. Such format string should contain a placeholder for every provided message argument and together this provides the possibility for an exception handler to easily create a human-readable text message from an exception entry, by just passing the TBoxes object to a Formatter, as discussed in the previous use-case chapter.
In contrast to the previous use case of text formatting, with Exception and its used Message object, the life-cycle management of the boxed message arguments is a quite critical issue. To resolve this, method TBoxes::CloneAll is used, which simply invokes TBoxes::CallAll<FClone> and hence clones all relevant data of values that do not fit into a box, into the internal mono allocator.
A code that throws an exception or while handling one, appends a new message to an exception, has to ensure that either of the following is true for each boxed argument attached:
Class Exception provides - and is even allocated within (!) - an object of type MonoAllocator, which itself is allocated in its own first buffer of memory! If the first buffer is sufficient, then only one single dynamic memory allocation is performed for the creation of the exception, including the copies of all message arguments!
We said in appendix C.1, that it was the original motivation for creating module ALib Boxing. The truth it, module ALox was it, just as the whole library once started with the development of ALox.
Of course, ALox uses the formatting features of ALib BaseCamp and thus all that was said for this use case applies to ALox.
In the context of ALox, boxes are called "logables" because they are the input to the logger. Now, ALox has an option to define prefixes (objects that are prepended to each log entry) in various ways. They can appear globally, or only with log-messages that are placed in a certain scope. The scope can be a source code file, a function or method, or even a certain execution thread.
A particular interesting thing is that if these prefixes are string objects (note that ALox also supports non-textual logging) these strings are copied when set as a prefix. The rationale for this is to allow the assembly of a local string object and pass this to ALox as a prefix logable. This is a pure convenience feature. However, in some seldom cases a software might wish to set a mutable string object as prefix logable. In this case the string must not be copied, but rather stored as a pointer to the original string object that then might be modified by other code entities. To achieve this and bypass the string copy feature, the string object has to be wrapped in std::reference_wrapper
.
Consequently, this is a sample use-case for what is explained in chapter 7.9 Bypass Custom Boxing With Identity-Boxing.
With C++, when overloading methods that use templated variadic arguments, quite quickly compile-time ambiguities occur: From a given set of arguments, the compiler can often not decide which of the overloaded versions to take, because two or more are matching the variadic portion. ALib Boxing solves this issue and allows ALox to offer a flexible API with many variants of overloaded methods that still accept variadic arguments. How this is done is explained in chapter 11.3 Advanced Usage of Class TBoxes.
The aim of module ALib Expressions is to provide an easy, yet powerful C++ library that allows run-time compilation of expressions. Expression syntax mimics and covers the whole set of C++ operators and like C++ is deemed to by type-safe during compilation (here: expression compilation performed at run-time!), while allowing custom intermediate and result types, processed by custom expression identifiers and functions. Expression strings are compiled to a "program", which is executed by a virtual machine (a simple stack machine provided with the module) to evaluate an expression result. Together with the program, the virtual machine is fed with an "expression scope" that provides access to custom data used by the program's identifiers and functions.
The use of ALib Boxing with this module, probably provides the most uncommon - but thus even more exciting - use case of ALib Boxing.
The full truth is, when planning that module, its authors did not expect how compelling and helpful the use of class Box would be for the implementation. Only during the development it became clear that the use of ALib Boxing simplifies almost every aspect of that library. And this is not only true for the library development itself, but also from the perspective of an "end-user" that incorporates that module into his own software.
What during development first seemed a like a "misuse" of class Box (and was deemed to be replaced later), turned out to not only to easy the libraries use, but to also boost performance and minimize code size.
This at first considered "misuse" is documented with manual chapter 3.2 Type Definitions With "Sample Boxes". Note that that manual still talks of "lazy use" or even "mis-use". In fact this is not really true. The effect on code size and ease of use is tremendous and it was a thorough decision to keep this concept since the first released library version.
Let us try to generalize the use-case described in the manual section linked above: Class Box is used to transport type-information between a user's code and a library code. In contrast to exchanging this information using std::type_info
references and C++ keyword typeid
, along with the information about whether that type is an array type or not, references to simple "sample boxes" are exchanged. A user of module ALib Expressions this way is not bothered with things like non-bijective boxing, value or pointer boxing, array boxing and the rather uncommon C++ RTTI mechanics. All that a user needs to do is to assign a simple sample value to an object of type Box and pass a reference to this box around.
The reason why the other use cases presented above did not need such use is obvious: Only module ALib Expressions deals with run-time type information "officially". It's whole goal is to allow the compilation of expression strings defined by end-users at run-time. Expression strings that a user feeds into a compiled software might result in different types - at run-time! Now the code using the library has to tell the library for example which result type an expression is allowed to have. Another sample are user-defined expression functions that have a signature of arguments and the result value. Now, during the type-safe compilation of expression strings (at run-time), the compiler needs to be able to select the right overloaded functions. The signature of a custom expression function is defined by a simple list of sample boxes.
The previous use-cases introduced in this appendix did not include a sample where a function or method returns a value of type Box. While the principle of doing so was presented in this manual very early already (see chapter 2.1 Tutorial: Boxing Values), it seems it is not too easy to find a good real-life use case for this. But module ALib Expressions has that.
The module allows defining custom expression functions. When the built-in virtual machine executes an expression program (aka evaluates an expression) such functions are called. As input arguments, the current boxes placed on the machine's stack are passed. Each function returns its result with a value, which replaces the input arguments on the stack. Consequently, all custom expression functions (which are also used to define custom unary or binary operators, auto-casts, etc) use boxes as input arguments and return a box.
Finally, another nice sample that module ALib Expressions demonstrates is in the area of box-functions. The module introduces the declaration FToLiteral. This is used by the expression compiler to generate an "optimized expression string". This may be wanted when a user passes an expression that can be optimized by the compiler to a shorter expression. While the optimization internally works and can be used, a software might want to present an expression string back to the user that - if compiled - directly resulted in the optimized expression program.
Details on that use-case are given in chapter 11.5 Optimizations of that module's Programmer's Manual, as well as in the reference documentation of the box-function declarator FToLiteral.