This ALib Module provides string formatting facilities by implementing an approach that is common to many programming languages and libraries. This approach offers an interface that includes the use of a "format string" containing placeholders. Besides this format string, a list of data values can be given, used to fill the placeholders.
Probably one of best known samples of such an interface is the printf
method of the C Language. A variation of this interface is found in almost any high-level, general purpose programing language.
Of-course, this module leverages module ALib Strings for all general string functions needed. Similar important is the use of module ALib Boxing, which brings type-safe variadic argument lists and allows with its feature of having "virtual functions" on boxed arguments, to have custom formatting syntax for placeholders of custom argument type.
While it is possible to implement a formatter providing a custom placeholder syntax, two very prominent ones are built-in with formatters:
printf
format string style.Another good news is that in its very basics, Python Style is similar to .Net formatting. This way, there is some "familiar basic syntax" available for everybody that has used formatting in one of the languages C, C++, C#, Java or Python and in languages that have also mimicked one of these styles!
By leveraging module ALib Boxing, which implies the use of variadic template arguments, the invocation of the final format method is as simple as it is possible. The following samples a simple format action with each of the two built-in formatters:
This produces the following result:
Values of or pointers to any type that is "boxable" may be passed as an argument to method Formatter::Format. The specific implementation of the formatter will match the "placeholder type" with the given argument type and format the argument according to the placeholder attributes.
In the sample above, two different formatters are created and each is used "properly", namely with its according syntax.
To increase flexibility, the formatters of this ALib Module provide two features:
With that information, the following code can be written:
new
) then stored in field Next of the first formatter. This field is of type SPFormatter, which is an alias for std::shared_ptr<Formatter>
, hence a C++ standard "smart pointer" that deletes it's contained object automatically with the deletion of the last referrer. Only in later chapters it will be explained why it is the preferred method to manage ALib formatter instances in shared pointers.The short sample code correctly produces the following output:
However, the placeholder syntax must not be mixed within one format string. Let's try:
The output is:
This is obviously not what we wanted, but then it also did not produce an exception and it even included the second argument, "Python" in the output. While exceptions are discussed in a later chapter only, the reason that no exception is thrown here is simply explained: The first formatter in the chain, which we defined as type FormatterJavaStyle, identified the format string by reading "%s". It then "consumes" this string along with as many subsequent arguments as placeholders are found in the format string. This number is just one, as the placeholder "{}" is not recognized by this formatter.
The intermediate result consequently is "---Java---{}---", while the argument "Python" remains unprocessed. The next section explains what happens with this remaining argument.
We just continue with the sample of the previous section: The unprocessed argument "Python" is not dropped, as it would have been with most implementations of a similar format functions in other libraries and programming languages. Instead, with ALib Text, the formatting process starts all over again using the remaining argument as the format string.
Now, as it is not a format string (it does not contain placeholders in any known syntax) it is just appended to the target string "as is".
In fact, for this last operation, none of the two formatters became active. The trick here is that the abstract base class, Formatter already implements method Format. This implementation loops over all arguments. It checks if the current first argument is a string recognized as a format string by any of the chained formatters. If it is not, this argument is just appended to the target string and the loop continues with the next argument in the list.
If a format string is identified, control is passed to the corresponding formatter that consumes as many further arguments as placeholders are found in that format string, and then passes control back to the main loop.
Consequently, this approach allows to invoke Format without even a format string:
which probably does not make any sense, because the same result could have been achieved much more efficiently by stating:
Even the following sample, still might not make too much sense to a reader:
because the usual way to come to the same result, was to have only one format string with two arguments, instead of two format strings with one argument each:
So, why is this loop implemented with it's auto-detection and the option of having more than one format string? Some sound rational for the loop is given in the next section.
Method Format collects the variadic template arguments to an internally allocated container of type Boxes. This container is then passed to the internal format loop.
Alternatively, the collection of format arguments in a container object may be performed "manually" by the user code. In this case, the formatter is invoked with one of the two overloaded methods Formatter::FormatArgs. Both methods must be invoked only after an explicit call to Formatter::Acquire. One of the two methods does not accept an external container and instead operates on the internally allocated instance which method Acquire returns. The second function allows to pass an arbitrary external container instance, for example one of derived type Message.
With this knowledge it becomes obvious, that the collection of formatting arguments can be "decoupled" from the invocation of the formatter. Note that the argument list may include zero, one or even multiple format strings, which each are followed by corresponding placeholder values:
A reader might think for herself if and when this might become useful. It should be noted that the interface of logging module ALox builds on the same mechanism. The arguments there are called "logables" and might be format strings or anything else. Therefore, also with ALox the collection of log entry data can be decoupled from the final creation of the log entry. This is especially useful for complex log-entries whose arguments are collected during the execution of an algorithm and for example are only logged in case of an exception or other unexpected conditions.
In the previous samples, a local instance of a formatter (or two) has been created. For general purpose use, this module provides a global pair of (concatenated) formatters which are receivable with static methods Formatter::GetDefault and Formatter::AcquireDefault.
The formatter returned is embedded in "smart pointer type" SPFormatter. During bootstrapping of the library, a formatter of type FormatterPythonStyle is created with a concatenated object of FormatterJavaStyle.
One obvious rational for the provision of these default formatters is of-course to save memory and processing resources by reusing the formatter instances in different parts of an application. However, probably in most cases more important is the fact that this way, the same default configuration is used with formatting operations. For example, if the decimal point character of floating point numbers should be defaulted to be different than US/English standard '
.', then such setting could be performed with the bootstrap of the library once and for all usages across a process.
Formatter implementations may or may not provide default settings that for example influence a format operation that uses minimal placeholders that omit optional formatting flags. The built-in formatters do have such default settings.
If a software unit wishes to change some settings, the advised approach is as follows:
With this procedure, any changes that an application applied to the default formatters (e.g. during bootstrap), will remain valid in the cloned formatters in addition to the "local changes", while the default formatters remain untouched.
For example, built-in formatters provide in fields DefaultNumberFormat and AlternativeNumberFormat to reflect some default behavior of their formatting syntax.
The attributes of these members might be modified to change those defaults. While this leads to a deviation of the formatting standard, it may be used instead of providing corresponding syntactic information within the placeholder field of each and every format string. Some modifications may not even be possible with the given format specification syntax.
The simple samples shown so far used correct format strings. In case of errorneous format strings, the built-in formatters will throw an ALib Exception defined with enumeration aworx::lib::text::Exceptions.
While in the case of "hard-coded" format strings, such exceptions are not needed to be caught, their evaluation (with debug builds) might be very helpful for identifying what is wrong. Of-course, when format strings are not hard-coded but instead can be provided by the users of a software (for example in configuration files or command line parameters), a try/catch
block around formatting invocations is a mandatory thing, also in release compilations.
The following sample shows how an exception can be caught and its description may be written to the standard output:
The output of running this code is:
In most cases, a detailed text message, including a copy of the format string and a "caret" symbol '^'
that hints to the parsing error in the string is given with the exception's description.
Escape characters, like for example "\t"
, "\n"
or "\\"
might be given with either one or two backslashes. The formatters will convert them to the corresponding ASCII code, if the backslash itself is escaped.
Class FormatterPythonStyle recognizes double curly braces "{{"
and "}}"
and converts them to a single brace. Similar to this, class FormatterJavaStyle recognizes "%%"
and converts it to a single percentage symbol.
As we have seen, the use of module ALib Boxing allows the formatters of this module to accept any third-party data type as formatting arguments. The formatters of-course are enabled to "convert" all C++ fundamental types to strings. But how about custom types?
The solution for custom conversion is given with the support of "box-functions", which implement a sort of "virtual function call" on boxed types.
There are two box-functions that the built-in formatters are using.
By default, the very simple box-function that is used by the built-in formatters for converting arbitrary types to string values, is FAppend. This function is one of the built-in functions of module ALib Boxing and this way is not specific to this module ALib Text.
Usually this function's implementation just unboxes the corresponding type and appends the object to the target string.
Let as look at an example. The following struct stores a temperature in Kelvin:
If an object of this class is used with a formatter without any further preparation, the default implementation of function FAppend is invoked, which writes the memory address of the given object. In debug-compilations, this implementation in addition writes the boxed type's name (platform dependent and implemented with class DbgTypeDemangler). This is shown in the following code and output snippet:
The first step to implement function FAppend for sample type Kelvin is to specialize functor T_Append for the type:
With that in place, it is possible to apply an object of this type to an AString:
Now, we can easily implement box-function FAppend, because for types that are "appendable" already, this is done with just a simple macro that has to be placed in the bootstrap section of a software:
With that in place, it is possible to append a boxed object of this type to an AString:
Because the formatters use the same technique with the boxed arguments they receive, our sample class can now already be used with formatters:
To summarize this section, some bullet points should be given:
The previous section demonstrated how a custom type can be made "compatible" to ALib formatters found in this module ALib Text.
The given approach using box-function FAppend is quite limited in that respect, that within a format string no attributes might be given that determine how to format a custom type. With the sampled temperature type "Kelvin", the output format was in celsius with one decimal digits. If we wanted to allow Fahrenheit as an alternative output, we need to implement boxing function FFormat, which was specifically created for this purpose and consequently is defined in this module.
The function has three parameters: besides the box that it is invoked on and the target string to write to, it receives a string that provides type-specific information about how the contents is to be written. This format specification is fully type and implementation specific and has to be documented with the specific function's documentation.
We want to implement a format string that starts with character 'C'
, 'F'
or 'K'
to specify celsius, fahrenheit or kelvin and a following integral number that specifies the fractional digits of the output.
To do this, the following function declaration needs to go to a header file:
Then, the implementation of the function has to be placed in a compilation unit. This might look like this:
Within the bootstrap section of the process, the function has to be registered with ALib Boxing:
With that in place, we can use the custom format specification with our custom type
The following output is produced.
As a second sample we want to look at the internal implementation of formatting date and time values. ALib class CalendarDateTime provides (native) method Format to write time and date values in a human readable and customizable way. This method also requires a format specification. Now, this helper class is used to implement FFormat for boxed arguments of type DateTime, which is given with class FFormat_DateTime. Due to the existence of the helper class, the implementation of the function is therefore rather simple:
To implement a custom formatter that uses a custom format string syntax, no detailed manual or step-by-step sample is given here. Instead just some hints as bullet points:
Besides the formatter classes that have been discussed in this Programmer's Manual, module ALib Text provides some other types which make use of the formatters.
As of today, these types are
Please consult the class's extensive reference documentation for more information about the features and use of these types.