ALib C++ Library
Library Version: 2511 R0
Documentation generated by doxygen
Loading...
Searching...
No Matches
formatterpythonstyle.inl
Go to the documentation of this file.
1//==================================================================================================
2/// \file
3/// This header-file is part of module \alib_format of the \aliblong.
4///
5/// \emoji :copyright: 2013-2025 A-Worx GmbH, Germany.
6/// Published under \ref mainpage_license "Boost Software License".
7//==================================================================================================
9
10//==================================================================================================
11/// Implements a \alib{format;Formatter} according to the
12/// \https{formatting standards of the Python language,docs.python.org/3.5/library/string.html#format-string-syntax}.
13///
14/// \note
15/// Inherited, public fields of parent class \b FormatterStdImpl provide important possibilities
16/// for changing the formatting behavior of instances of this class. Therefore, do not forget
17/// to consult the \ref alib::format::FormatterStdImpl "parent classes documentation".
18///
19/// In general, the original \b Python specification is covered quite well. However, there are
20/// some differences, some things are not possible (considering python being a scripting language)
21/// but then there are also found some very helpful extensions to that standard. Instead of repeating
22/// a complete documentation, please refer to the
23/// \https{Python Documentation,docs.python.org/3.5/library/string.html#format-string-syntax}
24/// as the foundation and then take note of the following list of differences, extensions and
25/// general hints:
26///
27/// - <b>General Notes:</b>
28/// \b Python defines a placeholder field as follows
29///
30/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
31///
32///
33/// - This formatter is <b>less strict</b> in respect to the order of the format symbols. E.g.
34/// it allows <c>{:11.5,}</c> where Python allows only <c>{:11,.5}</c>
35///
36/// - With this class being derived from
37/// \ref alib::format::FormatterStdImpl "FormatterStdImpl", features of the parent are
38/// available to this formatter as well. This is especially true and sometimes useful in respect to
39/// setting default values number formatting. For example, this allows modifying all number output
40/// without explicitly repeating the settings in each placeholder of format strings. Other options,
41/// for example, the grouping characters used with hexadecimal numbers, cannot be even changed
42/// with the <b>Python Style</b> formatting options. The only way of doing so is modifying the
43/// properties of the formatter object before the format operation.
44///
45/// - Nested replacements in format specification fields are (by nature of this implementation
46/// language) \b not supported.
47///
48/// <p>
49/// - <b>Positional arguments and field name:</b>
50/// - By the nature of the implementation language (<em>C++, no introspection</em>) of this class,
51/// \b field_name can \b not be the name of an identifier, an attribute name or an array element
52/// index. It can only be a positional argument index, hence a number that chooses a different
53/// index in the provided argument list.<br>
54/// However, the use of field names is often a requirement in use cases that offer configurable
55/// format string setup to the "end user". Therefore, there are two alternatives to cope
56/// with the limitation:
57/// - In simple cases, it is possible to just add all optionally needed data in the argument list,
58/// document their index position and let the user use positional argument notation to choose
59/// the right value from the list.
60/// - More elegant however, is the use of class
61/// \ref alib::format::PropertyFormatter "PropertyFormatter"
62/// which extends the format specification by custom identifiers which control the placement
63/// of corresponding data in the format argument list. This class uses a translator table from
64/// identifier strings to custom callback functions. This way, much more than just simple
65/// field names are allowed.
66///
67/// - When using positional arguments in a format string placeholders, the Python formatter
68/// implementation does not allow to switch from <b>automatic field indexing</b> to explicit
69/// indexing. This \b %Aib implementation does allow it. The automatic index (aka no positional
70/// argument is given for a next placeholder) always starts with index \c 0 and is incremented
71/// each time automatic indexing is used. Occurrences of explict indexing have no influence
72/// on the automatic indexing.
73///
74///
75/// <p>
76/// - <b>Binary, Hexadecimal and Octal Numbers:</b>
77/// - Binary, hexadecimal and octal output is <b>cut in size</b> (!) when a field width is given that
78/// is smaller than the resulting amount of digits of the number arguments provided.
79/// \note This implies that a value written might not be equal to the value given.
80/// This is not a bug but a design decision. The rationale behind this is that with this
81/// behavior, there is no need to mask lower digits when passing the arguments to the
82/// format invocation. In other words, the formatter "assumes" that the given field width
83/// indicates that only a corresponding number of lower digits are of interest.
84///
85/// - If no width is given and the argument contains a boxed pointer, then the platform-dependent
86/// full output width of pointer types is used.
87///
88/// - The number <b>grouping option</b> (<c>','</c>) can also be used with binary, hexadecimal and octal
89/// output.
90/// The types support different grouping separators for nibbles, bytes, 16-bit and 32-bit words.
91/// Changing the separator symbols, is not possible with the format fields of the format strings
92/// (if it was, this would become very incompatible to Python standards). Changes have to be made
93/// before the format operation by modifying field
94/// \alib{format;FormatterStdImpl::AlternativeNumberFormat;FormatterStdImpl::AlternativeNumberFormat}
95/// which is provided through parent class \b %FormatterStdImpl.
96///
97/// - Alternative form (\c '#') adds prefixes as specified in members
98/// - \alib{strings;TNumberFormat::BinLiteralPrefix;BinLiteralPrefix},
99/// - \alib{strings;TNumberFormat::HexLiteralPrefix;HexLiteralPrefix}, and
100/// - \alib{strings;TNumberFormat::OctLiteralPrefix;OctLiteralPrefix}.
101///
102/// For upper case formats, those are taken from field
103/// \alib{format;FormatterStdImpl::DefaultNumberFormat;FormatterStdImpl::DefaultNumberFormat},
104/// for lower case formats from
105/// \alib{format;FormatterStdImpl::AlternativeNumberFormat;FormatterStdImpl::AlternativeNumberFormat}.
106/// However, in alignment with the \b Python specification, \b both default to lower case
107/// literals \c "0b", \c "0o" and \c "0x". All defaults may be changed by the user.
108///
109///
110/// <p>
111/// - <b>Floating point values:</b>
112/// - If floating point values are provided without a type specification in the format string, then
113/// all values of
114/// \alib{format;FormatterStdImpl::DefaultNumberFormat;FormatterStdImpl::DefaultNumberFormat}
115/// are used to format the number
116/// - For lower case floating point format types (\c 'f' and \c 'e'), the values specified in
117/// attributes \b %ExponentSeparator, \b %NANLiteral and \b %INFLiteral of object
118/// \alib{format;FormatterStdImpl::AlternativeNumberFormat;FormatterStdImpl::AlternativeNumberFormat}
119/// are used. For upper case types (\c 'F' and \c 'E') the corresponding attributes in
120/// \alib{format;FormatterStdImpl::DefaultNumberFormat;FormatterStdImpl::DefaultNumberFormat} apply.
121/// - Fixed point formats (\c 'f' and 'F' types) are not supported to use an arbitrary length.
122/// See class \alib{strings;TNumberFormat;NumberFormat} for the limits.
123/// Also, very high values and values close to zero may be converted to scientific format.
124/// Finally, if flag \alib{strings;NumberFormatFlags;ForceScientific} field
125/// \alib{strings::NumberFormat;Flags} in member #DefaultNumberFormat is \c true, types
126/// \c 'f' and 'F' behave like types \c 'e' and 'E'.
127/// - When both, a \p{width} and a \p{precision} is given, then the \p{precision} determines the
128/// fractional part, even if the type is \b 'g' or \b 'G'. This is different than specified with
129/// Python formatter, which uses \p{precision} as the overall width in case of types
130/// \b 'g' or \b 'G'.
131/// - The 'general format' type for floats, specified with \c 'g' or \c 'G' in the python
132/// implementation limits the precision of the fractional part, even if \p{precision} is not
133/// further specified. This implementation does limit the precision only if type is \c 'f'
134/// or \c 'F'.
135///
136/// <p>
137/// - <b>%String Conversion:</b><br>
138/// If \e type \c 's' (or no \e type) is given in the \b format_spec of the replacement field,
139/// a string representation of the given argument is used.
140/// In \b Java and \b C# such representation is received by invoking <c>Object.[t|T]oString()</c>.
141/// Consequently, to support string representations of custom types, in these languages
142/// the corresponding <b>[t|T]oString()</b> methods of the type have to be implemented.
143///
144/// In C++ the arguments are "boxed" into objects of type
145/// \ref alib::boxing::Box "Box". For the string representation, the formatter invokes
146/// box-function \alib{boxing;FAppend}. A default implementation exists which
147/// for custom types appends the type name and the memory address of the object in hexadecimal
148/// format. To support custom string representations (for custom types), this box-function
149/// needs to be implemented for the type in question. Information and sample code on how to do this
150/// is found in the documentation of \alib_boxing , chapter
151/// \ref alib_boxing_strings_fappend "10.3 Box-Function FAppend".
152///
153/// - <b>Hash-Value Output:</b><br>
154/// In extension (and deviation) of the Python specification, format specification type \c 'h' and
155/// its upper case version \c 'H' is implemented. The hash-values of the argument object is
156/// written in hexadecimal format. Options of the type are identical to those of \c 'x',
157/// respectively \c 'X'.
158///
159/// In the C++ language implementation of \alib, instead of hash-values of objects, the pointer
160/// found in method \alib{boxing;Box::Data} is printed. In case of boxed class-types and default
161/// boxing mechanics are used with such class types, this will show the memory address of
162/// the given instance.
163///
164/// - <b>Boolean output:</b><br>
165/// In extension (and deviation) of the Python specification, format specification type \c 'B'
166/// is implemented. The word \b "true" is written if the given value represents a boolean \c true
167/// value, \b "false" otherwise.
168///
169/// In the C++ language implementation of \alib, the argument is evaluated to boolean by invoking
170/// box-function \alib{boxing;FIsTrue}.
171///
172/// <p>
173/// - <b>%Custom %Format Specifications:</b><br>
174/// With \c Python formatting syntax, placeholders have the following syntax:
175///
176/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
177///
178/// The part that follows the colon is called \b format_spec. \b Python passes this portion of the
179/// placeholder to a built-in function \c format(). Now, each type may interpret this string in a
180/// type specific way. But most built-in \b Python types do it along what they call the
181/// \https{"Format Specification Mini Language",docs.python.org/3.5/library/string.html#format-specification-mini-language}.
182///
183/// With this implementation, the approach is very similar. The only difference is that the
184/// "Format Specification Mini Language" is implemented for standard types right within this class.
185/// But before processing \b format_spec, this class will check if the argument type assigned to
186/// the placeholder disposes of a custom implementation of box function \alib{format;FFormat}.
187/// If so, this function is invoked and string \b format_spec is passed for custom processing.
188///
189/// Information and sample code on how to adopt custom types to support this interface is
190/// found in the Programmer's Manual of this module, with chapter
191/// \ref alib_format_custom_types_fformat.
192///
193/// For example, \alib class \alib{time;DateTime} supports custom formatting with box-function
194/// \alib{format;FFormat_DateTime} which uses helper-class
195/// \alib{strings::util;CalendarDateTime} that provides a very common specific mini language
196/// for \alib{strings::util::CalendarDateTime;Format;formatting date and time values}.
197///
198/// <p>
199/// - <b>Conversions:</b><br>
200/// In the \b Python placeholder syntax specification:
201///
202/// "{" [field_name] ["!" conversion] [":" format_spec] "}"
203///
204/// symbol \c '!' if used before the colon <c>':'</c> defines
205/// what is called the <b>conversion</b>. With \b Python, three options are given:
206/// \c '!s' which calls \c str() on the value, \c '!r' which calls \c repr() and \c '!a' which
207/// calls \c ascii(). This is of course not applicable to this formatter. As a replacement,
208/// this class extends the original specification of that conversion using \c '!'.
209/// The following provides a list of conversions supported. The names given can be abbreviated
210/// at any point and ignore letter case, e.g., \c !Upper can be \c !UP or just \c !u.
211/// In addition, multiple conversions can be given by concatenating them, each repeating
212/// character \c '!'.<br>
213/// The conversions supported are:
214///
215/// - <b>!Upper</b><br>
216/// Converts the contents of the field to upper case.
217///
218/// - <b>!Lower</b><br>
219/// Converts the contents of the field to lower case.
220///
221/// - <b>!Quote[O[C]]</b><br>
222/// Puts quote characters around the field.
223/// Note that these characters are not respecting any optional given field width but instead
224/// are added to such.
225/// An alias name for \!Quote is given with \b !Str. As the alias can be abbreviated to \b !s,
226/// this provides compatibility with the \b Python specification.
227///
228/// In extension to the python syntax specification, one or two optional characters might be
229/// given after the (optionally abreviated) terms "Quote" respectively "str".
230/// If one character is given, this is used as the open and closing character. If two are given,
231/// the first is used as the open character, the second as the closing one.
232/// For example, <b>{!Q'}</b> uses single quotes, or <b>{!Q[]}</b> uses rectangular brackets.
233/// Bracket types <b>'{'</b> and <b>'}'</b> cannot be used with this conversion.
234/// To surround a placeholder's contents in this bracket type, add <b>{{</b> and <b>}}</b>
235/// around the placeholder - resulting in <b>{{{}}}</b>!.
236///
237/// - <b>!ESC[<|>]</b><br>
238/// In its default behavior or if \c '<' is specified, certain characters are converted to escape
239/// sequences.
240/// If \c '>' is given, escape sequences are converted to their (ascii) value.
241/// See \alib{strings;TEscape;Escape} for details about the conversion
242/// that is performed.<br>
243/// An alias name for \b !ESC< is given with \b !a which provides compatibility
244/// with the \b Python specification.
245/// \note If \b !ESC< is used in combination with \b !Quote, then \b !ESC< should be the first
246/// conversion specifier. Otherwise, the quotes inserted might be escaped as well.
247///
248/// - <b>!Fill[Cc]</b><br>
249/// Inserts as many characters as denoted by the integer type argument.
250/// By default the fill character is space <c>' '</c>. It can be changed with optional character
251/// 'C' plus the character wanted.
252///
253/// - <b>!Tab[Cc][NNN]</b><br>
254/// Inserts fill characters to extend the length of the string to be a multiple of a tab width.
255/// By default the fill character is space <c>' '</c>. It can be changed with optional character
256/// 'C' plus the character wanted. The tab width defaults to \c 8. It can be changed by adding
257/// an unsigned decimal number.
258///
259/// - <b>!ATab[[Cc][NNN]|Reset]</b><br>
260/// Inserts an "automatic tabulator stop". These are tabulator positions that are stored
261/// internally and are automatically extended at the moment the actual contents exceeds the
262/// currently stored tab-position. An arbitrary number of auto tab stop and field width
263/// (see <b>!AWith</b> below) values is maintained by the formatter.
264///
265/// Which each new invocation of \alib{format;Formatter},
266/// the first auto value is chosen and with each use of \c !ATab or \c !AWidth, the next value is
267/// used.<br>
268/// However the stored values are cleared, whenever \b %Format is invoked on a non-acquired
269/// formatter! This means, to preserve the auto-positions across multiple format invocations,
270/// a formatter has to be acquired explicitly before the format operations and released
271/// afterwards.
272///
273/// Alternatively to this, the positions currently stored with the formatter can be reset with
274/// providing argument \c Reset in the format string.
275///
276/// By default, the fill character is space <c>' '</c>. It can be changed with optional character
277/// 'C' plus the character wanted. The optional number provided gives the growth value by which
278/// the tab will grow if its position is exceeded. This value defaults to \c 3.
279///
280/// Both, auto tab and auto width conversions may be used to increase readability of multiple
281/// output lines. Of course, output is not completely tabular, only if those values that result
282/// in the biggest sizes are formatted first. If a perfect tabular output is desired, the data
283/// to be formatted may be processed twice: Once to temporary buffer which is disposed and then
284/// a second time to the desired output \b %AString.
285///
286/// - <b>!AWidth[NNN|Reset]</b><br>
287/// Increases field width with repetitive invocations of format whenever a field value did not
288/// fit to the actually stored width. Optional decimal number \b NNN is added as a padding value.
289/// for more information, see <b>!ATab</b> above.
290///
291/// - <b>!Xtinguish</b><br>
292/// Does not print anything. This is useful if format strings are externalized, e.g defined
293/// in \alib{camp::Camp;GetResourcePool;library resources}. Modifications of such resources
294/// might use this conversion to suppress the display of arguments (which usually are
295/// hard-coded).
296///
297/// - <b>!Replace<search><replace></b><br>
298/// Searches string \p{search} and replaces with \p{replace}. Both values have to be given
299/// enclosed by characters \c '<' and \c '>'. In the special case that \p{search} is empty
300/// (<c><></c>), string \p{replace} will be inserted if the field argument is an empty
301/// string.
302///
303///\I{##########################################################################################}
304/// # Reference Documentation #
305/// @throws <b>alib::format::FMTExceptions</b>
306/// - \alib{format::FMTExceptions;ArgumentIndexOutOfBounds}
307/// - \alib{format::FMTExceptions;IncompatibleTypeCode}
308/// - \alib{format::FMTExceptions;MissingClosingBracket}
309/// - \alib{format::FMTExceptions;MissingPrecisionValuePS}
310/// - \alib{format::FMTExceptions;DuplicateTypeCode}
311/// - \alib{format::FMTExceptions;UnknownTypeCode}
312/// - \alib{format::FMTExceptions;ExclamationMarkExpected}
313/// - \alib{format::FMTExceptions;UnknownConversionPS}
314/// - \alib{format::FMTExceptions;PrecisionSpecificationWithInteger}
315//==================================================================================================
317{
318 //################################################################################################
319 // Protected fields
320 //################################################################################################
321 protected:
322 /// Set of extended placeholder attributes, needed for this type of formatter in
323 /// addition to parent's \alib{format::FormatterStdImpl;PlaceholderAttributes}.
325 {
326 /// The portion of the replacement field that represents the conversion specification.
327 /// This specification is given at the beginning of the replacement field, starting with
328 /// \c '!'.
330
331 /// The position where the conversion was read. This is set to \c -1 in #resetPlaceholder.
333
334
335 /// The value read from the precision field. This is set to \c -1 in #resetPlaceholder.
337
338 /// The position where the precision was read. This is set to \c -1 in #resetPlaceholder.
340
341 /// The default precision if not given.
342 /// This is set to \c 6 in #resetPlaceholder, but is changed when specific.
344 };
345
346 /// The extended placeholder attributes.
348
349 //################################################################################################
350 // Public fields
351 //################################################################################################
352 public:
353 /// Storage of sizes for auto-tabulator feature <b>{!ATab}</b> and auto field width feature
354 /// <b>{!AWidth}</b>
356
357 /// The default instance of field #Sizes. This might be replaced with an external object.
359
360 //################################################################################################
361 // Constructor/Destructor
362 //################################################################################################
363 public:
364 /// Constructs this formatter.
365 /// Inherited field #DefaultNumberFormat is initialized to meet the formatting defaults of
366 /// Python.
369
370 /// Clones and returns a copy of this formatter.
371 ///
372 /// If the formatter attached to field
373 /// \alib{format;Formatter::Next} is of type \b %FormatterStdImpl, then that
374 /// formatter is copied as well.
375 ///
376 /// @returns An object of type \b %FormatterPythonStyle and with the same custom settings
377 /// than this.
378 ALIB_DLL virtual
379 SPFormatter Clone() override;
380
381 /// Resets #AutoSizes.
382 /// @return An internally allocated container of boxes that may be used to collect
383 /// formatter arguments.
384 virtual BoxesMA& Reset() override { Sizes->Reset(); return Formatter::Reset(); }
385
386
387 //################################################################################################
388 // Implementation of FormatterStdImpl interface
389 //################################################################################################
390 protected:
391 /// Sets the actual auto tab stop index to \c 0.
392 virtual void initializeFormat() override { Sizes->Restart(); }
393
394
395
396 /// Invokes parent implementation and then applies some changes to reflect what is defined as
397 /// default in the Python string format specification.
399 virtual void resetPlaceholder() override;
400
401 /// Searches for \c '{' which is not '{{'.
402 ///
403 /// @return The index found, -1 if not found.
405 virtual integer findPlaceholder() override;
406
407 /// Parses placeholder field in python notation. The portion \p{format_spec} is not
408 /// parsed but stored in member
409 /// \alib{format::FormatterStdImpl::PlaceholderAttributes;FormatSpec}.
410 ///
411 /// @return \c true on success, \c false on errors.
413 virtual bool parsePlaceholder() override;
414
415 /// Parses the format specification for standard types as specified in
416 /// \https{"Format Specification Mini Language",docs.python.org/3.5/library/string.html#format-specification-mini-language}.
417 ///
418 /// @return \c true on success, \c false on errors.
420 virtual bool parseStdFormatSpec() override;
421
422 /// Implementation of abstract method
423 /// \alib{format;FormatterStdImpl::writeStringPortion}.<br>
424 /// While writing, replaces \c "{{" with \c "{" and \c "}}" with \c "}" as well as
425 /// standard codes like \c "\\n", \c "\\r" or \c "\\t" with corresponding ascii codes.
426 ///
427 /// @param length The number of characters to write.
429 virtual void writeStringPortion( integer length ) override;
430
431 /// Processes "conversions" which are specified with \c '!'.
432 ///
433 /// @param startIdx The index of the start of the field written in #targetString.
434 /// \c -1 indicates pre-phase.
435 /// @param target The target string, only if different from field #targetString, which
436 /// indicates intermediate phase.
437 /// @return \c false, if the placeholder should be skipped (nothing is written for it).
438 /// \c true otherwise.
440 virtual bool preAndPostProcess( integer startIdx,
441 AString* target ) override;
442
443
444 /// Makes some attribute adjustments and invokes standard implementation
445 /// @return \c true if OK, \c false if replacement should be aborted.
447 virtual bool checkStdFieldAgainstArgument() override;
448};
449} // namespace [alib::format]
450
451ALIB_EXPORT namespace alib {
452/// Type alias in namespace \b alib.
454}
virtual ALIB_DLL integer findPlaceholder() override
virtual void initializeFormat() override
Sets the actual auto tab stop index to 0.
virtual ALIB_DLL void writeStringPortion(integer length) override
virtual ALIB_DLL void resetPlaceholder() override
virtual ALIB_DLL bool parseStdFormatSpec() override
virtual ALIB_DLL bool preAndPostProcess(integer startIdx, AString *target) override
virtual ALIB_DLL bool checkStdFieldAgainstArgument() override
AutoSizes SizesDefaultInstance
The default instance of field Sizes. This might be replaced with an external object.
PlaceholderAttributesPS placeholderPS
The extended placeholder attributes.
virtual ALIB_DLL SPFormatter Clone() override
virtual ALIB_DLL bool parsePlaceholder() override
FormatterStdImpl(const String &formatterClassName)
virtual BoxesMA & Reset()
#define ALIB_DLL
Definition alib.inl:503
#define ALIB_EXPORT
Definition alib.inl:497
format::FormatterPythonStyle FormatterPythonStyle
Type alias in namespace alib.
strings::TAString< character, lang::HeapAllocator > AString
Type alias in namespace alib.
lang::integer integer
Type alias in namespace alib.
Definition integers.inl:149
strings::util::AutoSizes AutoSizes
Type alias in namespace alib.
boxing::TBoxes< MonoAllocator > BoxesMA
Type alias in namespace alib.
Definition boxes.inl:193
containers::SharedPtr< format::Formatter > SPFormatter
Definition formatter.inl:42
strings::TSubstring< character > Substring
Type alias in namespace alib.
int ConversionPos
The position where the conversion was read. This is set to -1 in resetPlaceholder.
int PrecisionPos
The position where the precision was read. This is set to -1 in resetPlaceholder.
int Precision
The value read from the precision field. This is set to -1 in resetPlaceholder.