language-icon Old Web
English
Sign In

C++ string handling

The C++ programming language has support for string handling, mostly implemented in its standard library. The language standard specifies several string types, some inherited from C, some designed to make use of the language's features, such as classes and RAII. The most-used of these is .mw-parser-output .monospaced{font-family:monospace,monospace}std::string. The C++ programming language has support for string handling, mostly implemented in its standard library. The language standard specifies several string types, some inherited from C, some designed to make use of the language's features, such as classes and RAII. The most-used of these is .mw-parser-output .monospaced{font-family:monospace,monospace}std::string. Since the initial versions of C++ had only the 'low-level' C string handling functionality and conventions, multiple incompatible designs for string handling classes have been designed over the years and are still used instead of std::string, and C++ programmers may need to handle multiple conventions in a single application. The std::string type is the main string datatype in standard C++ since 1998, but it was not always part of C++. From C, C++ inherited the convention of using null-terminated strings that are handled by a pointer to their first element, and a library of functions that manipulate such strings. In modern standard C++, a string literal such as 'hello' still denotes a NUL-terminated array of characters. Using C++ classes to implement a string type offers several benefits of automated memory management and a reduced risk of out-of-bounds accesses, and more intuitive syntax for string comparison and concatenation. Therefore, it was strongly tempting to create such a class. Over the years, C++ application, library and framework developers produced their own, incompatible string representations, such as the one in AT&T's Standard Components library (the first such implementation, 1983) or the CString type in Microsoft's MFC. While std::string standardized strings, legacy applications still commonly contain such custom string types and libraries may expect C-style strings, making it 'virtually impossible' to avoid using multiple string types in C++ programs and requiring programmers to decide on the desired string representation ahead of starting a project. In a 1991 retrospective on the history of C++, its inventor Bjarne Stroustrup called the lack of a standard string type (and some other standard types) in C++ 1.0 the worst mistake he made in its development; 'the absence of those led to everybody re-inventing the wheel and to an unnecessary diversity in the most fundamental classes'. The various vendors' string types have different implementation strategies and performance characteristics. In particular, some string types use a copy-on-write strategy, where an operation such as does not actually copy the content of a to b; instead, both strings share their contents and a reference count on the content is incremented. The actual copying is postponed until a mutating operation, such as appending a character to either string, makes the strings' contents differ. Copy-on-write can make major performance changes to code using strings (making some operations much faster and some much slower). Though std::string no longer uses it, many (perhaps most) alternative string libraries still implement copy-on-write strings. Some string implementations store 16-bit or 32-bit code points instead of bytes, this was intended to facilitate processing of Unicode text. However, it means that conversion to these types from std::string or from arrays of bytes is a slow and often a lossy operation, dependent on the 'locale', and can throw exceptions. Any processing advantages of 16-bit code units vanished when the variable-width UTF-16 encoding was introduced (though there are still advantages if you must communicate with a 16-bit API such as Windows). Qt's QString is an example.

[ "Empty string", "scanf format string", "String interpolation", "Jaro–Winkler distance", "String instrument playing", "Flipped SU(5)", "String (device)", "String diagram" ]
Parent Topic
Child Topic
    No Parent Topic