The String class is a string container providing methods to store and manipulate strings.
LYRIC defines symbols (characters) to be stored in a string as unicode, which is a 32 bits
unsigned integer. However the storage type is names unicode, the String class
doesn’t have unicode features. Here unicode means it can store 32 bits unicode
symbols.
The String class behaves slightly different in debug and optimized mode, depending
if your program is linked with the debug or release version of the library. In debug mode
all members build the ASCII representation if the string (whenever possible),
which slows down performance considerably, but shows a readable content of
the string. In the optimized version of the library the ASCII representation is
built only when needed (when the char* operator is called), which increases
performance.
You will probably experience complaints about ambiguities when using the String
class, with the subscript operator. Especially when you use a constant number as
subscript argument, like [7]. This ambiguity comes from the fact that the String
subscript operator takes a Size type argument, which is an unsigned int, while a
number is interpreted as a signed int (or simply int). The ambiguity comes from the
casting operator char*. Indeed, a C++ compiler will know the following subscript
operators after the String class definition:
operator[](char*, int)
operator[](String, unsigned int)
Now assume a String s. When we write:
a compiler will try to match
with one of the two subscript operators it knows. Both need a cast, either from
String to char*, or from int to unsigned int to match. This results in an
ambiguity.
The easiest solution to remove the ambiguity is to cast the argument to be an
unsigned int. This can be done by adding a u right after the number. The following line
has no ambiguity:
Synopsis
#include <lyric/String.hpp>
class String : private List<char32>
{
public:
typedef char32 Symbol;
~String ();
String ();
String (const String& string)
throw (Exception::Memory::Alloc);
String (const char* ascii)
throw (Exception::Memory::Alloc);
String (const wchar_t* wascii)
throw (Exception::Memory::Alloc);
String (char ch)
throw (Exception::Memory::Alloc);
operator char* () const
throw (Exception::Memory::Alloc);
Symbol& operator [] (Size index)
throw (Exception::Memory::Range);
const Symbol& operator [] (Size index) const
throw (Exception::Memory::Range);
String& operator = (const char* ascii)
throw (Exception::Memory::Alloc);
String& operator = (const String& string)
throw (Exception::Memory::Alloc);
String& operator << (const char* ascii)
throw (Exception::Memory::Alloc);
bool operator == (const String& string) const;
bool operator == (const char* ascii) const;
bool operator != (const String& string) const;
bool operator != (const char* ascii) const;
bool operator < (const String& string) const;
bool operator > (const String& string) const;
String& operator += (const String& string);
String& operator += (Symbol ch);
friend String operator + (const String& str1, const String& str2);
friend ostream& operator << (ostream& os, const String& string);
Size length () const;
void clean ();
void create (const char* ascii, Size leng);
void append (const Symbol ch);
void append (const String& string);
void append (const char* ascii, Size leng);
void insert (Size index, const Symbol& ch);
void insert (Size index, const String& string);
void remove (Size index);
void remove (Size index, Size size);
String sub (Size index, Size size) const;
String sub (const SubId& subid) const;
void capitalize ();
void lowerize ();
bool contains (const String& sub);
List<Size> pos (const String& sub);
List<String> tokens (const String& delimiter) const;
List<String> split (const Regexp& rule) const
throw (Exception::Memory::Alloc);
void rmlsp ();
void rmtsp ();
void rmltsp ();
};
Description
-
˜String () -
Destroys this string, releasing all used memory resources.
-
String () -
Constructs this string as an empty string.
-
String (const String& string) -
Constructs this string from the given string. All properties and data stored in string
are cloned into this string.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
string into this.
-
String (const char* ascii) -
Constructs this string from the given ascii string. Data stored in ascii is copied
into this string.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
the ascii string in this string.
-
String (const wchar_t* wascii) -
Constructs this string from the given wascii string. Data stored in wascii is copied
into this string.
-
String (char ch) -
Constructs this string from the given character. The result of this constructor is
this string with length 1 and containing the given character.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
the character in this.
-
operator char* () const -
Returns this string as a single byte ASCII string. This casting operator is used to
output LYRIC strings in C functions taking char* arguments (like printf, fprintf,
open, etc).
The returned pointer points into a memory area managed by this string. This area
is valid as long as this string exists, and this string wasn’t modified - e.g: as long as
only const defined member functions and/or operators are called, this operator
being the exception to the rule: it can change the memory area, either in location or
in content.
It is highly recommended not to rely on a long existence of the memory area
pointed by the return pointer of this operator. Using it as argument to
C functions is ok, but assigning it to a char* variable for later use is
dangerous since this string can release the memory area without prior
notice.
Never use the C memory manipulation functions (free, realloc) with argument
the pointer returned by this operator. This will break this string’s internal
functionality and may produce memory errors in the most unexpected places in
your code.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
this.length() bytes.
-
String::Symbol& operator [...] (Size index) -
Returns a reference to symbol at index in this string. This operator can be used to
modify the content of a string at the given index.
-
Exception::Memory::Range - is thrown if the given index is out of this
container’s size, if the debug version of LYRIC (-lyric-g) is linked.
-
const String::Symbol& operator [...] (Size index) const -
Returns a const reference to symbol at index in this string, this being a const
itself.
-
Exception::Memory::Range - is thrown if the given index is out of this
container’s size, if the debug version of LYRIC (-lyric-g) is linked.
-
String& operator = (const char* ascii) -
Assigns the given ascii string to this string. Data stored in ascii is copied into this
string.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
the ascii string in this string.
-
String& operator = (const String& string) -
Assigns the given string to this string. All properties and data stored in the given
string are coped into this string. A reference to this string is returned for
assignment operations chaining.
-
Exception::Memory::Alloc - is thrown if not enough memory is found to store
string into this.
-
bool operator == (const String& string) const -
Compares this with string and returns true if both contain the same information,
false if not.
-
bool operator != (const String& string) const -
Compares this with string and returns false if both contain the same information,
true if not.
-
void remove (Size index, Size size)) -
Removes a sub-part from this string. The sub-part is given by its starting position
index into this, and its size.
-
String sub (Size index, Size size) const) const -
Returns a sub-string of this string. The sub-string is given by its starting position
index into this, and its size.
Note: Incomplete. No range checking right now.
-
String sub (const String::SubId& subid) const -
Returns a sub-string of this string. The sub-string is given by the sub-string
identifier subid.
Note: Incomplete. No range checking right now.
-
void capitalize () -
Forces the first alphabetical symbol in this string to uppercase. Leading non
alphabetical symbols in this string are ignored during the parsing. Only the
eventual first lowercase letter is changed to uppercase.
-
void lowerize () -
Changes all uppercase letters in this string to lowercase, as far as a lowercase of a
letter is defined.
-
bool contains (const String& sub) const -
Returns true if this string contains the given sub-string, false if not.
-
List<Size> pos (const String& sub) const -
Returns the positions -- in the list of sizes -- of the given sub-string into this
string.
Note: Don’t know how robust the sub searching is. It probably wont’ handle partly
recovering sub-strings.
-
List<String> tokens (const String& delimiter) const -
Tokenises this string into parts -- returned in a list of strings -- using a given
delimiter. The delimiter can be a character or a string. The returned sub-strings are
the text parts bounded by the given delimiter. If delimiter was not found in this
string, the returned sub-string is this string. The delimiter itself is not returned in
the sub-strings. This string is left unchanged.
-
List<String> split (const Regexp& rule) const -
Splits this string into parts -- returned a list of strings strings -- according to a
given regular expression splitting rule. The returned sub-strings are either a
matched expression or unmatched text. This string is left unchanged.
A couple of examples (like the Path::expand implementation) would help to
understand the splitting.
-
void rmlsp () -
Removes leading spaces/tabs in this string. Removes all spaces/tabs preceding the
first non space/tab character in this string. Doesn’t touch spaces/tabs within
text.
-
void rmtsp () -
Removes trailing spaces/tabs in this. Removes all spaces/tabs following the last
non space/tab character in this string. Doesn’t touch spaces/tabs within
text.
-
void rmltsp () -
Removes leading and trailing spaces/tabs in this string. Removes all spaces/tabs
preceding the first non space/tab character and all spaces/tabs following the last
non space/tab character in this string. Doesn’t touch spaces/tabs within text.