Strims


Classes

class  ZStrimR
class  ZStrimU
class  ZStrimW

Detailed Description

See also:
ZUnicode

Unicode

The word strim was coined by Eric Cooper. It's a nice punchy designator for a string stream, an interface that is akin to a stream but geared towards the reading and writing of unicode strings. In working with strims you should understand the terms code unit and code point, which are covered in the documentation sections Unicode and ZUnicode.

If you're working with ASCII or other single-byte encoding of text then the ZStreamR and ZStreamW facilities are sufficient. If you'd like to be able to handle arbitrary text encodings then you should consider using the ZStrimR and ZStrimW interfaces. Strings in any of the three Unicode serialization forms (UTF-8, UTF-16, UTF-32) can be passed to a ZStrimW, and ZStrimR returns strings in any of the serialization forms. Both can limit their operation by code unit and code point counts.

As ZStrimR and ZStrimW are abstract interfaces you'll never instantiate them directly. Instead you'll use a concrete subclass that has overrides of the pure virtual methods that are ultimately responsible for taking a string and disposing of it somewhere, or sourcing text and returning it.

For example, if you have a ZStreamerRW that encapsulates a network connection, and know that the other end is expecting little endian UTF-16 text, you can do this:

        ZStrimW_StreamUTF16LE theStrim(theStreamerRW->GetStreamW());
        theStrim.Write("This is some UTF-8 text");
        theStrim.Write("but will be written to theStreamerRW as UTF-16LE.");

if it had been expecting UTF-8 text you could do this:

        ZStrimW_StreamUTF8 theStrim(theStreamerRW->GetStreamW());
        theStrim.Write(L"This is some native endian UTF-16 text");
        theStrim.Write(L"but will be written to theStreamerRW as UTF-8.");

In fact in the latter case the string is of type wchar_t, and thus might be a sequence of 32 bit code points, so the UTF32* entry point would be invoked rather than the UTF16* entry point. In either case the program at the other end of the network connection would see text that was encoded appropriately.

Note:
A short read is not necessarily indicative of an end of stream condition. If the buffer does not have enough space for enough code units to make up a complete code point then the read will return without reading any of that final code point's code units. For this reason, when reading UTF-8, you should always pass a buffer that is at least six code units in length, and when reading UTF-16 one that is at least two code units in length.

Generated on Thu Jul 26 11:21:59 2007 for ZooLib by  doxygen 1.4.7