Contents Up Previous Next

wxEncodingConverter

This class is capable of converting strings between two 8-bit encodings/charsets. It can also convert from/to Unicode (but only if you compiled wxWidgets with wxUSE_WCHAR_T set to 1). Only limited subset of encodings in supported by wxEncodingConverter: wxFONTENCODING_ISO8859_1..15, wxFONTENCODING_CP1250..1257 and wxFONTENCODING_KOI8.

Note

Please use wxMBConv classes instead if possible. wxCSConv has much better support for various encodings than wxEncodingConverter. wxEncodingConverter is useful only if you rely on wxCONVERT_SUBSTITUTE mode of operation (see Init).

Derived from

wxObject

Include files

<wx/encconv.h>

See also

wxFontMapper, wxMBConv, Writing non-English applications

Members

wxEncodingConverter::wxEncodingConverter
wxEncodingConverter::Init
wxEncodingConverter::CanConvert
wxEncodingConverter::Convert
wxEncodingConverter::GetPlatformEquivalents
wxEncodingConverter::GetAllEquivalents


wxEncodingConverter::wxEncodingConverter

wxEncodingConverter()

Constructor.


wxEncodingConverter::Init

bool Init(wxFontEncoding input_enc, wxFontEncoding output_enc, int method = wxCONVERT_STRICT)

Initialize conversion. Both output or input encoding may be wxFONTENCODING_UNICODE, but only if wxUSE_ENCODING is set to 1. All subsequent calls to Convert() will interpret its argument as a string in input_enc encoding and will output string in output_enc encoding. You must call this method before calling Convert. You may call it more than once in order to switch to another conversion. Method affects behaviour of Convert() in case input character cannot be converted because it does not exist in output encoding:

wxCONVERT_STRICT follow behaviour of GNU Recode - just copy unconvertible characters to output and don't change them (its integer value will stay the same)
wxCONVERT_SUBSTITUTE try some (lossy) substitutions - e.g. replace unconvertible latin capitals with acute by ordinary capitals, replace en-dash or em-dash by '-' etc.

Both modes guarantee that output string will have same length as input string.

Return value

false if given conversion is impossible, true otherwise (conversion may be impossible either if you try to convert to Unicode with non-Unicode build of wxWidgets or if input or output encoding is not supported.)


wxEncodingConverter::CanConvert

static bool CanConvert(wxFontEncodingencIn, wxFontEncoding encOut)

Return true if (any text in) multibyte encoding encIn can be converted to another one (encOut) losslessly.

Do not call this method with wxFONTENCODING_UNICODE as either parameter, it doesn't make sense (always works in one sense and always depends on the text to convert in the other).


wxEncodingConverter::Convert

bool Convert(const char* input, char* output) const

bool Convert(const wchar_t* input, wchar_t* output) const

bool Convert(const char* input, wchar_t* output) const

bool Convert(const wchar_t* input, char* output) const

Convert input string according to settings passed to Init and writes the result to output.

bool Convert(char* str) const

bool Convert(wchar_t* str) const

Convert input string according to settings passed to Init in-place, i.e. write the result to the same memory area.

All of the versions above return true if the conversion was lossless and false if at least one of the characters couldn't be converted and was replaced with '?' in the output. Note that if wxCONVERT_SUBSTITUTE was passed to Init, substitution is considered lossless operation.

wxString Convert(const wxString& input) const

Convert wxString and return new wxString object.

Notes

You must call Init before using this method!

wchar_t versions of the method are not available if wxWidgets was compiled with wxUSE_WCHAR_T set to 0.


wxEncodingConverter::GetPlatformEquivalents

static wxFontEncodingArray GetPlatformEquivalents(wxFontEncoding enc, int platform = wxPLATFORM_CURRENT)

Return equivalents for given font that are used under given platform. Supported platforms:

wxPLATFORM_CURRENT means the platform this binary was compiled for.

Examples:

current platform   enc          returned value
----------------------------------------------
unix            CP1250             {ISO8859_2}
unix         ISO8859_2             {ISO8859_2}
windows      ISO8859_2                {CP1250}
unix            CP1252  {ISO8859_1,ISO8859_15}
Equivalence is defined in terms of convertibility: two encodings are equivalent if you can convert text between then without losing information (it may - and will - happen that you lose special chars like quotation marks or em-dashes but you shouldn't lose any diacritics and language-specific characters when converting between equivalent encodings).

Remember that this function does NOT check for presence of fonts in system. It only tells you what are most suitable encodings. (It usually returns only one encoding.)

Notes


wxEncodingConverter::GetAllEquivalents

static wxFontEncodingArray GetAllEquivalents(wxFontEncoding enc)

Similar to GetPlatformEquivalents, but this one will return ALL equivalent encodings, regardless of the platform, and including itself.

This platform's encodings are before others in the array. And again, if enc is in the array, it is the very first item in it.