Encoding
XML (like Java) uses Unicode to encode characters.
Unicode comes in many flavors.  The most common one
used in the West is UTF-8.
UTF-8 is a variable length code.  Characters are
encoded in 1 byte, 2 bytes, or 4 bytes.
The first 128 characters in Unicode are ASCII.
In UTF-8, the numbers between 128 and 255 code for
some of the more common characters used in western
Europe, such as ã, á, å, or ç.
Two byte codes are used for some characters not listed
in the first 256 and some Asian ideographs.
Four byte codes can handle any ideographs that are left.
Those using non-western languages should investigate
other versions of Unicode.