JCL Help:MaximumUTF16

From Project JEDI Wiki
Jump to navigationJump to search


Summary

Maximum value for an UTF-16 encoded character


Pascal

 MaximumUTF16: UCS4 = $0010FFFF;
SurrogateHighStart = UCS4($D800);
SurrogateHighEnd = UCS4($DBFF);
SurrogateLowStart = UCS4($DC00);
SurrogateLowEnd = UCS4($DFFF);


Description

This constant denotes the maximum value for an UTF-16 character. Valid UTF-16 characters are included in range [#0..MaximumUTF16], with the exclusion of ranges [SurrogateLowStart..SurrogateLowEnd] and [SurrogateHighStart..SurrogateHighEnd].

In an UTF-16 stream, characters are read per 16-bit wide chunk. Characters are decoded as follow: A first 16-bit chunk is read, if its value is included in [SurrogateLowStart..SurrogateLowEnd], it is an error. If its value is not included in [SurrogateHighStart..SurrogateHighEnd], then the character value is equal to this first chunk. If the value of the first chunk is included in [SurrogateHighStart..SurrogateHighEnd], a second thunk is read and its value has to be included in [SurrogateLowStart..SurrogateLowEnd], it is an error otherwise. Finally the value of these two thunk are combined to make the character value.


About

Unit

JclBase


Contribute to this help topic

This documentation wiki is based on the collaborative effort of Project JEDI users. Your edits are welcome in order to improve documentation quality: edit this page