Unicode was extended to 32 bits, with the corresponding UTF-16 encoding also extended with a clumsy system of surrogate characters to encode the 32-bit characters above 0xffff.
The term codepoint in Java tends to be used to mean a slot in the 32-bit Unicode assignment, though I suspect the term is also valid to mean a spot in Unicode-16 or any other character set.
Java now straddles the 16-bit and 32-bit worlds. You might think Java would now have a 32-bit analog to Character, perhaps called CodePoint and a 32-bit analog to String, perhaps called CodePoints, but it does not. Instead, Strings and char[] are permitted to contain surrogate pairs which encode a single high-32-bit codepoint.
StringBuilder.appendCodePoint( int codepoint ) will accept 32-bit codepoints to append.
StringBuilder.append( int number ) just converts the number to a String and adds that, not what you want!
FontMetrics.charWidth( int codepoint ) will tell you the width in pixels to render a given codepoint.
Character.isValidCodePoint ( int codepoint ) will tell you if there is a glyph assigned to that codepoint. That is still no guarantee your Font will render it though. Character. codePointAt and codePointBefore let you deal with 32-bit codepoints encoded as surrogate pairs in char arrays. Most of the Character methods now have a version that accepts an int codepoint such as toLowerCase.
To convert from a codepoint to an array of chars with the surrogate pair
This page is posted |
http://mindprod.com/jgloss/codepoint.html | |
Optional Replicator mirror
|
J:\mindprod\jgloss\codepoint.html | |
Please read the feedback from other visitors,
or send your own feedback about the site. Contact Roedy. Please feel free to link to this page without explicit permission. | ||
Canadian
Mind
Products
IP:[65.110.21.43] Your face IP:[3.145.74.89] |
| |
Feedback |
You are visitor number | |