surrogate pair : Java Glossary

*0-9ABCDEFGHIJKLMNOPQRSTUVWXYZ (all)

surrogate pair

Internally, Java uses 16-bit characters. Unicode has been extended to include some 32-bit characters (actually only 20-bit at this point). Instead of flipping to RAM-gobbling 32-bit characters, Sun decided to handle the new characters with a pair of 16-bit characters. The added support for them in a half-hearted way.

Java does not even have 32-bit String literals, like C style code points e.g. \U0001d504. Note the capital U vs the usual \ud504 I wrote the  SurrogatePair applet to convert C-style code points to arcane surrogate pairs to let you use 32-bit Unicode glyphs in your programs.

To pull this off, Unicode reserves two bands of 16-bit characters for use in encoding the high characters.


This page is posted
on the web at:

http://mindprod.com/jgloss/surrogatepair.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\surrogatepair.html
logo
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[54.166.37.177]
You are visitor number