accents : Java Glossary

*0-9ABCDEFGHIJKLMNOPQRSTUVWXYZ (all)

accents
English, French, German, Italian and Swedish use modified letters such as é (e acute), ê (e circumflex), è (e grave), ç (c cedille). These appear in the range 0x000c to 0x00ff in the Latin-1 supplement part of Unicode.

Eastern European languages have additional accents such as š (s caron) in the range 0x0100 to 0x017f in the Latin Extended-A section of Unicode.

Esperanto has 6 accented letters ĉ (c circumflex), ĝ (g circumflex), ĥ (h circumflex), ĵ (j circumflex), ŝ (s circumflex), û (u circumflex).

Detecting Accented Vowels

com.mindprod.common18.ST. isVowel will tell you if a given character is a vowel, including accented vowels. You can download the source as part of the COMMON18 distributable. This works in JDK (Java Development Kit) 1.+.

Removing Accents

Here is how you can convert accented chars to unaccented ones, in Java version 1.6 or later.

Complications

Learning More

Oracle’s Javadoc on Normalizer class : available:
Oracle’s Javadoc on Normalizer.Form : available:

This page is posted
on the web at:

http://mindprod.com/jgloss/accents.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\accents.html
logo
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[54.227.68.206]
You are visitor number