|UTF (Unicode Transformation unit) BOM (Byte Order Mark) (Byte Order Mark) Unicode-encoding Endian Indicators|
as it appears encoded
|ef bb bf||UTF-8 endian, strictly speaking does not apply, though it uses big-endian most-significant-bytes first representation.|
|fe ff||UTF-16 for 16-bit internal UCS-2, big endian, Java network order|
|ff fe||UTF-16 for 16-bit internal UCS-2, little endian, Intel/Microsoft order. Note you must examine subsequent bytes to tell this apart from a UTF-32 BOM since they both start ff fe.|
|00 00 fe ff||UTF-32 for 32-bit internal UCS-4, big-endian, Java network order|
|ff fe 00 00||UTF-32 for 32-bit internal UCS-4, little endian, Intel/Microsoft order.|
There are also variants of these encodings that have an implied endian marker.
Unfortunately, often applications, even Javac.exe, choke on these byte order marks. Java Readers don’t automatically filter them out. There is not much you can do but manually remove them.
This program tests how Java handles BOM s. It discovers than Java never inserts BOM and it never removes them on its own. You have to bypass, insert and delete them explicitly.
available on the web at:
optional Replicator mirror
Please email your feedback for publication, letters to the editor, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : . If you want your message, your name or email kept confidential, not considered for public posting, please explicitly specify that. Unless you state otherwise, I will treat your message as a letter to the editor that I may or may not publish in the feedback section. After that, it will be too late to retract it. If you disagree with something I said, especially when sending an ad-hominem attack, a rant composed mainly of obscenities or a death threat, please quote the offending passage and cite the web page where you found it, tell me why you think it is wrong, and, if possible, provide some supporting evidence. I can’t very well fix erroneous or ambiguous text if I can’t find it.
Your face IP:[188.8.131.52]
|Feedback||You are visitor number 28,085.|