binary formats : Java Glossary

*0-9ABCDEFGHIJKLMNOPQRSTUVWXYZ (all)

binary formats
DataOutputStream puts out Java data in big-endian internal binary format. In contrast PrintStream.println and PrintStream.print put out data as human-readable 8-bit ASCII (American Standard Code for Information Interchange) characters.

Java DataOutputStream Binary File Formats

When you create a file via DataOutputStream, what does the binary file look like? It looks like the internal binary RAM (Random Access Memory) format in a big-endian CPU (Central Processing Unit). These are also the internal formats in the Java Virtual Machine.

Everything is stored big endian, MSB (Most Significant Bit) (Most Significant Byte) first. (People who cut their teeth on Intel or the MOS 6502 are used to the little endian LSB (Least Significant Bit) first format.) Even on Intel hardware Java uses big endian file formats. This permits data interchange with other platforms.

There are no separators between fields. The files are in binary, not readable ASCII.

Method Type Size Description
writeBoolean(boolean v) boolean 1 byte 8-bit 0x00=false 0x01=true
writeByte(int v) byte 1 byte 8-bit signed binary integer
or 8-bit ASCII char
writeBytes(String s) bytes 1 byte 8-bit signed binary integers
or string of ASCII chars.
not null terminated.
not in quotes.
not counted.
not delimited in any way.
writeChar(int v) char 2 byte 16-bit unsigned binary integer
or 16-bit Unicode char.
writeChars(String s) chars 2 byte 16-bit unsigned binary integers
or string of 16-bit Unicode chars.
not null terminated.
not in quotes.
not counted.
not delimited in any way.
writeDouble(double v) double 8 bytes 64-bit IEEE (Institute of Electrical & Electronics Engineers) binary
1-bit sign
11-bit base 2 exponent
biased+1023
52-bit fraction, lead 1 implied
e.g.
3. = 0x4008000000000000
-3. = 0xC008000000000000
writeFloat(float v) float 4 bytes 32-bit IEEE binary
1-bit sign
8-bit base 2 exponent
biased+127
23-bit fraction, lead 1 implied
e.g.
3. = 0x404000
-3. = 0xC04000
writeInt(int v) int 4 bytes 32-bit signed binary
e.g.
3 = >0x00 0x00 0x00 0x03
-3 = 0xff 0xff 0xff 0xfd
writeLong(long v) long 8 bytes 64-bit signed binary
writeShort(int v) short 2 bytes 16-bit signed binary
writeUTF(String s) utf 2 bytes 16-bit length count
followed by ASCII-7 string.
Not null terminated.
ABC == 0x0003414243
non-ASCII-7 chars use multibyte
encodings with first byte having
the high bit on.

UTF (Unicode Transformation unit)

UTF is a compact form of Unicode that uses a mixture of 8, 16 and 24-bit codes. Strings are stored as a 16-bit big-endian length count followed by a 7-bit ASCII string. Not null terminated. ABC == 0x0003414243. Non-ASCII-7 chars use multibyte encodings with first byte having the high bit on. UTF is an external format. UTF strings are interconverted to ordinary Strings during I/O by readUTF and writeUTF. Unicode-2 supports even 32-bit characters and UTF has been extended to handle them as well.
Unicode UTF bytes required to represent the character
00000000 0xxxxxxx 0xxxxxxx 1
00000yyy yyxxxxxx 110yyyyy 10xxxxxx 2
zzzzyyyy yyxxxxxx 1110zzzz 10yyyyyy 10xxxxxx 3

Learning More

Oracle’s Javadoc on DataOutputStream class : available:
Oracle’s Javadoc on DataInputStream class : available:

This page is posted
on the web at:

http://mindprod.com/jgloss/binaryformats.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\jgloss\binaryformats.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[52.14.213.73]
You are visitor number