Hermit Crab Variable Length Record Files
by Roedy Green ©1996-2008 Canadian Mind Products
This essay is about a suggested
student project in
Java programming. This essay gives a rough overview of how it might work. It
does not describe an actual complete program. I have
no source, object,
specifications, file layouts or anything else useful to implementing this
project. Everything I have to say to help you with this project is written below.
I am
not prepared to help you implement it; I have too many other
projects of my own.
I do contract work for a living, which could include writing a program such as
this. However, I don’t do people’s homework
for them. That just robs them of an education.
You have my full permission to implement this project any way you please.
Hermit crab files allow you to efficiently manage variable length records that
can grow and shrink. You can access records by an integer relative record number.
Hermit crab files do not provide access by key, though you might build an in-RAM
index or Hashtable by key fields that gives you the relative record number. They
are called hermit crab files because as records grow, they abandon space "shells
", which are then recycled for use by smaller records.
Here are some thoughts on how an efficient variable length record manager might
work. This is a virtual product. There is no such utility. I first proposed this
back in 1991 for DOS. I have dusted it off a little
for Java. It would be useful for that intermediate territory where flat files
are not quite smart enough and an SQL database is overkill.
Purpose
The Hermit Crab variable length record manager is a way to efficiently handle
variable length records. The strange name comes from the algorithm used. As
records (crabs) grow, they cast off their shells and find larger ones to
live in. Smaller records growing can then move into one of the cast-off shells.
The Interface
In DOS, Hermit would be a TSR you load before your application. In Java it would
be an ordinary class. It can handle files with records up to 65000 bytes long.
You need to design your application to use the Hermit API. Though you can Copy,
Move and Delete Hermit files with ordinary operating system commands, you cannot
edit them with ordinary word processors.
Hermit allows you to number your variable length records 0, 1, 2, .. 4,294,967,295.
You can leave some numbers out, but you should attempt to keep the highest
record number used as low as possible. Read the later Under the Hood to
find out why.
Advantages
The advantages of hermit crab files are:
- The code uses only a fraction of the RAM that an equivalent SQL engine would.
- HermitCrab is just another class, thus easy and compact to install.
- very fast lookup by int record number.
- allows variable length records, or variable length serialised objects. These
records can grow and shrink with little efficiency penalty.
Disadvantages
The limitations of hermit crab files are:
- they have no inherent searching ability. You have to search the records linearly
one by one.
- They have no indexes. You can build Hashtables from key to the prime int key.
These must be RAM resident. These are not part of the hermit crab file system
itself.
- You have to write the HermitCode java source yourself or hire someone (e.g. me,
hint hint) to do it for you.
- They do not have crash-recovery. SQL databases have automatic recovery to the
last transaction. You could copy the file periodically to backup to provide some
protection from crashes.
Interface Details
File Create
- You must specify the file name and the maximum possible record length.
- Optionally you may specify the maximum number of records that the file will hold,
and the average size in bytes of each record. These numbers are used to
preallocate space. They are not absolute limits on how big the file can grow.
- You may also specify the number of bytes of breathing room to use for each
record. This is padding space at the end of the record to allow it to grow
without having to move to a new shell.
File Open
- To open an existing file, simply provide the name.
Random Write
- You provide a reference to the record (array of bytes) you want to write and its
length. You also must provide the record number.
- Whether the record exists already or not is immaterial. It may grow or shrink
from its previous size.
- Writing a zero-length record DELETEs it. Its unused space will be reclaimed.
Random Read
- You provide provide the record number you want to read.
- Hermit allocates and returns a buffer.
Read Next
- Read next is similar to adding one to the record number, except that it
automatically jumps to the next record actually used.
- To read the whole file, start with a READ RANDOM on record 0, (or the low-water
mark derived from stats) then do READ-NEXTs.
- Note that there is so such thing as WRITE NEXT. It is safer for the programmer
to maintain the counter.
read Prev
- This is just like READ NEXT, except it works backwards. It is slower than READ
NEXT under Win95 because of the of the dippy way Microsoft handles the FAT
chains.
Get Stats
- Hermit returns the fully qualified name of the file, the count of how many
records exist in the file, the breathing room, the low water mark record number
used, the low water mark smallest record, the high water mark record number used,
the high water mark largest record, the total number of bytes in all data
records (not counting overhead), and average record size.
- Note that High Water marks include records that may now be deleted. To get
accurate stats, you must do a REORG.
Reorg
- REORG reorganizes a file for maximum efficiency, collecting small discarded "shells"
-- free space from deleted records, into one big pool of free space.
- optionally you may specify the maximum number of records that the reorganized
file will hold, and the average size in bytes of each record. These numbers are
used to preallocate space. They are not absolute limits on how big the file can
grow. Typically you would do a STATS and then add some growing room.
- You may also specify a new breathing room parameter to use for each record.
- REORG is a relatively slow process since each record must be copied to a new
file. When the copy is done, the old version is deleted, and the new one renamed
to the old. You need sufficient free disk space for two copies of the file.
Hermit will not even attempt the REORG unless there is sufficient room.
Under The Hood
- The key to the system is a large array of 32 byte pointers. You index this array
by record number, and get the offset on disk where that record currently lives.
The low 4 bits are always 0 since records are always aligned on even 16 byte
paragraph boundaries within the file. This gives every record an average of 8
bytes breathing room to grow and shrink without having to find a new shell.
- The array is the only thing Hermit keeps in RAM or RAM cache. If it has to grow,
it can build a new larger one on disk as needed. This is time consuming, so you
want to get good estimates of file size at CREATE or REORG time.
- The precise length of each record is also stored in the array of pointers to the
records. It takes 16 bits. We keep this information handy in the array rather
than with the record so that we can set up the physical read directly into the
user region in one single efficient i/o.
- The breathing room is not really what it appears. We arrange that shells come in
standard sizes. Let us say the breathing room were 8 bytes. If we aligned the
records always on a paragraph boundary — every 16 bytes, then on average
every record would have 8 bytes breathing room. Using this trick, we can save
the overhead of having to track both the shell size and the record size.
Whenever we see a record (crab-size) of say 100 bytes, we know the shell size
must be one size bigger 7*16 = 112 bytes.
- When records are created, we just tack them on the end and set the pointer to
them.
- When records are deleted, we add them to the free space chains. Since shells
come in only a limited number of sizes, we can build a separate chain through
the empty shells, one for each size. By building a LIFO queue of shells, all you
need to do is write a record into the deleted slot with the index of the
previous cast off shell of that size on it, — a single efficient disk i/o.
- To recycle the discarded shells, when we go to write a new record or expand a
new one, we first check if any free shells exist already of the desired size,
before creating one at the end of the file.
For very large files, you might split the logical file up into several smaller
physical files. This way they can span several physical hard disks with an read
head dedicated to each. Windows has notoriously slow random access on large
files because it chases linear FAT chains to find each cluster. Speed under
Windows can thus be improved by keeping physical random access files small.
Of course, if you have lots of RAM, you can keep the entire structure in memory,
and just pickle it between incarnations. In that case almost none of the Hermit
Crab structure is required. You rely on the garbage collection mechanism to find
your growing and shrinking records new recycled homes.
What is Wrong With This Scheme
- If a record shrinks, it must move to a smaller shell. Otherwise the relationship
between crab size and shell size would be broken.
- If no shells are available at the correct size, but they are available a little
too big, they cannot be used. Again the relationship between crab and shell must
be preserved.
- Eventually if say all records gradually grow, you will have all kinds of cast
off small shells that nobody wants. The only way to reclaim this space is to do
a Reorg. In theory it might be possible to do a mini Reorg to sort the array by
offset, then collapse pairs small shells into one big shell.
Indexing
You could build external hashing or Btree indexes to this file. They would
convert a key to a record number. The design of these indexes would be
simplified because they would not need to be updated when a record was moved,
only when a key value changed.
You might use a dual index — the base and the changes. On every reorg you
merge the changes back into the base. The base index can be a read-only
structure which makes it simpler to generate and more efficient to scan. The
changes index is small, so you can get away with less sophisticated indexing
techniques, perhaps even something as crude as a linear table or hashtable kept
in RAM.
With 100 MB RAMs becoming commonplace, you might simply implement all your
indexes as RAM-resident hashtables that cough up a record number to hand to
Hermit. A purely RAM-resident BTree is a much simpler animal to code that a disk
based one. You can use serialisation to load/store your RAM-based hashtables or
Btrees.
Compression
Records could be compressed and decompressed on write/read. Even with fixed
length data, compression creates varying length compressed results.
Quick & Dirty
Perhaps you would use a Hermit Crab file if somebody else coded it for you, but
you have not the patience to do it. Here is how you could whip up a quick and
dirty one. Use a Hashtable or in-RAM array that tells where each record starts
and its precise length in bytes. If a record changes value, but not length,
write it back to the same spot where it was before. If it grows or shrinks, even
so much as one byte, write a new record for it at the end of the file. When you
close the file, copy the live records to a new file (which reclaims space). Then
pickle the Hashtable or array in a companion file.
You might even use an external sort such as Opt-Tech Sort to rapidly reorder
your records into the proper physical order.