Serialized File Recovery  Serialized File Recovery

go to home page Student Projects full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish by Roedy Green ©1996-2008 Canadian Mind Products

This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. It does not describe an actual complete program. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; I have too many other projects of my own.

I do contract work for a living, which could include writing a program such as this. However, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project any way you please.

The Problem

Despite the best intentions, sometimes you can find yourself with serialized data and no class files or source code for class files to reconstitute them. This can happen because:
  1. You were naive and did not realise the need to keep the corresponding Java source and class files for all your serialised files.
  2. Backup files were lost in a fire or were stolen.
  3. You were careless about accounting for all data files and keeping them all up to date, with the latest class definition. You have all the code in CVS, but you have no idea which of it to use.
  4. You were sloppy and simply did not bother to bring all your old serialised files up to date every time you changed the format. Before you knew it, you had lost track of which files used which format.
  5. You added serialVersionUIDs, but other than that, you did not change the object structure. Now you can’t read your old files. You don’t remember what the old default serialVersionUIDs were. If you have made absolutely no other changes, you might luck out if you create matching class files by simply removing the serialVersionUIDs from your new class sources. If that does not work, now what?

The Tools

No matter what the reason you can’t recover the file, what can you do now? I strongly suspect there are sufficient metadata embedded in a serialized file to reconstruct some class files to recover the data. So this project has several parts:
  1. Study the spec for the serialised data format that Java OutputObjectStream uses. To fully understand it, you wil have to do some experiments.
    Sun’s JDK Platform Guide to serialisation protocol : available:
  2. Write a tool to convert create a set of java source code for the classes used in an ObjectStream, based only on the clues it finds embedded in the stream. These classes would be bare bones, no methods, no transients, no initialisation code, not custom read/write object code. However they might be enough of a skeleton to recreate the original classes, or at least enough to use to get the raw data back out.
  3. Write a tool to tell you summary facts about an unknown ObjectStream file such as the classes used and the serialVersionUIDs.
  4. Write a tool to convert an ObjectStream to something human-readable for analysis, such as XML.
  5. Write a scavenger tool that contains a class that behaves exactly like ObjectStreamClass except it does not complain if the serialVersionUIDs don’t match. You can then read QbjectStreams, in a rough and ready way, scavenging as much as possible from them.
How would you go about this? You start by using ObjectOutputStream to write simple primitives and objects, and studying the resulting file with a hex viewer, and reading the spec to learn the format. Then later you test your code on various unknown serial files given you by others for recovery.

You might not even release the code, just go into the serial file recovery business.

serialization

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.61] Spread the Net
You are visitor number 5,042.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/project/serializedrecovery.html J:\mindprod\project\serializedrecovery.html