This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.
This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.
Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.
Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.
You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.
Please do not email me about this project without reading the disclaimer above.
You write a program that backs up rarely used files over the Internet, using an ADSL (Asymmetric Digital Subscriber Line technology) connection to a server. Whenever space gets low, it frees disk space of files that have been backed up. When the user goes to open those files, there is a pause while they are retrieved from the server.This scheme also acts as backup. All files can be backed up. To speed backup, before they are sent to the server they can be super compressed and only changes sent via deltas.
The backup server can conserve disk space by noticing that, for example, two customers both have identical copies of MS Word For Windows DLLs (Dynamic Link Libraries) installed. The server only needs to keep one copy of each DLL (Dynamic Link Library). It has to be careful. Customers may have identically named files that are not identical.
This is very old idea. Univac 1106 mainframes and DEC (Digital Equipment Corporation) PDP-10s with far less than a megabyte of RAM (Random Access Memory) used to integrate file migration with backup to tape.
Before you can restore, you need to free up space. This is done by dropping infrequently used files already backed up. Happily, you don’t have to take time out to back them up.
You could also implement this with backup to CD (Compact Disc) ROM (Read Only Memory) burner. To consolidate backups, so you don’t have do shuffle a zillion discs, you may need to periodically re-backup files that have not changed, (which could entail temporarily restoring them.).
You probably would want to use 64-bit checksums for end to end assurance files were backed up and restored correctly. These also act as cookies for almost uniquely identifying files.
The only tricky technical challenge is the hook into the operating system. It only has to intercept file open. You don’t have to intercept directory reads or close. You leave tiny stub files behind in the directory as proxies for the file. I would tackle NT first. The hooks are much more formal with no ways around them. By the time you are ready, Windows 2000 will be out with I hope similar bulletproof hooks. You could then leave Win95 and Win98 on the trash heap of history. I must admit I have not studied the open hooks available, but I have studied the defragger interface. It is quite bulletproof and stable. I would hope the open hooks would be done similarly. In contrast, in Win95/98 there are many ways around the hooks since low level sector I/O is not restricted. You can cannibalise the Filemon vxd.
Backup should be imperceptible, a low priority background process. It might even stop completely when the computer or Internet connection were busy. You might simply monitor throughput and if it is slower than some threshold, temporarily shutdown. Backup would have relatively low overhead on the cpu, disk and Internet connection. It soaks up the upload side of the Internet channel which is not normally very busy. Unfortunately, with ADSL upload is typically not nearly as fast as the download side. However, that works to your advantage when it comes time to restore.
You would do a manual recall on a directory to revive a cold project so that all files are ready to go rather than dribbling in as you open them. You could even do this on a different machine if you knew the password. See how this project evolves into a scheme where you can sit down at any machine in the world and your logical desktop is sitting there ready for you.
For a simplified initial version, you could handle migration but not backup. You keep at most one backup of each file on the server. The ISP (Internet Service Provider) then needs to track files only by cookie.
For the full backup version, the ISP needs to maintain a list of what backups exist on mass backup organised in a directory structure matching the customer’s, so the user can selectively restore individual files by date. Further, the user should be able to say, "Put this directory back the way it was as of March 26 1999".
This scheme is not going to work without a 24-hour ADSL or faster connection. Restore would be too slow. You want backups to be scheduled at any time at the convenience of the ISP. He is typically backing up the least recently used files. If a user wanted to use one of these file, the backup would be automatically abandoned. You thus have very little interaction between the backup and the files the user is working on.
What about a purely local version where the user is responsible for making the backups? You can get about a gig per CD with compression. ZIP drives are too tiny to bother with. Tape drives are too slow to search to bring back files.
I think it wisest to tackle this first as an Internet service. That gets rid of many headaches.
Stavros Macrakis macrakis@alum.mit.edu hopes to find someone to implement this for under . See the Automatic File Update project for hints on implementation details.
This page is posted |
http://mindprod.com/project/infinitedisk.html | |
Optional Replicator mirror
|
J:\mindprod\project\infinitedisk.html | |
Please read the feedback from other visitors,
or send your own feedback about the site. Contact Roedy. Please feel free to link to this page without explicit permission. | ||
Canadian
Mind
Products
IP:[65.110.21.43] Your face IP:[3.138.69.101] |
| |
Feedback |
You are visitor number | |