LazyString  LazyString

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project, and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

LazyString is like a cross between String and StringBuffer especially good for processing long String s such as in-RAM copies of files. It would be particularly efficient at doing a series of bulk search and replace operations.

Why lazy?

  1. It procrastinates the work of copying characters about. In the process, it saves considerable work copying and creating String and StringBuffer objects.
  2. LazyString does almost no work itself. It fobs the bullwork off on its older brothers String and StringBuffer .
How does it work? It maintains the String internally as a LinkedList of String fragments. Each fragment contains a reference to a String, a starting point in the String and the ending point in the String, i.e. the parameters you would pass to String.substring to access a piece of some bigger String..

Only when you are done and use the LazyString.toString method do you collect all the fragments together contiguously and convert it to a String. You implement that with a series of StringBuffer.append s and a StringBuffer.toString. I refer to this process as collapsing. LazyString is at liberty to at any time collapse the entire String or any part of it. It might do this when the ratio of number of fragments to the total size of the fragments is greater than some magic number where the optimal value of that number is found by experiment. (It would neat to have a general purpose HotSpot-style optimiser that would tweak such constants to suit the current data. It could then apply what it had learned from previous experiments to instantly set the tweaking constants for any given set of data.) You should make the magic number optionally configurable in the constructor because it depends on just which methods you use most. By setting the number to 0, you effectively make it act like a simple String. You might use -1 as a special code to make it behave like a simple StringBuffer.

The overhead of using LazyString does not pay back unless you are handling large String s, but it should be orders of magnitude faster then.

You want to implement as many of the methods of String and StringBuffer as you have energy for. Don’t make the LazyString class final unless you provide source. Others may want to add their own convenience methods. See the StringBuffer student project.

Comparison of LazyString String and StringBuffer
Method LazyString String StringBuffer Notes
append very fast very slow slow To append, i.e. concatenate, String allocates a new StringBuffer containing an array of characters and copies character by character to the tail end of the array. LazyString just adds a node to the end of the LinkedList .
substring fast very fast slow String does not copy any characters to implement substring. LazyString has to copy the list of nodes comprising the substring, but not the characters themselves.
indexOf(char) very slow slow slow LazyString fobs the work off on String.indexOf for each fragment.
indexOf(String) very slow slow slow LazyString fobs the work off on String.indexOf for each fragment. Dealing with finding strings that span fragments is difficult, but necessary. Don’t be tempted to implement this with String.toString then use String.indexOf or you defeat the point of LazyString. If you want to get clever, use Boyer Moore.
charAt(int) fast very fast very fast LazyString fobs the work off on String.charAt for each fragment. To find the right fragment, LazyString needs to construct an array of bands to track which part of the String is covered by each fragment and pass that to java.util.Arrays.binarySearch. You can procrastinate creating the array until it is needed. Alternatively you might use a TreeMap to track the fragments and find the one corresponding to a given offset.
concatenation
Rope class

CMP homejump to top You can get the freshest copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/project/lazystring.html J:\mindprod\project\lazystring.html
logo
Please email your , letters to the editor, errors, omissions, typos, formatting errors, ambiguities, unclear wording, broken/redirected link reports, suggestions to improve this page or comments to Roedy Green : feedback email. If you want your message, your name or email kept confidential, not considered for public posting, please explicitly specify that. Unless you state otherwise, I will treat your message as a letter to the editor that I may or may not publish in the feedback section. After that, it will be too late to retract it. If you disagree with something I said, please quote it and cite the web page where you found it, tell me why you think it is wrong, and, if possible, provide some supporting evidence. Threatening to kill me or spouting obscenities has yet to persuade me to change my mind.
mindprod.com IP:[65.110.21.43]
view BlogYour face IP:[38.107.179.210]
You are visitor number 7,310.