image provider

Case Range


Disclaimer

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

Consider the following highly efficient piece of code: It is a nightmare to write, proofread and debug. Here is the same logic expressed as if Java were able to optimise case range statements: See how much easier it is to proofread. The program is so simple and regular I actually generated the source for it with a little for loop.

Now your task is to take the second program and mechanically convert it to the first and, of course, any similar problems solved with case ranges. You want to generate an inline binary search on the ranges. Note that ranges must not overlap. They need not be in ascending order. Some ranges will not be specified; they will go to the default case. Some ranges are degenerate, with a single integer to both begin and end the range.

To make life easy for yourself, for a first cut, you may insist the input come in some structured form so you don’t have to formally parse it. To do it working with ordinary source, see parser in the Java & Internet Glossary.

For a fancier version, you want to insert stub comments about the range in the generated code so that the original range information is not lost. You want to be able to trust the complex if/else skeleton to deliver possibilities to the right code stub without having to stare at it.

You could in theory maintain the code by looking at the stub comments and just changing the stub code (not recommended). If you changed the ranges, you would have to take the original code (preserved as a single giant comment) and rerun it through your processor.

For a still fancier version, you might generate comments on every else not just the stubs, along the manner of Dirk’s.

One further complication. Even plain case labels without ranges are more efficiently implemented as nested ifs than as a bytecode dense jumptable (tableswitch), if there are three or fewer cases plus default. If your compiler is not smart enough to make that optimisation, you perhaps should generate the if form for short switches.

Implementation

How might you go about implementing this? First collect the low and high bounds of each range and sort them by low bound. See Sort in the Java & Internet Glossary for help on that. Then ensure there is no overlap. Create dummy ranges to fill in missing gaps that refer to the default code. Work recursively. Generate the big outside if else to split the table in two. Recursively generate the inner if/elses, each time splitting the section of the table in two further. You never use the high bound values and you can suppress a test for if (n >= Integer.MIN_VALUE) since it is always true. You stop the recursion when your remaining subtable has only one element. The efficiency of recursion is not an issue. The code you generate is not recursive.

If the ranges are of equal length, or if some of the ranges are of equal length, you can select the correct band with a division.

For a fancier version, look for runs of degenerate cases or very short ranges and handle them with conventional inner switches. That effectively collapses them down to a single range.

Dealing with defaults and dummy ranges is a little trickier than it first sounds. You could just repeat the default code for each dummy range. You can’t very well generate a GOTO, which is effectively how the Java compilers deals with various cases lumping together to use the same stub code.

The GOTO is not evil, if humans never use it directly. It should only be used by code generators.

You have the same problem with fall through. I think you should disallow fall-through except in cases where you are specifying multiple case labels for the same stub. To handle this efficiently, you really want this logic built into the compiler.

What you can do without GOTO or compiler support to deal with many duplicate stubs? Use your binary if tree with stubs of this form band=n; Then you generate a traditional switch(band) where you map band to code. That switch will have no duplicate band labels, though your tree may have several band=3; stubs.

You can try out your code to optimise programs such as WaveLength which categorises frequencies into bands of the visual spectrum and ISBN which categories ISBN (International Standard Book Number) numbers into formatting categories, Sound which categories sound samples into mu-law category bands, or The Credit Card Amanuensis that categories credit card number by vendor based on number range.

Strings as Case Labels

A similar problem is implementing a switch statement that allows String constants as case labels. In this case you might implement it as a cascaded if for simple switches, or as a HashMap for complex ones. You might use the hashCode as a case label and generate special tests when two strings collide on the same hashCode. You have and advantage over run time lookup in that you know all the Strings ahead of time. With a HashMap you lookup an int which you use as in a traditional switch.

Once you tackle that you might allow variables in your case ranges or Strings.

Once you tackle that, you might allow boolean expressions in your case labels.

A simple-minded implementation of this extended switch can be done in about 30 lines of FORTH. It makes code much more readable and maintainable.

For an efficient switch you want dense integers (i.e. 0..n) so that Java will compile the switch into the more effient tableswitch rather than the lookupswitch JVM (Java Virtual Machine) instruction.

It is possible, when you know the Strings ahead of time to come up with a hash function without collisions that produces only a narrow range of integers.

hashCode

This page is posted
on the web at:

http://mindprod.com/project/caserange.html

Optional Replicator mirror
of mindprod.com
on local hard disk J:

J:\mindprod\project\caserange.html
Canadian Mind Products
Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

IP:[65.110.21.43]
Your face IP:[18.220.13.15]
You are visitor number