Case Range  Case Range

go to home page Student Projects full screen, hide local find menu Google search web for more information on this topic jump to foot of page translate this page with Babelfish by Roedy Green ©1996-2009 Canadian Mind Products
This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; I have too many other projects of my own.

I do contract work for a living, which could include writing a program such as this. However, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavor.

Consider the following highly efficient piece of code:
It is a nightmare to write, proofread and debug. Here is the same logic expressed as if Java were able to optimise case range statements:
See how much easier it is to proofread. The program is so simple and regular I actually generated the source for it with a little for loop.

Now your task is to take the second program and mechanically convert it to the first, and of course any similar problems solved with case ranges. You want to generate an inline binary search on the ranges. Note that ranges must not overlap. They need not be in ascending order. Some ranges will not be specified; they will go to the default case. Some ranges are degenerate, with a single integer to both begin and end the range.

To make life easy for yourself, for a first cut, you may insist the input come in some structured form so you don’t have to formally parse it. To do it working with ordinary source, see parser in the Java & Internet Glossary.

For a fancier version, you want to insert stub comments about the range in the generated code so that the original range information is not lost. You want to be able to trust the complex if/else skeleton to deliver possibilities to the right code stub without having to stare at it.

You could in theory maintain the code by looking at the stub comments and just changing the stub code (not recommended). If you changed the ranges, you would have to take the original code (preserved as a single giant comment), and rerun it through your processor.

For a still fancier version, you might generate comments on every else not just the stubs, along the manner of Dirk’s.

One further complication. Even plain case labels without ranges are more efficiently implemented as nested ifs than as a bytecode dense jumptable (tableswitch), if there are three or fewer cases plus default. If your compiler is not smart enough to make that optimisation, you perhaps should generate the if form for short switches.

Implementation

How might you go about implementing this? First collect the low and high bounds of each range and sort them by low bound. See Sort in the Java & Internet Glossary for help on that. Then ensure there is no overlap. Create dummy ranges to fill in missing gaps that refer to the default code. Work recursively. Generate the big outside if else to split the table in two. Recursively generate the inner if/elses, each time splitting the section of the table in two further. You never use the high bound values, and you can suppress a test for if (n >= Integer.MIN_VALUE) since it is always true. You stop the recursion when your remaining subtable has only one element. The efficiency of recursion is not an issue. The code you generate is not recursive.

If the ranges are of equal length, or if some of the ranges are of equal length, you can select the correct band with a division.

For a fancier version, look for runs of degenerate cases or very short ranges and handle them with conventional inner switches. That effectively collapses them down to a single range.

Dealing with defaults and dummy ranges is a little trickier than it first sounds. You could just repeat the default code for each dummy range. You can’t very well generate a GOTO, which is effectively how the Java compilers deals with various cases lumping together to use the same stub code.

The GOTO is not evil, if humans never use it directly. It should only be used by code generators.

You have the same problem with fall through. I think you should disallow fall-through except in cases where you are specifying multiple case labels for the same stub. To handle this efficiently, you really want this logic built-in to the compiler.

What you can do without GOTO or compiler support to deal with many duplicate stubs? Use your binary if tree with stubs of this form band=n; Then you generate a traditional switch(band) where you map band to code. That switch will have no duplicate band labels, though your tree may have several band=3; stubs.

You can try out your code to optimise programs such as WaveLength which categorises frequencies into bands of the visual spectrum, and ISBN which categories ISBN numbers into formatting categories, Sound which categories sound samples into mu-law category bands, or The Credit Card Amanuensis that categories credit card number by vendor based on number range.

Strings as Case Labels

A similar problem is implementing a switch statement that allows String constants as case labels. In this case you might implement it as a cascaded if for simple switches, or as a HashMap for complex ones. You might use the hashCode as a case label, and generate special tests when two strings collide on the same hashCode. You have and advantage over run time lookup in that you know all the Strings ahead of time. With a HashMap you lookup an int which you use as in a traditional switch.

Once you tackle that you might allow variables in your case ranges or Strings.

Once you tackle that, you might allow boolean expressions in your case labels.

A simple-minded implementation of this extended switch can be done in about 30 lines of FORTH. It makes code much more readable and maintainable.

For an efficient switch you want dense integers (i.e. 0..n) so that Java will compile the switch into the more effient tableswitch rather than the lookupswitch JVM instruction.

It is possible, when you know the Strings ahead of time to come up with a hash function without collisions that produces only an narrow range of integers.

hashCode

CMP homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.58]
You are visitor number 7,516.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/mindprod.com website mirror)
http://mindprod.com/project/caserange.html J:\mindprod\project\caserange.html