SCID

Disclaimer

This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have prepared to help you is right here.

This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.

Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many other projects of my own.

Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.

Please do not email me about this project without reading the disclaimer above.

Java Source Code SCID-style browser/editor

“If builders built buildings the way programmers write programs, then the first woodpecker that came along would destroy civilization.
~ Weinberg’s Second Law (1933-10-27 age:84)

An invasion of armies can be resisted, but not an idea whose time has come.”
Victor Hugo (1802-02-26 1885-05-22 age:83), born 1852, Histoire d’un Crime

I mean, source code in files; how quaint, how seventies!
~ Kent Beck (1961 age:56), evangelist for extreme programming.

SCID (Source Code In Database) . This is one of many student projects. We have been teaching our customers to regard their data as a precious resource that should be milked and reused by finding many possible ways of summarising, viewing and updating it. However, we programmers have not yet learned to treat our source code as a similar structured data resource.

Programs are abstract structured data. They map better onto 3D visual structures than they do onto linear streams of characters. We have to gently break the strangehold of the written language metaphor for programs before we can make any major progress. An IDE (Integrated Development Environment) ginergerly decorates text with graphics. We need to evolved that to a SCID with graphics ginergly decorated with text, where the dynamic graphics tell nearly all the story.

This is an enormous project, but you could start small. The basic idea is your pre-parse your code and put it in a database. The problem is programs are getting huger and huger. We need tools to help you temporarily ignore most of them so you can concentrate on your immediate needs. We need tools to rapidly navigate programs. We need tools to help you get a mental forest picture before delving into the tree detail.

I have been talking up the SCID idea since the early 70s. Mostly people have just hooted with derisive laughter. However, SCID-think is gradually catching on. The RAD (Rapid Application Development) s, such a Visual Café IBM (International Business Machines) Visual Age and Inprise Jbuilder, let you write code to control the properties of widgets on the screen by right clicking on visual elements to view the associated properties. You can tick off entries in pop-up listboxes and checkboxes or fill in the blanks. This is an important step away from thinking of programs strictly as linear streams of ASCII (American Standard Code for Information Interchange) characters. Java Studio lets you view and write Java code by playing plumber — visually connecting JavaBeans.

I think it is a case that the shoemaker’s children have no shoes. Programmers in creating source code in linear text files do the equivalent of keeping their accounting books using a CPM (Cost Per thousand/Mille impressions) Wordstar text editor. We would never dream of handing a customer such error prone tools for manipulating such complicated cross-linked data as source code. If a customer had such data, we would offer a GUI-based data entry system with all sorts of point and click features, extreme data validation and ability to reuse that data, view it in many ways and search it by any key.

Once you have your program pre-parsed, you can display the program in a variety of ways. Here are just a few examples:

Make the beginnings of methods more visually obvious. The plain Java syntax tends to camouflage them. Ditto for temporary variable declarations.
Hide comments.
Show just loop structure.
Show just code involved with class X.
Hide all code that involves calls to the java.awt package
Highlight all code that does bit shifting or division.
Highlight all uses of the sin method, but only in class Moses, not Math.
Gray out all code that deals with error handling. All that is left is the normal case.
Colour code so that the darker the shade, the more frequently it is executed.
Show me a switch statement as if it had been handled with a set of subclasses. There is underlying deep structure here. I should be able to view the code as if it had been done with switch or as if it had been done with polymorphism. Sometimes you are interested in all the facts about Dalmatians. Sometimes you are interested in comparing all the different ways different breeds of dogs bury their bones. Why should you have to pre-decide on a representation that lets you see only one point of view?
Sometimes you want to see all the code concerning the save button. Other times you want to see only code involving hooking up listeners for all buttons/menu items. Other times you are interested only in code that affects layout. The SCID can hide non relevant code. It can also dynamically reorder code for the current purpose. You no longer have to decide whether to bundle all your layout code together or to bundle it with the corresponding button instantiation. You can have it both ways!
Show me a set of subclasses as if the code had been handled by a giant switch. This lets me compare the equivalent code in all subclasses. Similarly let me compare just two subclasses. This is like a DIFF utility that notices the differences between two methods and generates a single program that handles both, taking advantage of commonalities.
Show me the logic as if it had been written with a PET decision table. Here you have a list of conditions, then a list of actions. The SCID can ensure that all possible combinations are covered and you easily proofread the logic originally coded in traditional nested if logic. You can write a decision table and have the SCID convert it to nested if logic or a giant switch indexed by concatenated binary condition representations. Here is a simple example:
```
CONDITIONS
kettleFull()    - Y Y Y N N N
makingTea()     N Y N Y Y N Y
makingCoffee()  N N Y Y N Y Y
ACTIONS
addWater()      - - - - X X X
boilWater()     - X X X X X X
addCoffee()     - - X X - X X
addTea()        - X - X X - X
```
Might generate code like this: Better logic still would merge test conditions and actions to reduce code size and to avoid computing a condition when it did not matter.
Let me switch rapidly back and forth between different representations of my code. I would like to see a high level CASE view, e.g. Warnier-Orr diagram or your favourite flavour of UML (Universal Modeling Language) or sequence diagrams and then zoom in on coding detail, something like TogetherJ offers. Let me see a flow chart of the program’s basic loop structure, then zoom in on part of it. When a project starts, typically all the energy is focused on the UML, specifications and the bird’s eye view. As the project progresses, the energy is focused on the detail and entropy gradually destroys the high level documentation. It is not kept in sync. The high level documentation becomes worse than useless to orient an incoming maintenance programmer. The spec, the UML, the high level stuff, the code, various levels of comment detail and the end user docs must be more closely integrated so that you can navigate at any level. All levels must be kept up to date and in sync. The navigation function provides motive to keep the high level docs accurate. On the other end of the spectrum, let me zoom down and examine the byte code or machine code.
We make the error of thinking computer programs are primarily for communicating with computers. On a project that requires more than one person, the source code is primarily for communicating between people. The SCID gives you a mechanism to record information only of interest to people and to help you manage that information overload.

With Java version 1.5 enums, you want an aligned grid so you can study the enum constructors either by row or by column so that you can compare enums for a certain property, or study the properties of one particular enum. You would like the grid to behave as a table in HTML (Hypertext Markup Language) or better still as a spreadsheet, with non scrolling headings, adjustable column widths so you can squeeze the most information on the screen without scrolling. It wraps within cells. It would look something like this:

/**
* constants for Application categories
*/
public enum AppCat {

Application category enums
Enum		shortName		description		aliases
APPLET	(	Applet	,	Java Applet	,	applet	),
APPLICATION	(	application	,	Java application	,	application	),
DOCUMENTATION	(	documentation	,	documentation	,	documentation	),
HYBRID	(	hybrid	,	Java Applet that can also be run as an application	,	hybrid	),
JWS (Java Web Start)	(	Java Web Start	,	Java Web Start	,	jws, weblet, webstart, jaws	),
LIBRARY	(	Class	,	Class library	,	class, classes , library	),
SERVLET	(	servlet	,	Java Servlet	,	servlet	),
UTILITY	(	Utility	,	non-Java Utility	,	utility	);

…
}

Show me the definition of this variable or method just by clicking it.
Tell me which classes and methods call this method and how many times. This XREF (Cross Reference) is always up to date because the source is in a database. It does not need to be scanned periodically to create a fresh XREF.
Optionally show me code with all class names fully qualified by package, or remove that qualification for all or some classes.
Tell me which classes and methods look at this variable and how many times. Similarly, tell me which ones change it.
Collapse/expand level of detail, e. g. collapse detail of CASE bodies, LOOP bodies, IF/ELSE bodies, parameter details leaving just the names of the methods being called, collapse purely arithmetic assignments.
By using specially coded comments you can hide/reveal various classes of them. You can hide code and just read comments, or perhaps just see the overview comments, or just the comments explaining what the various classes are for etc. The key is to show you just the level of detail in comments you need for the current task without being overwhelmed with irrelevant detail. You could configure which categories of comments you wanted to see fully expanded and which you wanted revealed only by hoverhelp and which you wanted totally suppressed. By using hoverhelp to display comments you free up screen real estate to see more code at once on screen. You could implement comment hiding/hoverhelp without a SCID using a smart traditional text editor using:
```
/* [1!] */
```
markers to tag comments with level of detail and importance (or severity as Chuck Sheehan, the technique’s inventor, calls it.)
```
/* 1=overview .. 9=detail
!=important */
```
Programmers very familiar with the code might be less likely to remove Javadoc or complain about it, if they could get it off their screens. The main drawback of doing this is out of sight, out of mind. Cowboy coders would be even less likely to keep comments in sync with the code.
When I write the code to call a method, show me the names, types and Javadoc for the parameters.
Show me the names of the parameters next to the parameters themselves in each method invocation so I can proof-read it, the way you can in Ada-95:
```
drawCircle( x => point.x, y => point.y, radius => 5 );
```
or Modula-3:
```
drawCircle( x := point.x, y := point.y, radius := 5 );
```
or as Java-style comments:
```
drawCircle( /* x */ point.x, /* y */ point.y, /* radius */ 5 );
```
As I become familiar with certain methods, turn this expansion off for those methods only. In a similar way, optionally expand/collapse calls with parameter type information as well.
When I type in an identifier the SCID has never heard off, use spell check logic to suggest what I likely meant. Eclipse now does this. I spend so much time correcting typos, variant abbreviations, errors in capitalisation, inconsistent capitalisation e.g. Hashtable vs HashTable when several variants are plausible.
Warn me if I reuse a name locally that is already defined as an instance or static variable, except for the usual exceptions.
Show me my declarations aligned in columns, perhaps using compact glyphs to indicate static, instance, public etc. so that I can easily pick out parallels in names and types.
Let me see switch statements as if they had been coded in Eiffel as inspect statements. Let me see declarations, expressions, loops, if nests etc, in my favourite syntax, in any of the Algol family of languages such as: Eiffel, Ada-95, Java, Dylan, Scheme, Algol-68, beta, Pascal, Delphi, Oberon, Modula, NetRexx, Python, Sather, roll-your-own such as Abundance or even as flowcharts. You should be able to key code in any of these modes too. The language would still be Java underneath, with a surface veneer to simulate the coding conventions of these other languages.
Show me the Javadoc comment for that parameter on demand.
Global method renaming. Accurate, unambiguous method and variable naming is the most underrated technique for writing maintainable code. Whenever you add a new method, there is a strong possibility some existing similar method should be renamed so the distinction between the two is more clear. Scope name clashes can be resolved to avoid confusing programmers. Compilers have no trouble with accidentally duplicated names, but programmers are easily befuddled. Globally renaming manually is so error-prone that it is almost never done manually. With a SCID, it would be effortless and completed in an eye blink. You could also do generalisations of renaming, e.g. reordering parameters to some more consistent standard, or adding overloaded methods to handle common default parameters and having all code converted to use the new overloaded methods.
Show me the program with the Spanish strings inserted. Show it to me with the Spanish variable names where they are available, but use English ones where not. Let me read it as if it were written purely for Spanish with any internationalisation bubblegum housekeeping hidden.
Show me the program with all needless () levels removed that our newbie programmer put in. The () are not actually stored in the database. They are regenerated to suit the individual programmer preference.
Show me the program with extra parentheses () inserted because I can never remember the precedence distinctions between && and <.
Show me the program with the Whazmotron custom glyph set so that I can easily pick out if begin end, loop begin end, class begin end, method begin end.
What classes are available to me at this point in the program? What local variables are in scope?
How should you display semicolons? Once you have the parse tree, they are purely for the convenience of the humans reading the code. They are not actually in the computer’s parse tree. You could display with any statement convention you wanted. Every programmer could flip between any display mode they wanted.
- you might leave them out.
- You might use Pascal separator, Java-like terminator, or Eiffel-like only-when-it-would-otherwise-be-confusing rules.
- You might use a pure indenting convention.
- You might draw boxes, or non-outlined boxes around each statement in a subtly different shade from the background colour.
- You might use a special fat glyph, perhaps a little red stop sign. that is very easy to tell apart from a colon.
You should be able to ask, what methods are available at this point in the program that produce a Zomblat object? What methods are available that take a Zomblat object as a parameter? What methods are available that take both a Zomblat and a Color object?
What methods are available anywhere that produce a Zomblat object? What methods are available that take a Zomblat object as a parameter?
Show/hide the Eiffelian pre/post assertions. You can fill in dialog boxes about each parameter, variable or return results. For ints you may specify the acceptable low-high bounds. For strings you would specify whether they may be, null, empty "" and whether they may have lead/trail blanks. You might specify that they must be all upper case, all numeric, all lower case, no accented letters etc. For enumerations you would specify the list of allowable values. For debugging, you can turn this code on to ensure all the conditions are being met.
When debugging, the SCID secretly captures information about where in the code each string in the output came from so that you can click anything in the console output and instantly jump to the System.out. println statement that generated it. It should be easy, when debugging, to temporarily assign colours to the console output from different classes or println statements to help classify them similar to the way logging can be configured.
Capture additional information about fields useful for data entry, such as low-high bounds, blank if zero, left leading zero fill, commas, lists of legal values, justification, natural layout parameters, field name or display, prompts, field widths, validation routines,… Programs can access this data rather than specifying it inline in the code. This keeps everything about the variable in one place where it can be easily accessed and changed. It also facilitates searching for fields that share some property and bulk replacing it.
That was an ambiguous name for that method. Change it everywhere it is used to this clearer name, but don’t change it where that same name is used in another class. Computer, be clever, don’t pester me to figure it out for you which ones should be changed. IBM ’s Visual Age can do this already. With a database, a variable or method name string actually appears in only one place, (everywhere else the name is represented by a pointer to that name), so it is trivial to make a global change.
Display the program using foreground/background color, font family, font size, font style (bold, italic), lines, glyphs and icons to pack as much additional information on the screen as possible. For example you might be able to tell a stack/temporary variable, from an instance variable from a static variable from a constant just by looking the font, or some slight shade of foreground or background colour difference, e.g. dark brown, orangey/brown and light brown foreground. The clues may be almost subliminal. You could encode all kinds of information compactly such as: local, parameter, instance, static, my class, Sun class, type, package, class, definition, keyword, final… all in a way that did not get in your face. You could encode for whatever distinctions were important at the time.
Variable pitch fonts are possible without giving up alignment. They put more on the screen and are more readable than fix pitch fonts.

Exactly how these abilities are used will change constantly depending on your current task. The idea is to encode information about symbols in their look.
You could use the full colour abilities of the modern screen to give subliminal clues, e.g. by automatically assigning a portion of the spectrum to each package/class using a pastel shades as the backgrounds to any references to methods or variables of that class. You could bold face the definition of any identifier to make it stand out. You could make calls to Sun code look different from calls to you own code.
Chris Uppal suggested using colour coding by author. He noted that in every shop where he had worked, there were programmers he could trust and ones he did not. If he were attempting to understand or debug a chunk of code it would help if he knew which stretches could be trusted to do what they claimed to do and which stretches warranted more scrutiny.
You could encode the age of code by colour. Generally the newest code is most suspect if there is a problem. Sometimes old code, that was done before some specification change occurred, needs to be examined and ticked off as compatible with the new spec. You can use colour to help keep track of which code has been checked. A SCID would know the age of every token to the millisecond, much finer resolution that could be pulled off with CVS (Concurrent Versions System) deltas.
You could ask that all code be filtered out unless it had to do with Instantiating objects (other than common ones like String). This skeleton view would give you a pretty good overview of how all the classes fit together.
You could ask to globally visit all references to a given method or variable and tick them off once each was dealt with.
You could do quite a bit of code writing by point and click. There is no need to type a variable or method name, just select it from a palette of likely variable or method names. You could type personal abbreviations for them and have them expanded. You could view code with your personal names or the official ones. For example to write a FOR loop there might be boxes you fill in for the various intializer, terminator and incrementor expression. They would default to int i=0, i<n and i++. You could give the loop a name to be displayed when its body is collapsed. You could convert the loop to one that ran backwards by a single click, or to one that generated a WHILE or UNTIL loop similarly. If you ticked enumeration all you would need type is the name of the name of the Enumeration generator. The rest would be generated for you, accurately.
Alternate display with common functions displayed as if they were infix operators using special glyphs (here simulated here with red). For example, instead of seeing:
```
if ( a.equals(b) )
```
You might see
```
if ( a == b )
```
Instead of
```
if ( a.compareTo(b) < 0 )
```
You might see
```
if ( a < b )
```
The SCID would act as a Java lint, displaying suspicious or unusual code in a special colour and perhaps ask you for confirmation when you inserted code of the form if (myString == abc) or if (myBool = a & b).
Show or hide explicit conversions.
Display declarations in a grid so that is easy to pick out the variable name, the type and the initialisation. They line up nicely in columns like a spreadsheet, possibly with each column separately scrollable so that you can see the big picture and home in on the detail when you need it.
Embed HTML comments in your code that render, complete with diagrams and images when you read the code. There could be links to references to where the algorithms were documented etc.
Show me the code in pseudo NetRexx, Bali or JPython, with obvious declarations removed so I can focus on the procedural logic, or vice versa. I would then see an enumeration iteration written tersely as for r in reminders instead of the usual Java verbosity.
Bali-style variable size parentheses. In Java a piece of code might be displayed like this:
```
int a = ((b+c)/(e+f))*(g(i)+h);
```
That some piece of code displayed in Bali might look like this:
int a = ((b + c)/(e + f)) * (g(i) + h);
The red is just to highlight the outsized(), though colour coding matching () and {} is not such a bad idea.. It might even be optionally displayed like this:
```
         b + c
int a =  ————— * (g(i) + h);
         e + f
```
Show get/set method invocations as if they were direct access to an associated property variable, similar to Delphi or Eiffel. This simplifies the syntax. Instead of seeing:
```
setFudge( getFudge()+1 );
```
you would see:
```
fudge++;
```
Use colour to display literals to group digits by three for decimal and octal and by 4 for hex, emphasising the trailing indicator char in a different colour so it does not get confused with a digit. A listing might look like this: In some Asian countries, decimal digits are also grouped by four. The SCID would allow either preference, defaulting to the locale default, so different people would see the same code differently.
Display using lines or slight shade variations in background colour to mark the bounds of ifs and loops. Programs would look more like flow charts, or more like text with highlights, as the programmer preferred for the current purpose. Vertical striation watermarks in the background would make it easier to see matching alignments. You might draw thin vari-shaded boxes around each nested block. You might bracket blocks with {} turned 90 degrees and made 10.16 cm (4 in) wide. CSD is one such representation.
Optionally apply Hungarian notation warts to variable names to indicate variable type or scope. Turn them on and off at will. They are always accurate, e. g. Scope prefixes might work like this:
- local a (e.g. aPoint)
- param p (e.g. pPoint)
- member instance m (e.g. mPoint)
- static s (e.g. sPoint)
- exception X (e.g. XOutOfBounds)
Highlight all code involving floating point. What I am talking about here is not permanently highlighting floating point operators and operands, or example, but just for the next 10 minutes because floating point is the thing I am concentrating on at the moment.
The syntax colouring schemes I am familiar with are designed to be done once and left alone once you have them tweaked the way you like them.

For a SCID, you need not only ways to change the syntax highlighting, but to rapidly flip between presets to enhance the current interest and to suppress the current irrelevancies. You also need ways to rapidly set up new interest constellations.

I use the word highlight in a broader sense. With a GUI (Graphic User Interface) and SCID you may use combinations of colour, font, size, glyphs, background colour, hiding, folding, lines, geometric shapes, bold, italic, blink etc. etc. to make a certain constellation of currently interesting features leap out at you. Different interesting features would use different highlighting techniques to grab your attention simultaneously.
If you elected to view an IF as a flow chart, you could more easily compare the true and false branches line by line side by side. With the modern GUI ’s ability to rapidly pan in 2D or even 3D, we should break the mindset that programs have to be a single linear column of text. We can pay more attention to what actually works for the eye, not what is easiest to code. I am pretty sure than long horizontal lines of text, stretching all the way across the screen, so popular now, will prove to be suboptimal. You might look for inspiration to website navigation aids or the Windows ME exploding menus.
You might create your own glyphs or icons to represent methods, classes, operators, variables, syntactic elements etc. That way you can pack more information onto the screen at once. You create a personal way of displaying the program to yourself that no one else in the universe need be able to make sense of. You share the underlying code syntax tree with your fellow programmers. The representation is personal and evanescent.
You want to see program flow under a certain set of circumstances. Code that would not be executed when those circumstances don’t apply is temporarily suppressed from the display. You are left with a simplified flow chart that shows execution flow. You can focus on the usual case, then later view various pathological cases, independently. You don’t have to deal with the full complexity of everything all the time the way you do in conventional coding. The whole point of the SCID is to temporarily suppress what is temporarily irrelevant.
Idiom expansion. There are many things in Java that take reams of code to express. You can’t abbreviate it by writing a method. Instead, you code an abbreviation, or fill in some blanks in a dialog box and it generates the bubblegum for you, error-free.
Idiom detection. Java is verbose, but tends to follows standard patterns called idioms, e.g. enumerating a set, hooking up buttons to listen for events. The SCID can detect the pattern and replace it with an abbreviation for display. If code refuses to abbreviate, but that looks like an idiom, you can be sure is not quite the standard idiom. That may be an error, or it may be deliberate deliberate. Code is much easier to proof read this way. You don’t need your eyes to detect tiny variations from the standard idioms.
You want to be able to trace not only program flow but data flow. Consider a program rendered like a flow chart, with parts of it suppressed. Lines show how a particular datum flows through the system, how it gets operated on and modified. Code that is not germane to that flow is temporarily suppressed. I am hand waving frantically here. What the heck am I talking about? Consider a program where you entered a birthdate. There are parts of the program that would be totally unaffected by that birthdate. Those parts could be suppressed if you were concerned with how the birthdate affects the program. There are degrees of association. A test on birthdate is a little less associated than some code that transforms a birthdate ordinal into year, month and day for display.
Pale finals. I would like the SCID to mark all variables that are not redefined with a pale final to let me know I need not worry about subsequent redefinition of the value. Similarly, I would like the SCID to mark all methods and classes that are not overridden with a pale final to let me know there are no redefinition of the method to worry about. These would be generated dynamically, not part of the source code and could be turned on or off. They would look like regular finals, except would be displayed in a pale colour to indicate their ghostly nature. They would not prevent me from redefining the variable or the method. The pale final would simply disappear.
A scid might colourise a final declaration in a distinctive way whether or not is it explicitly marked final.
Display complicated expressions in true mathematical form much as T_EX or the Microsoft equation editor would display them, with variable sized parentheses and denominated under numerators. To help understand expressions, you could ask bits of them to collapse on screen. A simple version would adjust the amount of space around each operator to indicate relative precedence. Low precedence operators would be surrounded by more space. Java has 13 levels of precedence, but you would not normally find them all in one expression. The relative distinctions in spacing could be obvious. You would not have a fixed amount of space for each precedence level. You could also display complex expressions as a parse tree, with operators at the nodes. What you see need not have that much relation to what you type. For example, you could type GT for > since for some people it is easier to type.
Let you refactor code by breaking up methods into smaller ones. You just highlight the hunk of inline code you want made into a separate method. In theory it should even be possible to automatically determine the parameters and their types. The system could then go looking for code that does inline what your new methods do and replace that inline code with the new encapsulated calls.
Lets you select colours within the SCID using a ColorChooser. Colours and variables/constants representing colours in code can be represented any combination of three ways:
1. By colour sample swatch.
2. By colour name.
3. By colour number (hex/decimal)
Exploit new high res 1 metre square LCD (Liquid Crystal Display) or gas plasma panels so you have room to see everything at once, visually navigating your way around the entire code space, rather than peeking at it through a toilet tube the way we do now.
With everything preparsed, writing your own custom code conversions would be a lot easier. For example, you might write a translator from AWT (Advanced Windowing Toolkit) to Swing code.
True visual editing. Your GUI program looks like the final screen output. You right click on any component which brings up a dialog box from which you can change, colour, font, border, initialisation, associated event handlers… mostly by ticking off boxes and making multiple choice selections. Your program always works to some degree since you can’t select anything syntactically invalid. Navigation is far easier, since you don’t even need to remember the names of things. You can, of course, find out the names of things by right clicking them. Code becomes far less procedural and more OO (Object Oriented).
A supermarket parking lot helps its customers find their cars by posting signs with animals on them in various parts of the parking lot. It is much easier for someone to remember they left their car near the salmon than in sector E6. Similarly you could embed landmark symbols in the code, perhaps with purely personal meaning and only visible to one programmer, just to help her find her way around. You could click on the to get back to a section of code you were working on recently.
For importing ordinary Java source code, a parser such as JavaCC or ANTLR (Another Tool for Language Recognition) might be useful. See parser in the Java & Internet Glossary.
Ability to add shortcuts to the syntax such as Abundance-style moods and for-each loops. Instead of saying x.keyin(); y.keyin(); z.keyin(); you can declare keyin as a mood and say: keyin x, y, z; You could say things like: for ( MyArray) { MyArray.x += MyArray.y + 1; } to run through all the elements of the array and provide an implied default subscript inside the loop body. Collection iteration could be much terser as it is in most modern languages.
You can invent your own language shortcuts. No other programmer need view them. They would see perfectly standard Java. However, when you viewed their code, your shortcuts would be applied so you would not have wade through their reams of dinosaurian repetition. You might for example add case ranges to the switch, implemented with a binary search. If your shortcut got in the way, you could drop it, and instantly see standard Java again. After all, this is software, right? It is supposed to be malleable and comfortable. With traditional coding techniques software becomes so rigid. It is harder to change that the supposed hardware.
I suspect that SCIDs (Source Code In Databases) will create a revolution in terseness of language design. It will come about gradually like this. A SCID will give you the ability to temporarily hide bookkeeping/busywork/plumbing/wiring (pick your analogy) details. Next will come the call for the SCID to automatically generate those details. Next will come the complete suppression of them from the application programmer’s awareness. They will be hidden completely behind the walls, no longer part of the day-to-day application programming language. Every time you can suppress 30% of the busywork, a whole new set of patterns emerge that were formerly obscured by all that fussy detail. Suddenly, you discover new ways to more tersely specify your desires to the computer. You see new levels of bookkeeping/busywork/plumbing/wiring that can be similarly hidden, revealing still more deep structure. this is not just speculation. I have seen this process in the evolution of my own Forth-based language, Abundance.
SCIDs will also have another influence on language design. Right now we are stuck in a mindset that a computer language is a linear stream of vanilla 7-bit ASCII characters. SCIDs will loosen that up. We are already seeing that with tools like the Symantec layout editor. A program can be a diagram in 2D. Font style can have semantic meaning. Noam Chomsky might put it this way, programs may have many temporary surface representations of a single deep structure. We will see multiple alias names for variables, multilingual variants of the Strings that can be flipped with the click of a menu item.
A SCID could potentially store a lot more information, (not normally visible) than a text file representation would. For example, you could fairly easily automatically record who changed each individual element in each line of code, why and when and as part of which job layer. See dynamic version control for what I mean by job layer. Global renamings would be labeled as such, not as a million separate little transactions. This information could also be used by the boss to track precisely what a telecommuting employee did during the week. You could add all the commentary you wanted without worrying about overwhelming the reader for whom it was irrelevant. The other information you could track at each node is who has access to look at or change that piece of code.
For some notes on how a SCID might be implemented so that many users could be updating the same code simultaneously from several globally dispersed sites, again see dynamic version control.
If ever Microsoft, (the inventors of the dancing paperclip), gets a hold of the SCID concept I suspect they will totally misunderstand. A SCID will be a 3D simulation of a Disneyland ride where you passively watch transactions being processed by a McDonald’s hamburger machine to the endlessly repeating strains of It’s A Small Small World. To debug, you watch the individual bits shaped like French fries being cooked, salted then added, get it? It takes only 10 seconds for the animation to complete the addition of two numbers.
Have a look at Visual SlickEdit. It is not a SCID, but with every release it develops more and more SCID-like features.
I have written an essay on online books. I propose a SCID-like solution to handling the problem of information overload in technical documentation.

How Might You implement a SCID

Instead of traditional CVS or editor model where you have lines of ASCII text, you would have a tangled hairball of objects, one object for each token, e.g. IF, variable reference, method definition. The objects would have pointers to each other so you can rapidly find related information and rapidly navigate the program at any level of detail. References to a variable would not contain the name of the variable, just a pointer to its associated token object. The actual string name of a variable or method would appear in only one place. (This makes global rename and aliasing trivial.)

There are TreeMaps so you can find symbols by name or approximate name or by name/property combinations.

There is no source code, just the parse tree. You are thus free to display it in many different possible formats, or to export traditional Java source. The parse tree always represents a syntactically valid Java program.

The parse tree contains much more data than the equivalent source code, e.g. history of change, who changed each token and why.

The parse tree is RAM-resident, or stored in a decent persistent object database that approaches RAM-resident performance, such as Objectstore. Even for a purely RAM-resident implementation, the data must be persisted that is dumped to disk and restored as a lump with all the interconnections intact. Execution, (but not startup) would be faster than using a POD (Persistent Object Database). You need to log transactions to disk, but everything else lives in a giant virtual RAM (Random Access Memory) space. Someday we will learn to snapshot entire virtual address spaces and pick up later exactly where you left off.

I repeat, the parse tree always represents a syntactically valid program. It might not necessarily do anything sensible, but it would compile. Changes to the parse tree are applied in the form of atomic transactions to ensure the integrity of the tree cannot be compromised.

Other sorts of auxiliary data may be stored in a conventional SQL (Standard Query Language) database where it would be accessible to user-written queries. However, the source code itself has too complex a structure to fit into the row-column SQL model.

There is a log of transactions that can be replayed in event of failure, or analysed to recreate the dynamic change history. You can play the log forward or back. The advantage of this log is that even in the event of catastrophic failure you would never lose more than a few seconds worth of keying.

When you get around to implementing dynamic version control, this transaction log must be sent to a central site and merged in real time with transactions of other people’s changes, then redistributed to all the redundant hot copies of the database. This implies a 24 hour Internet connection between all the programmer sites, or at least while any programmer is active at a site. The key is all copies of the database must process all the transactions in the exact same order. For speed you might process local transactions immediately then back them out if it turns out there were transactions from other sites that actually needed to be processed first. For more detail on how that might work see dynamic version control.

Books

recommend book⇒Doing Hard Time: Developing Real-Time Systems with UML, Objects, Frameworks and Patterns

Bruce Powel Douglass

978-0-321-77493-4

paperback

publisher

Addison-Wesley

978-0-201-49837-0

hardcover

published

1999-05-21

on the ROPES software-development method built into Rhapsody

Online bookstores carrying Doing Hard Time: Developing Real-Time Systems with UML, Objects, Frameworks and Patterns
	abe books anz	abe books.ca
	abe books.de	amazon.ca
	amazon.de	Chapters Indigo
	amazon.es	Chapters Indigo eBooks
	iberlibro.com	abe books.com
	abe books.fr	amazon.com
	amazon.fr	Barnes & Noble
	abe books.it	Nook at Barnes & Noble
	amazon.it	Kobo
	junglee.com	Google play
	abe books.co.uk	O’Reilly Safari
	amazon.co.uk	Powells
	other stores

Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.

recommend book⇒Software Engineering Environments: Automated Support for Software Engineering

Alan W. Brown

978-0-07-707432-6

paperback

publisher

McGraw-Hill

published

1993-03

Online bookstores carrying Software Engineering Environments: Automated Support for Software Engineering
	abe books anz	abe books.ca
	abe books.de	amazon.ca
	amazon.de	Chapters Indigo
	amazon.es	Chapters Indigo eBooks
	iberlibro.com	abe books.com
	abe books.fr	amazon.com
	amazon.fr	Barnes & Noble
	abe books.it	Nook at Barnes & Noble
	amazon.it	Kobo
	junglee.com	Google play
	abe books.co.uk	O’Reilly Safari
	amazon.co.uk	Powells
	other stores

Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.

recommend book⇒Object Oriented Databases: and Their Applications to Software Engineering

Alan W. Brown

978-0-07-707247-6

paperback

publisher

McGraw-Hill

published

1991-08

He describes, among other things, the ECMA Portable Common Tools Environment (which is a spec rather than a product) and how a few actual CASE tools match up to it

Online bookstores carrying Object Oriented Databases: and Their Applications to Software Engineering
	abe books anz	abe books.ca
	abe books.de	amazon.ca
	amazon.de	Chapters Indigo
	amazon.es	Chapters Indigo eBooks
	iberlibro.com	abe books.com
	abe books.fr	amazon.com
	amazon.fr	Barnes & Noble
	abe books.it	Nook at Barnes & Noble
	amazon.it	Kobo
	junglee.com	Google play
	abe books.co.uk	O’Reilly Safari
	amazon.co.uk	Powells
	other stores

Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.

Real World SCID Implementations

The usual reaction I get from programmers when I mention SCIDs is that they have tried them and they hate them. What they have tried are coding templates where you fill in the blanks. These stop you from coding in the old way, yet offer almost no payback. Granted SCIDs will force you to rethink how you compose programs. Code must at all times be 100% syntactically correct. However, a good SCID will pay back 100 fold for this inconvenience. If you try to import or paste code that is not correct, you will find much of it being turned into a special kind of comment

// INVALID out.printLine();

Symade Semantic Oriented Programing
SCIDs are not a totally mythical beast. Smalltalk and Logo programmers have been using them for a long time. IBM ’s Visual Age Java compiler uses a SCID, though they backed off somewhat with its successor, Eclipse. SCID users are very enthusiastic about them, even though I think the current crop of tools have just begun to exploit the possibilities. Jade stores its code is a preparsed tree. Mozart develops the idea of concept programming where you create application specific syntax.
Lisp has been treating programs as structured data for many years.
See Martin Fowler’s work on Refactoring. His ideas on automated source transformations require analysing the code as a parse tree.
Every version of Slickedit comes out with a more and more SCID-like user interface.
The Xerox Parc people have been experimenting with a new way of organising Java programs called Aspect Oriented Programming as a way specifying facts in only one place declaratively rather than by sprinkling them redundantly throughout the procedural code. Doing that makes code much easier to maintain. Declaratively specifying a huge amount of information that is traditionally handled procedurally is the key to my own computer language, Abundance, whose primary design goal was ease of maintenance. You can specify information declaratively and automatically generate the corresponding procedural Java bubblegum.
Microsoft had an Intentional Programming project. An intention is the core essence of a program once you strip out the housekeeping bubblegum that is necessary to explain picky implementation details the language/tool cannot handle on its own. Once the programmer has formed an executable thought, the programmer’s next question is not what do I have to say to get the computer to do this?, as was the case in traditional programming, but what do I insist on saying?. Intentions are the program spec plus sufficient detail to specify how you want the problem solved.
SCIDs are similar to Bell Lab’s SeeSoft to generate bird’s eye graphical displays of the entire project that use colour coded pixels to indicate such things as code age or hot spots where a profiler determines code spends most of its time executing. You can zoom in on interesting places to see the actual code. Other things you can colour code with pixels or coloured background include, code I have recently changed, code others have recently changed, code that was changed during some time period where a problem first showed up, code that is frequently changed, code that makes use of a certain class or method, where the comments are densest, Basically any metric you can compute from the parse tree representation can be expressed as colour.
Colouring for absolute frequency of execution points out areas that could benefit from optimisation. Colouring for relative frequency of execution helps you pick out the most common paths through the code, i.e. what happens in the usual case.
Jim Little’s Prism project seeks to find a representation for SCID data that can be shared by different programs. That way you could build your SCID system up out of pluggable components.
You might mine the i-Logix Rhapsody project for ideas on visual programming. It is a diagrammatic code generator for C++. It is based on UML, the high-level language for real-time, multitasking systems) and i-Logix Statemate.
The idea behind Rhapsody is to make the documentation executable. And the documentation is in the form of a number of diagrams you draw. i-Logix' Statemate uses enhanced bubble charts that, to paraphrase the Buick commercial, are not your father’s bubble charts. Briefly, they allow an action upon entry to a state, while in a state and upon exit. Further, exits from a state can branch conditionally and a sub-machine can remember its last state to pick up where it left off upon re-entry. There’s more, but suffice it to say that Statemate is very powerful.
CodeGuide was an IDE that is taking more of a plunge in the SCID direction than usual.
The i programming language is reputed to be SCID friendly.
OpenJava can also be regarded as a toolkit for constructing a Java preprocessor.
Jatha is a simple preprocessor for Java that is inspired by the power of Lisp macros. It is released under the GPL (Gnu Public Licence).
Juliet lets you ask SCID-like questions about your source code and rapidly navigate it. It is not an editor, just a browser.
Aubjex Alajava was a technology that transforms Java code into an especially efficient and complete database form, with generalized capabilities that do for Java source code what database query and manipulation products do for business data. Author Don Gilmore writes " Aubjex is built on SCID. It can parse the entire Java version 1.4 java source package in 30 seconds, into a database that maintains all information. We have hundreds of XML (extensible Markup Language) scripts that query and manipulate the database. There is a dataflow scripting tool for creating new scripts, although it is not yet documented."
There is SCID discussion group hosted by Google groups. To get on, send an email to brightone@o2.pl. You will need to create a Google Groups Account. Then you could visit the scid . The moderator is Polish and the host in google.pl, but go ahead and post in English.

The following people have expressed interest in writing a SCID. You might get together with them on a combined project. Email me at

to add you name to the list.

Unfortunately, the email addresses below are not clickable. Further, you cannot copy/paste them into your email program. You must manually re-type them. The email addresses are graphic *.png images created by Masker. I inconvenience you this way to discourage spammers from harvesting email addresses from the website with automated website spidering.

SCID Enthusiasts
email	name	notes
	Martin Fowler	language workbenches.
	Kyle Lahnakoski
	dIon Gillard
	Roedy Green	The author of this essay.
	Bill Kress
	Jim Little
	Lew Maestas
	Fabien Duminy
	Steve Lewis
	Graham Perkins
	Robert Bossanyi
	Marcos Diez
	Chris Tutty
	John Bäckstrand
	Maxim Friedental
	Carl Rosenberger	doing a SCID project with Java, C# and SmallTalk. They plan to be able to generate code in different languages from a common deep structure.
	Richard Mullins
	Hugh Doar
	Kimberley Burchett
	S. Saravanan
	David Rosenstrauch	Has completed the initial portion of a SCID project for Java. The app. currently parses Java code while the user types it and then stores it in a database-like format. Project is currently on hold due to lack of time and money Considering release as open source in the future. His work can be downloaded from darose.net
	Rohan Pall
	Don Gilmore & Jonathan Colt
	Ian	Has written a PHP (Pre-Hypertext Processor) SCID and is working on a rewrite.
	Kirill Osenkov	His thesis is dedicated to building an experimental structured editor for C#: He’s enthusiastic about SCID, intentional programming etc. He believe a structured editor would be a nice front-end for a SCID and is building a structured editor framework for that purpose. www.osenkov.com www.guilabs.net

Fortress
IDE
IntentSoft: Dr. Charles Simonyi’s Intentional Programming
JavaML

standard footer
	This page is posted on the web at:	http://mindprod.com/project/scid.html
	Optional Replicator mirror of mindprod.com on local hard disk J:	J:\mindprod\project\scid.html
	Please read the feedback from other visitors, or send your own feedback about the site. Contact Roedy. Please feel free to link to this page without explicit permission.
	Canadian Mind Products IP:[65.110.21.43] Your face IP:[216.73.216.153]
Feedback	You are visitor number