SCID
©1996-2017 Roedy Green of Canadian Mind Products
Disclaimer
This essay does not describe an existing computer program, just one that should exist. This essay is about a suggested student project in
Java programming. This essay gives a rough overview of how it might work. I have no source, object, specifications, file layouts or anything
else useful to implementing this project. Everything I have prepared to help you is right here.
This project outline is not like the artificial, tidy little problems you are spoon-fed in school, when all the facts you need are included, nothing extraneous is mentioned, the answer is
fully specified, along with hints to nudge you toward a single expected canonical solution. This project is much more like the real world of messy problems where it is up to you to fully the
define the end point, or a series of ever more difficult versions of this project and research the information yourself to solve them.
Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; or give you any additional materials. I have too many
other projects of my own.
Though I am a programmer by profession, I don’t do people’s homework for them. That just robs them of an education.
You have my full permission to implement this project in any way you please and to keep all the profits from your endeavour.
Please do not email me about this project without reading the disclaimer above.
Java Source Code SCID-style browser/editor
“If builders built buildings the way
programmers write programs, then the first woodpecker that came along would destroy
civilization.
~ Weinberg’s Second Law
(1933-10-27 age:84)
An invasion of armies can be resisted, but not an idea whose time has come.”
Victor Hugo (1802-02-26 1885-05-22 age:83),
born 1852, Histoire d’un Crime
I mean, source code in files; how quaint, how seventies!
~ Kent Beck (1961 age:56),
evangelist for
extreme programming.
SCID (Source Code In Database) . This is one of many student projects. We have been teaching our customers to regard their
data as a precious resource that should be milked and reused by finding many possible
ways of summarising, viewing and updating it. However, we programmers have not yet
learned to treat our source code as a similar structured data resource.
Programs are abstract structured data. They map better onto 3D visual structures than
they do onto linear streams of characters. We have to gently break the strangehold of
the written language metaphor for programs before we can make any major progress. An
IDE (
Integrated Development Environment) ginergerly decorates text with graphics. We need to
evolved that to a
SCID with graphics ginergly decorated with text, where the
dynamic graphics tell nearly all the story.
This is an enormous project, but you could start small. The basic idea is your
pre-parse your code and put it in a database. The problem is programs are getting huger
and huger. We need tools to help you temporarily ignore most of them so you can
concentrate on your immediate needs. We need tools to rapidly navigate programs. We need
tools to help you get a mental forest picture before delving into the tree detail.
I have been talking up the SCID idea since the early 70s. Mostly
people have just hooted with derisive laughter. However, SCID-think is gradually catching
on. The RAD (Rapid Application Development) s, such a Visual Café IBM (International Business Machines)
Visual Age and Inprise Jbuilder, let you write code to control the properties of widgets
on the screen by right clicking on visual elements to view the associated properties. You
can tick off entries in pop-up listboxes and checkboxes or fill in the blanks. This is an
important step away from thinking of programs strictly as linear streams of
ASCII (American Standard Code for Information Interchange)
characters. Java Studio lets you view and write Java code by playing plumber —
visually connecting JavaBeans.
I think it is a case that the shoemaker’s children have no shoes. Programmers in
creating source code in linear text files do the equivalent of keeping their accounting
books using a CPM (Cost Per thousand/Mille impressions) Wordstar text editor. We would never dream of handing a
customer such error prone tools for manipulating such complicated cross-linked data as
source code. If a customer had such data, we would offer a GUI-based data entry system
with all sorts of point and click features, extreme data validation and ability to reuse
that data, view it in many ways and search it by any key.
Once you have your program pre-parsed, you can display the program in a
variety of ways. Here are just a few examples:
- Make the beginnings of methods more visually obvious. The plain Java syntax tends
to camouflage them. Ditto for temporary variable declarations.
- Hide comments.
- Show just loop structure.
- Show just code involved with class X.
- Hide all code that involves calls to the java.awt package
- Highlight all code that does bit shifting or division.
- Highlight all uses of the sin method, but only in class Moses, not Math.
- Gray out all code that deals with error handling. All that is left is the normal
case.
- Colour code so that the darker the shade, the more frequently it is executed.
- Show me a switch
statement as if it had been handled with a set of subclasses. There
is underlying deep structure here. I should be able to view the code as
if it had been done with switch or
as if it had been done with polymorphism. Sometimes you are
interested in all the facts about Dalmatians. Sometimes you are interested in
comparing all the different ways different breeds of dogs bury their bones. Why
should you have to pre-decide on a representation that lets you see only one point of
view?
- Sometimes you want to see all the code concerning the save button. Other times you
want to see only code involving hooking up listeners for all buttons/menu items. Other
times you are interested only in code that affects layout. The
SCID can hide non
relevant code. It can also dynamically reorder code for the current purpose.
You no longer have to decide whether to bundle all your layout code together or to
bundle it with the corresponding button instantiation. You can have it both ways!
- Show me a set of subclasses as if the code had been handled by a
giant switch. This lets me compare the equivalent code in all subclasses. Similarly let
me compare just two subclasses. This is like a DIFF utility that notices the
differences between two methods and generates a single program that handles both,
taking advantage of commonalities.
- Show me the logic as if it had been written with a PET
decision table. Here you have a list of conditions, then a list of actions. The
SCID can ensure
that all possible combinations are covered and you easily proofread the logic
originally coded in traditional nested if logic. You can write a decision table and
have the SCID convert it to nested if logic or a giant switch
indexed by concatenated binary condition representations. Here is a simple example:
CONDITIONS
kettleFull() - Y Y Y N N N
makingTea() N Y N Y Y N Y
makingCoffee() N N Y Y N Y Y
ACTIONS
addWater() - - - - X X X
boilWater() - X X X X X X
addCoffee() - - X X - X X
addTea() - X - X X - X
Might generate code like this:
Better logic still would merge test conditions and actions to reduce code size and to
avoid computing a condition when it did not matter.
- Let me switch rapidly back and forth between different representations of my code.
I would like to see a high level CASE view, e.g. Warnier-Orr diagram or your favourite
flavour of UML (Universal Modeling Language) or sequence diagrams and then zoom in on coding detail,
something like TogetherJ offers. Let me see a flow chart of the program’s basic
loop structure, then zoom in on part of it. When a project starts, typically all the
energy is focused on the UML, specifications and the bird’s eye view. As the
project progresses, the energy is focused on the detail and entropy gradually destroys
the high level documentation. It is not kept in sync. The high level documentation
becomes worse than useless to orient an incoming maintenance programmer. The spec, the
UML, the
high level stuff, the code, various levels of comment detail and the end user docs
must be more closely integrated so that you can
navigate at any level. All levels must be kept up
to date and in sync. The navigation function provides motive to keep the high level
docs accurate. On the other end of the spectrum, let me zoom down and examine the
byte code or machine code.
We make the error of thinking computer programs are primarily for communicating
with computers. On a project that requires more than one person, the source code is
primarily for communicating between people. The SCID
gives you a mechanism to record information only of interest to people and to help
you manage that information overload.
- With Java version 1.5 enums, you want an
aligned grid so you can study the enum constructors either by row or by column so
that you can compare enums for a certain property, or study the properties of one
particular enum. You would like the grid to behave as a table in
HTML (Hypertext Markup Language) or better
still as a spreadsheet, with non scrolling headings, adjustable column widths so you
can squeeze the most information on the screen without scrolling. It wraps within
cells. It would look something like this:
/**
* constants for Application categories
*/
public enum
AppCat {
Application category enums
Enum |
|
shortName |
|
description |
|
aliases |
|
APPLET |
( |
Applet |
, |
Java Applet |
, |
applet |
), |
APPLICATION |
( |
application |
, |
Java
application |
, |
application |
), |
DOCUMENTATION |
( |
documentation |
, |
documentation |
, |
documentation |
), |
HYBRID |
( |
hybrid |
, |
Java Applet that can also be
run as an application |
, |
hybrid |
), |
JWS (Java Web Start)
|
( |
Java Web
Start |
, |
Java Web
Start |
, |
jws,
weblet,
webstart,
jaws |
), |
LIBRARY |
( |
Class |
, |
Class
library |
, |
class,
classes ,
library |
), |
SERVLET |
( |
servlet |
, |
Java Servlet |
, |
servlet |
), |
UTILITY |
( |
Utility |
, |
non-Java
Utility |
, |
utility |
); |
---|
}
- Show me the definition of this variable or method just by clicking it.
- Tell me which classes and methods call this method and how many times. This
XREF (Cross Reference) is always up to
date because the source is in a database. It does not need to be scanned periodically
to create a fresh XREF.
- Optionally show me code with all class names fully qualified by package, or remove
that qualification for all or some classes.
- Tell me which classes and methods look at this variable and how many times.
Similarly, tell me which ones change it.
- Collapse/expand level of detail, e. g. collapse detail of CASE bodies, LOOP bodies,
IF/ELSE bodies, parameter details leaving just the names of the methods being called,
collapse purely arithmetic assignments.
- By using specially coded comments you can hide/reveal various classes of them. You
can hide code and just read comments, or perhaps just see the overview comments, or
just the comments explaining what the various classes are for etc. The key is to show
you just the level of detail in comments you need for the current task without being
overwhelmed with irrelevant detail. You could configure which categories of comments
you wanted to see fully expanded and which you wanted revealed only by hoverhelp and
which you wanted totally suppressed. By using hoverhelp to display comments you free up
screen real estate to see more code at once on screen. You could implement comment
hiding/hoverhelp without a SCID using a smart traditional text editor using:
markers to tag comments with level of detail and importance (or severity as
Chuck Sheehan, the technique’s
inventor, calls it.)
Programmers very familiar with the code might be less likely to remove Javadoc or
complain about it, if they could get it off their screens. The main drawback
of doing this is out of sight, out of mind. Cowboy coders would be even less likely
to keep comments in sync with the code.
- When I write the code to call a method, show me the names, types and Javadoc for
the parameters.
- Show me the names of the parameters next to the parameters themselves in each
method invocation so I can proof-read it, the way you can in Ada-95:
drawCircle( x => point.x, y => point.y, radius => 5 );
or Modula-3:drawCircle( x := point.x, y := point.y, radius := 5 );
or as Java-style comments:drawCircle( point.x, point.y, 5 );
As I become familiar with certain methods, turn this expansion off for those methods
only. In a similar way, optionally expand/collapse calls with parameter type
information as well.
- When I type in an identifier the SCID
has never heard off, use spell check logic to suggest what I likely meant. Eclipse now
does this. I spend so much time correcting typos, variant abbreviations, errors in
capitalisation, inconsistent capitalisation e.g. Hashtable
vs HashTable when several variants are plausible.
- Warn me if I reuse a name locally that is already defined as an instance or static
variable, except for the usual exceptions.
- Show me my declarations aligned in columns, perhaps using compact glyphs to
indicate static, instance, public etc. so that I can easily pick out parallels in names
and types.
- Let me see switch statements as if they had been coded in Eiffel as inspect
statements. Let me see declarations, expressions, loops, if nests etc, in my favourite
syntax, in any of the Algol family of languages such as: Eiffel, Ada-95, Java, Dylan,
Scheme, Algol-68, beta, Pascal, Delphi, Oberon, Modula, NetRexx, Python, Sather,
roll-your-own such as Abundance or even as flowcharts. You should be able to key code
in any of these modes too. The language would still be Java underneath, with a surface
veneer to simulate the coding conventions of these other languages.
- Show me the Javadoc comment for that parameter on demand.
- Global method renaming. Accurate, unambiguous method and variable naming is the
most underrated technique for writing maintainable code. Whenever you add a new method,
there is a strong possibility some existing similar method should be renamed so the
distinction between the two is more clear. Scope name clashes can be resolved to avoid
confusing programmers. Compilers have no trouble with accidentally duplicated names,
but programmers are easily befuddled. Globally renaming manually is so error-prone that
it is almost never done manually. With a SCID,
it would be effortless and completed in an eye blink. You could also do
generalisations of renaming, e.g. reordering parameters to some more consistent
standard, or adding overloaded methods to handle common default parameters and having
all code converted to use the new overloaded methods.
- Show me the program with the Spanish strings inserted. Show it to me with the
Spanish variable names where they are available, but use English ones where not. Let me
read it as if it were written purely for Spanish with any internationalisation
bubblegum housekeeping hidden.
- Show me the program with all needless () levels removed that our newbie programmer
put in. The () are not actually stored in the database. They are regenerated to suit
the individual programmer preference.
- Show me the program with extra parentheses () inserted because I can never remember the precedence distinctions between && and <.
- Show me the program with the Whazmotron custom glyph set so that I can easily pick
out if begin end, loop begin end, class begin end, method begin end.
- What classes are available to me at this point in the program? What local variables
are in scope?
- How should you display semicolons? Once you have the parse tree, they are purely
for the convenience of the humans reading the code. They are not actually in the
computer’s parse tree. You could display with any statement convention
you wanted. Every programmer could flip between any display mode they wanted.
- you might leave them out.
- You might use Pascal separator, Java-like terminator, or Eiffel-like
only-when-it-would-otherwise-be-confusing rules.
- You might use a pure indenting convention.
- You might draw boxes, or non-outlined boxes around each statement in a subtly
different shade from the background colour.
- You might use a special fat glyph, perhaps a little red stop sign. that is very
easy to tell apart from a colon.
- You should be able to ask, what methods are available at this point in the program
that produce a Zomblat object? What methods are available that take a Zomblat object as
a parameter? What methods are available that take both a Zomblat and a Color
object?
- What methods are available anywhere that produce a Zomblat object? What methods are
available that take a Zomblat object as a parameter?
- Show/hide the Eiffelian pre/post assertions. You can fill in dialog boxes about
each parameter, variable or return results. For ints you may specify the acceptable
low-high bounds. For strings you would specify whether they may be, null, empty
"" and whether they may have lead/trail blanks. You might specify that they
must be all upper case, all numeric, all lower case, no accented letters etc. For
enumerations you would specify the list of allowable values. For debugging, you can
turn this code on to ensure all the conditions are being met.
- When debugging, the SCID secretly captures information about where in the code
each string in the output came from so that you can click anything in the console
output and instantly jump to the System.out. println statement that generated it. It
should be easy, when debugging, to temporarily assign colours to the console output
from different classes or println statements to help
classify them similar to the way logging can be configured.
- Capture additional information about fields useful for data entry, such as low-high
bounds, blank if zero, left leading zero fill, commas, lists of legal values,
justification, natural layout parameters, field name or display, prompts, field widths,
validation routines,… Programs can access this data rather than specifying it
inline in the code. This keeps everything about the variable in one place where it can
be easily accessed and changed. It also facilitates searching for fields that share
some property and bulk replacing it.
- That was an ambiguous name for that method. Change it everywhere it is used to this
clearer name, but don’t change it where that same name is used in another class.
Computer, be clever, don’t pester me to figure it out for you which ones should
be changed. IBM ’s Visual Age can do this already. With a
database, a variable or method name string actually appears in only one place,
(everywhere else the name is represented by a pointer to that name), so it is trivial
to make a global change.
- Display the program using foreground/background color, font family, font size, font
style (bold, italic), lines, glyphs and icons to pack as much additional information on
the screen as possible. For example you might be able to tell a stack/temporary
variable, from an instance variable from a static variable from a constant just by
looking the font, or some slight shade of foreground or background colour difference,
e.g. dark brown, orangey/brown and light brown foreground. The clues may be almost
subliminal. You could encode all kinds of information compactly such as: local,
parameter, instance, static, my class, Sun class, type, package, class, definition,
keyword, final… all in a way that did not get in your face. You could encode for
whatever distinctions were important at the time.
Variable pitch fonts are possible without giving up alignment. They put more on
the screen and are more readable than fix pitch fonts.
Exactly how these abilities are used will change constantly depending on your
current task. The idea is to encode information about symbols in their look.
- You could use the full colour abilities of the modern screen to give subliminal
clues, e.g. by automatically assigning a portion of the spectrum to each package/class
using a pastel shades as the backgrounds to any references to methods or variables of
that class. You could bold face the definition of any identifier to make it stand out.
You could make calls to Sun code look different from calls to you own code.
- Chris Uppal suggested using colour coding by author. He noted that in every shop
where he had worked, there were programmers he could trust and ones he did not. If he
were attempting to understand or debug a chunk of code it would help if he knew which
stretches could be trusted to do what they claimed to do and which stretches warranted
more scrutiny.
- You could encode the age of code by colour. Generally the newest code is most
suspect if there is a problem. Sometimes old code, that was done before some
specification change occurred, needs to be examined and ticked off as compatible with
the new spec. You can use colour to help keep track of which code has been checked. A
SCID would know
the age of every token to the millisecond, much finer resolution that could be pulled
off with CVS (Concurrent Versions System)
deltas.
- You could ask that all code be filtered out unless it had to do with Instantiating
objects (other than common ones like String). This skeleton view would give you a
pretty good overview of how all the classes fit together.
- You could ask to globally visit all references to a given method or variable and
tick them off once each was dealt with.
- You could do quite a bit of code writing by point and click. There is no need to
type a variable or method name, just select it from a palette of likely variable or
method names. You could type personal abbreviations for them and have them expanded.
You could view code with your personal names or the official ones. For example to write
a FOR loop there might be boxes you fill in for the various intializer, terminator and
incrementor expression. They would default to int i=0,
i<n and i++. You could give the
loop a name to be displayed when its body is collapsed. You could convert the loop to
one that ran backwards by a single click, or to one that generated a WHILE or UNTIL
loop similarly. If you ticked enumeration all you would need
type is the name of the name of the Enumeration generator. The rest would be generated
for you, accurately.
- Alternate display with common functions displayed as if they were infix operators
using special glyphs (here simulated here with red). For example, instead of seeing:
if ( a.equals(b) )
You might see if ( a == b )
Instead of if ( a.compareTo(b) < 0 )
You might see if ( a < b )
- The SCID would act as
a Java lint, displaying suspicious or unusual code in a special colour and perhaps ask
you for confirmation when you inserted code of the form if (myString == abc) or if
(myBool = a & b).
- Show or hide explicit conversions.
- Display declarations in a grid so that is easy to pick out the variable name, the
type and the initialisation. They line up nicely in columns like a spreadsheet,
possibly with each column separately scrollable so that you can see the big picture and
home in on the detail when you need it.
- Embed HTML comments in your code that render, complete with
diagrams and images when you read the code. There could be links to references to where
the algorithms were documented etc.
- Show me the code in pseudo NetRexx, Bali or JPython, with obvious declarations removed so I can
focus on the procedural logic, or vice versa. I would then see an enumeration iteration
written tersely as for r in reminders instead of the usual Java verbosity.
- Bali-style variable size parentheses. In Java a piece of code
might be displayed like this:
int a = ((b+c)/(e+f))*(g(i)+h);
That some piece of code displayed in Bali might look like this:
int a = ((b + c)/(e + f)) *
(g(i) + h);
The red is just to highlight the outsized(), though
colour coding matching () and {} is not such a bad idea.. It might even be optionally
displayed like this:
b + c
int a = ————— * (g(i) + h);
e + f
- Show get/set method invocations as if they were direct access to an associated
property variable, similar to Delphi or Eiffel. This simplifies the syntax. Instead of
seeing:
setFudge( getFudge()+1 );
you would see:fudge++;
- Use colour to display literals to group digits by three for decimal and octal and
by 4 for hex, emphasising the trailing indicator char in a different colour so it does
not get confused with a digit. A listing might look like this:
In some Asian countries, decimal digits are also grouped by four. The
SCID would allow
either preference, defaulting to the locale default, so different people would see the
same code differently.
- Display using lines or slight shade variations in background colour to mark the
bounds of ifs and loops. Programs would look more like flow charts, or more like text
with highlights, as the programmer preferred for the current purpose. Vertical
striation watermarks in the background would make it easier to see matching alignments.
You might draw thin vari-shaded boxes around each nested block. You might bracket
blocks with {} turned 90 degrees and made 10.16 cm (4 in)
wide. CSD is one such representation.
- Optionally apply Hungarian notation warts to variable names to indicate variable
type or scope. Turn them on and off at will. They are always accurate, e. g. Scope
prefixes might work like this:
- local a (e.g. aPoint)
- param p (e.g. pPoint)
- member instance m (e.g. mPoint)
- static s (e.g. sPoint)
- exception X (e.g. XOutOfBounds)
- Highlight all code involving floating point. What I am talking about here is not
permanently highlighting floating point operators and operands, or
example, but just for the next 10 minutes because floating point is
the thing I am concentrating on at the moment.
The syntax colouring schemes I am familiar with are designed to be done once and
left alone once you have them tweaked the way you like them.
For a SCID, you need not only ways to change the syntax
highlighting, but to rapidly flip between presets to enhance the
current interest and to suppress the current irrelevancies. You
also need ways to rapidly set up new interest constellations.
I use the word highlight in a broader sense. With a
GUI (Graphic User Interface) and SCID
you may use combinations of colour, font, size, glyphs, background colour, hiding,
folding, lines, geometric shapes, bold, italic, blink etc. etc. to make a certain
constellation of currently interesting features leap out at you. Different
interesting features would use different highlighting techniques to grab your
attention simultaneously.
- If you elected to view an IF as a flow chart, you could more easily compare the
true and false branches line by line side by side. With the modern
GUI ’s
ability to rapidly pan in 2D or even 3D, we should break the mindset that programs
have to be a single linear column of text. We can pay more attention
to what actually works for the eye, not what is easiest to code. I am pretty sure than
long horizontal lines of text, stretching all the way across the screen, so popular
now, will prove to be suboptimal. You might look for inspiration to website navigation
aids or the Windows ME exploding menus.
- You might create your own glyphs or icons to represent methods, classes, operators,
variables, syntactic elements etc. That way you can pack more information onto the
screen at once. You create a personal way of displaying the program to
yourself that no one else in the universe need be able to make sense of. You share the
underlying code syntax tree with your fellow programmers. The representation is
personal and evanescent.
- You want to see program flow under a certain set of circumstances. Code that would
not be executed when those circumstances don’t apply is temporarily suppressed
from the display. You are left with a simplified flow chart that shows execution flow.
You can focus on the usual case, then later view various pathological cases,
independently. You don’t have to deal with the full complexity of everything all
the time the way you do in conventional coding. The whole point of the
SCID is to
temporarily suppress what is temporarily irrelevant.
- Idiom expansion. There are many things in Java
that take reams of code to express. You can’t abbreviate it by writing a method.
Instead, you code an abbreviation, or fill in some blanks in a dialog box and it
generates the bubblegum for you, error-free.
- Idiom detection. Java is verbose, but tends to follows standard patterns called
idioms, e.g. enumerating a set, hooking up buttons to listen for events. The
SCID can detect
the pattern and replace it with an abbreviation for display. If code refuses to
abbreviate, but that looks like an idiom, you can be sure is not quite the standard
idiom. That may be an error, or it may be deliberate deliberate. Code is much easier to
proof read this way. You don’t need your eyes to detect tiny variations from the
standard idioms.
- You want to be able to trace not only program flow but data flow. Consider a
program rendered like a flow chart, with parts of it suppressed. Lines show how a
particular datum flows through the system, how it gets operated on and modified. Code
that is not germane to that flow is temporarily suppressed. I am hand waving
frantically here. What the heck am I talking about? Consider a program where you
entered a birthdate. There are parts of the program that would be totally unaffected by
that birthdate. Those parts could be suppressed if you were concerned with how the
birthdate affects the program. There are degrees of association. A test on birthdate is
a little less associated than some code that transforms a birthdate ordinal into year,
month and day for display.
- Pale finals. I would like the SCID
to mark all variables that are not redefined with a pale
final to let me know I need not worry about subsequent redefinition of the
value. Similarly, I would like the SCID
to mark all
methods and classes that are not overridden with a pale final
to let me know there are no redefinition of the method to worry about. These would be
generated dynamically, not part of the source code and could be turned on or off. They
would look like regular finals, except would be displayed in a pale colour to indicate
their ghostly nature. They would not prevent me from redefining the variable or the
method. The pale final would simply disappear.
- A scid might colourise a final declaration in a
distinctive way whether or not is it explicitly marked final.
- Display complicated expressions in true mathematical form much as TEX or
the Microsoft equation editor would display them, with variable sized parentheses and denominated under numerators.
To help understand expressions, you could ask bits of them to collapse on screen. A
simple version would adjust the amount of space around each operator to indicate
relative precedence. Low
precedence operators would be surrounded by more space. Java has 13 levels of
precedence, but you would not normally find them all in one expression. The
relative distinctions in spacing could be obvious. You would
not have a fixed amount of space for each precedence level. You
could also display complex expressions as a parse tree, with operators at the nodes.
What you see need not have that much relation to what you type. For example, you
could type GT for > since for
some people it is easier to type.
- Let you refactor code by breaking up methods into smaller ones. You just highlight
the hunk of inline code you want made into a separate method. In theory it should even
be possible to automatically determine the parameters and their types. The system could
then go looking for code that does inline what your new methods do and replace that
inline code with the new encapsulated calls.
- Lets you select colours within the SCID
using a ColorChooser. Colours and variables/constants
representing colours in code can be represented any combination of three ways:
- By colour sample swatch.
- By colour name.
- By colour number (hex/decimal)
- Exploit new high res 1 metre square LCD (Liquid Crystal Display)
or gas plasma panels so you have room to see everything at once, visually navigating
your way around the entire code space, rather than peeking at it through a toilet
tube the way we do now.
- With everything preparsed, writing your own custom code conversions would be a lot
easier. For example, you might write a translator from AWT (Advanced Windowing Toolkit)
to Swing code.
- True visual editing. Your GUI
program looks like the final screen output. You right click on any component which
brings up a dialog box from which you can change, colour, font, border,
initialisation, associated event handlers… mostly by ticking off boxes and
making multiple choice selections. Your program always works to some degree since you
can’t select anything syntactically invalid. Navigation is far easier, since
you don’t even need to remember the names of things. You can, of course, find out
the names of things by right clicking them. Code becomes far less procedural and more
OO (Object Oriented).
- A supermarket parking lot helps its customers find their
cars by posting signs with animals on them in various parts of the parking lot. It is
much easier for someone to remember they left their car near the salmon than in sector
E6. Similarly you could embed landmark symbols in the code, perhaps with purely
personal meaning and only visible to one programmer, just to help her find her way
around. You could click on the
to get back to a section of code you were working on recently.
- For importing ordinary Java source code, a parser such as JavaCC or
ANTLR (Another Tool for Language Recognition) might be useful. See
parser in the Java & Internet Glossary.
- Ability to add shortcuts to the syntax such as Abundance-style moods and for-each
loops. Instead of saying x.keyin(); y.keyin(); z.keyin(); you
can declare keyin as a mood and say: keyin x, y, z; You could say things like: for (
MyArray) { MyArray.x += MyArray.y + 1; } to run through all the elements of the
array and provide an implied default subscript inside the loop body. Collection
iteration could be much terser as it is in most modern languages.
You can invent your own language shortcuts. No other programmer need view them.
They would see perfectly standard Java. However, when you viewed their code, your
shortcuts would be applied so you would not have wade through their reams of
dinosaurian repetition. You might for example add case ranges to the switch,
implemented with a binary search. If your shortcut got in the way, you could drop it,
and instantly see standard Java again. After all, this is software,
right? It is supposed to be malleable and comfortable. With traditional coding
techniques software becomes so rigid. It is harder to change that the supposed
hardware.
- I suspect that SCIDs (Source Code In Databases)
will create a revolution in terseness of language design. It will come about gradually
like this. A SCID will give you the ability to temporarily hide
bookkeeping/busywork/plumbing/wiring (pick your analogy) details. Next will come the
call for the SCID to automatically generate those details. Next will come
the complete suppression of them from the application programmer’s awareness.
They will be hidden completely behind the walls, no longer part of the day-to-day
application programming language. Every time you can suppress 30% of the busywork, a whole new set of patterns emerge that were
formerly obscured by all that fussy detail. Suddenly, you discover new ways to more
tersely specify your desires to the computer. You see new levels of
bookkeeping/busywork/plumbing/wiring that can be similarly hidden, revealing still more
deep structure. this is not just speculation. I have seen this process in the evolution
of my own Forth-based language, Abundance.
- SCIDs
will also have another influence on language design. Right now we are stuck in a
mindset that a computer language is a linear stream of vanilla 7-bit
ASCII characters. SCIDs
will loosen that up. We are already seeing that with tools like the Symantec layout
editor. A program can be a diagram in 2D. Font style can have
semantic meaning. Noam Chomsky might put it this way, programs may have many temporary
surface representations of a single deep structure. We will see multiple alias names
for variables, multilingual variants of the Strings that can be flipped with the click
of a menu item.
- A SCID could
potentially store a lot more information, (not normally visible) than a text file
representation would. For example, you could fairly easily automatically record who
changed each individual element in each line of code, why and when and as part of which
job layer. See dynamic
version control for what I mean by job layer. Global renamings would be labeled as
such, not as a million separate little transactions. This information could also be
used by the boss to track precisely what a telecommuting employee did during the week.
You could add all the commentary you wanted without worrying about overwhelming the
reader for whom it was irrelevant. The other information you could track at each node
is who has access to look at or change that piece of code.
- For some notes on how a SCID might be implemented so that many users could be
updating the same code simultaneously from several globally dispersed sites, again see
dynamic version
control.
- If ever Microsoft, (the inventors of the dancing paperclip), gets a hold of the
SCID concept I
suspect they will totally misunderstand. A SCID
will be a 3D simulation of a Disneyland ride where you passively watch transactions
being processed by a McDonald’s hamburger machine to the endlessly repeating
strains of It’s A Small Small World. To debug, you watch the
individual bits shaped like French fries being cooked, salted then added, get it? It takes only 10 seconds for the animation to complete
the addition of two numbers.
- Have a look at Visual
SlickEdit. It is not a SCID, but with every release it develops more and more
SCID-like features.
- I have written an essay on online books. I propose a SCID-like solution to
handling the problem of information overload in technical documentation.
How Might You implement a SCID
Instead of traditional CVS
or editor model where you have lines of ASCII
text, you would have a tangled hairball of objects, one object for each token, e.g. IF,
variable reference, method definition. The objects would have pointers to each other so
you can rapidly find related information and rapidly navigate the program at any level of
detail. References to a variable would not contain the name of the variable, just a
pointer to its associated token object. The actual string name of a variable or method
would appear in only one place. (This makes global rename and aliasing trivial.)
There are TreeMaps so you can find symbols by name or approximate name or by
name/property combinations.
There is no source code, just the parse tree. You are thus free to display it in many
different possible formats, or to export traditional Java source. The parse tree
always represents a syntactically valid Java program.
The parse tree contains much more data than the equivalent source code, e.g. history
of change, who changed each token and why.
The parse tree is RAM-resident, or stored in a decent persistent object database that
approaches RAM-resident performance, such as Objectstore. Even for a purely RAM-resident
implementation, the data must be persisted that is dumped to disk and restored
as a lump with all the interconnections intact. Execution, (but not startup) would be
faster than using a POD (Persistent Object Database). You need to log transactions to disk, but everything else
lives in a giant virtual RAM (Random Access Memory) space. Someday we will learn to snapshot entire virtual
address spaces and pick up later exactly where you left off.
I repeat, the parse tree always represents a syntactically valid program. It might not
necessarily do anything sensible, but it would compile.
Changes to the parse tree are applied in the form of atomic transactions to ensure the
integrity of the tree cannot be compromised.
Other sorts of auxiliary data may be stored in a conventional
SQL (Standard Query Language) database where it would be accessible to user-written
queries. However, the source code itself has too complex a structure to fit into the
row-column SQL
model.
There is a log of transactions that can be replayed in event of failure, or analysed
to recreate the dynamic change history. You can play the log forward or back. The
advantage of this log is that even in the event of catastrophic failure you would never
lose more than a few seconds worth of keying.
When you get around to implementing dynamic version control, this transaction log must
be sent to a central site and merged in real time with transactions of other
people’s changes, then redistributed to all the redundant hot copies of the
database. This implies a 24 hour Internet connection between all the programmer sites, or
at least while any programmer is active at a site. The key is all copies of the database
must process all the transactions in the exact same order. For speed you might process
local transactions immediately then back them out if it turns out there were transactions
from other sites that actually needed to be processed first. For more detail on how that
might work see dynamic version
control.
Books
-
Book referral for Doing Hard Time: Developing Real-Time Systems with UML, Objects, Frameworks and Patterns
|
recommend book⇒Doing Hard Time: Developing Real-Time Systems with UML, Objects, Frameworks and Patterns |
by |
Bruce Powel Douglass |
978-0-321-77493-4 |
paperback |
publisher |
Addison-Wesley |
978-0-201-49837-0 |
hardcover |
published |
1999-05-21 |
on the ROPES software-development method built into Rhapsody |
|
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder. |
Book referral for Software Engineering Environments: Automated Support for Software Engineering
|
recommend book⇒Software Engineering Environments: Automated Support for Software Engineering |
by |
Alan W. Brown |
978-0-07-707432-6 |
paperback |
publisher |
McGraw-Hill |
published |
1993-03 |
|
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder. |
-
Book referral for Object Oriented Databases: and Their Applications to Software Engineering
|
recommend book⇒Object Oriented Databases: and Their Applications to Software Engineering |
by |
Alan W. Brown |
978-0-07-707247-6 |
paperback |
publisher |
McGraw-Hill |
published |
1991-08 |
He describes, among other things, the ECMA Portable Common Tools Environment (which is a spec rather than a product) and how a few actual CASE tools match up to it |
|
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder. |
-
Real World SCID Implementations
The usual reaction I get from
programmers when I mention SCIDs
is that they have tried them and they hate them. What they have tried are coding
templates where you fill in the blanks. These stop you from coding in the old way,
yet offer almost no payback. Granted SCIDs
will force you to rethink how you compose programs. Code must at all
times be 100% syntactically correct. However, a good
SCID will pay back
100 fold for this inconvenience. If you try to import or paste code that is not correct,
you will find much of it being turned into a special kind of comment
- Symade Semantic Oriented Programing
- SCIDs
are not a totally mythical beast. Smalltalk and Logo programmers have been using them
for a long time. IBM ’s Visual Age Java compiler uses a
SCID, though
they backed off somewhat with its successor, Eclipse. SCID
users are very enthusiastic about them, even though I think the current crop of tools
have just begun to exploit the possibilities. Jade stores its code is a preparsed tree. Mozart
develops the idea of concept programming where you create application specific
syntax.
- Lisp has been treating programs as structured data for many years.
- See Martin Fowler’s work on Refactoring. His ideas on automated source
transformations require analysing the code as a parse tree.
- Every version of Slickedit
comes out with a more and more SCID-like user interface.
- The Xerox Parc people have been experimenting with a new way of organising Java
programs called Aspect Oriented
Programming as a way specifying facts in only one place declaratively rather than
by sprinkling them redundantly throughout the procedural code. Doing that makes code
much easier to maintain. Declaratively specifying a huge amount of information that is
traditionally handled procedurally is the key to my own computer language, Abundance,
whose primary design goal was ease of maintenance. You can specify information
declaratively and automatically generate the corresponding procedural Java bubblegum.
- Microsoft had an Intentional Programming project. An intention is the core essence of a program once you strip out the
housekeeping bubblegum that is necessary to explain picky implementation details the
language/tool cannot handle on its own. Once the programmer has formed an executable
thought, the programmer’s next question is not what do I
have to say to get the computer to do this?, as was the case in traditional
programming, but what do I insist on saying?. Intentions
are the program spec plus sufficient detail to specify how you want the problem
solved.
- SCIDs
are similar to Bell Lab’s SeeSoft
to generate bird’s eye graphical displays of the entire project that use colour
coded pixels to indicate such things as code age or hot spots where a profiler
determines code spends most of its time executing. You can zoom in on interesting
places to see the actual code. Other things you can colour code with pixels or
coloured background include, code I have recently changed, code others have recently
changed, code that was changed during some time period where a problem first showed
up, code that is frequently changed, code that makes use of a certain class or
method, where the comments are densest, Basically any metric you can compute from the
parse tree representation can be expressed as colour.
Colouring for absolute frequency of execution points out areas that could benefit
from optimisation. Colouring for relative frequency of execution helps you pick out
the most common paths through the code, i.e. what happens in the usual case.
- Jim Little’s Prism project seeks to find a representation for
SCID data that
can be shared by different programs. That way you could build your
SCID system up
out of pluggable components.
- You might mine the i-Logix Rhapsody project for ideas
on visual programming. It is a diagrammatic code generator for C++. It is based on UML,
the high-level language for real-time, multitasking systems) and i-Logix Statemate.
The idea behind Rhapsody is to make the documentation executable. And the
documentation is in the form of a number of diagrams you draw. i-Logix' Statemate
uses enhanced bubble charts that, to paraphrase the Buick commercial, are not your
father’s bubble charts. Briefly, they allow an action upon entry to a state,
while in a state and upon exit. Further, exits from a state can branch conditionally
and a sub-machine can remember its last state to pick up where it left off upon
re-entry. There’s more, but suffice it to say that Statemate is very
powerful.
- CodeGuide was an
IDE
that is taking more of a plunge in the SCID
direction than usual.
- The i programming language is reputed to be SCID
friendly.
- OpenJava can also be regarded as
a toolkit for constructing a Java preprocessor.
- Jatha is a simple
preprocessor for Java that is inspired by the power of Lisp macros. It is released
under the GPL (Gnu Public Licence).
- Juliet lets you ask SCID-like
questions about your source code and rapidly navigate it. It is not an editor, just a
browser.
- Aubjex Alajava was a technology that transforms Java
code into an especially efficient and complete database form, with generalized
capabilities that do for Java source code what database query and manipulation
products do for business data. Author Don Gilmore writes " Aubjex is built on
SCID. It can
parse the entire Java version 1.4 java source package in 30
seconds, into a database that maintains all information. We have hundreds of
XML (extensible Markup Language) scripts that query and manipulate the database. There
is a dataflow scripting tool for creating new scripts, although it is not yet
documented."
- There is SCID discussion group hosted by Google groups. To get on,
send an email to brightone@o2.pl. You will need to
create a Google Groups Account. Then you could visit the scid.
The moderator is Polish and the host in google.pl, but go ahead and
post in English.
- The following people have expressed interest in writing a
SCID. You might
get together with them on a combined project. Email me at
to add you
name to the list.
Unfortunately, the email addresses below are not clickable. Further, you cannot copy/paste them into your email program. You must manually re-type them. The email addresses are graphic *.png images created by Masker. I inconvenience you this way to discourage spammers from harvesting email addresses from the website with automated website spidering.
SCID Enthusiasts |
email |
name |
notes |
|
Martin Fowler |
language
workbenches. |
|
Kyle Lahnakoski |
|
|
dIon Gillard |
|
|
Roedy Green |
The author of this essay. |
|
Bill Kress |
|
|
Jim Little |
|
|
Lew Maestas |
|
|
Fabien Duminy |
|
|
Steve Lewis |
|
|
Graham Perkins |
|
|
Robert Bossanyi |
|
|
Marcos Diez |
|
|
Chris Tutty |
|
|
John Bäckstrand |
|
|
Maxim Friedental |
|
|
Carl Rosenberger |
doing a SCID project with Java, C# and SmallTalk. They plan
to be able to generate code in different languages from a common deep
structure. |
|
Richard Mullins |
|
|
Hugh Doar |
|
|
Kimberley Burchett |
|
|
S. Saravanan |
|
|
David Rosenstrauch |
Has completed the initial portion of a SCID
project for Java. The app. currently parses Java code while the user types
it and then stores it in a database-like format. Project is currently on
hold due to lack of time and money Considering release as open source in the
future. His work can be downloaded from darose.net |
|
Rohan Pall |
|
|
Don Gilmore & Jonathan Colt |
|
|
Ian |
Has written a PHP (Pre-Hypertext Processor)
SCID and is working on a rewrite. |
|
Kirill Osenkov |
His thesis is dedicated to building an experimental structured editor for C#:
He’s enthusiastic about SCID,
intentional programming etc. He believe a structured editor would be a nice
front-end for a SCID and is building a structured editor framework
for that purpose. www.osenkov.com www.guilabs.net |