SCID SCID
home Student Projects no local find frame, full screen Google search web for topic jump to footer translate with Babelfish by Roedy Green ©1996-2008 Canadian Mind Products
This essay is about a suggested student project in Java programming. This essay gives a rough overview of how it might work. It does not describe an actual complete program. I have no source, object, specifications, file layouts or anything else useful to implementing this project. Everything I have to say to help you with this project is written below. I am not prepared to help you implement it; I have too many other projects of my own.

I do contract work for a living, which could include writing a program such as this. However, I don’t do people’s homework for them. That just robs them of an education.

You have my full permission to implement this project any way you please.

“If engineers built buildings the way computer programmers designed buildings, the first woodpecker that came along would destroy civilisation.”
~ anonymous

Java Source Code SCID-style browser/editor
Where the Rubber Meets the Roed

“An invasion of armies can be resisted, but not an idea whose time has come.”
Victor Hugo — Histoire d’un Crime, 1852
I mean, source code in files; how quaint, how seventies!
~ Kent Beck
SCID means Source Code In Database. This is one of many student projects. We have been teaching our customers to regard their data as a precious resource that should be milked and reused by finding many possible ways of summarising, viewing and updating it. However, we programmers have not yet learned to treat our source code as a similar structured data resource.

This is an enormous project, but you could start small. The basic idea is your pre-parse your code and put it in a database. The problem is programs are getting huger and huger. We need tools to help you temporarily ignore most of them so you can concentrate on your immediate needs. We need tools to rapidly navigate programs. We need tools to help you get an mental forest picture before delving into the tree detail.

I have been talking up the SCID idea since the early 70s. Mostly people have just hooted with derisive laughter. However, SCID-think is gradually catching on. The RADs, such an Visual Café IBM Visual Age and Inprise Jbuilder, let you write code to control the properties of widgets on the screen by right clicking on visual elements to view the associated properties. You can tick off entries in pop-up listboxes and checkboxes or fill in the blanks. This is an important step away from thinking of programs strictly as linear streams of ASCII characters. Java Studio lets you view and write Java code by playing plumber — visually connecting JavaBeans.

I think it is a case that the shoemaker’s children have no shoes. Programmers in creating source code in linear text files do the equivalent of keeping their accounting books using a CPM Wordstar text editor. We would never dream of handing a customer such error prone tools for manipulating such complicated cross-linked data as source code. If a customer had such data, we would offer a GUI-based data entry system with all sorts of point and click features, extreme data validation, and ability to reuse that data, view it in many ways, and search it by any key.

Once you have your program pre-parsed, you can display the program in a variety of ways. Here are just a few examples:

How Might You implement a SCID

Instead of traditional CVS or editor model where you have lines of ASCII text, you would have a tangled hairball of objects, one object for each token, e.g. IF, variable reference, method definition. The objects would have pointers to each other so you can rapidly find related information and rapidly navigate the program at any level of detail. References to a variable would not contain the name of the variable, just a pointer to its associated token object. The actual string name of a variable or method would appear in only one place. (This makes global rename and aliasing trivial.)

There are TreeMaps so you can find symbols by name or approximate name or by name/property combinations.

There is no source code, just the parse tree. You are thus free to display it in many different possible formats, or to export traditional Java source. The parse tree always represents a syntatically valid Java program.

The parse tree contains much more data than the equivalent source code, e.g. history of change, who changed each token and why.

The parse tree is RAM-resident, or stored in a decent persistent object database that approaches RAM-resident performance, such as Objectstore. Even for a purely RAM-resident implementation, the data must be persisted that is dumped to disk and restored as a lump with all the interconnections intact. Execution, (but not startup) would be faster than using a POD (Persistent Object Database). You need to log transactions to disk, but everything else lives in a giant virtual RAM space. Someday we will learn to snapshot entire virtual address spaces and pick up later exactly where you left off.

I repeat, the parse tree always represents a syntatically valid program. It might not necessarily do anything sensible, but it would "compile". Changes to the parse tree are applied in the form of atomic transactions to ensure the integrity of the tree cannot be compromised.

Other sorts of auxiliary data may be stored in a conventional SQL database where it would be accessible to user-written queries. However, the source code itself has too complex a structure to fit into the row-column SQL model.

There is a log of transactions that can be replayed in event of failure, or analysed to recreate the dynamic change history. You can play the log forward or back. The advantage of this log is that even in the event of catastrophic failure you would never lose more than a few seconds worth of keying.

When you get around to implementing dynamic version control, this transaction log must be sent to a central site and merged in real time with transactions of other people’s changes, then redistributed to all the redundant hot copies of the database. This implies a 24 hour Internet connection between all the programmer sites, or at least while any programmer is active at a site. The key is all copies of the database must process all the transactions in the exact same order. For speed you might process local transactions immediately then back them out if it turns out there were transactions from other sites that actually needed to be processed first. For more detail on how that might work see dynamic version control.

Books

Real World SCID Implementations

The usual reaction I get from programmers when I mention SCIDs is that they have tried them and they hate them. What they have tried are coding templates where you fill in the blanks. These stop you from coding in the old way, yet offer almost no payback. Granted SCIDs will force you to rethink how you compose programs. Code must at all times be 100% syntactically correct. However, a good SCID will pay back 100 fold for this inconvenience. If you try to import or paste code that is not correct, you will find much of it being turned into a special kind of comment
// INVALID System.out.printLine();
Fortress
IDE
JavaML

CMP_homejump to top
CMP logo
feedback Please email your feedback for publication, errors, omissions, broken/redirected link reports
and suggestions to improve this page to Roedy Green : feedback email
made with CSS
HTML Checked!
ICRA ratings logo
mindprod.com IP:[65.110.21.43]
Your face IP:[38.103.63.16] The information on this page is for non-military use only.
You are visitor number 24,613. Military use includes use by defence contractors.
You can get a fresh copy of this page from: or possibly from your local J: drive (Java virtual drive/Mindprod website mirror)
http://mindprod.com/project/scid.html J:\mindprod\project\scid.html