If you look at my website,, you will see that a large part of it is generated by various computer programs.

If I run the output through HTMLValidator, or eyeball it, or view it in a browser, it is generally pretty easy to find errors. The hard part is figuring out where that error was generated in the corresponding code.

I had an idea years ago for a tool to solve that problem, but no one was interested. It worked like this:

In debug mode there is a hook in the text-output functions to, whenever you write a line, make a log entry I/O compact binary (something like serialized, but in essence it would look something like this:

at com.mindprod.prices.Prices.emitDetailsAndManual line:1059

The output is perfectly normal and you can do anything you would normally do with it. However, there a magic viewer for the file. When you right click on any character, it will tell you the program and line number that wrote it. It looks the filename and offset up in the log file, which is massaged to make finding fast and compact by avoiding duplicating long package, class, method and file names.

I needed such a tool on 2014-04-23, but I had too many things on my plate and did not have time. Further I did not know how to insert the hook without changing the names of the I/O methods.

So I cooked up something a little more primitive:

I then peppered my code with calls to this method. I inserted the String it gives back into the output stream inline, as an HTML (Hypertext Markup Language) comment.

<!-- at com.mindprod.prices.Prices.emitDetailsAndManual line:1059 -->

By turning a debug switch on and off I could control whether these comments were inserted.

There is a problem. You can’t just insert <!----> anywhere in HTML. So I had to manually back off and suppress the insertion in spots where the comment could confuse an HTML parser, e.g. inside comments or inside macros which are officially comments. If I had a more sophisticated tool, such as I described ot the top, that would not be a problem. The generated stream would be intact no matter where I made my log entries.

So your task is to implement this properly. The tricky parts are:

It would not be that much more difficult to track on a per character or per String basis. You do not actually have to create a log entry for every character, only when the creating method changes. You need a log entry per range of characters all created by the same method (or same line number write within that method). You might even give the programmer the option of how finely to want to track. The more finely the more overhead.


Silvio noted:
Unless you would have some really crappy code most IO calls in an HTML generating application would be inside a very small general purpose utility class. You would need an entire stack trace at every output position to be able to actually pinpoint a bug.

I think, in practice, the depth of the stack you want too sample is constant. Perhaps if it is variable, the code would have to set it dynamically, yuch! Perhaps the snapshot taker could be given a list of methods not of interest. When it hits one, it snapshots one deeper in the stack. I think this would be conceptually easier for the programmer to understand and would not require modifying the code in any way.


