XML (extensible Markup Language)
The
XML fad has
created a bonanza of opportunities for obfuscation. The basic technique is to pick a
random hunk of code, then invent an obscure way of representing its logic in
XML. Then
replace the piece of code with an XML
properties file and an XML parser. Make sure the XML
representation you choose is so limited that almost anything other than the original
logic cannot be expressed in it. Of course, you never document the
XML language
extension or the parser. Nobody questions the simplicity of
XML. Using
this technique, you should easily be able to balloon 10 lines of simple Java code up
to 100 lines of perfectly opaque XML.
Obfuscated C
Follow the obfuscated C contests on the
Internet and sit at the lotus feet of the masters.
Find a Forth or APL (A Programming Language)
Guru
In those worlds, the terser your code and the more bizarre the way it works,
the more you are revered.
I’ll Take a Dozen
Never use one housekeeping variable
when you could just as easily use two or three.
Jude the Obscure
Always look for the most obscure way to do
common tasks. For example, instead of using arrays to convert an integer to the
corresponding string, use code like this:
char *p;
switch (n)
{
case 1:
p = one;
if (0)
case 2:
p = two;
if (0)
case 3:
p = three;
printf(%s, p);
break;
}
Foolish Consistency Is the Hobgoblin of Little
Minds
When you need a character constant, use many different formats: '
', 32, 0x20, 040. Make liberal use of the fact that 10 and 010 are not the same
number in C or Java.
Casting
Pass all data as a void * and then typecast to
the appropriate structure. Using byte offsets into the data instead of structure
casting is fun too.
The Nested Switch
(a switch within a switch) is the
most difficult type of nesting for the human mind to unravel.
Exploit Implicit Conversion
Memorize all of
the subtle implicit conversion rules in the programming language. Take full advantage
of them. Never use a picture variable (in COBOL or PL/I) or a general conversion
routine (such as sprintf in C). Be sure to use floating-point variables as indexes
into arrays, characters as loop counters and perform string functions on numbers.
After all, all of these operations are well-defined and will only add to the
terseness of your source code. Any maintainer who tries to understand them will be
very grateful to you because they will have to read and learn the entire chapter on
implicit data type conversion; a chapter that they probably had completely overlooked
before working on your programs.
int literals
When using ComboBoxes, use a switch
statement with integer cases rather than named constants for the possible values.
If you have an array with 100 elements in it, hard code the literal 100 in as many places in the program as possible.
Never use a static final named constant for the 100, or refer to it as myArray.length. To make changing this constant even more difficult, use
the literal 50 instead of 100/2, or 99 instead of 100-1. You can further disguise the
100 by checking for a == 101 instead of a > 100 or a > 99 instead of
a >= 100.
Consider things like page sizes, where the lines consisting of x header, y body,
and z footer lines, you can apply the obfuscations independently to each of these
and to their partial or total sums.
These time-honoured techniques are especially effective in a program with two
unrelated arrays that just accidentally happen to both have 100 elements. If the
maintenance programmer has to change the length of one of them, he will have to
decipher every use of the literal 100 in the program to determine which array it
applies to. He is almost sure to make at least one error, hopefully one that
won’t show up for years later.
There are even more fiendish variants. To lull the maintenance programmer into a
false sense of security, dutifully create the named constant, but very occasionally
accidentally use the literal 100 value instead of the
named constant. Most fiendish of all, in place of the literal 100 or the correct
named constant, sporadically use some other unrelated named constant that just
accidentally happens to have the value 100, for now. It almost goes without saying
that you should avoid any consistent naming scheme that would associate an array name
with its size constant.
Semicolons!
Always use semicolons whenever they are
syntactically allowed. For example:
if ( a );
else;
{
int d;
d = c;
}
Use Octal For Obscurity
Smuggle octal literals into a list
of decimal numbers like this:
array = new int[]
{
111,
120,
013,
121,
};
Convert Indirectly
Java offers great
opportunity for obfuscation whenever you have to convert. As a simple example, if you
have to convert a double to a String, go circuitously, via Double with
new Double(d).toString() rather than the more direct
Double.toString(d). You can, of course, be far more
circuitous than that! Avoid any conversion techniques recommended by the Conversion Amanuensis. You get bonus
points for every extra temporary object you leave littering the heap after your
conversion.
Nesting
Nest as deeply as you can. Good coders can get up
to 10 levels of ( ) on a single line and 20 { } in a single method. C++ coders have the additional powerful option of preprocessor
nesting totally independent of the nest structure of the underlying code. You earn
extra Brownie points whenever the beginning and end of a block appear on separate
pages in a printed listing. Wherever possible, convert nested ifs into nested [? : ]
ternaries. If they span several lines, so much the better.
C’s Eccentric View of Arrays
C compilers transform
myArray[i] into *(myArray + i),
which is equivalent to *(i + myArray) which is equivalent
to i[myArray]. Experts know to put this to good use. To
really disguise things, generate the index with a function:
int myfunc(int q, int p) { return p%q; }
…
myfunc(6291, 8)[Array];
Unfortunately, these techniques can only be used in native C classes, not
Java.
L o n g L i n e s
Try to pack as much as possible
into a single line. This saves the overhead of temporary variables and makes source
files shorter by eliminating new line characters and white space. Tip: remove all
white space around operators. Good programmers can often hit the 255 character line
length limit imposed by some editors. The bonus of long lines is that programmers who
cannot read 6 point type must scroll to view them.
Exceptions
I am going to let you in on a little-known
coding secret. Exceptions are a pain in the behind. Properly-written code never
fails, so exceptions are actually unnecessary. Don’t waste time on them.
Subclassing exceptions is for incompetents who know their code will fail. You can
greatly simplify your program by having only a single try/catch in the entire
application (in main) that calls System.exit(). Just stick a perfectly standard set
of throws on every method header whether they could actually throw any exceptions or
not.
When To Use Exceptions
Use exceptions for
non-exceptional conditions. Routinely terminate loops with an ArrayIndexOutOfBoundsException. Pass return standard results from a
method in an exception.
Efficient
Exceptions
Throwing an Exception has quite a high
overhead. The JVM (Java Virtual Machine) has to scan the stack looking for a ton of information
to potentially use in a stack trace. You can avoid this overhead by constructing a
Exception object once and throwing it many times. The
stack trace will be for the spot in the code where the Exception was constructed, not where it was thrown. This will really
keep them guessing where the bugs are.
Use threads With Abandon
title says it all.
Lawyer Code
Follow the language lawyer discussions in the
newsgroups about what various bits of tricky code should do e.g. a=a++; or f(a++,a++); then sprinkle your code
liberally with the examples. In C, the effects of pre/post decrement code such as
*++b ? (*++b + *(b-1)) : 0
are not defined by the language spec. Every compiler is free to evaluate in a
different order. This makes them doubly deadly. Similarly, take advantage of the
complex tokenizing rules of C and Java by removing all spaces.
Early Returns
Rigidly follow the guidelines about no
goto, no early returns and no labeled breaks especially when you can increase the
if/else nesting depth by at least 5 levels.
Avoid {}
Never put in any { } surrounding your if/else
blocks unless they are syntactically obligatory. If you have a deeply nested mixture
of if/else statements and blocks, especially with misleading indentation, you can
trip up even an expert maintenance programmer. For best results with this technique,
use Perl. You can pepper the code with additional ifs after the statements,
to amazing effect.
Tabs From Hell
Never underestimate how much havoc you can
create by indenting with tabs instead of spaces, especially when there is no
corporate standard on how much indenting a tab represents. Embed tabs inside string
literals, or use a tool to convert spaces to tabs that will do that for you.
Magic Matrix Locations
Use special values in certain
matrix locations as flags. A good choice is the [3][0] element in a transformation
matrix used with a homogeneous coordinate system.
Magic Array Slots revisited
If you need several
variables of a given type, just define an array of them, then access them by number.
Pick a numbering convention that only you know and don’t document it. And
don’t bother to define #define constants for the indexes. Everybody should just
know that the global variable widget[15] is the cancel button. This is just an
up-to-date variant on using absolute numerical addresses in assembler code.
Never Beautify
Never use an automated source code tidier
(beautifier) to keep your code aligned. Lobby to have them banned them from your
company on the grounds they create false deltas in PVCS/CVS (version control
tracking) or that every programmer should have his own indenting style held forever
sacrosanct for any module he wrote. Insist that other programmers observe those
idiosyncratic conventions in his modules. Banning
beautifiers is quite easy, even though they save the millions of keystrokes doing
manual alignment and days wasted misinterpreting poorly aligned code. Just insist
that everyone use the same tidied format, not just for storing in
the common repository, but also while they are editing. This starts an
RWAR (Religious War) and the boss, to
keep the peace, will ban automated tidying. Without automated tidying, you are now
free to accidentally misalign the code to give the optical illusion that
bodies of loops and ifs are longer or shorter than they really are, or that else
clauses match a different if than they really do, e. g.
The Macro Preprocessor
It offers great opportunities for
obfuscation. The key technique is to nest macro expansions several layers deep so
that you have to discover all the various parts in many different *.hpp files.
Placing executable code into macros then including those macros in every *.cpp file
(even those that never use those macros) will maximize the amount of recompilation
necessary if ever that code changes.
Exploit Schizophrenia
Java is schizophrenic about
array declarations. You can do them the old C, way String x[], (which uses mixed
pre-postfix notation) or the new way String[] x, which uses pure prefix notation. If
you want to really confuse people, mix
byte[] rowvector, colvector, matrix[];
which byte[] rowvector;
byte[] colvector;
byte[][] matrix;
Hide Error Recovery Code
Use nesting to put the
error recovery for a function call as far as possible away from the call. This simple
example can be elaborated to 10 or 12 levels of nest:
Pseudo C
The real reason for #define was to help programmers who are familiar with another programming
language to switch to C. Maybe you will find declarations like #define begin { " or " #define end } useful to write more
interesting code.
Confounding Imports
Keep the maintenance programmer
guessing about what packages the methods you are using are in. Instead of:
import com.mindprod.mypackage.Read;
import com.mindprod.mypackage.Write;
use: import com.mindprod.mypackage.*;
Never fully qualify any method or class no matter how obscure. Let the maintenance
programmer guess which of the packages/classes it belongs to. Of course,
inconsistency in when you fully qualify and how you do your imports helps most.
Toilet Tubing
Never under any circumstances allow the
code from more than one function or procedure to appear on the screen at once. To
achieve this with short routines, use the following handy tricks:
- Blank lines are generally used to separate logical blocks of code. Each line is
a logical block in and of itself. Put blank lines between each line.
- Never comment your code at the end of a line. Put it on the line above. If
you’re forced to comment at the end of the line, pick the longest line of
code in the entire file, add 10 spaces and left-align all end-of-line comments to
that column.
- Comments at the top of procedures should use templates that are at least 15
lines long and make liberal use of blank lines. Here’s a handy template:
The technique of putting so much redundant information in documentation almost
guarantees it will soon go out of date and will help befuddle maintenance
programmers foolish enough to trust it.
Encapsulate The Trivial
Create entire classes or
methods to encapsulate trivialities that could never possibly change, but which then
require complex invocation and careful unravelling to discover that the code does
almost nothing. Here is a classic
Loops
The humble canonical for loop: for (int i=0;
i<n; i++ ) should never be used. Always
randomly disguise it, for example by:
- Redoing it as a while or do
while loop.
- Reversing the names of the i and n variables, or making up fanciful names for either that have nothing
to do with their purpose as index and count.
- Changing the < to <=.
- Use i-- just for a
change of pace.
It goes without saying you should never use the compact for:each loop. There many ways to rearrange the parts of an
Iterator loop over a Collection so every time the maintenance programmer looks at on a
simple Iterator, it appears to be something novel.