JavaCC : Java Glossary



Formerly known as Jack. JavaCC is a parser, like YACC (Yet Another Compiler Compiler), except that is it written in Java and generates Java code. It started out handled by Sun, then by Metamata, then bought out by Webgain, who went belly up. It is now an open source project. The current version is 5.0 Last revised/verified: 2011-02-06. It is still the most popular parser written in Java. JavaCC is similar to PCCTS (Purdue Compiler Compiler Tool Set). JavaCC is a LL(k) (top down) parser, whereas YACC is a LALR(1) (Look Ahead Left-to-Right parse, single token look-ahead) (bottom up) parser, so the grammars look quite different. The documentation is quite readable, unlike most of its competition. The parser has some parsing speedup tricks like lookahead and custom code for the tricky bits.

JavaCC has been stagnated, but it showed some signs of renewed life with a recent release. Most new parsers are written in ANTLR. JavaCC is more intuitive and better fit for getting your feet with parsers.

It blew my mind when a little parser I wrote worked first time. It is much simpler than it first looks. The easiest way to learn is to study the *.jj example grammar descriptions and to Google for example source code for JavaCC code for grammars similar to what you want to tackle.

download JavaCC 6.0 free. Last revised/verified: 2014-04-28 The download zip includes documentation, JJTree and a number of simple *.jj example grammar descriptions. Just unzip everything with embedded folder names into a Program Files directory. It does not have an explicit installer. You will want to tweak javacc-6.0\bin\javacc.bat, (the script that compiles your grammar descriptor into Java source code) and put

You write little methods that describe the various phrases of your grammar. They are a mixture of Java code and JavaCC grammatical BNF (Backus-Naur Form). JavaCC then merges them and converts the whole thing into pure Java source code, with methods that will recognise the various phrases. When you compile that, you can parse text. In addition there are commands to describe the tokens of your language tokens — the basic units/words.


Dongwon Lee used to maintain a list of JavaCC grammars but abandoned the project. I wrote him, but he did not write back. I used the Wayback Machine to recover some of his snapshotted it to be discarded. Please pass on new submissions and updates.

The email addresses displayed below have been munged with Masker to discourage spam harvesters. They are not copy/pastable or extractable from the HTML (Hypertext Markup Language), so you will have to retype them into your email program because they are *.png images, not text.

JavaCC Grammars
Grammar Download Author Released Version Description
AdaJohn R. Callahan1998-06-04Rudimentary Ada9x grammar.
ASN.1Helena Sarin1998-03-20ASN.1 grammar ITU Recommendation X.208.
C++Theodore Norvell email Theodore NorvellJavaCC parser. Coming soon.
C++downloadSreenivasa Viswanadha1997-03-201.1Part of JavaCC package
C++ to HTMLdownloadTheodore Norvell email Theodore Norvell2001-01-01Applet to convert C++ code to HTML code, with syntax highlighting
CurldownloadDoug South1997-03-21Part of JavaCC package
DCLPat Martin2001-07-242.0DCL - Dynamic Content Language: Welcome to the DCL homepage. DCL was a project born in frustration. Dealing with HTML pages within java programs can be quite tedious. Most of that which you are presenting doesn't change, only small regions of the content tend to need dynamic output. In addition, it’s difficult to come up with one solution to every problem without resorting to a full fledged language and so DCL came to be…
DTDdownloadTheodore Norvell email Theodore Norvell2011-02-09A JavaCC parser for content models as found in XML’s DTD files. It doesn’t parse the whole DTD file, because SAX can do that, but SAX doesn’t parse the content models. It may be out of date.
DTDdownloadJohn Gebbie2001-03-27Almost-complete grammar file to parse XML DTD 1.0.
John D. Ramsdell1997-01-20Part of JavaCC package. Parses the output from du (disk usage summarization program on UNIX) into nested parenthesized lists. The parser was developed for an old version of the Disk Usage Tree Map Viewer. The latest version does not run du as an inferior process, so the parser is no longer used.
ECMAScriptJean-Marc Lugrin1998-10-171.0FESI (pronounced fezz-y rhymes with fuzzy) is a full implementation of the ECMAScript language (defined in the standard ECMA 262 available at ecma-international/ (edition of 1997-06). ECMAScript is largely equivalent to the JavaScript language version 1.1 or to the core part of JScript, but without the navigator specific extensions.
ExpressdownloadJason A. Goodman1998-10-26Grammar for the EXPRESS product representation language as defined in IS0 10303-1 which covers industrial automation systems and integration, product data representation and exchange.
GDMOdownloadDermot Dwyer1999-04-19ISO/CCITT Guidelines for Definition of Managed Objects (GDMO) into UML.
HELFabien Azavant
Arnaud Sahuguet
Part of World Wide Web Wrapper Factory (W4F) package. HEL (HTML Extraction Language) relies on the Document Object Model from W3C. An HTML document is a tree with a hierarchy of tags, with html as the root. HEL permits to describe extraction rules in terms of path-expressions along this tree. In addition, HEL has some regular expression capabilities to capture finer details of the document (like dates, numerical format, etc.)
HTMLdownloadPart of JavaCC package. Covers HTML 3.2 specification.
HTMLBrian Goetz email Brian Goetz1999-11-031.0This is a JavaCC grammar for parsing HTML documents. It does not enforce the DTD, but instead builds a simple parse tree which can be used to validate, reformat, display, analyze, or edit the HTML document. The goal was to produce a parse tree which threw away very little information contained in the source file, so that by dumping the parse tree, an almost identical copy of the input document would result. The only source information discarded by the parser is whitespace inside of tags (i.e., the spaces or newlines between the attributes of a tag.) It is not confused by things that look like tags inside of quoted strings.
IDLdownload0.1Part of JavaCC package. A grammar for the IDL (Interface Definition Language) of OMG CORBA 2.0 specification.
Michael McConnell2002-01-191.1InfoSapient Business rules engine. InfoSapient is a business rules engine used for the expression of policy or operation rules within a business.
JavaSriram Sankar email Sriram SankarPart of JavaCC package. Sriram Sankar is the original author of JavaCC and continues to maintain it.
Java 1.4downloadAndrea Gini2002-02-24A grammar for parsing java 1.4 sources. It is a modified version of the grammar written by Sriram Sankar for Java 1.1 and modified by David Williams for Java 1.2. It has been modified to accept Java sources for Java 1.4. The grammar have been modified in four parts: 1) assert has been included to the keyword list 2) AssertStatement() production has been added 3) the production Statement() has been modified in order to support AssertStatement() 4) in the main the string for Java1.2 code has been changed with for Java1.4 code
Java 1.4Theodore Norvell email Theodore NorvellJJTree parser. Coming soon.
Java 1.4downloadMarco Savard2002-03-28I used the latest version of grammar for Java 1.4 (written by Andrea Gini). It works well, but I found two constructs that were not supported by this grammar, although it is valid in Java (according Oracle’s documentation). I have fixed the problem and I send you the modified grammar so it can be useful for some users…
Java 1.4downloadAndrea Gini2002-05-05This is a bug fix of the grammar written by Sriram Sankar for Java 1.1, modified by David Williams for Java 1.2, by Andrea Gini for Java 1.4 and finally by Marco Savard to include a missing construct…
Java to HTMLdownloadTheodore Norvell email Theodore Norvell2001-01-01Applet to convert Java code to HTML code, with syntax highlighting
Java to HTMLdownloadPaul Cager2001-12-191.0JavaCC grammar to convert Java or JavaCC code to HTML.
JavaScriptdownloadRoland Paterson-Jones1997-04-01rloand.javascript package. A .jj grammar for Javascript, from the specification document, plus an expression tree generator and (incomplete) interpreter.
James Power1998-11-17A parser for the programming language Oberon-2. Oberon is the latest generation in the Wirth family of languages, an heir to the Pascal and Modula tradition.
ODLdownloadVladimir Rubanov1999-07-230.1ODMG Object Definition Language (ODL). The language is based on IDL and is specified in ODMG 2.0 standard. Production numeration corresponds to this standard. The grammar is tested and used at the Institute for System programming of Russian Academy of Science.
OGNLDrew Davidson2002-04-05OGNL stands for Object-Graph Navigation Language; it is an expression language for getting and setting properties of Java objects. You use the same expression for both getting and setting the value of a property.
OQLdownloadKoen Hendrickx1997-09-25(Unfinished Version) Object Query Language (OQL) defined in ODMG 2.0 specification.
PGNdownloadMartin Rademacher2001-08-286.5PGN is Portable Game Notation. For more information see the Wikipedia PGN entry.
PHPdownloadSatyam0.1Parses PHP 5.0 grammar. Tested with the PHP 5 test suite, except for exceptions noted in the TODO list contained in the source.
PythondownloadJim Hugunin2.0Part of JPython package. Python is an interpreted, interactive, object-oriented, extensible programming language.
QuiltdownloadHenry Chiu
Dongwon Lee
2000-08-170.9QuiltParser is a parser for the Quilt XML query language written with JavaCC as a part of XPRESS project at UCLA / CSD. This small, implementable language has been recently proposed by Robie, Chamberlin and Florescu; it integrates the advantages of various languages while meeting the W3C’s XML Query Requirements.
Rational RosedownloadMarkus Dahm2001-06-181.11A JavaCC grammar for models created by Rational Rose. The Homepage of the CrazyBeans project.
RMAILdownloadPart of JavaCC package. Processes RMAIL files that are created by the GNU emacs editor.
RPCdownloadAdelene W. Ng2001-06-260.1RPC (Remote Procedure Call) Specification grammar; tested on JavaCC 2.0.
RTFdownloadDavid Rosenstrauch2001-10-111.0A grammar for RTF (Rich Text Format) documents (frequently used with the Microsoft Word word processor, as well as several others).
RTFdownloadEric Friedman2001-10-31JavaCC grammar for parsing RTF files. This parser handles unicode RTF as well as double byte RTF files used to represent Asian character sets. Jar file includes the RTFParser.jj grammar, source for a parser delegate interface for applications to implement when they need to respond to RTF events and a copy of the LGPL license.
SimkindownloadSimon Whiteside2001-01-021.1Simkin is a high-level lightweight embeddable scripting language which works with Java or C++ and XML.
SpecCdownloadH. Chen
Rukhsana Alam
2001-06-15The SpecC is a system-level design and specification language developed in University of California, Irvine. Since SpecC is a superset of C, we built this grammar file based on the C grammar file contributed by Mr.Doug South.
SPLdownloadKen Beesley2001-02-21A modification of one of the example grammars that is part of the standard JavaCC download. The original example is /examples/Interpreter/ which implements a small language called SPL (stupid programming language). The original example parses, gets an AST tree and then interprets the tree by calling the interpret() method in the root node, which in turn calls interpret() methods in the daughter nodes. The hand-written interpreter code is therefore spread out through all the various AST class files.
SQLdownloadRamanathane1997-04-090.5Grammar for PL/SQL inside Oracle*Forms 4.5(i.e. PlSql 1.x).
SQLdownloadKevinParser for Oracle SQL.
STEPdownloadSingva Ma1999-08-18STEP Clear Text Encoding syntax is used along with the EXPRESS language to exchange neutral files between CAD systems. The description of the syntax can be found in ISO 10303-21.
StruQLMary Fernandez
Dana Florescu
Alon Levy
Dan Suciu
0.4Strudel is a web-site management system developed at AT&T Labs — Research. Its query language, StruQL, specifies how a Web site is constructed from the source data modeled by a data graph. No longer supported.
VHDLdownloadChristoph Grimm1998-12-06The parser covers the complete IEEE 1076 - Standard including the extensions proposed as IEEE 1076.1.
Visual BasicdownloadPaul Cager2002-03-133.0UPDATED: A JavaCC parser for Visual Basic, using JJTree to generate an AST. The grammar supports most VB constructs and could be used as a starting point for an ASP grammar.
VRMLdownloadKoen HendrickxCovers most of VRML 1.0 specification proposed as IEEE 1076.1.
VRMLSatoshi Konno email Satoshi Konno1.1 betaCyberVRML97 for Java is a development library of VRML97/2.0 applications. Using the library, you can easily read and write the VRML files, set and get the scenegraph information, draw the geometries, run the behaviors. Everyone can use the library free for commerce or an individual purpose. Now called CyberX3D and ported to Xerces.
XMLPatrice Bonhomme1998-07-130.5XSilfide is a client/server based environment for distributing language resources. Part of the XSilfide components is SXP (Silfide XML Parser — a parser and a complete XML API in Java). More in detail, it supports XML 1.0 (REC 1998-02-10), XML NameSpace (WD 1998-03-27), Document Object Model Level 1 (DOM Core and XML, WD 1998-04-16), XLink (WD 1998-03-03) and XPointer (WD 1998-03-03). From their project home page, follow Technology and SXP to download the package.
XMLNorbert H. Mikula1997-05-080.97NXP is a (validating) XML (eXtensible Markup Language) parser. (It appears that NXP is no longer public software — commercial product was available at, but this does not include JavaCC source code. The above link has the last-known public distribution version.) Norbert Mikula co-authored XML For Dummies. He died in 2009.
XML-QLdownloadAlin Deutsch
Mary Fernandez
Dana Florescu
Alon Levy
Dan Suciu
Wang-Chiew Tan
0.9Data extraction, conversion, transformation and integration are all well-understood database problems. Their solutions rely on a query language, either relational (SQL) or object-oriented (OQL). Unlike relational or object-oriented data, XML is semistructured, i.e., it can have irregular and extensible structure and its attributes, or schema, are stored with the data. XML-QL is a query-language for XML and is suitable for performing the above tasks. No longer supported.
Ingo Macherius1999-08-191.1This distribution contains a JavaCC generated Parser and two frontends for testing. Look at the source code for some configurable options. To change options the frontends need to be recompiled. The grammar is straightforward from the drafts. It is not optimized for speed and had little to no checking for correctness beside a tiny regression test on all examples from the W3C drafts. The grammar implements XPath/XSLT patterns as of 1999-08-13.
XQuerydownloadW3C XML Query Working Group2001-01-26This specification describes a new query language called XQuery (successor of Quilt), which is designed to be broadly applicable across all types of XML data sources. See Appendix G.

Learning More

book cover recommend book⇒Generating Parsers with JavaCCto book home
by Tom Copeland 978-0-9762214-3-2 paperback
publisher Centennial
published 2007
You can also order direct from the author. Also covers JJTree, JTB and JJDoc. It explains how to write Unicode parsers to handle a much richer character set than in traditional in languages. It covers integration with Eclipse, but not IntelliJ. The documentation for JavaCC is scattered over the web. This book helps you find it all. The book treats the reader gently. There is plenty of repetition, full explanation, examples, comments about what is going on under the hood and pointing out of pitfalls. He even tells you what the various tools he is describing are useful for. You can tell he as been around the block and is talking from hard-won experience. It is like talking with a fellow programmer over coffee. He has a very kind friendly way of writing. His only fault is a tendency to belabour the obvious. Not available in bookstores. Try the publisher Centennial books.
Australian flag abe books anz abe UK flag
German flag abe UK flag
German flag abe Canadian flag
Spanish flag Canadian flag
Spanish flag Chapters Indigo Canadian flag
French flag abe abe American flag
French flag American flag
Italian flag abe Barnes & Noble American flag
Italian flag Nook at Barnes & Noble American flag
India flag Kobo American flag
UN flag other stores Google play American flag
O’Reilly Safari American flag
Powells American flag
Greyed out stores probably do not have the item in stock. Try looking for it with a bookfinder.

This page is posted
on the web at:

Optional Replicator mirror
on local hard disk J:

Please the feedback from other visitors, or your own feedback about the site.
Contact Roedy. Please feel free to link to this page without explicit permission.

Your face IP:[]
You are visitor number