|


Compilers and Interpreters

Compiler Construction Tools
Bison (parser generator)
Bison generates a parser when presented with a LALR
(1) context-free grammar that is yacc compatible.
The generated parser is in C. It includes extensions
to the yacc features that actually make it easier to
use if you want multiple parsers in your program.
Bison works on Win32, MSDOS, Linux and numerous
other operating systems. The link points to the
source code which should compile with many compilers
(especially GNU's gcc). Although the program itself
is under GPL, the generated parser (using the
bison.simple skeleton) can be distributed without
restriction.
Grammatica
Grammatica is a parser generator for C# and Java. It
uses LL(k) grammars with unlimited number of
look-ahead tokens. It purportedly creates commented
and readable source code, has automatic error
recovery and detailed error messages. The generator
creates the parser at runtime thus also allowing you
to test and debug the parser before you even write
your source code. The program is released under the
GNU General Public License with an exception to
facilitate its use by commercial software.
Very Portable Optimizer
(Vpo) (Code Generator)
Vpo is a global optimizer that is language and
compiler independent. It can be retargeted and
supports a number of architectures. It is useful if
you need a back end for the compiler you're
constructing that handles optimized code generation,
MLRISC Retargetable and
Optimizing Compiler Back End
MLRISC is a customizable optimizing compiler backend
that can be retargeted to multiple architectures. It
is written in Standard ML, and requires that your
front end be written in ML.
Ulm's Modula-2 LALR(1)
Parser Generator
Ulm's Modula-2 System comprises an LALR(1) parser
generator for Modula-2, a Modula-2 compiler,
Modula-2 beautifier, Modula-2 debugger, a Modula-2
tags utility (like ctags for C), a Modula-2-Prolog
interpreter, and a Pascal to Modula-2 translator. It
is distributed under the terms of the GNU GPL
(compiler and tools) and the GNU LGPL (library). It
supports SPARCv8/Solaris 2.x and MC68020/SunOS 4.1x.
sid (Parser Generator)
Sid is an LL(1) parser generator that produces C
code. It comes with (and is used to create) the
TenDRA C compiler (you have to download the compiler
to obtain this parser generator).
YaYacc (Generates Parsers)
YaYacc, or Yet Another Yacc, generates C++ parsers
using an LALR(1) algorithm. YaYacc itself runs on
FreeBSD, but the resulting parser is not tied to any
particular platform (it depends on your code, of
course).
Optimix Optimizer
Generator
This optimizer generator allows you "to generate
program analysis and transformations". It may be
used in a CoSy compiler framework, with the Cocktail
tool, or with Java.
Elex Scanner Generator
Elex is a lexical scanner (lexer) generator that
supports multiple programming languages. It is
released under the GNU GPL, and has been tested on
Linux.
JACCIE (Java-based
Compiler Compiler)
Jaccie includes a scanner generator and a variety of
parser generators that can generate LL(1), SLR(1),
LALR(1) grammars. It has a debugging mode where you
can operate it non-deterministically.
GOLD Parser
The GOLD Parser is a parser generator
(compiler-compiler) that generates parsers that use
a Deterministic Finite Automaton (DFA) for the
tokenizer and a LALR(1) for the state machine.
Unlike other parser generators, GOLD does not
require you to embed your grammar into your source
code. It saves the parse tables into a separate file
which is loaded by the parser engine when run.
LEMON Parser Generator
This LALR(1) parser generator claims to generate
faster parsers than Yacc or Bison. The generated
parsers are also re-entrant and thread-safe. The
program is written in C, and only the source code is
provided, so you will need a C compiler to compile
LEMON before you can use it.
Accent Compiler Compiler
A
compiler-compiler that avoids the problems of the
LALR parsers (eg, when faced with shift/reduce and
reduce/reduce conflicts) and LL parsers (with its
restrictions due to left-recursive rules). You
specify your input grammar in the
Extended-Backus-Naur-Form, in which you are allowed
to indicate repetition, choices and optional parts.
You can insert semantic actions anywhere, and
ambiguous grammars are allowed. All these features
make Accent grammars easier to write than (eg) Yacc
grammars. The website warns however that the
generated code require significantly more system
resources than code generated by Yacc. Accent is
distributed under GNU GPL. I'm not sure about the
generated C code.
PRECCX (Prettier
Compiler-Compiler Extended)
PRECCX, or PREttier Compiler-Compiler eXtended, is
"an infinite-lookahead compiler-compiler for context
dependent grammars" which generates C code. You
specify an input grammar in an extended BNF notation
where inherited and synthetic attributes are
allowed. The parser is essentially LL(infinity) with
optimisations. You can get versions for MSDOS, Linux
and other Unices (including Sun, HP, etc). Source
code is available and you can apparently compile it
on other platforms with an ANSI C compiler if
needed.
Byacc/Java (Parser
Generator)
This is a version of Berkeley yacc modified so that
it can generate Java source code. You simply supply
a "-j" option on the command line and it'll produce
the Java code instead of the usual C output. You can
either get the free source code and compile it
yourself, or download any of the precompiled
binaries for Solaris, SGI/IRIX, Windows 95/NT, and
Linux. Like the byacc original, your output is free
of any restrictions, and you can freely use it for
any purpose you wish.
COCO/R (Lexer and Parser
Generators)
This tool generates recursive descent LL(1) parsers
and their associated lexical scanners from
attributed grammars. It comes with source code, and
there are versions to generate Oberon, Modula-2,
Pascal, C, C++, Java. A version for Delphi is (at
the time of this writing) "on the way". Platforms
supported appear to vary (Unix systems, Apple
Macintosh, Atari, MSDOS, Oberon, etc) depending on
the language you want generated.
Eli
A
programming environment that allows you to generate
complete language implementations from
application-oriented specifications. The user
describes the problems that needs to be solved and
Eli uses the tools and components required for that
problem. It handles structural analysis, analysis of
names, types, values, stores translation structures
and produces the target text. It generates C code.
The program is available in source form and has been
tested under Linux, IRIX, HP-UX, OSF, and SunOS. Eli
itself is distributed under the GNU GPL but the
generated code is your property to do as you please.
ALE
This freeware system, written in Prolog, and
requiring SICStus Prolog 3.7, SWI Prolog or Quintus
Prolog (no longer maintained?) to run, handles
phrase structure parsing, semantic-head-driven
generation and constraint logic programming and
includes a source level debugger.
TP Lex/Yacc (Lexical
Analyzer and Parser Generators)
This is a version of Lex and Yacc designed for
Borland Delphi, Borland Turbo Pascal and the Free
Pascal Compiler (you can find legally free versions
of all the above listed on our Free Delphi Compilers
and Pascal Compilers page). Like its lex and yacc
predecessors, this version generates lexers and
parsers, although in its case, the generated code is
in the Pascal language.
Gentle Compiler
Construction System
This compiler construction tool purports to provide
a uniform framework for language recognition,
definition of abstract syntax trees, construction of
tree walkers based on pattern recognition, smart
traversal, simple unparsing for source to source
translation and optimal code selection for
microprocessors. Note however that if you use it to
create an application, the licensing terms require
that your applications be licensed under the GNU
GPL. This probably restricts your use of it in a
commercial program, unless you are prepared to pay
for a special license or you plan to make the
sources for your program available anyway.
Bison for Eiffel (Parser
generator)
This version of Bison produces Eiffel source code.
Like Bison, it is released under the GNU GPL. I am
uncertain whether the generated parser can be
distributed freely (the current versions of Bison
allow this if you do not modify the output) without
restrictions.
Cocktail (compiler
construction kit)
This is a set of tools that generates programs for
almost every phase of a compiler. Rex generates
either a C or Modula-2 scanner. Lalr generates
table-driven C or Modula-2 LALR parsers from
grammars written in extended BNF notation, while Ell
does the same for LL(1) parsers. The parsers
generated have automatic error recovery, error
messages and error repair. Ast generates abstract
syntax trees, Ag generates an attribute evaluator,
and Puma is a transformation tool based on pattern
matching. The scanners and parsers generated are
supposed to be faster than those generated by lex
and yacc. The tools are publicly copyable and are
implemented in Modula-2.
Aflex and Ayacc
This combination of a lexer and parser generator was
written in Ada and generates Ada source code. It is
modelled after lex and yacc.
ANTLR (Recursive Descent
Parser Generator)
ANTLR generates a recursive descent parser in C, C++
or Java from predicated-LL(k>1) grammars. It is able
to build ASTs automatically. If you are using C, you
may have to get the PCCTS 1.XX series (the precursor
to ANTLR), also available at the site. The latest
version may be used for C++ and Java.
Byacc (Berkeley YACC)
Berkeley YACC ("Yet Another Compiler Compiler") is a
public domain parser generator that is the precursor
of the GNU BISON. The link above points to a
directory where you can download the sources to the
program (look for a file beginning with "byacc").
Although it appears to be no longer maintained, this
is one of the best yacc clones (plus it's in the
public domain). The link points to the source code
which should compile with many compilers (including
GNU's gcc).
BtYacc (generates parsers)
To quote from the documentation, BtYacc, or
BackTracking Yacc, "is a modified version of
Berkeley Yacc that supports automatic backtracking
and semantic disambiguation to parse ambiguous
grammars, as well as syntactic sugar for inherited
attributes". The program comes with sources which
are in the public domain. Although the author only
mentions compilation of the program on Unix and
Win32 systems, it is likely that the program can be
compiled and run on MSDOS systems using an MSDOS
port of the GNU C compiler like DJGPP, since the GNU
compiler was used on the other systems. For more
information about DJGPP, see the Free C and C++
compilers page.
Flex (Lex drop-in
replacement)
FLEX generates a lexical analyser in C or C++ given
an input program. It is compatible with the original
lex, although it has numerous features that make it
more useful if you are writing your own scanner. It
is designed so that it can be used together with
yacc and its clones (like byacc and bison, also
listed on this page). It is highly compatible with
the Unix lex program. The URL given above is for the
original source code. The source code can also be
obtained from the
GNU ftp site.
Java Compiler Compiler
(JavaCC)
This Java parser generator is written in Java and
produces pure Java code. It even comes with grammars
for Java 1.0.2, 1.1 as well as HTML. It generates
recursive descent parsers (top-down) and allows you
to specify both lexical and grammar specifications
in your input grammar. In terms of syntactic and
semantic lookahead, it generates an LL(1) parser
with specific portions LL(k) to resolve things like
shift-shift conflicts. The input grammar is in
extended BNF notation. It comes with JJTree, a tree
building preprocessor; a documentation generator;
support for Unicode (and hence
internationalization), and many examples. There are
numerous other features, including debugging
capabilities, error reporting, etc.
Programming Language
Creator
According to the documentation, the Programming
Language Creator is designed to enable you "to
easily create new programming languages, or create
interpreted versions of any compiled language"
without the need for you to wrestle with yacc and
lex. If you want your application to have a
scripting language, you might want to look at this
to see if it meets your requirements. The binaries,
available free, are for Win32, and the source code
is available for a fee.
SableCC (generates lexers)
This is an object-oriented framework that generates
DFA based lexers, LALR(1) parsers, strictly typed
syntax trees, and tree walker classes from an
extended BNF grammar (in other words, it's a
compiler generator). The program was written in Java
itself, runs on any Java 1.1 (or later) system and
generates Java sources.
|