6 Context-free languages

A context free language is a language that can be expressed by a set of productions, each satisfying the following constraints:

Note that this constraint is a subset of the constraints of regular expressions. It means that the set of context-free languages is a superset of the set of regular expression languages. This may not seem like a severe constraint, but it is. In other words, there are many languages that are not context-free.

Compared to regular expressions, a context-free language can be more flexible. For example, a general infix expression is context-free, but it is not a regular expression because the operand of a binary operator are both non-terminals. In other words, the production Sum ::= Product "+" Product does not satisfy the constraints of a regular expression, yet it is essential to express operator priority.

Note that even the prefix and postfix (also knowns RPN, reversed Polish notation) expressions are context-free. This is still because of binary operators. In order to express a well-formed prefix or postfix expression (so that there is no missing or excess operands), we still need to use multiple nonterminals in a single production, such as BinOp ::= Operand Operand Operator, in which Operator expands to a single terminal, but an Operand can be a BinOp itself.

One exception is the language accepted by a “dumb” calculator that does not handle operator priorities. Such a calculator accepts 23+4*6 as an input string and computes its result as (23+4)*6.

All programming languages are context-free. This is a necessity because context-free languages can be accepted by push-down automata (also known as stack machines).