Home
Assay

Assay is a C library that provides functions to parse a configuration file in yet one more variation on the widely used and under specified INI format. The syntax of the file format implemented by Assay is specified by an LALR(1) grammar that was strongly influenced by my experience using configuration files in the Asterisk open source PBX. Portions of the C code, specifically its reentrant lexical scanner and reentrant shift-reduce parser, are generated at build time using the GNU Flex and Bison tools. Assay is built on top of the Diminuto library of GNU/Linux-based software tools and makes heavy use of its implementation of balanced Red-Black trees.

Here is an example of what an INI file might look like.

; This property goes into the default section named "general".
keyword01 = value01

[Section1]
keyword11=value11
keyword12: value1
keyword\ 13 : value 13

#include common.ini

[Section\ 2]
keyword3 = \ value\t3
keyword4=\
123\
456;

#exec generated.sh

[Section3] keyword5: value5
[Section3] keyword6: value6

See also the example files used by the included unit tests.

The syntax rules for the INI file format supported by Assay are pretty simple (but the grammar is the definitive source).

  1. The characters octothorpe, equal sign, colon, semicolon, left square bracket, and right square bracket, are special.
  2. White space at the beginning of any line is ignored.
  3. A comment begins with a semicolon and can occur on a line by itself or on the same line after any other statement.
  4. The beginning of a section is declared within square brackets. The section name must escape special characters or white space, which then becomes part of the section name.
  5. Properties consist of a keyword, an equal sign or a colon, and a value. White space may occur on either side of the equal sign or colon.
  6. If a keyword contains special characters or white space, those characters must be escaped.
  7. The value starts at the first non-white space character following the equal sign or colon. A white space character that is the first character of a value must be escaped. The value continues until end of line or a comment. A semicolon in the value must be escaped.
  8. As a short cut, a section can be declared followed by a property on the same line.
  9. Statements can be extended across multiple lines by escaping the newline at the end, which is discarded.
  10. An octothorpe as the first character of a statement signals an operation that interrupts the parsing of the current input stream. Every operation consists of an operator and an argument separated by white space. The two operators currently supported are include and exec.
  11. The include operator temporarily redirects parsing to the file identified by the path name in the argument. When end of file is reached, parsing of the stream containing the include statement resumes.
    The exec operator temporarily redirects parsing to the standard output of the shell command specified by the argument, which may include white space. When the shell command exits, parsing of the stream containing the exec statement resumes.
  12. The exec operator temporarily redirects parsing to the standard output of the shell command specified by the argument, which may include white space. When the shell command exits, parsing of the stream containing the exec statement resumes.

I had a variety of reasons for tackling this project.

  • I've written many recursive descent parsers, and parsers implemented as push-down automata, over the years, but had never worked with Lex/Flex or Yacc/Bison.
  • I've worked with a lot of much simpler LL(1) grammars, but never an LALR(1) grammer, except as an academic exercise way back in graduate school.
  • I found the support for configuration files in Asterisk to be really useful.
  • I wanted to implement a non-trivial application using the Diminuto Red-Black balanced tree feature.

Here are some references I found useful.

Assay can be found on GitHub here.

Presentation: Implications of Memory Consistency (or Lack of It) Models for Java, C++, and C Developers (more)

Seminar Review: Jack Ganssle, Better Firmware Faster, 2006 (more)

Article: Vaster than Empires and More Slow: The Dimensions of Scalability (more)

Article: In Praise of do-while (false) (more)

Book Review: Joel Spolsky, Best Software Writing I, Apress, 2005 (more)

Presentation: Robert Austin, Measuring and Managing Performance in Organizations, Dorset House, 1996 (more)

Book Review: Joel Spolsky, Joel on Software, Apress, 2004 (more)

Presentation: James Surowiecki, The Wisdom of Crowds, Doubleday, 2004 (more)

Travelogue: China Journal: Dancing with a Sleeping Giant (more)

Unless otherwise specified, all contents Copyright © 1995-2015 by the Digital Aggregates Corporation, Colorado, USA.
Such copyrighted content is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.