Flex and Gnu (Lex and Yacc)

A majority of my time in software development is spent in converting data from one form to another.  I think that the same may be true for all developers, depending on how one views the process.  Keyboard input is translated into events, data, commands, which are put into memory based data structures, and then translated into visual feedback on the monitor, and perhaps network packets transmitted over the wire, and then again perhaps to database records in a database.

At the most primitive level, I am acting as a walking, talking compiler, compiling designs and specifications into a representation that the computer can manage.  It would be great, then, for the computer to better understand how I communicate, but it is quicker for me to learn to represent the information in a way the computer can understand.

The first formal language that I was trained in was Pascal, which I don’t hear too much of these days.  It’s a procedural language, and each function would begin with the keyword BEGIN and end with the keyword END.   I am not going to claim that it read like English, but it was certainly closer than many of the languages in use today.  There were researchers, and scholars, who spoke of Artificial Intelligence and natural language processing, but it did seem a long ways off back inthe 1970’s.  Every now and again, academic types will gather and the idea of a learning, intelligent, quasi-sentient intelligence created from silicon would gain momentum, and the reality would fizzle like a wet sparkler on the fourth of July.

The C language came along, and it’s basically Pascal, with a few bonuses (pointers).  And a few shorthand items.  Like the BEGIN and END went away, replaced by the open curly and the closing curly {}.  Okay, I could live with that.  Other than that, it was Pascal-like, for the things I needed to do.  Most importantly, it was not BASIC, the scourge of the serious programmer, plagued by spaghetti code and GOTO’s.  It was compiled, and not interpreted.

Now, the compiler and the interpreter, as I would later learn, had the chore of semantic parsing, breaking the stream of text into chunks for functions (which have a beginning and an end!) and statements, and expressions.  In college, in Compiler Design, we used Lex and Yacc (or Flex and Gnu, the Linux versions) to help with the compiling, and in the first step, it was dependent on regular expressions, to determine where the functions were, the statements and the expessions.  Now, ironically, the compiler I built compiled from (a subset of) Pascal into Java bytecode.  My regular expressions had to look for the BEGIN and the END.

Now the curlies have landed in Java as well as C and C++, and is fairly common now in scripting languages as well.  The bad news for this fan of natural language English is that the curlies have been supplanted by a construct which is, to my eyes, even less readable:  XML tags.  The success of html (an XML like structure) as a delivery format for internet commerce has driven an (overuse) of XML, for things like database records, order information, Excel documents, and to tag based scripting languages, and finally to compiled languages.

In my career so far, then, I have watched computer languages transform from:

BEGIN…..END

{……………….}

to

<block>…..</block>

XML is the basis for Microsoft’s new Silverlight XAML, the basis for the RDF protocol support the “semantic web” (ironic, isn’t it?), and XSQL, a replacement for the fairly nice SQL language.  But, programmers too, may be moving away from the written word, towards UML or visual development metaphors, more and more, so I suppose it does not represent a huge imposition.  But it would be nice if the trend for computer languages was toward something that is like a natural language, rather than moving away from it.

Howard

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: