One of the first things that many programmers who’ve dabbled in Forth do upon learning Tcl, or vice versa in my case, is to try to implement lexical features of one inside of the other.  In mid-2011 I felt the urge to learn how to do some programming in a low-overhead version of Forth, more than a decade after I seriously got into the Tcl scripting language for similar reasons of wanting something flexible but floppy-bootable.
This trend surely stems from the fact that neither language imposes many rules of syntax beyond streams of whitespace-separated words.

Forth: cross-platform compiled code (careful, can come cryptic!)

Forth is actually quite venerable among compiled languages, having come on the scene in the 1970’s after a long incubation in the service of Charles Moore’s individual programming efforts since 1958.  Natively supporting only 16-bit integers on an internal calculation stack that supplements the more universal “return” stack, many Forth dictionary versions also include support for floating point values–as well as “double” integers–by special processing of multiple adjacent stack locations.  Forth only really does one of two things once it’s up and running as an interpreter/compiler:

  1. accept numeric constants onto the stack, or
  2. try to jump to a subroutine that corresponds to a sequence of non-numeric input characters taken to be a “word” of code

This behavior results in the same postfix syntax as an RPN calculator, and when reading through any Forth code it is essential go slowly enough to be able to visualize the topmost stack contents at any given point. Values are continually sent to and consumed from the stack, so adding two numbers and printing the result is as straightforward as the ubiquitous Forth example ( 2 2 + . ) in which the addition function ( + ) replaces two inputs on the stack with its output ( 4 ) which is then in turn consumed by the stack-printing function ( . ).

Tcl: omitting both vowels from the name, not just one

A Tcl interpreter similarly just plows through streams of characters, but goes a step further by not assigning particular significance to any elements that other languages might try to parse as expressions, including numerical constants!  You can write a procedure called “2”, name a variable “+”, or even have a procedure called “2+2″ coexist with a variable by the same name.  The variable value would be accessed Bourne-shell style by preceding it with a dollar sign ($), whereas the procedure call would be the first word on a line or immediately following a function-calling left bracket ([) so there is never any ambiguity.

Tcl’s native format is the parameter list rather than an implied stack, but both being at their core a sequence of numbers there is an obvious natural correspondence between the two. If you actually want to perform a mathematical calculation in Tcl you put it in a list passed to a function that interprets its argument list as an expression, so that

expr 2 + 2

evaluates to “4”.  Once you think about it, this lack of any imposed interpretation on unquoted strings makes sense in today’s computing environment, now that we’ve moved away from just FORTRAN-like calculations and now trade all sorts of text over the Web that are encoded according to various contexts.

A brief aside: a procedure called left parenthesis

By the way, in both languages you can define most ASCII punctuation marks including a parenthesis to be procedure calls, so that whereas the Tcl definition

proc ( {arg args} {expr ( $arg $args}

would seem to contain unbalanced parentheses, it is actually defining “(” as a wrapper procedure. Like any other Tcl procedure it can take zero or more mandatory arguments (one in this case, arbitrarily called “arg”) followed by any number of optional arguments (always called “args”).  Now there are two equivalent forms of the simple arithmetic script above, with the first one taking just slightly longer to execute as it gets translated into the second:

  1. ( 2 + 2 )
  2. expr 2 + 2

but only if carefully minding the spaces since they play the significant role of list separators. The same syntax could be accomplished in Forth, also by creating a left-parenthesis function that reads ahead in the input stream.  At the end of this post, the left parenthesis will be redefined in its Forth context of beginning a comment string, akin to C’s /*.

Finding application in embedded systems

For this kind of elegant extensibility while still maintaining robustness, compact versions of Tcl–just like Forth before it–are now found running the debug shell on a lot of embedded devices.  It gives developers a convenient way to query the state of a product, call underlying C functions in the base firmware, define manufacturing test routines etc., all without having to develop the full overhead of an interpreter themselves.  This is exactly the pretext under in which Tcl (“tool command language” but pronounced “tickle”) was originally developed: it’s better to have professional language designers doing the shell writing and init-file processing, than those whose primary focus is the overall application program and for whom constructing command syntaxes and formatting init files are an often ill-executed distraction from the larger application they’re trying to create.

Forth is a very handy small-footprint language, and probably the only integrated development/operating environment presenting an immediate, text-mode interface relatively unchanged since its brief popularity on 8-bit 80’s systems.  One of the complaints about Forth is that there are so many different “standard” dictionaries that  different Forth dialects often use the same word to mean slightly different things, taking different parameters or returning differently formatted results.  And while Forth’s low overhead becomes less relevant in newly affordable 32-bit microcontroller environments, there are still exceptional recent versions of the language such as FlashForth that cater to 8- and 16-bit microcontrollers.  Studying Mikael Nordman’s creation is a joy, whether to program in or just reading through his source code, and it actually makes even the simplest PIC demo boards instant on-board development platforms for even rather complex programs.

Tcl in Forth

Returning to the theme of replicating Forth or Tcl syntax inside of the other, I briefly took a stab at implementing Tcl’s shell-style (preceded by $) variable substitution in new words “set” and “puts” for Forth as well as supporting bracket-bounded function calls. So after much trial and error I was able to replace typically jumbled Forth statements of the form

variable x 3 x !
variable y x @ y !

with the somewhat more intuitive code (including implicit declaration of variables) borrowed from Tcl:

set x 3
set y $x

or its single-line equivalent,

set y [set x 3]

But this exercise proved not very interesting after I had tested a few such constructs, and again the inconsistent dictionaries among Blazin’ Forth on the Commodore, FlashForth on my Microstick II and gforth on my Linux box served as a reminder that when it comes to portability, Forth is no C.

Fortcl: Forth in Tcl

Lots of people have implemented Forth in Tcl.  I decided to take a crack at it for the principal reason that most implementations I was running across seemed to use global variables, whereas I prefer to stick with a select few list arguments to any function in order to enable better execution tracing and reuse.  I use a global variable in Tcl only to access what would be a global variable in Forth; none are used for the mechanics of the Forth interpreter itself.

Since Ficl (pronounced “fickle”) comes up on a web search as the name of an extant but seemingly unmaintained SourceForge project to develop a command language similar to Tcl in Forth, I’ve called my second exercise for this post Fortcl (pronounced “forticle,” as in a “fortified follicle”).  As with the other Forth-inside-Tcl implementations at the previous link, defining a word defines a Tcl proc of the same name.  Mine just accepts three (possibly null) arguments as input and similarly returns three (possibly null) outputs: a calculation stack called “stack,” a return stack called “rstack” and a list of words in “args” left to process on the current line.  A proc forth2 keeps the process going from one word to the next on a line; it is wrapped in a proc forth that just passes in null stacks and everything typed after it and returns only the stack (thus not returning the “return stack,” apologies for the roundabout nomenclature).

So besides replacing global variables with cascading Tcl lists, what can I say is special about Fortcl?  As in any implementation of Forth, colon definitions (basically a procedure defined by a sequence of characters between a lone colon and a lone semicolon) require a word name followed by one or more list elements, but the hierarchical list support in Tcl means that each list element itself can be a list.  I utilize this fact to allow the definition of a word in terms of either other Fortcl words or (if the first argument is a list rather than a word) in Tcl itself.  So either of the following is an acceptable way of defining the increment ( 1+ ) word once the addition ( + ) word has been defined:

: 1+ 1 + ; # defined in terms of another Forth word ( + )

or

: 1+ {list [concat [lrange $stack 0 end-1] [expr 1 + [lindex $stack end]]] $rstack $args} ; # defined as a Tcl script since the first (and only) word in the definition list is a list

Note that hash ( # ) after a semicolon starts a comment in Tcl and therefore in Fortcl.  The first is faster to type, the latter faster in execution time.

The left-parenthesis according to Forth

Another advantage of the latter definition of “1+” is that it supports the keyword “immediate” before the semicolon, which forces future definitions that use the word to execute its Tcl script during the definition (compile-time) rather than copying it into themselves for later execution (run-time).  It is good to do this for parenthesis-delimited Forth comments meant for human consumption only, so that they are stripped from any definitions and don’t slow down execution of the word:

: ( {concat [list $stack] [list $rstack] [list [lrange $args 0 [lsearch -exact $args )]-1]]} immediate ;

(_imm is now a defined word, and so as long as the “immediate” keyword is present comments will be removed from any definition.  There is no conflict with any other, non-immediate left-parenthesis proc that may exist, such as the wrapper for Tcl’s built-in proc expr at the top of this post.

Executing Forth words from Tcl

Regardless of which colon-definition mode (Forth or native, non-immediate Tcl) that 1+ was implemented in, execution similarly can take place on a line of Forth after the Tcl interpreter has recognized a call to proc forth:

forth var x 3 x ! x @ 1+ x ! x @
--> 4

or a line of Tcl that doesn’t even call proc forth for stack management, instead using lindex to strip off the two null lists (return stack and args remaining) that follow the answer in the first list (i.e. the resulting stack after the increment):

set x 3 ; set x [lindex [1+ $x] 0]
--> 4

At last, the code

After source-ing the base Fortcl to pick up the forth wrapper and more importantly the definition word ( : ), I have placed all words I’ve implemented to date in a separate, much longer source-able file consisting of only colon definitions that call it. An important note on this latter file is that there are a few ubiquitous Forth words that clash with Tcl keywords and thus had to be renamed in the forth definition:

  1. if (renamed to iff as in “if-forth” or “if and only if,” so use iff…else…then rather than if…else…then)
  2. variable (renamed to var)

Closing thoughts

Tcl is a fond post-adolescent memory for me, by the way.  John Ousterhout, its creator, was the instructor for the first computer science course I took at Berkeley, right around the time that Tcl was taking off. By day he was assigning me C and MIPS assembly homework, but outside of class he was rolling out this flexible interpreter to the world, probably still supporting Magic to some extent and somehow also raising a newborn daughter who as of this new year would be on the verge of turning 22 herself.

Thanks, ouster!

About these ads