Eforth
eForth in C/C++ - multi-platform (Linux,MacOS,Windows,ESP32,WASM)
Install / Use
/learn @chochain/EforthREADME
Forth - is it still relevant?
With all the advantages, it is unfortunate that Forth lost out to C language over the years and have been reduced to a niche. Per ChatGPT: due to C's broader appeal, standardization, and support ecosystem likely contributed to its greater adoption and use in mainstream computing.
So, the question is, how to encourage today's world of C programmers to take a look at Forth. How do we convince them that Forth can be 10 times more productive? Well, we do know that by keep saying how elegant Forth is or even bashing how bad C can be probably won't get us anywhere.
Bill Muench created eForth for simplicity and educational purpose. Dr. Ting, ported to many processors, described Forth in his well-written eForth genesis and overview. I like the idea and decided to pick it up.
eForth now - What did I change!
-
<b>100% C/C++ with multi-platform support</b>. Though classic implementation of primitives in assembly language and scripted high-level words gave the power to Forth, it also became the hurtle for newbies. Because they have to learn the assembly and Forth syntax before peeking into the internal beauty of Forth.
-
<b>Dictionary is just an array</b>. It's remodeled from linear memory linked-list to an array (or a vector in C++'s term) of words.
- To search for a word, simply scan the name string of dictionary entries. So, to define a new word during compile time is just to append those found word pointers to the its parameter array one by one.
- To execute become just a walk of the word pointers in the array. This is our inner interpreter.
- Hashtables might go even faster but we'll try that later.
-
<b>Data and Return Stacks are also arrays</b>. With push, pop and [] methods to clarify intentions.
-
<b>Parameter fields are all arrays</b>. Why not!
-
<b>No vocabulary, or meta-compilation</b>. Except CREATE..DOES>, and POSTPONE, these black-belt skills of Forth greatness are dropped to keep the focus on core concepts.
-
<b>Multi-threading and message passing are available</b> From v5.0 and on, multi-core platform can utilize Forth VMs running in parallel. see the multi-threading section below for details
- A thread pool is built-in. Size is defaults to number of cores.
- Message Passing send/recv with pthread mutex waiting.
- IO and memory update can be synchronized with lock/unlock.
Rolling your own C-based Forth?
If you are fluent in C/C++ and in the process of building your own Forth, skipping the verbage, the easiest path to gain understanding of how things work together is to download release v4.2 and work from there.
In the release, a heavily commented ceforth.cpp, the companion ceforth.h, and a config.h. Altogether, about 800 lines. Check them out!
eForth Internals
The core of current implementation of eForth is the dictionary composed of an array of Code objects that represent each of Forth words.
-
<b>Code</b> - the heart of eForth, depends on the constructor called, the following fields are populated accordingly
<pre> + name - a string that holds primitive word's name, i.e. NFA in classic FORTH, can also holds branching mnemonic for compound words which classic FORTH keeps on parameter memory + xt - pointer to a lambda function for primitive words i.e. XT in classic FORTH + pf, p1, p2 - parameter arrays of Code objects for compound words, i.e. PFA in classic FORTH + q - holds the literal value which classic FORTH keep on parameter memory </pre> -
<b>Lit, Var, Str, Bran, Tmp</b> - the polymorphic classes extended from the base class Code which serve the functionalities of primitive words of classic Forth.
<pre> + Lit - numeric literals + Var - variable or constant + Str - string for dostr or dotstr + Bran - Branching opcode + Tmp - temp storage for branching word </pre> -
<b>Dictionary</b> - an array of Code objects
<pre> + build-it words - constructed by initializer_list at start up, before main is called, degenerated lambdas become function pointers stored in Code.xt dict[0].xt ------> lambda[0] <== These function pointers can be converted dict[1].xt ------> lambda[1] into indices to a jump table ... which is exactly what WASM does dict[N-1].xt ----> lambda[N-1] <== N is number of built-in words + colon (user defined) words - collection of word pointers during compile time dict[N].pf = [ *Code, *Code, ... ] <== These are called the 'threads' in Forth's term dict[N+1].pf = [ *Code, *Code, ... ] So, instead of subroutine threading ... this is 'object' threading. dict[-1].pf = [ *Code, *Code, ... ] It can be further compacted into token (i.e. dict index) threading if desired </pre> -
<b>Inner Interpreter</b> - Code.exec() is self-explanatory
if (xt) { xt(this); return; } // run primitive word for (Code *w : pf) { // run colon word try { w->exec(); } // execute recursively catch (...) { break; } // handle exception if any }i.e. either we call a built-in word's lambda function or walk the Code.pf array recursively like a depth-first tree search.
-
<b>Outer Interpreter</b> - forth_core() is self-explanatory
Code *c = find(idiom); // search dictionary if (c) { // word found? if (compile && !c->immd) // are we compiling a new word? dict[-1]->add(c); // then append found code to it else c->exec(); // or, execute the code return; } DU n = parse_number(idiom); // word not found, try as a number if (compile) // are we compiling a new word? dict[-1]->add(new Lit(n)); // append numeric literal to it else PUSH(n); // push onto data stack
With the array implementation, the first difference is in array variable read/write.
> create narr 10 cells allot
> see narr
> : narr
0 0 0 0 0 0 0 0 0 0 ;
\ ^----------------- narr 2 cells +
While traditional Forths uses <code>narr 2 cells +</code> to get the memory address of <code>narr[2]</code>, eforth <code>narr</code> returns its index (or defining order) in the dictionary. So, <code>narr 2 cells +</code> will actually get you the index of the second word defined after <code>narr</code>. You'll be storing the value into that word's empty qf field. To access the nth element of <code>narr</code>, use <code>th</code> instead
> : fill-arr
10 0 do
i 2* narr i th !
loop ;
> fill-arr
> see narr
> : narr
0 2 4 6 8 10 12 14 16 18 ;
With arrays, the doors are open. Dynamically expanding variables as well as storing objects instead of just integers. Parameter fields can be filled in compile time or changed on the fly in runtime i.e. self-morphing code. These can be the "scary" features for Forths to come.
ceForth - Where we came from
Most classic Forth systems are build with a few low-level primitives in assembly language and bootstrap the high-level words in Forth itself. Over the years, Dr. Ting have implemented many Forth systems using the same model. See here for the detailed list. However, he eventually stated that it was silly trying to explain Forth in Forth to new comers. There are just not many people know Forth, period.
Utilizing modern OS and tool chains, a new generation of Forths implemented in just a few hundreds lines of C code can help someone who did not know Forth to gain the core understanding much quickly. He called the insight Forth without Forth.
In 2021-07-04, I got in touched with Dr. Ting mentioning that he taught at the university when I attended. He, as the usual kind and generous him, included me in his last projects all the way till his passing. I am honored that he considered me one of the frogs living in the bottom of the deep well with him looking up to the small opening of the sky together. With cross-platform portability as our guild-line, we built ooeForth in Java, jeForth in Javascript, wineForth for Windows, and esp32forth for ESP micro-controllers using the same code-base. With his last breath in the hospital, he attempted to build it onto an FPGA using Verilog. see ceForth_403 and eJsv32 for details.
We hope it can serve as a stepping stone for learning Forth to even building their own, one day.
How To Build and Run
$ git clone https://github.com/chochain/eforth to your local machine
$ cd eforth
There are two major versions current. eForth. v4 is single-threaded only and v5 default single-threaded but also supports multi-threaded.
Checkout the version you are interested in.
$ git checkout v42 # for version 4.2 (latest), or
$ git checkout master # for version 5 and on
To enable multi-threading, of v5, update the followings in ~/src/config.h
#define DO_MULTITASK 1
#define E4_VM_POOL_SZ 8
Linux, MacOS, Cygwin, Raspberry Pi, or Android with Termux
$ make
$ ./tests/eforth # to bring up the Forth interpreter
> eForth v5.0, RAM 16.5% free (1300 / 7880 MB)
> words⏎ \ to see available Forth words
> 1 2 +⏎ \ see Forth in action
> bye⏎
Related Skills
node-connect
343.3kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
92.1kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.3kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.3kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
