From 9828c78d95f2195cd8e1db04887072cd5f48005b Mon Sep 17 00:00:00 2001 From: Akshay Date: Thu, 1 Aug 2024 20:24:57 +0100 Subject: new post: tablespoon --- docs/posts/index.html | 17 +++ docs/posts/introducing_tablespoon/index.html | 211 +++++++++++++++++++++++++++ 2 files changed, 228 insertions(+) create mode 100644 docs/posts/introducing_tablespoon/index.html (limited to 'docs/posts') diff --git a/docs/posts/index.html b/docs/posts/index.html index ca9a3ac..a8d321c 100644 --- a/docs/posts/index.html +++ b/docs/posts/index.html @@ -24,6 +24,23 @@
+ + + + +
+
+ 01/08 — 2024 +
+ + Introducing Tablespoon + +
+ + 4.5 + + min +
diff --git a/docs/posts/introducing_tablespoon/index.html b/docs/posts/introducing_tablespoon/index.html new file mode 100644 index 0000000..a6f6ef2 --- /dev/null +++ b/docs/posts/introducing_tablespoon/index.html @@ -0,0 +1,211 @@ + + + + + + + + + + + + + + + Introducing Tablespoon · peppe.rs + +
+
+ Home + / + Posts + / + Introducing Tablespoon + View Raw +
+
+ 01/08 — 2024 +
+ + 72.33 + + cm +   + + 4.5 + + min +
+
+

+ Introducing Tablespoon +

+
+

tbsp (tree-based +source-processing language) is an awk-like language that operates on +tree-sitter syntax trees. To motivate the need for such a program, we +could begin by writing a markdown-to-html converter using +tbsp and tree-sitter-md. +We need some markdown to begin with:

+
# 1 heading
+
+content of first paragraph
+
+## 1.1 heading
+
+content of nested paragraph
+

For future reference, this markdown is parsed like so by +tree-sitter-md (visualization generated by tree-viz):

+
document
+|  section
+|  |  atx_heading
+|  |  |  atx_h1_marker "#"
+|  |  |  heading_content inline "1 heading"
+|  |  paragraph
+|  |  |  inline "content of first paragraph"
+|  |  section
+|  |  |  atx_heading
+|  |  |  |  atx_h2_marker "##"
+|  |  |  |  heading_content inline "1.1 heading"
+|  |  |  paragraph
+|  |  |  |  inline "content of nested paragraph"
+

Onto the converter itself. Every tbsp program is written +as a collection of stanzas. Typically, we start with a stanza like +so:

+
BEGIN {
+    int depth = 0;
+
+    print("<html>\n");
+    print("<body>\n");
+}
+

The stanza begins with a “pattern”, in this case, BEGIN, +and is followed a block of code. This block specifically, is executed +right at the beginning, before traversing the parse tree. In this +stanza, we set a “depth” variable to keep track of nesting of markdown +headers, and begin our html document by printing the +<html> and <body> tags.

+

We can follow this stanza with an END stanza, that is +executed after the traversal:

+
END {
+    print("</body>\n");
+    print("</html>\n");
+}
+

In this stanza, we close off the tags we opened at the start of the +document. We can move onto the interesting bits of the conversion +now:

+
enter section {
+    depth += 1;
+}
+leave section {
+    depth -= 1;
+}
+

The above stanzas begin with enter and +leave clauses, followed by the name of a tree-sitter node +kind: section. The section identifier is +visible in the tree-visualization above, it encompasses a +markdown-section, and is created for every markdown header. To +understand how tbsp executes above stanzas:

+
document                                 ...  depth = 0 
+|  section <-------- enter section (1)   ...  depth = 1 
+|  |  atx_heading
+|  |  |  inline
+|  |  paragraph
+|  |  |  inline
+|  |  section <----- enter section (2)   ...  depth = 2 
+|  |  |  atx_heading
+|  |  |  | inline
+|  |  |  paragraph
+|  |  |  | inline
+|  |  | <----------- leave section (2)   ...  depth = 1 
+|  | <-------------- leave section (1)   ...  depth = 0 
+

The following stanzas should be self-explanatory now:

+
enter atx_heading {
+    print("<h");
+    print(depth);
+    print(">");
+}
+leave atx_heading {
+    print("</h");
+    print(depth);
+    print(">\n");
+}
+
+enter inline {
+    print(text(node));
+}
+

But an explanation is included nonetheless:

+
document                                 ...  depth = 0 
+|  section <-------- enter section (1)   ...  depth = 1 
+|  |  atx_heading <- enter atx_heading   ...  print "<h1>"
+|  |  |  inline <--- enter inline        ...  print ..
+|  |  | <----------- leave atx_heading   ...  print "</h1>"
+|  |  paragraph
+|  |  |  inline <--- enter inline        ...  print ..
+|  |  section <----- enter section (2)   ...  depth = 2 
+|  |  |  atx_heading enter atx_heading   ...  print "<h2>"
+|  |  |  | inline <- enter inline        ...  print ..
+|  |  |  | <-------- leave atx_heading   ...  print "</h2>"
+|  |  |  paragraph
+|  |  |  | inline <- enter inline        ...  print ..
+|  |  | <----------- leave section (2)   ...  depth = 1 
+|  | <-------------- leave section (1)   ...  depth = 0 
+

The examples +directory contains a complete markdown-to-html converter, along with a +few other motivating examples.

+

Usage

+

The tbsp evaluator is written in rust, use cargo to +build and run:

+
cargo build --release
+./target/release/tbsp --help
+

tbsp requires three inputs:

+
    +
  • a tbsp program, referred to as “program file”
  • +
  • a language
  • +
  • an input file or some input text at stdin
  • +
+

You can run the interpreter like so (this program prints an overview +of a rust file):

+
$ ./target/release/tbsp \
+      -f./examples/code-overview/overview.tbsp \
+      -l rust \
+      src/main.rs
+module
+   └╴struct Cli
+   └╴trait Cli
+      └╴fn program
+      └╴fn language
+      └╴fn file
+   └╴fn try_consume_stdin
+   └╴fn main
+ +
+ +
+ Hi. + +

I'm Akshay, programmer and pixel-artist. + I write open-source stuff. + I also design fonts: + scientifica, + curie. +

+

Reach out at oppili@irc.rizon.net.

+
+ + Home + / + Posts + / + Introducing Tablespoon + View Raw +
+
+ + -- cgit v1.2.3