diff options
Diffstat (limited to 'posts')
-rw-r--r-- | posts/introducing_tablespoon.md | 181 |
1 files changed, 181 insertions, 0 deletions
diff --git a/posts/introducing_tablespoon.md b/posts/introducing_tablespoon.md new file mode 100644 index 0000000..957b6d4 --- /dev/null +++ b/posts/introducing_tablespoon.md | |||
@@ -0,0 +1,181 @@ | |||
1 | [tbsp](https://git.peppe.rs/languages/tbsp) (tree-based | ||
2 | source-processing language) is an awk-like language that | ||
3 | operates on tree-sitter syntax trees. To motivate the need | ||
4 | for such a program, we could begin by writing a | ||
5 | markdown-to-html converter using `tbsp` and | ||
6 | [tree-sitter-md](https://github.com/tree-sitter-grammars/tree-sitter-markdown). | ||
7 | We need some markdown to begin with: | ||
8 | |||
9 | |||
10 | # 1 heading | ||
11 | |||
12 | content of first paragraph | ||
13 | |||
14 | ## 1.1 heading | ||
15 | |||
16 | content of nested paragraph | ||
17 | |||
18 | |||
19 | For future reference, this markdown is parsed like so by | ||
20 | tree-sitter-md (visualization generated by | ||
21 | [tree-viz](https://git.peppe.rs/cli/tree-viz)): | ||
22 | |||
23 | |||
24 | document | ||
25 | | section | ||
26 | | | atx_heading | ||
27 | | | | atx_h1_marker "#" | ||
28 | | | | heading_content inline "1 heading" | ||
29 | | | paragraph | ||
30 | | | | inline "content of first paragraph" | ||
31 | | | section | ||
32 | | | | atx_heading | ||
33 | | | | | atx_h2_marker "##" | ||
34 | | | | | heading_content inline "1.1 heading" | ||
35 | | | | paragraph | ||
36 | | | | | inline "content of nested paragraph" | ||
37 | |||
38 | |||
39 | Onto the converter itself. Every `tbsp` program is written as | ||
40 | a collection of stanzas. Typically, we start with a stanza | ||
41 | like so: | ||
42 | |||
43 | |||
44 | BEGIN { | ||
45 | int depth = 0; | ||
46 | |||
47 | print("<html>\n"); | ||
48 | print("<body>\n"); | ||
49 | } | ||
50 | |||
51 | |||
52 | The stanza begins with a "pattern", in this case, `BEGIN`, | ||
53 | and is followed a block of code. This block specifically, is | ||
54 | executed right at the beginning, before traversing the parse | ||
55 | tree. In this stanza, we set a "depth" variable to keep | ||
56 | track of nesting of markdown headers, and begin our html | ||
57 | document by printing the `<html>` and `<body>` tags. | ||
58 | |||
59 | We can follow this stanza with an `END` stanza, that is | ||
60 | executed after the traversal: | ||
61 | |||
62 | |||
63 | END { | ||
64 | print("</body>\n"); | ||
65 | print("</html>\n"); | ||
66 | } | ||
67 | |||
68 | |||
69 | In this stanza, we close off the tags we opened at the start | ||
70 | of the document. We can move onto the interesting bits of | ||
71 | the conversion now: | ||
72 | |||
73 | |||
74 | enter section { | ||
75 | depth += 1; | ||
76 | } | ||
77 | leave section { | ||
78 | depth -= 1; | ||
79 | } | ||
80 | |||
81 | |||
82 | The above stanzas begin with `enter` and `leave` clauses, | ||
83 | followed by the name of a tree-sitter node kind: `section`. | ||
84 | The `section` identifier is visible in the | ||
85 | tree-visualization above, it encompasses a markdown-section, | ||
86 | and is created for every markdown header. To understand how | ||
87 | `tbsp` executes above stanzas: | ||
88 | |||
89 | |||
90 | document ... depth = 0 | ||
91 | | section <-------- enter section (1) ... depth = 1 | ||
92 | | | atx_heading | ||
93 | | | | inline | ||
94 | | | paragraph | ||
95 | | | | inline | ||
96 | | | section <----- enter section (2) ... depth = 2 | ||
97 | | | | atx_heading | ||
98 | | | | | inline | ||
99 | | | | paragraph | ||
100 | | | | | inline | ||
101 | | | | <----------- leave section (2) ... depth = 1 | ||
102 | | | <-------------- leave section (1) ... depth = 0 | ||
103 | |||
104 | |||
105 | The following stanzas should be self-explanatory now: | ||
106 | |||
107 | |||
108 | enter atx_heading { | ||
109 | print("<h"); | ||
110 | print(depth); | ||
111 | print(">"); | ||
112 | } | ||
113 | leave atx_heading { | ||
114 | print("</h"); | ||
115 | print(depth); | ||
116 | print(">\n"); | ||
117 | } | ||
118 | |||
119 | enter inline { | ||
120 | print(text(node)); | ||
121 | } | ||
122 | |||
123 | |||
124 | But an explanation is included nonetheless: | ||
125 | |||
126 | |||
127 | document ... depth = 0 | ||
128 | | section <-------- enter section (1) ... depth = 1 | ||
129 | | | atx_heading <- enter atx_heading ... print "<h1>" | ||
130 | | | | inline <--- enter inline ... print .. | ||
131 | | | | <----------- leave atx_heading ... print "</h1>" | ||
132 | | | paragraph | ||
133 | | | | inline <--- enter inline ... print .. | ||
134 | | | section <----- enter section (2) ... depth = 2 | ||
135 | | | | atx_heading enter atx_heading ... print "<h2>" | ||
136 | | | | | inline <- enter inline ... print .. | ||
137 | | | | | <-------- leave atx_heading ... print "</h2>" | ||
138 | | | | paragraph | ||
139 | | | | | inline <- enter inline ... print .. | ||
140 | | | | <----------- leave section (2) ... depth = 1 | ||
141 | | | <-------------- leave section (1) ... depth = 0 | ||
142 | |||
143 | |||
144 | The | ||
145 | [examples](https://git.peppe.rs/languages/tbsp/tree/examples) | ||
146 | directory contains a complete markdown-to-html converter, | ||
147 | along with a few other motivating examples. | ||
148 | |||
149 | ### Usage | ||
150 | |||
151 | The `tbsp` evaluator is written in rust, use cargo to build | ||
152 | and run: | ||
153 | |||
154 | cargo build --release | ||
155 | ./target/release/tbsp --help | ||
156 | |||
157 | |||
158 | `tbsp` requires three inputs: | ||
159 | |||
160 | - a `tbsp` program, referred to as "program file" | ||
161 | - a language | ||
162 | - an input file or some input text at stdin | ||
163 | |||
164 | |||
165 | You can run the interpreter like so (this program prints an | ||
166 | overview of a rust file): | ||
167 | |||
168 | $ ./target/release/tbsp \ | ||
169 | -f./examples/code-overview/overview.tbsp \ | ||
170 | -l rust \ | ||
171 | src/main.rs | ||
172 | module | ||
173 | └╴struct Cli | ||
174 | └╴trait Cli | ||
175 | └╴fn program | ||
176 | └╴fn language | ||
177 | └╴fn file | ||
178 | └╴fn try_consume_stdin | ||
179 | └╴fn main | ||
180 | |||
181 | |||