aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--docs/index.html16
-rw-r--r--docs/index.xml229
-rw-r--r--docs/posts/index.html17
-rw-r--r--docs/posts/lightweight_linting/index.html298
-rw-r--r--docs/style.css6
-rw-r--r--posts/lightweight_linting.md427
6 files changed, 985 insertions, 8 deletions
diff --git a/docs/index.html b/docs/index.html
index 4993863..89b04d4 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -42,15 +42,15 @@
42 <tr> 42 <tr>
43 <td class=table-post> 43 <td class=table-post>
44 <div class="date"> 44 <div class="date">
45 05/10 — 2021 45 26/01 — 2022
46 </div> 46 </div>
47 <a href="/posts/novice_nix:_flake_templates" class="post-link"> 47 <a href="/posts/lightweight_linting" class="post-link">
48 <span class="post-link">Novice Nix: Flake Templates</span> 48 <span class="post-link">Lightweight Linting</span>
49 </a> 49 </a>
50 </td> 50 </td>
51 <td class=table-stats> 51 <td class=table-stats>
52 <span class="stats-number"> 52 <span class="stats-number">
53 5.5 53 8.5
54 </span> 54 </span>
55 <span class=stats-unit>min</span> 55 <span class=stats-unit>min</span>
56 </td> 56 </td>
@@ -59,15 +59,15 @@
59 <tr> 59 <tr>
60 <td class=table-post> 60 <td class=table-post>
61 <div class="date"> 61 <div class="date">
62 11/04 — 2021 62 05/10 — 2021
63 </div> 63 </div>
64 <a href="/posts/SDL2_devlog" class="post-link"> 64 <a href="/posts/novice_nix:_flake_templates" class="post-link">
65 <span class="post-link">SDL2 Devlog</span> 65 <span class="post-link">Novice Nix: Flake Templates</span>
66 </a> 66 </a>
67 </td> 67 </td>
68 <td class=table-stats> 68 <td class=table-stats>
69 <span class="stats-number"> 69 <span class="stats-number">
70 10.0 70 5.5
71 </span> 71 </span>
72 <span class=stats-unit>min</span> 72 <span class=stats-unit>min</span>
73 </td> 73 </td>
diff --git a/docs/index.xml b/docs/index.xml
index b0ca956..c2ba3ca 100644
--- a/docs/index.xml
+++ b/docs/index.xml
@@ -12,6 +12,235 @@
12 <language>en-us</language> 12 <language>en-us</language>
13 <copyright>Creative Commons BY-NC-SA 4.0</copyright> 13 <copyright>Creative Commons BY-NC-SA 4.0</copyright>
14 <item> 14 <item>
15<title>Lightweight Linting</title>
16<description>&lt;p&gt;&lt;a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries"&gt;Tree-sitter&lt;/a&gt; queries allow you to search for patterns in syntax trees, much like a regex would, in text. Combine that with some Rust glue to write simple, custom linters.&lt;/p&gt;
17&lt;h3 id="tree-sitter-syntax-trees"&gt;Tree-sitter syntax trees&lt;/h3&gt;
18&lt;p&gt;Here is a quick crash course on syntax trees generated by tree-sitter. Syntax trees produced by tree-sitter are represented by S-expressions. The generated S-expression for the following Rust code,&lt;/p&gt;
19&lt;div class="sourceCode" id="cb1"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb1-1"&gt;&lt;a href="#cb1-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;fn&lt;/span&gt; main() &lt;span class="op"&gt;{&lt;/span&gt;&lt;/span&gt;
20&lt;span id="cb1-2"&gt;&lt;a href="#cb1-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; x &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="dv"&gt;2&lt;/span&gt;&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
21&lt;span id="cb1-3"&gt;&lt;a href="#cb1-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
22&lt;p&gt;would be:&lt;/p&gt;
23&lt;div class="sourceCode" id="cb2"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb2-1"&gt;&lt;a href="#cb2-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(source_file&lt;/span&gt;
24&lt;span id="cb2-2"&gt;&lt;a href="#cb2-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (function_item&lt;/span&gt;
25&lt;span id="cb2-3"&gt;&lt;a href="#cb2-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; name: (identifier)&lt;/span&gt;
26&lt;span id="cb2-4"&gt;&lt;a href="#cb2-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; parameters: (parameters)&lt;/span&gt;
27&lt;span id="cb2-5"&gt;&lt;a href="#cb2-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; body: &lt;/span&gt;
28&lt;span id="cb2-6"&gt;&lt;a href="#cb2-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (block&lt;/span&gt;
29&lt;span id="cb2-7"&gt;&lt;a href="#cb2-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (let_declaration &lt;/span&gt;
30&lt;span id="cb2-8"&gt;&lt;a href="#cb2-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier)&lt;/span&gt;
31&lt;span id="cb2-9"&gt;&lt;a href="#cb2-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal)))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
32&lt;p&gt;Syntax trees generated by tree-sitter have a couple of other cool properties: they are &lt;em&gt;lossless&lt;/em&gt; syntax trees. Given a lossless syntax tree, you can regenerate the original source code in its entirety. Consider the following addition to our example:&lt;/p&gt;
33&lt;div class="sourceCode" id="cb3"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb3-1"&gt;&lt;a href="#cb3-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;fn&lt;/span&gt; main() &lt;span class="op"&gt;{&lt;/span&gt;&lt;/span&gt;
34&lt;span id="cb3-2"&gt;&lt;a href="#cb3-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;+&lt;/span&gt; &lt;span class="co"&gt;// a comment goes here&lt;/span&gt;&lt;/span&gt;
35&lt;span id="cb3-3"&gt;&lt;a href="#cb3-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; x &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="dv"&gt;2&lt;/span&gt;&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
36&lt;span id="cb3-4"&gt;&lt;a href="#cb3-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
37&lt;p&gt;The tree-sitter syntax tree preserves the comment, while the typical abstract syntax tree wouldn’t:&lt;/p&gt;
38&lt;div class="sourceCode" id="cb4"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb4-1"&gt;&lt;a href="#cb4-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (source_file&lt;/span&gt;
39&lt;span id="cb4-2"&gt;&lt;a href="#cb4-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (function_item&lt;/span&gt;
40&lt;span id="cb4-3"&gt;&lt;a href="#cb4-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; name: (identifier)&lt;/span&gt;
41&lt;span id="cb4-4"&gt;&lt;a href="#cb4-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; parameters: (parameters)&lt;/span&gt;
42&lt;span id="cb4-5"&gt;&lt;a href="#cb4-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; body:&lt;/span&gt;
43&lt;span id="cb4-6"&gt;&lt;a href="#cb4-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (block&lt;/span&gt;
44&lt;span id="cb4-7"&gt;&lt;a href="#cb4-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;+&lt;/span&gt; (line_comment)&lt;/span&gt;
45&lt;span id="cb4-8"&gt;&lt;a href="#cb4-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (let_declaration&lt;/span&gt;
46&lt;span id="cb4-9"&gt;&lt;a href="#cb4-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier)&lt;/span&gt;
47&lt;span id="cb4-10"&gt;&lt;a href="#cb4-10" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal)))))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
48&lt;h3 id="tree-sitter-queries"&gt;Tree-sitter queries&lt;/h3&gt;
49&lt;p&gt;Tree-sitter provides a DSL to match over CSTs. These queries resemble our S-expression syntax trees, here is a query to match all line comments in a Rust CST:&lt;/p&gt;
50&lt;div class="sourceCode" id="cb5"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb5-1"&gt;&lt;a href="#cb5-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(line_comment)&lt;/span&gt;
51&lt;span id="cb5-2"&gt;&lt;a href="#cb5-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
52&lt;span id="cb5-3"&gt;&lt;a href="#cb5-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches the following rust code&lt;/span&gt;&lt;/span&gt;
53&lt;span id="cb5-4"&gt;&lt;a href="#cb5-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; // a comment goes here&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
54&lt;p&gt;Neat, eh? But don’t take my word for it, give it a go on the &lt;a href="https://tree-sitter.github.io/tree-sitter/playground"&gt;tree-sitter playground&lt;/a&gt;. Type in a query like so:&lt;/p&gt;
55&lt;div class="sourceCode" id="cb6"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb6-1"&gt;&lt;a href="#cb6-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; the web playground requires you to specify a &amp;quot;capture&amp;quot;&lt;/span&gt;&lt;/span&gt;
56&lt;span id="cb6-2"&gt;&lt;a href="#cb6-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; you will notice the capture and the nodes it captured&lt;/span&gt;&lt;/span&gt;
57&lt;span id="cb6-3"&gt;&lt;a href="#cb6-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; turn blue&lt;/span&gt;&lt;/span&gt;
58&lt;span id="cb6-4"&gt;&lt;a href="#cb6-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(line_comment) @capture&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
59&lt;p&gt;Here’s another to match &lt;code&gt;let&lt;/code&gt; expressions that bind an integer to an identifier:&lt;/p&gt;
60&lt;div class="sourceCode" id="cb7"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb7-1"&gt;&lt;a href="#cb7-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(let_declaration&lt;/span&gt;
61&lt;span id="cb7-2"&gt;&lt;a href="#cb7-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier)&lt;/span&gt;
62&lt;span id="cb7-3"&gt;&lt;a href="#cb7-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal))&lt;/span&gt;
63&lt;span id="cb7-4"&gt;&lt;a href="#cb7-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;/span&gt;
64&lt;span id="cb7-5"&gt;&lt;a href="#cb7-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches:&lt;/span&gt;&lt;/span&gt;
65&lt;span id="cb7-6"&gt;&lt;a href="#cb7-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = 2;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
66&lt;p&gt;We can &lt;em&gt;capture&lt;/em&gt; nodes into variables:&lt;/p&gt;
67&lt;div class="sourceCode" id="cb8"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb8-1"&gt;&lt;a href="#cb8-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(let_declaration &lt;/span&gt;
68&lt;span id="cb8-2"&gt;&lt;a href="#cb8-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier) @my-capture&lt;/span&gt;
69&lt;span id="cb8-3"&gt;&lt;a href="#cb8-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal))&lt;/span&gt;
70&lt;span id="cb8-4"&gt;&lt;a href="#cb8-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;/span&gt;
71&lt;span id="cb8-5"&gt;&lt;a href="#cb8-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches:&lt;/span&gt;&lt;/span&gt;
72&lt;span id="cb8-6"&gt;&lt;a href="#cb8-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = 2;&lt;/span&gt;&lt;/span&gt;
73&lt;span id="cb8-7"&gt;&lt;a href="#cb8-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
74&lt;span id="cb8-8"&gt;&lt;a href="#cb8-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; captures:&lt;/span&gt;&lt;/span&gt;
75&lt;span id="cb8-9"&gt;&lt;a href="#cb8-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; foo&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
76&lt;p&gt;And apply certain &lt;em&gt;predicates&lt;/em&gt; to captures:&lt;/p&gt;
77&lt;div class="sourceCode" id="cb9"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb9-1"&gt;&lt;a href="#cb9-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((let_declaration&lt;/span&gt;
78&lt;span id="cb9-2"&gt;&lt;a href="#cb9-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier) @my-capture&lt;/span&gt;
79&lt;span id="cb9-3"&gt;&lt;a href="#cb9-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal))&lt;/span&gt;
80&lt;span id="cb9-4"&gt;&lt;a href="#cb9-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (&lt;span class="sc"&gt;#e&lt;/span&gt;q? @my-capture &lt;span class="st"&gt;&amp;quot;foo&amp;quot;&lt;/span&gt;))&lt;/span&gt;
81&lt;span id="cb9-5"&gt;&lt;a href="#cb9-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;/span&gt;
82&lt;span id="cb9-6"&gt;&lt;a href="#cb9-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches:&lt;/span&gt;&lt;/span&gt;
83&lt;span id="cb9-7"&gt;&lt;a href="#cb9-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = 2;&lt;/span&gt;&lt;/span&gt;
84&lt;span id="cb9-8"&gt;&lt;a href="#cb9-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
85&lt;span id="cb9-9"&gt;&lt;a href="#cb9-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; and not:&lt;/span&gt;&lt;/span&gt;
86&lt;span id="cb9-10"&gt;&lt;a href="#cb9-10" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let bar = 2;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
87&lt;p&gt;The &lt;code&gt;#match?&lt;/code&gt; predicate checks if a capture matches a regex:&lt;/p&gt;
88&lt;div class="sourceCode" id="cb10"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb10-1"&gt;&lt;a href="#cb10-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((let_declaration&lt;/span&gt;
89&lt;span id="cb10-2"&gt;&lt;a href="#cb10-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier) @my-capture&lt;/span&gt;
90&lt;span id="cb10-3"&gt;&lt;a href="#cb10-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (integer_literal))&lt;/span&gt;
91&lt;span id="cb10-4"&gt;&lt;a href="#cb10-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (#match? @my-capture &lt;span class="st"&gt;&amp;quot;foo|bar&amp;quot;&lt;/span&gt;))&lt;/span&gt;
92&lt;span id="cb10-5"&gt;&lt;a href="#cb10-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;/span&gt;
93&lt;span id="cb10-6"&gt;&lt;a href="#cb10-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches both `foo` and `bar`:&lt;/span&gt;&lt;/span&gt;
94&lt;span id="cb10-7"&gt;&lt;a href="#cb10-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = 2;&lt;/span&gt;&lt;/span&gt;
95&lt;span id="cb10-8"&gt;&lt;a href="#cb10-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let bar = 2;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
96&lt;p&gt;Exhibit indifference, as a stoic programmer would, with the &lt;em&gt;wildcard&lt;/em&gt; pattern:&lt;/p&gt;
97&lt;div class="sourceCode" id="cb11"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb11-1"&gt;&lt;a href="#cb11-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(let_declaration&lt;/span&gt;
98&lt;span id="cb11-2"&gt;&lt;a href="#cb11-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; pattern: (identifier)&lt;/span&gt;
99&lt;span id="cb11-3"&gt;&lt;a href="#cb11-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; value: (&lt;span class="op"&gt;_&lt;/span&gt;))&lt;/span&gt;
100&lt;span id="cb11-4"&gt;&lt;a href="#cb11-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;/span&gt;
101&lt;span id="cb11-5"&gt;&lt;a href="#cb11-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; matches:&lt;/span&gt;&lt;/span&gt;
102&lt;span id="cb11-6"&gt;&lt;a href="#cb11-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = &amp;quot;foo&amp;quot;;&lt;/span&gt;&lt;/span&gt;
103&lt;span id="cb11-7"&gt;&lt;a href="#cb11-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = 42;&lt;/span&gt;&lt;/span&gt;
104&lt;span id="cb11-8"&gt;&lt;a href="#cb11-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;; let foo = bar;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
105&lt;p&gt;&lt;a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries"&gt;The documentation&lt;/a&gt; does the tree-sitter query DSL more justice, but we now know enough to write our first lint.&lt;/p&gt;
106&lt;h3 id="write-you-a-tree-sitter-lint"&gt;Write you a tree-sitter lint&lt;/h3&gt;
107&lt;p&gt;Strings in &lt;code&gt;std::env&lt;/code&gt; functions are error prone:&lt;/p&gt;
108&lt;div class="sourceCode" id="cb12"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb12-1"&gt;&lt;a href="#cb12-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="pp"&gt;std::env::&lt;/span&gt;remove_var(&lt;span class="st"&gt;&amp;quot;RUST_BACKTACE&amp;quot;&lt;/span&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
109&lt;span id="cb12-2"&gt;&lt;a href="#cb12-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="co"&gt;// ^^^^ &amp;quot;TACE&amp;quot; instead of &amp;quot;TRACE&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
110&lt;p&gt;I prefer this instead:&lt;/p&gt;
111&lt;div class="sourceCode" id="cb13"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb13-1"&gt;&lt;a href="#cb13-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// somewhere in a module that is well spellchecked&lt;/span&gt;&lt;/span&gt;
112&lt;span id="cb13-2"&gt;&lt;a href="#cb13-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;static&lt;/span&gt; BACKTRACE&lt;span class="op"&gt;:&lt;/span&gt; &lt;span class="op"&gt;&amp;amp;&lt;/span&gt;&lt;span class="dt"&gt;str&lt;/span&gt; &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="st"&gt;&amp;quot;RUST_BACKTRACE&amp;quot;&lt;/span&gt;&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
113&lt;span id="cb13-3"&gt;&lt;a href="#cb13-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
114&lt;span id="cb13-4"&gt;&lt;a href="#cb13-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// rest of the codebase&lt;/span&gt;&lt;/span&gt;
115&lt;span id="cb13-5"&gt;&lt;a href="#cb13-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="pp"&gt;std::env::&lt;/span&gt;remove_var(BACKTRACE)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
116&lt;p&gt;Let’s write a lint to find &lt;code&gt;std::env&lt;/code&gt; functions that use strings. Put aside the effectiveness of this lint for the moment, and take a stab at writing a tree-sitter query. For reference, a function call like so:&lt;/p&gt;
117&lt;div class="sourceCode" id="cb14"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb14-1"&gt;&lt;a href="#cb14-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;remove_var(&lt;span class="st"&gt;&amp;quot;RUST_BACKTRACE&amp;quot;&lt;/span&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
118&lt;p&gt;Produces the following S-expression:&lt;/p&gt;
119&lt;div class="sourceCode" id="cb15"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb15-1"&gt;&lt;a href="#cb15-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(call_expression&lt;/span&gt;
120&lt;span id="cb15-2"&gt;&lt;a href="#cb15-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; function: (identifier)&lt;/span&gt;
121&lt;span id="cb15-3"&gt;&lt;a href="#cb15-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; arguments: (arguments (string_literal)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
122&lt;p&gt;We are definitely looking for a &lt;code&gt;call_expression&lt;/code&gt;:&lt;/p&gt;
123&lt;div class="sourceCode" id="cb16"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb16-1"&gt;&lt;a href="#cb16-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;(call_expression) @raise&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
124&lt;p&gt;Whose function name matches &lt;code&gt;std::env::var&lt;/code&gt; or &lt;code&gt;std::env::remove_var&lt;/code&gt; at the very least (I know, I know, this isn’t the most optimal regex):&lt;/p&gt;
125&lt;div class="sourceCode" id="cb17"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb17-1"&gt;&lt;a href="#cb17-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((call_expression&lt;/span&gt;
126&lt;span id="cb17-2"&gt;&lt;a href="#cb17-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; function: (&lt;span class="op"&gt;_&lt;/span&gt;) @fn-name) @raise&lt;/span&gt;
127&lt;span id="cb17-3"&gt;&lt;a href="#cb17-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (#match? @fn-name &lt;span class="st"&gt;&amp;quot;std::env::(var|remove_var)&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
128&lt;p&gt;Let’s turn that &lt;code&gt;std::&lt;/code&gt; prefix optional:&lt;/p&gt;
129&lt;div class="sourceCode" id="cb18"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb18-1"&gt;&lt;a href="#cb18-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((call_expression&lt;/span&gt;
130&lt;span id="cb18-2"&gt;&lt;a href="#cb18-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; function: (&lt;span class="op"&gt;_&lt;/span&gt;) @fn-name) @raise&lt;/span&gt;
131&lt;span id="cb18-3"&gt;&lt;a href="#cb18-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (#match? @fn-name &lt;span class="st"&gt;&amp;quot;(std::|)env::(var|remove_var)&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
132&lt;p&gt;And ensure that &lt;code&gt;arguments&lt;/code&gt; is a string:&lt;/p&gt;
133&lt;div class="sourceCode" id="cb19"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb19-1"&gt;&lt;a href="#cb19-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((call_expression&lt;/span&gt;
134&lt;span id="cb19-2"&gt;&lt;a href="#cb19-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; function: (&lt;span class="op"&gt;_&lt;/span&gt;) @fn-name&lt;/span&gt;
135&lt;span id="cb19-3"&gt;&lt;a href="#cb19-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; arguments: (arguments (string_literal)))&lt;/span&gt;
136&lt;span id="cb19-4"&gt;&lt;a href="#cb19-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (#match? @fn-name &lt;span class="st"&gt;&amp;quot;(std::|)env::(var|remove_var)&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
137&lt;h3 id="running-our-linter"&gt;Running our linter&lt;/h3&gt;
138&lt;p&gt;We could always plug our query into the web playground, but let’s go a step further:&lt;/p&gt;
139&lt;div class="sourceCode" id="cb20"&gt;&lt;pre class="sourceCode bash"&gt;&lt;code class="sourceCode bash"&gt;&lt;span id="cb20-1"&gt;&lt;a href="#cb20-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="ex"&gt;cargo&lt;/span&gt; new &lt;span class="at"&gt;--bin&lt;/span&gt; toy-lint&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
140&lt;p&gt;Add &lt;code&gt;tree-sitter&lt;/code&gt; and &lt;code&gt;tree-sitter-rust&lt;/code&gt; to your dependencies:&lt;/p&gt;
141&lt;div class="sourceCode" id="cb21"&gt;&lt;pre class="sourceCode toml"&gt;&lt;code class="sourceCode toml"&gt;&lt;span id="cb21-1"&gt;&lt;a href="#cb21-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;# within Cargo.toml&lt;/span&gt;&lt;/span&gt;
142&lt;span id="cb21-2"&gt;&lt;a href="#cb21-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;[&lt;/span&gt;&lt;span class="dt"&gt;dependencies&lt;/span&gt;&lt;span class="kw"&gt;]&lt;/span&gt;&lt;/span&gt;
143&lt;span id="cb21-3"&gt;&lt;a href="#cb21-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="dt"&gt;tree-sitter&lt;/span&gt; &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="st"&gt;&amp;quot;0.20&amp;quot;&lt;/span&gt;&lt;/span&gt;
144&lt;span id="cb21-4"&gt;&lt;a href="#cb21-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
145&lt;span id="cb21-5"&gt;&lt;a href="#cb21-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;[&lt;/span&gt;&lt;span class="dt"&gt;dependencies&lt;/span&gt;&lt;span class="kw"&gt;.&lt;/span&gt;&lt;span class="dt"&gt;tree-sitter-rust&lt;/span&gt;&lt;span class="kw"&gt;]&lt;/span&gt;&lt;/span&gt;
146&lt;span id="cb21-6"&gt;&lt;a href="#cb21-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="dt"&gt;git&lt;/span&gt; &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="st"&gt;&amp;quot;https://github.com/tree-sitter/tree-sitter-rust&amp;quot;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
147&lt;p&gt;Let’s load in some Rust code to work with. As &lt;a href="https://en.wikipedia.org/wiki/Self-reference"&gt;an ode to Gödel&lt;/a&gt; (G&lt;code&gt;ode&lt;/code&gt;l?), why not load in our linter itself:&lt;/p&gt;
148&lt;div class="sourceCode" id="cb22"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb22-1"&gt;&lt;a href="#cb22-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;fn&lt;/span&gt; main() &lt;span class="op"&gt;{&lt;/span&gt;&lt;/span&gt;
149&lt;span id="cb22-2"&gt;&lt;a href="#cb22-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; src &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="pp"&gt;include_str!&lt;/span&gt;(&lt;span class="st"&gt;&amp;quot;main.rs&amp;quot;&lt;/span&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
150&lt;span id="cb22-3"&gt;&lt;a href="#cb22-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
151&lt;p&gt;Most tree-sitter APIs require a reference to a &lt;code&gt;Language&lt;/code&gt; struct, we will be working with Rust if you haven’t already guessed:&lt;/p&gt;
152&lt;div class="sourceCode" id="cb23"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb23-1"&gt;&lt;a href="#cb23-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;use&lt;/span&gt; &lt;span class="pp"&gt;tree_sitter::&lt;/span&gt;Language&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
153&lt;span id="cb23-2"&gt;&lt;a href="#cb23-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
154&lt;span id="cb23-3"&gt;&lt;a href="#cb23-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; rust_lang&lt;span class="op"&gt;:&lt;/span&gt; Language &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="pp"&gt;tree_sitter_rust::&lt;/span&gt;language()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
155&lt;p&gt;Enough scaffolding, let’s parse some Rust:&lt;/p&gt;
156&lt;div class="sourceCode" id="cb24"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb24-1"&gt;&lt;a href="#cb24-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;use&lt;/span&gt; &lt;span class="pp"&gt;tree_sitter::&lt;/span&gt;Parser&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
157&lt;span id="cb24-2"&gt;&lt;a href="#cb24-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
158&lt;span id="cb24-3"&gt;&lt;a href="#cb24-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; &lt;span class="kw"&gt;mut&lt;/span&gt; parser &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="pp"&gt;Parser::&lt;/span&gt;new()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
159&lt;span id="cb24-4"&gt;&lt;a href="#cb24-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;parser&lt;span class="op"&gt;.&lt;/span&gt;set_language(rust_lang)&lt;span class="op"&gt;.&lt;/span&gt;unwrap()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
160&lt;span id="cb24-5"&gt;&lt;a href="#cb24-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
161&lt;span id="cb24-6"&gt;&lt;a href="#cb24-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; parse_tree &lt;span class="op"&gt;=&lt;/span&gt; parser&lt;span class="op"&gt;.&lt;/span&gt;parse(&lt;span class="op"&gt;&amp;amp;&lt;/span&gt;src&lt;span class="op"&gt;,&lt;/span&gt; &lt;span class="cn"&gt;None&lt;/span&gt;)&lt;span class="op"&gt;.&lt;/span&gt;unwrap()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
162&lt;p&gt;The second argument to &lt;code&gt;Parser::parse&lt;/code&gt; may be of interest. Tree-sitter has this cool feature that allows for quick reparsing of existing parse trees if they contain edits. If you do happen to want to reparse a source file, you can pass in the old tree:&lt;/p&gt;
163&lt;div class="sourceCode" id="cb25"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb25-1"&gt;&lt;a href="#cb25-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// if you wish to reparse instead of parse&lt;/span&gt;&lt;/span&gt;
164&lt;span id="cb25-2"&gt;&lt;a href="#cb25-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;old_tree&lt;span class="op"&gt;.&lt;/span&gt;edit(&lt;span class="co"&gt;/* redacted */&lt;/span&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
165&lt;span id="cb25-3"&gt;&lt;a href="#cb25-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
166&lt;span id="cb25-4"&gt;&lt;a href="#cb25-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// generate shiny new reparsed tree&lt;/span&gt;&lt;/span&gt;
167&lt;span id="cb25-5"&gt;&lt;a href="#cb25-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; new_tree &lt;span class="op"&gt;=&lt;/span&gt; parser&lt;span class="op"&gt;.&lt;/span&gt;parse(&lt;span class="op"&gt;&amp;amp;&lt;/span&gt;src&lt;span class="op"&gt;,&lt;/span&gt; &lt;span class="cn"&gt;Some&lt;/span&gt;(old_tree))&lt;span class="op"&gt;.&lt;/span&gt;unwrap()&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
168&lt;p&gt;Anyhow (&lt;a href="http://github.com/dtolnay/anyhow"&gt;hah!&lt;/a&gt;), now that we have a parse tree, we can inspect it:&lt;/p&gt;
169&lt;div class="sourceCode" id="cb26"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb26-1"&gt;&lt;a href="#cb26-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="pp"&gt;println!&lt;/span&gt;(&lt;span class="st"&gt;&amp;quot;{}&amp;quot;&lt;/span&gt;&lt;span class="op"&gt;,&lt;/span&gt; parse_tree&lt;span class="op"&gt;.&lt;/span&gt;root_node()&lt;span class="op"&gt;.&lt;/span&gt;to_sexp())&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
170&lt;p&gt;Or better yet, run a query on it:&lt;/p&gt;
171&lt;div class="sourceCode" id="cb27"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb27-1"&gt;&lt;a href="#cb27-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;use&lt;/span&gt; &lt;span class="pp"&gt;tree_sitter::&lt;/span&gt;Query&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
172&lt;span id="cb27-2"&gt;&lt;a href="#cb27-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
173&lt;span id="cb27-3"&gt;&lt;a href="#cb27-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; query &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="pp"&gt;Query::&lt;/span&gt;new(&lt;/span&gt;
174&lt;span id="cb27-4"&gt;&lt;a href="#cb27-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; rust_lang&lt;span class="op"&gt;,&lt;/span&gt;&lt;/span&gt;
175&lt;span id="cb27-5"&gt;&lt;a href="#cb27-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="st"&gt;r#&amp;quot;&lt;/span&gt;&lt;/span&gt;
176&lt;span id="cb27-6"&gt;&lt;a href="#cb27-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="st"&gt; ((call_expression&lt;/span&gt;&lt;/span&gt;
177&lt;span id="cb27-7"&gt;&lt;a href="#cb27-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="st"&gt; function: (_) @fn-name&lt;/span&gt;&lt;/span&gt;
178&lt;span id="cb27-8"&gt;&lt;a href="#cb27-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="st"&gt; arguments: (arguments (string_literal))) @raise&lt;/span&gt;&lt;/span&gt;
179&lt;span id="cb27-9"&gt;&lt;a href="#cb27-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="st"&gt; (#match? @fn-name &amp;quot;(std::|)env::(var|remove_var)&amp;quot;))&lt;/span&gt;&lt;/span&gt;
180&lt;span id="cb27-10"&gt;&lt;a href="#cb27-10" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="st"&gt; &amp;quot;#&lt;/span&gt;&lt;/span&gt;
181&lt;span id="cb27-11"&gt;&lt;a href="#cb27-11" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;)&lt;/span&gt;
182&lt;span id="cb27-12"&gt;&lt;a href="#cb27-12" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;.&lt;/span&gt;unwrap()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
183&lt;p&gt;A &lt;code&gt;QueryCursor&lt;/code&gt; is tree-sitter’s way of maintaining state as we iterate through the matches or captures produced by running a query on the parse tree. Observe:&lt;/p&gt;
184&lt;div class="sourceCode" id="cb28"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb28-1"&gt;&lt;a href="#cb28-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;use&lt;/span&gt; &lt;span class="pp"&gt;tree_sitter::&lt;/span&gt;QueryCursor&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
185&lt;span id="cb28-2"&gt;&lt;a href="#cb28-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
186&lt;span id="cb28-3"&gt;&lt;a href="#cb28-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; &lt;span class="kw"&gt;mut&lt;/span&gt; query_cursor &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="pp"&gt;QueryCursor::&lt;/span&gt;new()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
187&lt;span id="cb28-4"&gt;&lt;a href="#cb28-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; all_matches &lt;span class="op"&gt;=&lt;/span&gt; query_cursor&lt;span class="op"&gt;.&lt;/span&gt;matches(&lt;/span&gt;
188&lt;span id="cb28-5"&gt;&lt;a href="#cb28-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;&amp;amp;&lt;/span&gt;query&lt;span class="op"&gt;,&lt;/span&gt;&lt;/span&gt;
189&lt;span id="cb28-6"&gt;&lt;a href="#cb28-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; parse_tree&lt;span class="op"&gt;.&lt;/span&gt;root_node()&lt;span class="op"&gt;,&lt;/span&gt;&lt;/span&gt;
190&lt;span id="cb28-7"&gt;&lt;a href="#cb28-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; src&lt;span class="op"&gt;.&lt;/span&gt;as_bytes()&lt;span class="op"&gt;,&lt;/span&gt;&lt;/span&gt;
191&lt;span id="cb28-8"&gt;&lt;a href="#cb28-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
192&lt;p&gt;We begin by passing our query to the cursor, followed by the “root node”, which is another way of saying, “start from the top”, and lastly, the source itself. If you have already taken a look at the C API, you will notice that the last argument, the source (known as the &lt;code&gt;TextProvider&lt;/code&gt;), is not required. The Rust bindings seem to require this argument to provide predicate functionality such as &lt;code&gt;#match?&lt;/code&gt; and &lt;code&gt;#eq?&lt;/code&gt;.&lt;/p&gt;
193&lt;p&gt;Do something with the matches:&lt;/p&gt;
194&lt;div class="sourceCode" id="cb29"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb29-1"&gt;&lt;a href="#cb29-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// get the index of the capture named &amp;quot;raise&amp;quot;&lt;/span&gt;&lt;/span&gt;
195&lt;span id="cb29-2"&gt;&lt;a href="#cb29-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;let&lt;/span&gt; raise_idx &lt;span class="op"&gt;=&lt;/span&gt; query&lt;span class="op"&gt;.&lt;/span&gt;capture_index_for_name(&lt;span class="st"&gt;&amp;quot;raise&amp;quot;&lt;/span&gt;)&lt;span class="op"&gt;.&lt;/span&gt;unwrap()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
196&lt;span id="cb29-3"&gt;&lt;a href="#cb29-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;/span&gt;
197&lt;span id="cb29-4"&gt;&lt;a href="#cb29-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="kw"&gt;for&lt;/span&gt; each_match &lt;span class="kw"&gt;in&lt;/span&gt; all_matches &lt;span class="op"&gt;{&lt;/span&gt;&lt;/span&gt;
198&lt;span id="cb29-5"&gt;&lt;a href="#cb29-5" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="co"&gt;// iterate over all captures called &amp;quot;raise&amp;quot;&lt;/span&gt;&lt;/span&gt;
199&lt;span id="cb29-6"&gt;&lt;a href="#cb29-6" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="co"&gt;// ignore captures such as &amp;quot;fn-name&amp;quot;&lt;/span&gt;&lt;/span&gt;
200&lt;span id="cb29-7"&gt;&lt;a href="#cb29-7" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;for&lt;/span&gt; capture &lt;span class="kw"&gt;in&lt;/span&gt; each_match&lt;/span&gt;
201&lt;span id="cb29-8"&gt;&lt;a href="#cb29-8" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;.&lt;/span&gt;captures&lt;/span&gt;
202&lt;span id="cb29-9"&gt;&lt;a href="#cb29-9" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;.&lt;/span&gt;iter()&lt;/span&gt;
203&lt;span id="cb29-10"&gt;&lt;a href="#cb29-10" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;.&lt;/span&gt;filter(&lt;span class="op"&gt;|&lt;/span&gt;c&lt;span class="op"&gt;|&lt;/span&gt; c&lt;span class="op"&gt;.&lt;/span&gt;idx &lt;span class="op"&gt;==&lt;/span&gt; raise_idx)&lt;/span&gt;
204&lt;span id="cb29-11"&gt;&lt;a href="#cb29-11" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;{&lt;/span&gt;&lt;/span&gt;
205&lt;span id="cb29-12"&gt;&lt;a href="#cb29-12" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; range &lt;span class="op"&gt;=&lt;/span&gt; capture&lt;span class="op"&gt;.&lt;/span&gt;node&lt;span class="op"&gt;.&lt;/span&gt;range()&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
206&lt;span id="cb29-13"&gt;&lt;a href="#cb29-13" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; text &lt;span class="op"&gt;=&lt;/span&gt; &lt;span class="op"&gt;&amp;amp;&lt;/span&gt;src[range&lt;span class="op"&gt;.&lt;/span&gt;start_byte&lt;span class="op"&gt;..&lt;/span&gt;range&lt;span class="op"&gt;.&lt;/span&gt;end_byte]&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
207&lt;span id="cb29-14"&gt;&lt;a href="#cb29-14" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; line &lt;span class="op"&gt;=&lt;/span&gt; range&lt;span class="op"&gt;.&lt;/span&gt;start_point&lt;span class="op"&gt;.&lt;/span&gt;row&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
208&lt;span id="cb29-15"&gt;&lt;a href="#cb29-15" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="kw"&gt;let&lt;/span&gt; col &lt;span class="op"&gt;=&lt;/span&gt; range&lt;span class="op"&gt;.&lt;/span&gt;start_point&lt;span class="op"&gt;.&lt;/span&gt;column&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
209&lt;span id="cb29-16"&gt;&lt;a href="#cb29-16" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="pp"&gt;println!&lt;/span&gt;(&lt;/span&gt;
210&lt;span id="cb29-17"&gt;&lt;a href="#cb29-17" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="st"&gt;&amp;quot;[Line: {}, Col: {}] Offending source code: `{}`&amp;quot;&lt;/span&gt;&lt;span class="op"&gt;,&lt;/span&gt;&lt;/span&gt;
211&lt;span id="cb29-18"&gt;&lt;a href="#cb29-18" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; line&lt;span class="op"&gt;,&lt;/span&gt; col&lt;span class="op"&gt;,&lt;/span&gt; text&lt;/span&gt;
212&lt;span id="cb29-19"&gt;&lt;a href="#cb29-19" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; )&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;
213&lt;span id="cb29-20"&gt;&lt;a href="#cb29-20" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; &lt;span class="op"&gt;}&lt;/span&gt;&lt;/span&gt;
214&lt;span id="cb29-21"&gt;&lt;a href="#cb29-21" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="op"&gt;}&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
215&lt;p&gt;Lastly, add the following line to your source code, to get the linter to catch something:&lt;/p&gt;
216&lt;div class="sourceCode" id="cb30"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb30-1"&gt;&lt;a href="#cb30-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="pp"&gt;env::&lt;/span&gt;remove_var(&lt;span class="st"&gt;&amp;quot;RUST_BACKTRACE&amp;quot;&lt;/span&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
217&lt;p&gt;And &lt;code&gt;cargo run&lt;/code&gt;:&lt;/p&gt;
218&lt;pre class="shell"&gt;&lt;code&gt;λ cargo run
219 Compiling toy-lint v0.1.0 (/redacted/path/to/toy-lint)
220 Finished dev [unoptimized + debuginfo] target(s) in 0.74s
221 Running `target/debug/toy-lint`
222[Line: 40, Col: 4] Offending source code: `env::remove_var(&amp;quot;RUST_BACKTRACE&amp;quot;)`&lt;/code&gt;&lt;/pre&gt;
223&lt;p&gt;Thank you tree-sitter!&lt;/p&gt;
224&lt;h3 id="bonus"&gt;Bonus&lt;/h3&gt;
225&lt;p&gt;Keen readers will notice that I avoided &lt;code&gt;std::env::set_var&lt;/code&gt;. Because &lt;code&gt;set_var&lt;/code&gt; is called with two arguments, a “key” and a “value”, unlike &lt;code&gt;env::var&lt;/code&gt; and &lt;code&gt;env::remove_var&lt;/code&gt;. As a result, it requires more juggling:&lt;/p&gt;
226&lt;div class="sourceCode" id="cb32"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb32-1"&gt;&lt;a href="#cb32-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((call_expression&lt;/span&gt;
227&lt;span id="cb32-2"&gt;&lt;a href="#cb32-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; function: (&lt;span class="op"&gt;_&lt;/span&gt;) @fn-name&lt;/span&gt;
228&lt;span id="cb32-3"&gt;&lt;a href="#cb32-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; arguments: (arguments &lt;span class="op"&gt;.&lt;/span&gt; (string_literal)&lt;span class="op"&gt;?&lt;/span&gt; &lt;span class="op"&gt;.&lt;/span&gt; (string_literal) &lt;span class="op"&gt;.&lt;/span&gt;)) @raise&lt;/span&gt;
229&lt;span id="cb32-4"&gt;&lt;a href="#cb32-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (#match? @fn-name &lt;span class="st"&gt;&amp;quot;(std::|)env::(var|remove_var|set_var)&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
230&lt;p&gt;The interesting part of this query is the humble &lt;code&gt;.&lt;/code&gt;, the &lt;em&gt;anchor&lt;/em&gt; operator. Anchors help constrain child nodes in certain ways. In this case, it ensures that we match exactly two &lt;code&gt;string_literal&lt;/code&gt;s who are siblings or exactly one &lt;code&gt;string_literal&lt;/code&gt; with no siblings. Unfortunately, this query also matches the following invalid Rust code:&lt;/p&gt;
231&lt;div class="sourceCode" id="cb33"&gt;&lt;pre class="sourceCode rust"&gt;&lt;code class="sourceCode rust"&gt;&lt;span id="cb33-1"&gt;&lt;a href="#cb33-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="co"&gt;// remove_var accepts only 1 arg!&lt;/span&gt;&lt;/span&gt;
232&lt;span id="cb33-2"&gt;&lt;a href="#cb33-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;&lt;span class="pp"&gt;std::env::&lt;/span&gt;remove_var(&lt;span class="st"&gt;&amp;quot;RUST_BACKTRACE&amp;quot;&lt;/span&gt;&lt;span class="op"&gt;,&lt;/span&gt; &lt;span class="st"&gt;&amp;quot;1&amp;quot;&lt;/span&gt;)&lt;span class="op"&gt;;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
233&lt;h3 id="notes"&gt;Notes&lt;/h3&gt;
234&lt;p&gt;All-in-all, the query DSL does a great job in lowering the bar to writing language tools. The knowledge gained from mastering the query DSL can be applied to other languages that have tree-sitter grammars too. This query detects &lt;code&gt;to_json&lt;/code&gt; methods that do not accept additional arguments, in Ruby:&lt;/p&gt;
235&lt;div class="sourceCode" id="cb34"&gt;&lt;pre class="sourceCode scheme"&gt;&lt;code class="sourceCode scheme"&gt;&lt;span id="cb34-1"&gt;&lt;a href="#cb34-1" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt;((method&lt;/span&gt;
236&lt;span id="cb34-2"&gt;&lt;a href="#cb34-2" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; name: (identifier) @fn&lt;/span&gt;
237&lt;span id="cb34-3"&gt;&lt;a href="#cb34-3" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; !parameters)&lt;/span&gt;
238&lt;span id="cb34-4"&gt;&lt;a href="#cb34-4" aria-hidden="true" tabindex="-1"&gt;&lt;/a&gt; (&lt;span class="sc"&gt;#i&lt;/span&gt;s? @fn &lt;span class="st"&gt;&amp;quot;to_json&amp;quot;&lt;/span&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
239<link>https://peppe.rs/posts/lightweight_linting/</link>
240<pubDate>Wed, 26 Jan 2022 12:52:00 +0000</pubDate>
241<guid>https://peppe.rs/posts/lightweight_linting/</guid>
242</item>
243<item>
15<title>Novice Nix: Flake Templates</title> 244<title>Novice Nix: Flake Templates</title>
16<description>&lt;p&gt;Flakes are very handy to setup entirely pure, project-specific dependencies (not just dependencies, but build steps, shell environments and more) in a declarative way. Writing Flake expressions can get repetitive though, oftentimes, you’d much rather start off with a skeleton. Luckily, &lt;code&gt;nix&lt;/code&gt; already supports templates!&lt;/p&gt; 245<description>&lt;p&gt;Flakes are very handy to setup entirely pure, project-specific dependencies (not just dependencies, but build steps, shell environments and more) in a declarative way. Writing Flake expressions can get repetitive though, oftentimes, you’d much rather start off with a skeleton. Luckily, &lt;code&gt;nix&lt;/code&gt; already supports templates!&lt;/p&gt;
17&lt;p&gt;You might already be familiar with &lt;code&gt;nix flake init&lt;/code&gt;, that drops a “default” flake expression into your current working directory. If you head over to the manpage:&lt;/p&gt; 246&lt;p&gt;You might already be familiar with &lt;code&gt;nix flake init&lt;/code&gt;, that drops a “default” flake expression into your current working directory. If you head over to the manpage:&lt;/p&gt;
diff --git a/docs/posts/index.html b/docs/posts/index.html
index d15342d..68d5d60 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -27,6 +27,23 @@
27 <tr> 27 <tr>
28 <td class=table-post> 28 <td class=table-post>
29 <div class="date"> 29 <div class="date">
30 26/01 — 2022
31 </div>
32 <a href="/posts/lightweight_linting" class="post-link">
33 <span class="post-link">Lightweight Linting</span>
34 </a>
35 </td>
36 <td class=table-stats>
37 <span class="stats-number">
38 8.5
39 </span>
40 <span class=stats-unit>min</span>
41 </td>
42 </tr>
43
44 <tr>
45 <td class=table-post>
46 <div class="date">
30 05/10 — 2021 47 05/10 — 2021
31 </div> 48 </div>
32 <a href="/posts/novice_nix:_flake_templates" class="post-link"> 49 <a href="/posts/novice_nix:_flake_templates" class="post-link">
diff --git a/docs/posts/lightweight_linting/index.html b/docs/posts/lightweight_linting/index.html
new file mode 100644
index 0000000..9bc84f2
--- /dev/null
+++ b/docs/posts/lightweight_linting/index.html
@@ -0,0 +1,298 @@
1<!DOCTYPE html>
2<html lang="en">
3 <head>
4 <link rel="stylesheet" href="/style.css">
5 <link rel="stylesheet" href="/syntax.css">
6 <meta charset="UTF-8">
7 <meta name="viewport" content="initial-scale=1">
8 <meta content="#ffffff" name="theme-color">
9 <meta name="HandheldFriendly" content="true">
10 <meta property="og:title" content="Lightweight Linting">
11 <meta property="og:type" content="website">
12 <meta property="og:description" content="a static site {for, by, about} me ">
13 <meta property="og:url" content="https://peppe.rs">
14 <link rel="icon" type="image/x-icon" href="/favicon.png">
15 <title>Lightweight Linting · peppe.rs</title>
16 <body>
17 <div class="posts">
18 <div class="post">
19 <a href="/" class="post-end-link">Home</a>
20 <span>/</span>
21 <a href="/posts" class="post-end-link">Posts</a>
22 <span>/</span>
23 <a class="post-end-link">Lightweight Linting</a>
24 <a class="stats post-end-link" href="https://git.peppe.rs/web/site/plain/posts/lightweight_linting.md
25">View Raw</a>
26 <div class="separator"></div>
27 <div class="date">
28 26/01 — 2022
29 <div class="stats">
30 <span class="stats-number">
31 170.62
32 </span>
33 <span class="stats-unit">cm</span>
34 &nbsp
35 <span class="stats-number">
36 8.5
37 </span>
38 <span class="stats-unit">min</span>
39 </div>
40 </div>
41 <h1>
42 Lightweight Linting
43 </h1>
44 <div class="post-text">
45 <p><a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries">Tree-sitter</a> queries allow you to search for patterns in syntax trees, much like a regex would, in text. Combine that with some Rust glue to write simple, custom linters.</p>
46<h3 id="tree-sitter-syntax-trees">Tree-sitter syntax trees</h3>
47<p>Here is a quick crash course on syntax trees generated by tree-sitter. Syntax trees produced by tree-sitter are represented by S-expressions. The generated S-expression for the following Rust code,</p>
48<div class="sourceCode" id="cb1"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">fn</span> main() <span class="op">{</span></span>
49<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> x <span class="op">=</span> <span class="dv">2</span><span class="op">;</span></span>
50<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
51<p>would be:</p>
52<div class="sourceCode" id="cb2"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a>(source_file</span>
53<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> (function_item</span>
54<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> name: (identifier)</span>
55<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> parameters: (parameters)</span>
56<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> body: </span>
57<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> (block</span>
58<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> (let_declaration </span>
59<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> pattern: (identifier)</span>
60<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> value: (integer_literal)))))</span></code></pre></div>
61<p>Syntax trees generated by tree-sitter have a couple of other cool properties: they are <em>lossless</em> syntax trees. Given a lossless syntax tree, you can regenerate the original source code in its entirety. Consider the following addition to our example:</p>
62<div class="sourceCode" id="cb3"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a> <span class="kw">fn</span> main() <span class="op">{</span></span>
63<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="op">+</span> <span class="co">// a comment goes here</span></span>
64<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> x <span class="op">=</span> <span class="dv">2</span><span class="op">;</span></span>
65<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span></code></pre></div>
66<p>The tree-sitter syntax tree preserves the comment, while the typical abstract syntax tree wouldn’t:</p>
67<div class="sourceCode" id="cb4"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a> (source_file</span>
68<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a> (function_item</span>
69<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a> name: (identifier)</span>
70<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a> parameters: (parameters)</span>
71<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a> body:</span>
72<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a> (block</span>
73<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a><span class="op">+</span> (line_comment)</span>
74<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a> (let_declaration</span>
75<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a> pattern: (identifier)</span>
76<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a> value: (integer_literal)))))</span></code></pre></div>
77<h3 id="tree-sitter-queries">Tree-sitter queries</h3>
78<p>Tree-sitter provides a DSL to match over CSTs. These queries resemble our S-expression syntax trees, here is a query to match all line comments in a Rust CST:</p>
79<div class="sourceCode" id="cb5"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a>(line_comment)</span>
80<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
81<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a><span class="co">; matches the following rust code</span></span>
82<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a><span class="co">; // a comment goes here</span></span></code></pre></div>
83<p>Neat, eh? But don’t take my word for it, give it a go on the <a href="https://tree-sitter.github.io/tree-sitter/playground">tree-sitter playground</a>. Type in a query like so:</p>
84<div class="sourceCode" id="cb6"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="co">; the web playground requires you to specify a &quot;capture&quot;</span></span>
85<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="co">; you will notice the capture and the nodes it captured</span></span>
86<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="co">; turn blue</span></span>
87<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a>(line_comment) @capture</span></code></pre></div>
88<p>Here’s another to match <code>let</code> expressions that bind an integer to an identifier:</p>
89<div class="sourceCode" id="cb7"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a>(let_declaration</span>
90<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a> pattern: (identifier)</span>
91<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a> value: (integer_literal))</span>
92<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a> </span>
93<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="co">; matches:</span></span>
94<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = 2;</span></span></code></pre></div>
95<p>We can <em>capture</em> nodes into variables:</p>
96<div class="sourceCode" id="cb8"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>(let_declaration </span>
97<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a> pattern: (identifier) @my-capture</span>
98<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a> value: (integer_literal))</span>
99<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a> </span>
100<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a><span class="co">; matches:</span></span>
101<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = 2;</span></span>
102<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a></span>
103<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a><span class="co">; captures:</span></span>
104<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a><span class="co">; foo</span></span></code></pre></div>
105<p>And apply certain <em>predicates</em> to captures:</p>
106<div class="sourceCode" id="cb9"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a>((let_declaration</span>
107<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a> pattern: (identifier) @my-capture</span>
108<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a> value: (integer_literal))</span>
109<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a> (<span class="sc">#e</span>q? @my-capture <span class="st">&quot;foo&quot;</span>))</span>
110<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a> </span>
111<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a><span class="co">; matches:</span></span>
112<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = 2;</span></span>
113<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a></span>
114<span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a><span class="co">; and not:</span></span>
115<span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a><span class="co">; let bar = 2;</span></span></code></pre></div>
116<p>The <code>#match?</code> predicate checks if a capture matches a regex:</p>
117<div class="sourceCode" id="cb10"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a>((let_declaration</span>
118<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a> pattern: (identifier) @my-capture</span>
119<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a> value: (integer_literal))</span>
120<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a> (#match? @my-capture <span class="st">&quot;foo|bar&quot;</span>))</span>
121<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a> </span>
122<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a><span class="co">; matches both `foo` and `bar`:</span></span>
123<span id="cb10-7"><a href="#cb10-7" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = 2;</span></span>
124<span id="cb10-8"><a href="#cb10-8" aria-hidden="true" tabindex="-1"></a><span class="co">; let bar = 2;</span></span></code></pre></div>
125<p>Exhibit indifference, as a stoic programmer would, with the <em>wildcard</em> pattern:</p>
126<div class="sourceCode" id="cb11"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a>(let_declaration</span>
127<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a> pattern: (identifier)</span>
128<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a> value: (<span class="op">_</span>))</span>
129<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a> </span>
130<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a><span class="co">; matches:</span></span>
131<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = &quot;foo&quot;;</span></span>
132<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = 42;</span></span>
133<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a><span class="co">; let foo = bar;</span></span></code></pre></div>
134<p><a href="https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries">The documentation</a> does the tree-sitter query DSL more justice, but we now know enough to write our first lint.</p>
135<h3 id="write-you-a-tree-sitter-lint">Write you a tree-sitter lint</h3>
136<p>Strings in <code>std::env</code> functions are error prone:</p>
137<div class="sourceCode" id="cb12"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="pp">std::env::</span>remove_var(<span class="st">&quot;RUST_BACKTACE&quot;</span>)<span class="op">;</span></span>
138<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a> <span class="co">// ^^^^ &quot;TACE&quot; instead of &quot;TRACE&quot;</span></span></code></pre></div>
139<p>I prefer this instead:</p>
140<div class="sourceCode" id="cb13"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="co">// somewhere in a module that is well spellchecked</span></span>
141<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a><span class="kw">static</span> BACKTRACE<span class="op">:</span> <span class="op">&amp;</span><span class="dt">str</span> <span class="op">=</span> <span class="st">&quot;RUST_BACKTRACE&quot;</span><span class="op">;</span></span>
142<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a></span>
143<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a><span class="co">// rest of the codebase</span></span>
144<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a><span class="pp">std::env::</span>remove_var(BACKTRACE)<span class="op">;</span></span></code></pre></div>
145<p>Let’s write a lint to find <code>std::env</code> functions that use strings. Put aside the effectiveness of this lint for the moment, and take a stab at writing a tree-sitter query. For reference, a function call like so:</p>
146<div class="sourceCode" id="cb14"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a>remove_var(<span class="st">&quot;RUST_BACKTRACE&quot;</span>)</span></code></pre></div>
147<p>Produces the following S-expression:</p>
148<div class="sourceCode" id="cb15"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a>(call_expression</span>
149<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a> function: (identifier)</span>
150<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a> arguments: (arguments (string_literal)))</span></code></pre></div>
151<p>We are definitely looking for a <code>call_expression</code>:</p>
152<div class="sourceCode" id="cb16"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a>(call_expression) @raise</span></code></pre></div>
153<p>Whose function name matches <code>std::env::var</code> or <code>std::env::remove_var</code> at the very least (I know, I know, this isn’t the most optimal regex):</p>
154<div class="sourceCode" id="cb17"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a>((call_expression</span>
155<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a> function: (<span class="op">_</span>) @fn-name) @raise</span>
156<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a> (#match? @fn-name <span class="st">&quot;std::env::(var|remove_var)&quot;</span>))</span></code></pre></div>
157<p>Let’s turn that <code>std::</code> prefix optional:</p>
158<div class="sourceCode" id="cb18"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a>((call_expression</span>
159<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a> function: (<span class="op">_</span>) @fn-name) @raise</span>
160<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a> (#match? @fn-name <span class="st">&quot;(std::|)env::(var|remove_var)&quot;</span>))</span></code></pre></div>
161<p>And ensure that <code>arguments</code> is a string:</p>
162<div class="sourceCode" id="cb19"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a>((call_expression</span>
163<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a> function: (<span class="op">_</span>) @fn-name</span>
164<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a> arguments: (arguments (string_literal)))</span>
165<span id="cb19-4"><a href="#cb19-4" aria-hidden="true" tabindex="-1"></a> (#match? @fn-name <span class="st">&quot;(std::|)env::(var|remove_var)&quot;</span>))</span></code></pre></div>
166<h3 id="running-our-linter">Running our linter</h3>
167<p>We could always plug our query into the web playground, but let’s go a step further:</p>
168<div class="sourceCode" id="cb20"><pre class="sourceCode bash"><code class="sourceCode bash"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a><span class="ex">cargo</span> new <span class="at">--bin</span> toy-lint</span></code></pre></div>
169<p>Add <code>tree-sitter</code> and <code>tree-sitter-rust</code> to your dependencies:</p>
170<div class="sourceCode" id="cb21"><pre class="sourceCode toml"><code class="sourceCode toml"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="co"># within Cargo.toml</span></span>
171<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a><span class="kw">[</span><span class="dt">dependencies</span><span class="kw">]</span></span>
172<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a><span class="dt">tree-sitter</span> <span class="op">=</span> <span class="st">&quot;0.20&quot;</span></span>
173<span id="cb21-4"><a href="#cb21-4" aria-hidden="true" tabindex="-1"></a></span>
174<span id="cb21-5"><a href="#cb21-5" aria-hidden="true" tabindex="-1"></a><span class="kw">[</span><span class="dt">dependencies</span><span class="kw">.</span><span class="dt">tree-sitter-rust</span><span class="kw">]</span></span>
175<span id="cb21-6"><a href="#cb21-6" aria-hidden="true" tabindex="-1"></a><span class="dt">git</span> <span class="op">=</span> <span class="st">&quot;https://github.com/tree-sitter/tree-sitter-rust&quot;</span></span></code></pre></div>
176<p>Let’s load in some Rust code to work with. As <a href="https://en.wikipedia.org/wiki/Self-reference">an ode to Gödel</a> (G<code>ode</code>l?), why not load in our linter itself:</p>
177<div class="sourceCode" id="cb22"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="kw">fn</span> main() <span class="op">{</span></span>
178<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> src <span class="op">=</span> <span class="pp">include_str!</span>(<span class="st">&quot;main.rs&quot;</span>)<span class="op">;</span></span>
179<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
180<p>Most tree-sitter APIs require a reference to a <code>Language</code> struct, we will be working with Rust if you haven’t already guessed:</p>
181<div class="sourceCode" id="cb23"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">tree_sitter::</span>Language<span class="op">;</span></span>
182<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a></span>
183<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> rust_lang<span class="op">:</span> Language <span class="op">=</span> <span class="pp">tree_sitter_rust::</span>language()<span class="op">;</span></span></code></pre></div>
184<p>Enough scaffolding, let’s parse some Rust:</p>
185<div class="sourceCode" id="cb24"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb24-1"><a href="#cb24-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">tree_sitter::</span>Parser<span class="op">;</span></span>
186<span id="cb24-2"><a href="#cb24-2" aria-hidden="true" tabindex="-1"></a></span>
187<span id="cb24-3"><a href="#cb24-3" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> <span class="kw">mut</span> parser <span class="op">=</span> <span class="pp">Parser::</span>new()<span class="op">;</span></span>
188<span id="cb24-4"><a href="#cb24-4" aria-hidden="true" tabindex="-1"></a>parser<span class="op">.</span>set_language(rust_lang)<span class="op">.</span>unwrap()<span class="op">;</span></span>
189<span id="cb24-5"><a href="#cb24-5" aria-hidden="true" tabindex="-1"></a></span>
190<span id="cb24-6"><a href="#cb24-6" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> parse_tree <span class="op">=</span> parser<span class="op">.</span>parse(<span class="op">&amp;</span>src<span class="op">,</span> <span class="cn">None</span>)<span class="op">.</span>unwrap()<span class="op">;</span></span></code></pre></div>
191<p>The second argument to <code>Parser::parse</code> may be of interest. Tree-sitter has this cool feature that allows for quick reparsing of existing parse trees if they contain edits. If you do happen to want to reparse a source file, you can pass in the old tree:</p>
192<div class="sourceCode" id="cb25"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="co">// if you wish to reparse instead of parse</span></span>
193<span id="cb25-2"><a href="#cb25-2" aria-hidden="true" tabindex="-1"></a>old_tree<span class="op">.</span>edit(<span class="co">/* redacted */</span>)<span class="op">;</span></span>
194<span id="cb25-3"><a href="#cb25-3" aria-hidden="true" tabindex="-1"></a></span>
195<span id="cb25-4"><a href="#cb25-4" aria-hidden="true" tabindex="-1"></a><span class="co">// generate shiny new reparsed tree</span></span>
196<span id="cb25-5"><a href="#cb25-5" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> new_tree <span class="op">=</span> parser<span class="op">.</span>parse(<span class="op">&amp;</span>src<span class="op">,</span> <span class="cn">Some</span>(old_tree))<span class="op">.</span>unwrap()</span></code></pre></div>
197<p>Anyhow (<a href="http://github.com/dtolnay/anyhow">hah!</a>), now that we have a parse tree, we can inspect it:</p>
198<div class="sourceCode" id="cb26"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb26-1"><a href="#cb26-1" aria-hidden="true" tabindex="-1"></a><span class="pp">println!</span>(<span class="st">&quot;{}&quot;</span><span class="op">,</span> parse_tree<span class="op">.</span>root_node()<span class="op">.</span>to_sexp())<span class="op">;</span></span></code></pre></div>
199<p>Or better yet, run a query on it:</p>
200<div class="sourceCode" id="cb27"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb27-1"><a href="#cb27-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">tree_sitter::</span>Query<span class="op">;</span></span>
201<span id="cb27-2"><a href="#cb27-2" aria-hidden="true" tabindex="-1"></a></span>
202<span id="cb27-3"><a href="#cb27-3" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> query <span class="op">=</span> <span class="pp">Query::</span>new(</span>
203<span id="cb27-4"><a href="#cb27-4" aria-hidden="true" tabindex="-1"></a> rust_lang<span class="op">,</span></span>
204<span id="cb27-5"><a href="#cb27-5" aria-hidden="true" tabindex="-1"></a> <span class="st">r#&quot;</span></span>
205<span id="cb27-6"><a href="#cb27-6" aria-hidden="true" tabindex="-1"></a><span class="st"> ((call_expression</span></span>
206<span id="cb27-7"><a href="#cb27-7" aria-hidden="true" tabindex="-1"></a><span class="st"> function: (_) @fn-name</span></span>
207<span id="cb27-8"><a href="#cb27-8" aria-hidden="true" tabindex="-1"></a><span class="st"> arguments: (arguments (string_literal))) @raise</span></span>
208<span id="cb27-9"><a href="#cb27-9" aria-hidden="true" tabindex="-1"></a><span class="st"> (#match? @fn-name &quot;(std::|)env::(var|remove_var)&quot;))</span></span>
209<span id="cb27-10"><a href="#cb27-10" aria-hidden="true" tabindex="-1"></a><span class="st"> &quot;#</span></span>
210<span id="cb27-11"><a href="#cb27-11" aria-hidden="true" tabindex="-1"></a>)</span>
211<span id="cb27-12"><a href="#cb27-12" aria-hidden="true" tabindex="-1"></a><span class="op">.</span>unwrap()<span class="op">;</span></span></code></pre></div>
212<p>A <code>QueryCursor</code> is tree-sitter’s way of maintaining state as we iterate through the matches or captures produced by running a query on the parse tree. Observe:</p>
213<div class="sourceCode" id="cb28"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a><span class="kw">use</span> <span class="pp">tree_sitter::</span>QueryCursor<span class="op">;</span></span>
214<span id="cb28-2"><a href="#cb28-2" aria-hidden="true" tabindex="-1"></a></span>
215<span id="cb28-3"><a href="#cb28-3" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> <span class="kw">mut</span> query_cursor <span class="op">=</span> <span class="pp">QueryCursor::</span>new()<span class="op">;</span></span>
216<span id="cb28-4"><a href="#cb28-4" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> all_matches <span class="op">=</span> query_cursor<span class="op">.</span>matches(</span>
217<span id="cb28-5"><a href="#cb28-5" aria-hidden="true" tabindex="-1"></a> <span class="op">&amp;</span>query<span class="op">,</span></span>
218<span id="cb28-6"><a href="#cb28-6" aria-hidden="true" tabindex="-1"></a> parse_tree<span class="op">.</span>root_node()<span class="op">,</span></span>
219<span id="cb28-7"><a href="#cb28-7" aria-hidden="true" tabindex="-1"></a> src<span class="op">.</span>as_bytes()<span class="op">,</span></span>
220<span id="cb28-8"><a href="#cb28-8" aria-hidden="true" tabindex="-1"></a>)<span class="op">;</span></span></code></pre></div>
221<p>We begin by passing our query to the cursor, followed by the “root node”, which is another way of saying, “start from the top”, and lastly, the source itself. If you have already taken a look at the C API, you will notice that the last argument, the source (known as the <code>TextProvider</code>), is not required. The Rust bindings seem to require this argument to provide predicate functionality such as <code>#match?</code> and <code>#eq?</code>.</p>
222<p>Do something with the matches:</p>
223<div class="sourceCode" id="cb29"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb29-1"><a href="#cb29-1" aria-hidden="true" tabindex="-1"></a><span class="co">// get the index of the capture named &quot;raise&quot;</span></span>
224<span id="cb29-2"><a href="#cb29-2" aria-hidden="true" tabindex="-1"></a><span class="kw">let</span> raise_idx <span class="op">=</span> query<span class="op">.</span>capture_index_for_name(<span class="st">&quot;raise&quot;</span>)<span class="op">.</span>unwrap()<span class="op">;</span></span>
225<span id="cb29-3"><a href="#cb29-3" aria-hidden="true" tabindex="-1"></a></span>
226<span id="cb29-4"><a href="#cb29-4" aria-hidden="true" tabindex="-1"></a><span class="kw">for</span> each_match <span class="kw">in</span> all_matches <span class="op">{</span></span>
227<span id="cb29-5"><a href="#cb29-5" aria-hidden="true" tabindex="-1"></a> <span class="co">// iterate over all captures called &quot;raise&quot;</span></span>
228<span id="cb29-6"><a href="#cb29-6" aria-hidden="true" tabindex="-1"></a> <span class="co">// ignore captures such as &quot;fn-name&quot;</span></span>
229<span id="cb29-7"><a href="#cb29-7" aria-hidden="true" tabindex="-1"></a> <span class="kw">for</span> capture <span class="kw">in</span> each_match</span>
230<span id="cb29-8"><a href="#cb29-8" aria-hidden="true" tabindex="-1"></a> <span class="op">.</span>captures</span>
231<span id="cb29-9"><a href="#cb29-9" aria-hidden="true" tabindex="-1"></a> <span class="op">.</span>iter()</span>
232<span id="cb29-10"><a href="#cb29-10" aria-hidden="true" tabindex="-1"></a> <span class="op">.</span>filter(<span class="op">|</span>c<span class="op">|</span> c<span class="op">.</span>idx <span class="op">==</span> raise_idx)</span>
233<span id="cb29-11"><a href="#cb29-11" aria-hidden="true" tabindex="-1"></a> <span class="op">{</span></span>
234<span id="cb29-12"><a href="#cb29-12" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> range <span class="op">=</span> capture<span class="op">.</span>node<span class="op">.</span>range()<span class="op">;</span></span>
235<span id="cb29-13"><a href="#cb29-13" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> text <span class="op">=</span> <span class="op">&amp;</span>src[range<span class="op">.</span>start_byte<span class="op">..</span>range<span class="op">.</span>end_byte]<span class="op">;</span></span>
236<span id="cb29-14"><a href="#cb29-14" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> line <span class="op">=</span> range<span class="op">.</span>start_point<span class="op">.</span>row<span class="op">;</span></span>
237<span id="cb29-15"><a href="#cb29-15" aria-hidden="true" tabindex="-1"></a> <span class="kw">let</span> col <span class="op">=</span> range<span class="op">.</span>start_point<span class="op">.</span>column<span class="op">;</span></span>
238<span id="cb29-16"><a href="#cb29-16" aria-hidden="true" tabindex="-1"></a> <span class="pp">println!</span>(</span>
239<span id="cb29-17"><a href="#cb29-17" aria-hidden="true" tabindex="-1"></a> <span class="st">&quot;[Line: {}, Col: {}] Offending source code: `{}`&quot;</span><span class="op">,</span></span>
240<span id="cb29-18"><a href="#cb29-18" aria-hidden="true" tabindex="-1"></a> line<span class="op">,</span> col<span class="op">,</span> text</span>
241<span id="cb29-19"><a href="#cb29-19" aria-hidden="true" tabindex="-1"></a> )<span class="op">;</span></span>
242<span id="cb29-20"><a href="#cb29-20" aria-hidden="true" tabindex="-1"></a> <span class="op">}</span></span>
243<span id="cb29-21"><a href="#cb29-21" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
244<p>Lastly, add the following line to your source code, to get the linter to catch something:</p>
245<div class="sourceCode" id="cb30"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb30-1"><a href="#cb30-1" aria-hidden="true" tabindex="-1"></a><span class="pp">env::</span>remove_var(<span class="st">&quot;RUST_BACKTRACE&quot;</span>)<span class="op">;</span></span></code></pre></div>
246<p>And <code>cargo run</code>:</p>
247<pre class="shell"><code>λ cargo run
248 Compiling toy-lint v0.1.0 (/redacted/path/to/toy-lint)
249 Finished dev [unoptimized + debuginfo] target(s) in 0.74s
250 Running `target/debug/toy-lint`
251[Line: 40, Col: 4] Offending source code: `env::remove_var(&quot;RUST_BACKTRACE&quot;)`</code></pre>
252<p>Thank you tree-sitter!</p>
253<h3 id="bonus">Bonus</h3>
254<p>Keen readers will notice that I avoided <code>std::env::set_var</code>. Because <code>set_var</code> is called with two arguments, a “key” and a “value”, unlike <code>env::var</code> and <code>env::remove_var</code>. As a result, it requires more juggling:</p>
255<div class="sourceCode" id="cb32"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb32-1"><a href="#cb32-1" aria-hidden="true" tabindex="-1"></a>((call_expression</span>
256<span id="cb32-2"><a href="#cb32-2" aria-hidden="true" tabindex="-1"></a> function: (<span class="op">_</span>) @fn-name</span>
257<span id="cb32-3"><a href="#cb32-3" aria-hidden="true" tabindex="-1"></a> arguments: (arguments <span class="op">.</span> (string_literal)<span class="op">?</span> <span class="op">.</span> (string_literal) <span class="op">.</span>)) @raise</span>
258<span id="cb32-4"><a href="#cb32-4" aria-hidden="true" tabindex="-1"></a> (#match? @fn-name <span class="st">&quot;(std::|)env::(var|remove_var|set_var)&quot;</span>))</span></code></pre></div>
259<p>The interesting part of this query is the humble <code>.</code>, the <em>anchor</em> operator. Anchors help constrain child nodes in certain ways. In this case, it ensures that we match exactly two <code>string_literal</code>s who are siblings or exactly one <code>string_literal</code> with no siblings. Unfortunately, this query also matches the following invalid Rust code:</p>
260<div class="sourceCode" id="cb33"><pre class="sourceCode rust"><code class="sourceCode rust"><span id="cb33-1"><a href="#cb33-1" aria-hidden="true" tabindex="-1"></a><span class="co">// remove_var accepts only 1 arg!</span></span>
261<span id="cb33-2"><a href="#cb33-2" aria-hidden="true" tabindex="-1"></a><span class="pp">std::env::</span>remove_var(<span class="st">&quot;RUST_BACKTRACE&quot;</span><span class="op">,</span> <span class="st">&quot;1&quot;</span>)<span class="op">;</span></span></code></pre></div>
262<h3 id="notes">Notes</h3>
263<p>All-in-all, the query DSL does a great job in lowering the bar to writing language tools. The knowledge gained from mastering the query DSL can be applied to other languages that have tree-sitter grammars too. This query detects <code>to_json</code> methods that do not accept additional arguments, in Ruby:</p>
264<div class="sourceCode" id="cb34"><pre class="sourceCode scheme"><code class="sourceCode scheme"><span id="cb34-1"><a href="#cb34-1" aria-hidden="true" tabindex="-1"></a>((method</span>
265<span id="cb34-2"><a href="#cb34-2" aria-hidden="true" tabindex="-1"></a> name: (identifier) @fn</span>
266<span id="cb34-3"><a href="#cb34-3" aria-hidden="true" tabindex="-1"></a> !parameters)</span>
267<span id="cb34-4"><a href="#cb34-4" aria-hidden="true" tabindex="-1"></a> (<span class="sc">#i</span>s? @fn <span class="st">&quot;to_json&quot;</span>))</span></code></pre></div>
268
269 </div>
270
271 <div class="intro">
272 Hi.
273 <div class="hot-links">
274 <a href="https://peppe.rs/index.xml" class="feed-button">Subscribe</a>
275 <a href="https://liberapay.com/nerdypepper/donate" class="donate-button">Donate</a>
276 </div>
277 <p>I'm Akshay, I go by nerd or nerdypepper on the internet.</p>
278 <p>
279 I am a compsci undergrad, Rust programmer and an enthusiastic Vimmer.
280 I write <a href="https://git.peppe.rs">open-source stuff</a> to pass time.
281 I also design fonts:
282 <a href="https://git.peppe.rs/fonts/scientifica">scientifica</a>,
283 <a href="https://git.peppe.rs/fonts/curie">curie</a>.
284 </p>
285 <p>Send me a mail at [email protected] or a message at [email protected].</p>
286 </div>
287
288 <a href="/" class="post-end-link">Home</a>
289 <span>/</span>
290 <a href="/posts" class="post-end-link">Posts</a>
291 <span>/</span>
292 <a class="post-end-link">Lightweight Linting</a>
293 <a class="stats post-end-link" href="https://git.peppe.rs/web/site/plain/posts/lightweight_linting.md
294">View Raw</a>
295 </div>
296 </div>
297 </body>
298</html>
diff --git a/docs/style.css b/docs/style.css
index 133b193..5767d9f 100644
--- a/docs/style.css
+++ b/docs/style.css
@@ -390,3 +390,9 @@ ul {
390 box-shadow: none; 390 box-shadow: none;
391} 391}
392 392
393blockquote {
394 margin: 0;
395 padding-left: 0.8rem;
396 border-left: 2px solid var(--dark-white);
397}
398
diff --git a/posts/lightweight_linting.md b/posts/lightweight_linting.md
new file mode 100644
index 0000000..2436f30
--- /dev/null
+++ b/posts/lightweight_linting.md
@@ -0,0 +1,427 @@
1[Tree-sitter](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries)
2queries allow you to search for patterns in syntax trees,
3much like a regex would, in text. Combine that with some Rust
4glue to write simple, custom linters.
5
6### Tree-sitter syntax trees
7
8Here is a quick crash course on syntax trees generated by
9tree-sitter. Syntax trees produced by tree-sitter are
10represented by S-expressions. The generated S-expression for
11the following Rust code,
12
13```rust
14fn main() {
15 let x = 2;
16}
17```
18
19would be:
20
21```scheme
22(source_file
23 (function_item
24 name: (identifier)
25 parameters: (parameters)
26 body:
27 (block
28 (let_declaration
29 pattern: (identifier)
30 value: (integer_literal)))))
31```
32
33Syntax trees generated by tree-sitter have a couple of other
34cool properties: they are _lossless_ syntax trees. Given a
35lossless syntax tree, you can regenerate the original source
36code in its entirety. Consider the following addition to our
37example:
38
39```rust
40 fn main() {
41+ // a comment goes here
42 let x = 2;
43 }
44```
45
46The tree-sitter syntax tree preserves the comment, while the
47typical abstract syntax tree wouldn't:
48
49```scheme
50 (source_file
51 (function_item
52 name: (identifier)
53 parameters: (parameters)
54 body:
55 (block
56+ (line_comment)
57 (let_declaration
58 pattern: (identifier)
59 value: (integer_literal)))))
60```
61
62### Tree-sitter queries
63
64Tree-sitter provides a DSL to match over CSTs. These queries
65resemble our S-expression syntax trees, here is a query to
66match all line comments in a Rust CST:
67
68```scheme
69(line_comment)
70
71; matches the following rust code
72; // a comment goes here
73```
74
75Neat, eh? But don't take my word for it, give it a go on the
76[tree-sitter
77playground](https://tree-sitter.github.io/tree-sitter/playground).
78Type in a query like so:
79
80```scheme
81; the web playground requires you to specify a "capture"
82; you will notice the capture and the nodes it captured
83; turn blue
84(line_comment) @capture
85```
86
87Here's another to match `let` expressions that
88bind an integer to an identifier:
89
90```scheme
91(let_declaration
92 pattern: (identifier)
93 value: (integer_literal))
94
95; matches:
96; let foo = 2;
97```
98
99We can _capture_ nodes into variables:
100
101```scheme
102(let_declaration
103 pattern: (identifier) @my-capture
104 value: (integer_literal))
105
106; matches:
107; let foo = 2;
108
109; captures:
110; foo
111```
112
113And apply certain _predicates_ to captures:
114
115```scheme
116((let_declaration
117 pattern: (identifier) @my-capture
118 value: (integer_literal))
119 (#eq? @my-capture "foo"))
120
121; matches:
122; let foo = 2;
123
124; and not:
125; let bar = 2;
126```
127
128The `#match?` predicate checks if a capture matches a regex:
129
130```scheme
131((let_declaration
132 pattern: (identifier) @my-capture
133 value: (integer_literal))
134 (#match? @my-capture "foo|bar"))
135
136; matches both `foo` and `bar`:
137; let foo = 2;
138; let bar = 2;
139```
140
141Exhibit indifference, as a stoic programmer would, with the
142_wildcard_ pattern:
143
144```scheme
145(let_declaration
146 pattern: (identifier)
147 value: (_))
148
149; matches:
150; let foo = "foo";
151; let foo = 42;
152; let foo = bar;
153```
154
155[The
156documentation](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries)
157does the tree-sitter query DSL more justice, but we now know
158enough to write our first lint.
159
160### Write you a tree-sitter lint
161
162Strings in `std::env` functions are error prone:
163
164```rust
165std::env::remove_var("RUST_BACKTACE");
166 // ^^^^ "TACE" instead of "TRACE"
167```
168
169I prefer this instead:
170
171```rust
172// somewhere in a module that is well spellchecked
173static BACKTRACE: &str = "RUST_BACKTRACE";
174
175// rest of the codebase
176std::env::remove_var(BACKTRACE);
177```
178
179Let's write a lint to find `std::env` functions that use
180strings. Put aside the effectiveness of this lint for the
181moment, and take a stab at writing a tree-sitter query. For
182reference, a function call like so:
183
184```rust
185remove_var("RUST_BACKTRACE")
186```
187
188Produces the following S-expression:
189
190```scheme
191(call_expression
192 function: (identifier)
193 arguments: (arguments (string_literal)))
194```
195
196We are definitely looking for a `call_expression`:
197
198```scheme
199(call_expression) @raise
200```
201
202Whose function name matches `std::env::var` or
203`std::env::remove_var` at the very least (I know, I know,
204this isn't the most optimal regex):
205
206```scheme
207((call_expression
208 function: (_) @fn-name) @raise
209 (#match? @fn-name "std::env::(var|remove_var)"))
210```
211
212Let's turn that `std::` prefix optional:
213
214```scheme
215((call_expression
216 function: (_) @fn-name) @raise
217 (#match? @fn-name "(std::|)env::(var|remove_var)"))
218```
219
220And ensure that `arguments` is a string:
221
222```scheme
223((call_expression
224 function: (_) @fn-name
225 arguments: (arguments (string_literal)))
226 (#match? @fn-name "(std::|)env::(var|remove_var)"))
227```
228
229### Running our linter
230
231We could always plug our query into the web playground, but
232let's go a step further:
233
234```bash
235cargo new --bin toy-lint
236```
237
238Add `tree-sitter` and `tree-sitter-rust` to your
239dependencies:
240
241```toml
242# within Cargo.toml
243[dependencies]
244tree-sitter = "0.20"
245
246[dependencies.tree-sitter-rust]
247git = "https://github.com/tree-sitter/tree-sitter-rust"
248```
249
250Let's load in some Rust code to work with. As [an ode to
251Gödel](https://en.wikipedia.org/wiki/Self-reference)
252(G`ode`l?), why not load in our linter itself:
253
254```rust
255fn main() {
256 let src = include_str!("main.rs");
257}
258```
259
260Most tree-sitter APIs require a reference to a `Language`
261struct, we will be working with Rust if you haven't
262already guessed:
263
264```rust
265use tree_sitter::Language;
266
267let rust_lang: Language = tree_sitter_rust::language();
268```
269
270Enough scaffolding, let's parse some Rust:
271
272```rust
273use tree_sitter::Parser;
274
275let mut parser = Parser::new();
276parser.set_language(rust_lang).unwrap();
277
278let parse_tree = parser.parse(&src, None).unwrap();
279```
280
281The second argument to `Parser::parse` may be of interest.
282Tree-sitter has this cool feature that allows for quick
283reparsing of existing parse trees if they contain edits. If
284you do happen to want to reparse a source file, you can pass
285in the old tree:
286
287```rust
288// if you wish to reparse instead of parse
289old_tree.edit(/* redacted */);
290
291// generate shiny new reparsed tree
292let new_tree = parser.parse(&src, Some(old_tree)).unwrap()
293```
294
295Anyhow ([hah!](http://github.com/dtolnay/anyhow)), now that we have a parse tree, we can inspect it:
296
297```rust
298println!("{}", parse_tree.root_node().to_sexp());
299```
300
301Or better yet, run a query on it:
302
303```rust
304use tree_sitter::Query;
305
306let query = Query::new(
307 rust_lang,
308 r#"
309 ((call_expression
310 function: (_) @fn-name
311 arguments: (arguments (string_literal))) @raise
312 (#match? @fn-name "(std::|)env::(var|remove_var)"))
313 "#
314)
315.unwrap();
316```
317
318A `QueryCursor` is tree-sitter's way of maintaining state as
319we iterate through the matches or captures produced by
320running a query on the parse tree. Observe:
321
322```rust
323use tree_sitter::QueryCursor;
324
325let mut query_cursor = QueryCursor::new();
326let all_matches = query_cursor.matches(
327 &query,
328 parse_tree.root_node(),
329 src.as_bytes(),
330);
331```
332
333We begin by passing our query to the cursor, followed by the
334"root node", which is another way of saying, "start from the
335top", and lastly, the source itself. If you have already
336taken a look at the C API, you will notice that the last
337argument, the source (known as the `TextProvider`), is not
338required. The Rust bindings seem to require this argument to
339provide predicate functionality such as `#match?` and
340`#eq?`.
341
342Do something with the matches:
343
344```rust
345// get the index of the capture named "raise"
346let raise_idx = query.capture_index_for_name("raise").unwrap();
347
348for each_match in all_matches {
349 // iterate over all captures called "raise"
350 // ignore captures such as "fn-name"
351 for capture in each_match
352 .captures
353 .iter()
354 .filter(|c| c.idx == raise_idx)
355 {
356 let range = capture.node.range();
357 let text = &src[range.start_byte..range.end_byte];
358 let line = range.start_point.row;
359 let col = range.start_point.column;
360 println!(
361 "[Line: {}, Col: {}] Offending source code: `{}`",
362 line, col, text
363 );
364 }
365}
366```
367
368Lastly, add the following line to your source code, to get
369the linter to catch something:
370
371```rust
372env::remove_var("RUST_BACKTRACE");
373```
374
375And `cargo run`:
376
377```shell
378λ cargo run
379 Compiling toy-lint v0.1.0 (/redacted/path/to/toy-lint)
380 Finished dev [unoptimized + debuginfo] target(s) in 0.74s
381 Running `target/debug/toy-lint`
382[Line: 40, Col: 4] Offending source code: `env::remove_var("RUST_BACKTRACE")`
383```
384
385Thank you tree-sitter!
386
387### Bonus
388
389Keen readers will notice that I avoided `std::env::set_var`.
390Because `set_var` is called with two arguments, a "key" and
391a "value", unlike `env::var` and `env::remove_var`. As a
392result, it requires more juggling:
393
394```scheme
395((call_expression
396 function: (_) @fn-name
397 arguments: (arguments . (string_literal)? . (string_literal) .)) @raise
398 (#match? @fn-name "(std::|)env::(var|remove_var|set_var)"))
399```
400
401The interesting part of this query is the humble `.`, the
402_anchor_ operator. Anchors help constrain child nodes in
403certain ways. In this case, it ensures that we match exactly
404two `string_literal`s who are siblings or exactly one
405`string_literal` with no siblings. Unfortunately, this query
406also matches the following invalid Rust code:
407
408```rust
409// remove_var accepts only 1 arg!
410std::env::remove_var("RUST_BACKTRACE", "1");
411```
412
413### Notes
414
415All-in-all, the query DSL does a great job in lowering the
416bar to writing language tools. The knowledge gained from
417mastering the query DSL can be applied to other languages
418that have tree-sitter grammars too. This query
419detects `to_json` methods that do not accept additional
420arguments, in Ruby:
421
422```scheme
423((method
424 name: (identifier) @fn
425 !parameters)
426 (#is? @fn "to_json"))
427```