aboutsummaryrefslogtreecommitdiff
path: root/docs/dev
diff options
context:
space:
mode:
Diffstat (limited to 'docs/dev')
-rw-r--r--docs/dev/architecture.md4
-rw-r--r--docs/dev/guide.md8
-rw-r--r--docs/dev/syntax.md42
3 files changed, 27 insertions, 27 deletions
diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md
index 3a337c574..cee916c09 100644
--- a/docs/dev/architecture.md
+++ b/docs/dev/architecture.md
@@ -46,7 +46,7 @@ can be quickly updated for small modifications.
46 46
47Some of the components of this repository are generated through automatic 47Some of the components of this repository are generated through automatic
48processes. `cargo xtask codegen` runs all generation tasks. Generated code is 48processes. `cargo xtask codegen` runs all generation tasks. Generated code is
49commited to the git repository. 49committed to the git repository.
50 50
51In particular, `cargo xtask codegen` generates: 51In particular, `cargo xtask codegen` generates:
52 52
@@ -114,7 +114,7 @@ is responsible for guessing a HIR for a particular source position.
114Underneath, HIR works on top of salsa, using a `HirDatabase` trait. 114Underneath, HIR works on top of salsa, using a `HirDatabase` trait.
115 115
116`ra_hir_xxx` crates have a strong ECS flavor, in that they work with raw ids and 116`ra_hir_xxx` crates have a strong ECS flavor, in that they work with raw ids and
117directly query the databse. 117directly query the database.
118 118
119The top-level `ra_hir` façade crate wraps ids into a more OO-flavored API. 119The top-level `ra_hir` façade crate wraps ids into a more OO-flavored API.
120 120
diff --git a/docs/dev/guide.md b/docs/dev/guide.md
index abbe4c154..c3252f1f6 100644
--- a/docs/dev/guide.md
+++ b/docs/dev/guide.md
@@ -26,7 +26,7 @@ properties hold:
26 26
27## IDE API 27## IDE API
28 28
29To see the bigger picture of how the IDE features works, let's take a look at the [`AnalysisHost`] and 29To see the bigger picture of how the IDE features work, let's take a look at the [`AnalysisHost`] and
30[`Analysis`] pair of types. `AnalysisHost` has three methods: 30[`Analysis`] pair of types. `AnalysisHost` has three methods:
31 31
32* `default()` for creating an empty analysis instance 32* `default()` for creating an empty analysis instance
@@ -131,7 +131,7 @@ mapping between `SourceRoot` IDs (which are assigned by the client) and actual
131analyzer. 131analyzer.
132 132
133Note that `mod`, `#[path]` and `include!()` can only reference files from the 133Note that `mod`, `#[path]` and `include!()` can only reference files from the
134same source root. It is of course is possible to explicitly add extra files to 134same source root. It is of course possible to explicitly add extra files to
135the source root, even `/dev/random`. 135the source root, even `/dev/random`.
136 136
137## Language Server Protocol 137## Language Server Protocol
@@ -192,7 +192,7 @@ task will be canceled as soon as the main loop calls `apply_change` on the
192[`schedule`]: https://github.com/rust-analyzer/rust-analyzer/blob/guide-2019-01/crates/ra_lsp_server/src/main_loop.rs#L426-L455 192[`schedule`]: https://github.com/rust-analyzer/rust-analyzer/blob/guide-2019-01/crates/ra_lsp_server/src/main_loop.rs#L426-L455
193[The task]: https://github.com/rust-analyzer/rust-analyzer/blob/guide-2019-01/crates/ra_lsp_server/src/main_loop/handlers.rs#L205-L223 193[The task]: https://github.com/rust-analyzer/rust-analyzer/blob/guide-2019-01/crates/ra_lsp_server/src/main_loop/handlers.rs#L205-L223
194 194
195This concludes the overview of the analyzer's programing *interface*. Next, lets 195This concludes the overview of the analyzer's programing *interface*. Next, let's
196dig into the implementation! 196dig into the implementation!
197 197
198## Salsa 198## Salsa
@@ -480,7 +480,7 @@ throughout the analyzer:
480## Source Map pattern 480## Source Map pattern
481 481
482Due to an obscure edge case in completion, IDE needs to know the syntax node of 482Due to an obscure edge case in completion, IDE needs to know the syntax node of
483an use statement which imported the given completion candidate. We can't just 483a use statement which imported the given completion candidate. We can't just
484store the syntax node as a part of name resolution: this will break 484store the syntax node as a part of name resolution: this will break
485incrementality, due to the fact that syntax changes after every file 485incrementality, due to the fact that syntax changes after every file
486modification. 486modification.
diff --git a/docs/dev/syntax.md b/docs/dev/syntax.md
index 33973ffec..c2864bbbc 100644
--- a/docs/dev/syntax.md
+++ b/docs/dev/syntax.md
@@ -64,7 +64,7 @@ struct Token {
64} 64}
65``` 65```
66 66
67All the difference bettwen the above sketch and the real implementation are strictly due to optimizations. 67All the difference between the above sketch and the real implementation are strictly due to optimizations.
68 68
69Points of note: 69Points of note:
70* The tree is untyped. Each node has a "type tag", `SyntaxKind`. 70* The tree is untyped. Each node has a "type tag", `SyntaxKind`.
@@ -72,7 +72,7 @@ Points of note:
72* Trivia and non-trivia tokens are not distinguished on the type level. 72* Trivia and non-trivia tokens are not distinguished on the type level.
73* Each token carries its full text. 73* Each token carries its full text.
74* The original text can be recovered by concatenating the texts of all tokens in order. 74* The original text can be recovered by concatenating the texts of all tokens in order.
75* Accessing a child of particular type (for example, parameter list of a function) generarly involves linerary traversing the children, looking for a specific `kind`. 75* Accessing a child of particular type (for example, parameter list of a function) generally involves linerary traversing the children, looking for a specific `kind`.
76* Modifying the tree is roughly `O(depth)`. 76* Modifying the tree is roughly `O(depth)`.
77 We don't make special efforts to guarantree that the depth is not liner, but, in practice, syntax trees are branchy and shallow. 77 We don't make special efforts to guarantree that the depth is not liner, but, in practice, syntax trees are branchy and shallow.
78* If mandatory (grammar wise) node is missing from the input, it's just missing from the tree. 78* If mandatory (grammar wise) node is missing from the input, it's just missing from the tree.
@@ -123,7 +123,7 @@ To more compactly store the children, we box *both* interior nodes and tokens, a
123`Either<Arc<Node>, Arc<Token>>` as a single pointer with a tag in the last bit. 123`Either<Arc<Node>, Arc<Token>>` as a single pointer with a tag in the last bit.
124 124
125To avoid allocating EVERY SINGLE TOKEN on the heap, syntax trees use interning. 125To avoid allocating EVERY SINGLE TOKEN on the heap, syntax trees use interning.
126Because the tree is fully imutable, it's valid to structuraly share subtrees. 126Because the tree is fully immutable, it's valid to structurally share subtrees.
127For example, in `1 + 1`, there will be a *single* token for `1` with ref count 2; the same goes for the ` ` whitespace token. 127For example, in `1 + 1`, there will be a *single* token for `1` with ref count 2; the same goes for the ` ` whitespace token.
128Interior nodes are shared as well (for example in `(1 + 1) * (1 + 1)`). 128Interior nodes are shared as well (for example in `(1 + 1) * (1 + 1)`).
129 129
@@ -134,8 +134,8 @@ Currently, the interner is created per-file, but it will be easy to use a per-th
134 134
135We use a `TextSize`, a newtyped `u32`, to store the length of the text. 135We use a `TextSize`, a newtyped `u32`, to store the length of the text.
136 136
137We currently use `SmolStr`, an small object optimized string to store text. 137We currently use `SmolStr`, a small object optimized string to store text.
138This was mostly relevant *before* we implmented tree interning, to avoid allocating common keywords and identifiers. We should switch to storing text data alongside the interned tokens. 138This was mostly relevant *before* we implemented tree interning, to avoid allocating common keywords and identifiers. We should switch to storing text data alongside the interned tokens.
139 139
140#### Alternative designs 140#### Alternative designs
141 141
@@ -162,12 +162,12 @@ Explicit trivia nodes, like in `rowan`, are used by IntelliJ.
162 162
163##### Accessing Children 163##### Accessing Children
164 164
165As noted before, accesing a specific child in the node requires a linear traversal of the children (though we can skip tokens, beacuse the tag is encoded in the pointer itself). 165As noted before, accessing a specific child in the node requires a linear traversal of the children (though we can skip tokens, because the tag is encoded in the pointer itself).
166It is possible to recover O(1) access with another representation. 166It is possible to recover O(1) access with another representation.
167We explicitly store optional and missing (required by the grammar, but not present) nodes. 167We explicitly store optional and missing (required by the grammar, but not present) nodes.
168That is, we use `Option<Node>` for children. 168That is, we use `Option<Node>` for children.
169We also remove trivia tokens from the tree. 169We also remove trivia tokens from the tree.
170This way, each child kind genrerally occupies a fixed position in a parent, and we can use index access to fetch it. 170This way, each child kind generally occupies a fixed position in a parent, and we can use index access to fetch it.
171The cost is that we now need to allocate space for all not-present optional nodes. 171The cost is that we now need to allocate space for all not-present optional nodes.
172So, `fn foo() {}` will have slots for visibility, unsafeness, attributes, abi and return type. 172So, `fn foo() {}` will have slots for visibility, unsafeness, attributes, abi and return type.
173 173
@@ -193,7 +193,7 @@ Modeling this with immutable trees is possible, but annoying.
193### Syntax Nodes 193### Syntax Nodes
194 194
195A function green tree is not super-convenient to use. 195A function green tree is not super-convenient to use.
196The biggest problem is acessing parents (there are no parent pointers!). 196The biggest problem is accessing parents (there are no parent pointers!).
197But there are also "identify" issues. 197But there are also "identify" issues.
198Let's say you want to write a code which builds a list of expressions in a file: `fn collect_exrepssions(file: GreenNode) -> HashSet<GreenNode>`. 198Let's say you want to write a code which builds a list of expressions in a file: `fn collect_exrepssions(file: GreenNode) -> HashSet<GreenNode>`.
199For the input like 199For the input like
@@ -207,7 +207,7 @@ fn main() {
207} 207}
208``` 208```
209 209
210both copies of the `x + 2` expression are representing by equal (and, with interning in mind, actualy the same) green nodes. 210both copies of the `x + 2` expression are representing by equal (and, with interning in mind, actually the same) green nodes.
211Green trees just can't differentiate between the two. 211Green trees just can't differentiate between the two.
212 212
213`SyntaxNode` adds parent pointers and identify semantics to green nodes. 213`SyntaxNode` adds parent pointers and identify semantics to green nodes.
@@ -285,9 +285,9 @@ They also point to the parent (and, consequently, to the root) with an owning `R
285In other words, one needs *one* arc bump when initiating a traversal. 285In other words, one needs *one* arc bump when initiating a traversal.
286 286
287To get rid of allocations, `rowan` takes advantage of `SyntaxNode: !Sync` and uses a thread-local free list of `SyntaxNode`s. 287To get rid of allocations, `rowan` takes advantage of `SyntaxNode: !Sync` and uses a thread-local free list of `SyntaxNode`s.
288In a typical traversal, you only directly hold a few `SyntaxNode`s at a time (and their ancesstors indirectly), so a free list proportional to the depth of the tree removes all allocations in a typical case. 288In a typical traversal, you only directly hold a few `SyntaxNode`s at a time (and their ancestors indirectly), so a free list proportional to the depth of the tree removes all allocations in a typical case.
289 289
290So, while traversal is not exactly incrementing a pointer, it's still prety cheep: tls + rc bump! 290So, while traversal is not exactly incrementing a pointer, it's still pretty cheap: TLS + rc bump!
291 291
292Traversal also yields (cheap) owned nodes, which improves ergonomics quite a bit. 292Traversal also yields (cheap) owned nodes, which improves ergonomics quite a bit.
293 293
@@ -308,15 +308,15 @@ struct SyntaxData {
308} 308}
309``` 309```
310 310
311This allows using true pointer equality for comparision of identities of `SyntaxNodes`. 311This allows using true pointer equality for comparison of identities of `SyntaxNodes`.
312rust-analyzer used to have this design as well, but since we've switch to cursors. 312rust-analyzer used to have this design as well, but we've since switched to cursors.
313The main problem with memoizing the red nodes is that it more than doubles the memory requirenments for fully realized syntax trees. 313The main problem with memoizing the red nodes is that it more than doubles the memory requirements for fully realized syntax trees.
314In contrast, cursors generally retain only a path to the root. 314In contrast, cursors generally retain only a path to the root.
315C# combats increased memory usage by using weak references. 315C# combats increased memory usage by using weak references.
316 316
317### AST 317### AST
318 318
319`GreenTree`s are untyped and homogeneous, because it makes accomodating error nodes, arbitrary whitespace and comments natural, and because it makes possible to write generic tree traversals. 319`GreenTree`s are untyped and homogeneous, because it makes accommodating error nodes, arbitrary whitespace and comments natural, and because it makes possible to write generic tree traversals.
320However, when working with a specific node, like a function definition, one would want a strongly typed API. 320However, when working with a specific node, like a function definition, one would want a strongly typed API.
321 321
322This is what is provided by the AST layer. AST nodes are transparent wrappers over untyped syntax nodes: 322This is what is provided by the AST layer. AST nodes are transparent wrappers over untyped syntax nodes:
@@ -397,7 +397,7 @@ impl HasVisbility for FnDef {
397Points of note: 397Points of note:
398 398
399* Like `SyntaxNode`s, AST nodes are cheap to clone pointer-sized owned values. 399* Like `SyntaxNode`s, AST nodes are cheap to clone pointer-sized owned values.
400* All "fields" are optional, to accomodate incomplete and/or erroneous source code. 400* All "fields" are optional, to accommodate incomplete and/or erroneous source code.
401* It's always possible to go from an ast node to an untyped `SyntaxNode`. 401* It's always possible to go from an ast node to an untyped `SyntaxNode`.
402* It's possible to go in the opposite direction with a checked cast. 402* It's possible to go in the opposite direction with a checked cast.
403* `enum`s allow modeling of arbitrary intersecting subsets of AST types. 403* `enum`s allow modeling of arbitrary intersecting subsets of AST types.
@@ -437,13 +437,13 @@ impl GreenNodeBuilder {
437} 437}
438``` 438```
439 439
440The parser, ultimatelly, needs to invoke the `GreenNodeBuilder`. 440The parser, ultimately, needs to invoke the `GreenNodeBuilder`.
441There are two principal sources of inputs for the parser: 441There are two principal sources of inputs for the parser:
442 * source text, which contains trivia tokens (whitespace and comments) 442 * source text, which contains trivia tokens (whitespace and comments)
443 * token trees from macros, which lack trivia 443 * token trees from macros, which lack trivia
444 444
445Additionaly, input tokens do not correspond 1-to-1 with output tokens. 445Additionally, input tokens do not correspond 1-to-1 with output tokens.
446For example, two consequtive `>` tokens might be glued, by the parser, into a single `>>`. 446For example, two consecutive `>` tokens might be glued, by the parser, into a single `>>`.
447 447
448For these reasons, the parser crate defines a callback interfaces for both input tokens and output trees. 448For these reasons, the parser crate defines a callback interfaces for both input tokens and output trees.
449The explicit glue layer then bridges various gaps. 449The explicit glue layer then bridges various gaps.
@@ -491,7 +491,7 @@ Syntax errors are not stored directly in the tree.
491The primary motivation for this is that syntax tree is not necessary produced by the parser, it may also be assembled manually from pieces (which happens all the time in refactorings). 491The primary motivation for this is that syntax tree is not necessary produced by the parser, it may also be assembled manually from pieces (which happens all the time in refactorings).
492Instead, parser reports errors to an error sink, which stores them in a `Vec`. 492Instead, parser reports errors to an error sink, which stores them in a `Vec`.
493If possible, errors are not reported during parsing and are postponed for a separate validation step. 493If possible, errors are not reported during parsing and are postponed for a separate validation step.
494For example, parser accepts visibility modifiers on trait methods, but then a separate tree traversal flags all such visibilites as erroneous. 494For example, parser accepts visibility modifiers on trait methods, but then a separate tree traversal flags all such visibilities as erroneous.
495 495
496### Macros 496### Macros
497 497
@@ -501,7 +501,7 @@ Specifically, `TreeSink` constructs the tree in lockstep with draining the origi
501In the process, it records which tokens of the tree correspond to which tokens of the input, by using text ranges to identify syntax tokens. 501In the process, it records which tokens of the tree correspond to which tokens of the input, by using text ranges to identify syntax tokens.
502The end result is that parsing an expanded code yields a syntax tree and a mapping of text-ranges of the tree to original tokens. 502The end result is that parsing an expanded code yields a syntax tree and a mapping of text-ranges of the tree to original tokens.
503 503
504To deal with precedence in cases like `$expr * 1`, we use special invisible parenthesis, which are explicitelly handled by the parser 504To deal with precedence in cases like `$expr * 1`, we use special invisible parenthesis, which are explicitly handled by the parser
505 505
506### Whitespace & Comments 506### Whitespace & Comments
507 507