diff options
Diffstat (limited to 'docs/dev/architecture.md')
-rw-r--r-- | docs/dev/architecture.md | 79 |
1 files changed, 41 insertions, 38 deletions
diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md index 629645757..9675ed0b6 100644 --- a/docs/dev/architecture.md +++ b/docs/dev/architecture.md | |||
@@ -12,6 +12,9 @@ analyzer: | |||
12 | 12 | ||
13 | https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE | 13 | https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE |
14 | 14 | ||
15 | Note that the guide and videos are pretty dated, this document should be in | ||
16 | generally fresher. | ||
17 | |||
15 | ## The Big Picture | 18 | ## The Big Picture |
16 | 19 | ||
17 | ![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) | 20 | ![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) |
@@ -20,13 +23,12 @@ On the highest level, rust-analyzer is a thing which accepts input source code | |||
20 | from the client and produces a structured semantic model of the code. | 23 | from the client and produces a structured semantic model of the code. |
21 | 24 | ||
22 | More specifically, input data consists of a set of test files (`(PathBuf, | 25 | More specifically, input data consists of a set of test files (`(PathBuf, |
23 | String)` pairs) and information about project structure, captured in the so called | 26 | String)` pairs) and information about project structure, captured in the so |
24 | `CrateGraph`. The crate graph specifies which files are crate roots, which cfg | 27 | called `CrateGraph`. The crate graph specifies which files are crate roots, |
25 | flags are specified for each crate (TODO: actually implement this) and what | 28 | which cfg flags are specified for each crate and what dependencies exist between |
26 | dependencies exist between the crates. The analyzer keeps all this input data in | 29 | the crates. The analyzer keeps all this input data in memory and never does any |
27 | memory and never does any IO. Because the input data is source code, which | 30 | IO. Because the input data are source code, which typically measures in tens of |
28 | typically measures in tens of megabytes at most, keeping all input data in | 31 | megabytes at most, keeping everything in memory is OK. |
29 | memory is OK. | ||
30 | 32 | ||
31 | A "structured semantic model" is basically an object-oriented representation of | 33 | A "structured semantic model" is basically an object-oriented representation of |
32 | modules, functions and types which appear in the source code. This representation | 34 | modules, functions and types which appear in the source code. This representation |
@@ -43,37 +45,39 @@ can be quickly updated for small modifications. | |||
43 | ## Code generation | 45 | ## Code generation |
44 | 46 | ||
45 | Some of the components of this repository are generated through automatic | 47 | Some of the components of this repository are generated through automatic |
46 | processes. These are outlined below: | 48 | processes. `cargo xtask codegen` runs all generation tasks. Generated code is |
49 | commited to the git repository. | ||
50 | |||
51 | In particular, `cargo xtask codegen` generates: | ||
52 | |||
53 | 1. [`syntax_kind/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_parser/src/syntax_kind/generated.rs) | ||
54 | -- the set of terminals and non-terminals of rust grammar. | ||
47 | 55 | ||
48 | - `cargo xtask codegen`: The kinds of tokens that are reused in several places, so a generator | 56 | 2. [`ast/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/src/ast/generated.rs) |
49 | is used. We use `quote!` macro to generate the files listed below, based on | 57 | -- AST data structure. |
50 | the grammar described in [grammar.ron]: | ||
51 | - [ast/generated.rs][ast generated] | ||
52 | - [syntax_kind/generated.rs][syntax_kind generated] | ||
53 | 58 | ||
54 | [grammar.ron]: ../../crates/ra_syntax/src/grammar.ron | 59 | .3 [`doc_tests/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_assists/src/doc_tests/generated.rs), |
55 | [ast generated]: ../../crates/ra_syntax/src/ast/generated.rs | 60 | [`test_data/parser/inline`](https://github.com/rust-analyzer/rust-analyzer/tree/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/test_data/parser/inline) |
56 | [syntax_kind generated]: ../../crates/ra_parser/src/syntax_kind/generated.rs | 61 | -- tests for assists and the parser. |
62 | |||
63 | The source for 1 and 2 is in [`ast_src.rs`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/xtask/src/ast_src.rs). | ||
57 | 64 | ||
58 | ## Code Walk-Through | 65 | ## Code Walk-Through |
59 | 66 | ||
60 | ### `crates/ra_syntax`, `crates/ra_parser` | 67 | ### `crates/ra_syntax`, `crates/ra_parser` |
61 | 68 | ||
62 | Rust syntax tree structure and parser. See | 69 | Rust syntax tree structure and parser. See |
63 | [RFC](https://github.com/rust-lang/rfcs/pull/2256) for some design notes. | 70 | [RFC](https://github.com/rust-lang/rfcs/pull/2256) and [./syntax.md](./syntax.md) for some design notes. |
64 | 71 | ||
65 | - [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. | 72 | - [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. |
66 | - `grammar` module is the actual parser. It is a hand-written recursive descent parser, which | 73 | - `grammar` module is the actual parser. It is a hand-written recursive descent parser, which |
67 | produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), | 74 | produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), |
68 | which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) | 75 | which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) |
69 | is what we use for the definition of the Rust language. | 76 | is what we use for the definition of the Rust language. |
70 | - `parser_api/parser_impl` bridges the tree-agnostic parser from `grammar` with `rowan` trees. | 77 | - `TreeSink` and `TokenSource` traits bridge the tree-agnostic parser from `grammar` with `rowan` trees. |
71 | This is the thing that turns a flat list of events into a tree (see `EventProcessor`) | ||
72 | - `ast` provides a type safe API on top of the raw `rowan` tree. | 78 | - `ast` provides a type safe API on top of the raw `rowan` tree. |
73 | - `grammar.ron` RON description of the grammar, which is used to | 79 | - `ast_src` description of the grammar, which is used to generate `syntax_kinds` |
74 | generate `syntax_kinds` and `ast` modules, using `cargo xtask codegen` command. | 80 | and `ast` modules, using `cargo xtask codegen` command. |
75 | - `algo`: generic tree algorithms, including `walk` for O(1) stack | ||
76 | space tree traversal (this is cool). | ||
77 | 81 | ||
78 | Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs` | 82 | Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs` |
79 | (test vectors) and `.txt` files with corresponding syntax trees. During testing, we check | 83 | (test vectors) and `.txt` files with corresponding syntax trees. During testing, we check |
@@ -81,6 +85,10 @@ Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirec | |||
81 | tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect | 85 | tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect |
82 | all `// test test_name` comments into files inside `test_data/parser/inline` directory. | 86 | all `// test test_name` comments into files inside `test_data/parser/inline` directory. |
83 | 87 | ||
88 | Note | ||
89 | [`api_walkthrough`](https://github.com/rust-analyzer/rust-analyzer/blob/2fb6af89eb794f775de60b82afe56b6f986c2a40/crates/ra_syntax/src/lib.rs#L190-L348) | ||
90 | in particular: it shows off various methods of working with syntax tree. | ||
91 | |||
84 | See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which | 92 | See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which |
85 | fixes a bug in the grammar. | 93 | fixes a bug in the grammar. |
86 | 94 | ||
@@ -94,18 +102,22 @@ defines most of the "input" queries: facts supplied by the client of the | |||
94 | analyzer. Reading the docs of the `ra_db::input` module should be useful: | 102 | analyzer. Reading the docs of the `ra_db::input` module should be useful: |
95 | everything else is strictly derived from those inputs. | 103 | everything else is strictly derived from those inputs. |
96 | 104 | ||
97 | ### `crates/ra_hir` | 105 | ### `crates/ra_hir*` crates |
98 | 106 | ||
99 | HIR provides high-level "object oriented" access to Rust code. | 107 | HIR provides high-level "object oriented" access to Rust code. |
100 | 108 | ||
101 | The principal difference between HIR and syntax trees is that HIR is bound to a | 109 | The principal difference between HIR and syntax trees is that HIR is bound to a |
102 | particular crate instance. That is, it has cfg flags and features applied (in | 110 | particular crate instance. That is, it has cfg flags and features applied. So, |
103 | theory, in practice this is to be implemented). So, the relation between | 111 | the relation between syntax and HIR is many-to-one. The `source_binder` module |
104 | syntax and HIR is many-to-one. The `source_binder` module is responsible for | 112 | is responsible for guessing a HIR for a particular source position. |
105 | guessing a HIR for a particular source position. | ||
106 | 113 | ||
107 | Underneath, HIR works on top of salsa, using a `HirDatabase` trait. | 114 | Underneath, HIR works on top of salsa, using a `HirDatabase` trait. |
108 | 115 | ||
116 | `ra_hir_xxx` crates have a strong ECS flavor, in that they work with raw ids and | ||
117 | directly query the databse. | ||
118 | |||
119 | The top-level `ra_hir` façade crate wraps ids into a more OO-flavored API. | ||
120 | |||
109 | ### `crates/ra_ide` | 121 | ### `crates/ra_ide` |
110 | 122 | ||
111 | A stateful library for analyzing many Rust files as they change. `AnalysisHost` | 123 | A stateful library for analyzing many Rust files as they change. `AnalysisHost` |
@@ -135,18 +147,9 @@ different from data on disk. This is more or less the single really | |||
135 | platform-dependent component, so it lives in a separate repository and has an | 147 | platform-dependent component, so it lives in a separate repository and has an |
136 | extensive cross-platform CI testing. | 148 | extensive cross-platform CI testing. |
137 | 149 | ||
138 | ### `crates/gen_lsp_server` | ||
139 | |||
140 | A language server scaffold, exposing a synchronous crossbeam-channel based API. | ||
141 | This crate handles protocol handshaking and parsing messages, while you | ||
142 | control the message dispatch loop yourself. | ||
143 | |||
144 | Run with `RUST_LOG=sync_lsp_server=debug` to see all the messages. | ||
145 | |||
146 | ### `crates/ra_cli` | 150 | ### `crates/ra_cli` |
147 | 151 | ||
148 | A CLI interface to rust-analyzer. | 152 | A CLI interface to rust-analyzer, mainly for testing. |
149 | |||
150 | 153 | ||
151 | ## Testing Infrastructure | 154 | ## Testing Infrastructure |
152 | 155 | ||