aboutsummaryrefslogtreecommitdiff
path: root/docs/dev/architecture.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/dev/architecture.md')
-rw-r--r--docs/dev/architecture.md79
1 files changed, 41 insertions, 38 deletions
diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md
index 629645757..9675ed0b6 100644
--- a/docs/dev/architecture.md
+++ b/docs/dev/architecture.md
@@ -12,6 +12,9 @@ analyzer:
12 12
13https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE 13https://www.youtube.com/playlist?list=PL85XCvVPmGQho7MZkdW-wtPtuJcFpzycE
14 14
15Note that the guide and videos are pretty dated, this document should be in
16generally fresher.
17
15## The Big Picture 18## The Big Picture
16 19
17![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) 20![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png)
@@ -20,13 +23,12 @@ On the highest level, rust-analyzer is a thing which accepts input source code
20from the client and produces a structured semantic model of the code. 23from the client and produces a structured semantic model of the code.
21 24
22More specifically, input data consists of a set of test files (`(PathBuf, 25More specifically, input data consists of a set of test files (`(PathBuf,
23String)` pairs) and information about project structure, captured in the so called 26String)` pairs) and information about project structure, captured in the so
24`CrateGraph`. The crate graph specifies which files are crate roots, which cfg 27called `CrateGraph`. The crate graph specifies which files are crate roots,
25flags are specified for each crate (TODO: actually implement this) and what 28which cfg flags are specified for each crate and what dependencies exist between
26dependencies exist between the crates. The analyzer keeps all this input data in 29the crates. The analyzer keeps all this input data in memory and never does any
27memory and never does any IO. Because the input data is source code, which 30IO. Because the input data are source code, which typically measures in tens of
28typically measures in tens of megabytes at most, keeping all input data in 31megabytes at most, keeping everything in memory is OK.
29memory is OK.
30 32
31A "structured semantic model" is basically an object-oriented representation of 33A "structured semantic model" is basically an object-oriented representation of
32modules, functions and types which appear in the source code. This representation 34modules, functions and types which appear in the source code. This representation
@@ -43,37 +45,39 @@ can be quickly updated for small modifications.
43## Code generation 45## Code generation
44 46
45Some of the components of this repository are generated through automatic 47Some of the components of this repository are generated through automatic
46processes. These are outlined below: 48processes. `cargo xtask codegen` runs all generation tasks. Generated code is
49commited to the git repository.
50
51In particular, `cargo xtask codegen` generates:
52
531. [`syntax_kind/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_parser/src/syntax_kind/generated.rs)
54 -- the set of terminals and non-terminals of rust grammar.
47 55
48- `cargo xtask codegen`: The kinds of tokens that are reused in several places, so a generator 562. [`ast/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/src/ast/generated.rs)
49 is used. We use `quote!` macro to generate the files listed below, based on 57 -- AST data structure.
50 the grammar described in [grammar.ron]:
51 - [ast/generated.rs][ast generated]
52 - [syntax_kind/generated.rs][syntax_kind generated]
53 58
54[grammar.ron]: ../../crates/ra_syntax/src/grammar.ron 59.3 [`doc_tests/generated`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_assists/src/doc_tests/generated.rs),
55[ast generated]: ../../crates/ra_syntax/src/ast/generated.rs 60 [`test_data/parser/inline`](https://github.com/rust-analyzer/rust-analyzer/tree/a0be39296d2925972cacd9fbf8b5fb258fad6947/crates/ra_syntax/test_data/parser/inline)
56[syntax_kind generated]: ../../crates/ra_parser/src/syntax_kind/generated.rs 61 -- tests for assists and the parser.
62
63The source for 1 and 2 is in [`ast_src.rs`](https://github.com/rust-analyzer/rust-analyzer/blob/a0be39296d2925972cacd9fbf8b5fb258fad6947/xtask/src/ast_src.rs).
57 64
58## Code Walk-Through 65## Code Walk-Through
59 66
60### `crates/ra_syntax`, `crates/ra_parser` 67### `crates/ra_syntax`, `crates/ra_parser`
61 68
62Rust syntax tree structure and parser. See 69Rust syntax tree structure and parser. See
63[RFC](https://github.com/rust-lang/rfcs/pull/2256) for some design notes. 70[RFC](https://github.com/rust-lang/rfcs/pull/2256) and [./syntax.md](./syntax.md) for some design notes.
64 71
65- [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. 72- [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees.
66- `grammar` module is the actual parser. It is a hand-written recursive descent parser, which 73- `grammar` module is the actual parser. It is a hand-written recursive descent parser, which
67 produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), 74 produces a sequence of events like "start node X", "finish node Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java),
68 which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) 75 which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs)
69 is what we use for the definition of the Rust language. 76 is what we use for the definition of the Rust language.
70- `parser_api/parser_impl` bridges the tree-agnostic parser from `grammar` with `rowan` trees. 77- `TreeSink` and `TokenSource` traits bridge the tree-agnostic parser from `grammar` with `rowan` trees.
71 This is the thing that turns a flat list of events into a tree (see `EventProcessor`)
72- `ast` provides a type safe API on top of the raw `rowan` tree. 78- `ast` provides a type safe API on top of the raw `rowan` tree.
73- `grammar.ron` RON description of the grammar, which is used to 79- `ast_src` description of the grammar, which is used to generate `syntax_kinds`
74 generate `syntax_kinds` and `ast` modules, using `cargo xtask codegen` command. 80 and `ast` modules, using `cargo xtask codegen` command.
75- `algo`: generic tree algorithms, including `walk` for O(1) stack
76 space tree traversal (this is cool).
77 81
78Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs` 82Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirectories with a bunch of `.rs`
79(test vectors) and `.txt` files with corresponding syntax trees. During testing, we check 83(test vectors) and `.txt` files with corresponding syntax trees. During testing, we check
@@ -81,6 +85,10 @@ Tests for ra_syntax are mostly data-driven: `test_data/parser` contains subdirec
81tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect 85tests). Additionally, running `cargo xtask codegen` will walk the grammar module and collect
82all `// test test_name` comments into files inside `test_data/parser/inline` directory. 86all `// test test_name` comments into files inside `test_data/parser/inline` directory.
83 87
88Note
89[`api_walkthrough`](https://github.com/rust-analyzer/rust-analyzer/blob/2fb6af89eb794f775de60b82afe56b6f986c2a40/crates/ra_syntax/src/lib.rs#L190-L348)
90in particular: it shows off various methods of working with syntax tree.
91
84See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which 92See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which
85fixes a bug in the grammar. 93fixes a bug in the grammar.
86 94
@@ -94,18 +102,22 @@ defines most of the "input" queries: facts supplied by the client of the
94analyzer. Reading the docs of the `ra_db::input` module should be useful: 102analyzer. Reading the docs of the `ra_db::input` module should be useful:
95everything else is strictly derived from those inputs. 103everything else is strictly derived from those inputs.
96 104
97### `crates/ra_hir` 105### `crates/ra_hir*` crates
98 106
99HIR provides high-level "object oriented" access to Rust code. 107HIR provides high-level "object oriented" access to Rust code.
100 108
101The principal difference between HIR and syntax trees is that HIR is bound to a 109The principal difference between HIR and syntax trees is that HIR is bound to a
102particular crate instance. That is, it has cfg flags and features applied (in 110particular crate instance. That is, it has cfg flags and features applied. So,
103theory, in practice this is to be implemented). So, the relation between 111the relation between syntax and HIR is many-to-one. The `source_binder` module
104syntax and HIR is many-to-one. The `source_binder` module is responsible for 112is responsible for guessing a HIR for a particular source position.
105guessing a HIR for a particular source position.
106 113
107Underneath, HIR works on top of salsa, using a `HirDatabase` trait. 114Underneath, HIR works on top of salsa, using a `HirDatabase` trait.
108 115
116`ra_hir_xxx` crates have a strong ECS flavor, in that they work with raw ids and
117directly query the databse.
118
119The top-level `ra_hir` façade crate wraps ids into a more OO-flavored API.
120
109### `crates/ra_ide` 121### `crates/ra_ide`
110 122
111A stateful library for analyzing many Rust files as they change. `AnalysisHost` 123A stateful library for analyzing many Rust files as they change. `AnalysisHost`
@@ -135,18 +147,9 @@ different from data on disk. This is more or less the single really
135platform-dependent component, so it lives in a separate repository and has an 147platform-dependent component, so it lives in a separate repository and has an
136extensive cross-platform CI testing. 148extensive cross-platform CI testing.
137 149
138### `crates/gen_lsp_server`
139
140A language server scaffold, exposing a synchronous crossbeam-channel based API.
141This crate handles protocol handshaking and parsing messages, while you
142control the message dispatch loop yourself.
143
144Run with `RUST_LOG=sync_lsp_server=debug` to see all the messages.
145
146### `crates/ra_cli` 150### `crates/ra_cli`
147 151
148A CLI interface to rust-analyzer. 152A CLI interface to rust-analyzer, mainly for testing.
149
150 153
151## Testing Infrastructure 154## Testing Infrastructure
152 155