From 728990a5807882276e34f1f581f70a59f61ba991 Mon Sep 17 00:00:00 2001 From: Aleksey Kladov Date: Wed, 20 Mar 2019 15:22:05 +0300 Subject: start dev readme --- docs/dev/ARCHITECTURE.md | 200 ----------------------------------------------- docs/dev/README.md | 37 +++++++++ docs/dev/arhictecture.md | 200 +++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 237 insertions(+), 200 deletions(-) delete mode 100644 docs/dev/ARCHITECTURE.md create mode 100644 docs/dev/README.md create mode 100644 docs/dev/arhictecture.md (limited to 'docs') diff --git a/docs/dev/ARCHITECTURE.md b/docs/dev/ARCHITECTURE.md deleted file mode 100644 index 57f76ebae..000000000 --- a/docs/dev/ARCHITECTURE.md +++ /dev/null @@ -1,200 +0,0 @@ -# Architecture - -This document describes the high-level architecture of rust-analyzer. -If you want to familiarize yourself with the code base, you are just -in the right place! - -See also the [guide](./guide.md), which walks through a particular snapshot of -rust-analyzer code base. - -For syntax-trees specifically, there's a [video walk -through](https://youtu.be/DGAuLWdCCAI) as well. - -## The Big Picture - -![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) - -On the highest level, rust-analyzer is a thing which accepts input source code -from the client and produces a structured semantic model of the code. - -More specifically, input data consists of a set of test files (`(PathBuf, -String)` pairs) and information about project structure, captured in the so called -`CrateGraph`. The crate graph specifies which files are crate roots, which cfg -flags are specified for each crate (TODO: actually implement this) and what -dependencies exist between the crates. The analyzer keeps all this input data in -memory and never does any IO. Because the input data is source code, which -typically measures in tens of megabytes at most, keeping all input data in -memory is OK. - -A "structured semantic model" is basically an object-oriented representation of -modules, functions and types which appear in the source code. This representation -is fully "resolved": all expressions have types, all references are bound to -declarations, etc. - -The client can submit a small delta of input data (typically, a change to a -single file) and get a fresh code model which accounts for changes. - -The underlying engine makes sure that model is computed lazily (on-demand) and -can be quickly updated for small modifications. - - -## Code generation - -Some of the components of this repository are generated through automatic -processes. These are outlined below: - -- `gen-syntax`: The kinds of tokens that are reused in several places, so a generator - is used. We use tera templates to generate the files listed below, based on - the grammar described in [grammar.ron]: - - [ast/generated.rs][ast generated] in `ra_syntax` based on - [ast/generated.tera.rs][ast source] - - [syntax_kinds/generated.rs][syntax_kinds generated] in `ra_syntax` based on - [syntax_kinds/generated.tera.rs][syntax_kinds source] - -[tera]: https://tera.netlify.com/ -[grammar.ron]: ./crates/ra_syntax/src/grammar.ron -[ast generated]: ./crates/ra_syntax/src/ast/generated.rs -[ast source]: ./crates/ra_syntax/src/ast/generated.rs.tera -[syntax_kinds generated]: ./crates/ra_syntax/src/syntax_kinds/generated.rs -[syntax_kinds source]: ./crates/ra_syntax/src/syntax_kinds/generated.rs.tera - - -## Code Walk-Through - -### `crates/ra_syntax` - -Rust syntax tree structure and parser. See -[RFC](https://github.com/rust-lang/rfcs/pull/2256) for some design notes. - -- [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. -- `grammar` module is the actual parser. It is a hand-written recursive descent parser, which - produces a sequence of events like "start node X", "finish not Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), - which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) - is what we use for the definition of the Rust language. -- `parser_api/parser_impl` bridges the tree-agnostic parser from `grammar` with `rowan` trees. - This is the thing that turns a flat list of events into a tree (see `EventProcessor`) -- `ast` provides a type safe API on top of the raw `rowan` tree. -- `grammar.ron` RON description of the grammar, which is used to - generate `syntax_kinds` and `ast` modules, using `cargo gen-syntax` command. -- `algo`: generic tree algorithms, including `walk` for O(1) stack - space tree traversal (this is cool) and `visit` for type-driven - visiting the nodes (this is double plus cool, if you understand how - `Visitor` works, you understand the design of syntax trees). - -Tests for ra_syntax are mostly data-driven: `tests/data/parser` contains a bunch of `.rs` -(test vectors) and `.txt` files with corresponding syntax trees. During testing, we check -`.rs` against `.txt`. If the `.txt` file is missing, it is created (this is how you update -tests). Additionally, running `cargo gen-tests` will walk the grammar module and collect -all `//test test_name` comments into files inside `tests/data` directory. - -See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which -fixes a bug in the grammar. - -### `crates/ra_db` - -We use the [salsa](https://github.com/salsa-rs/salsa) crate for incremental and -on-demand computation. Roughly, you can think of salsa as a key-value store, but -it also can compute derived values using specified functions. The `ra_db` crate -provides basic infrastructure for interacting with salsa. Crucially, it -defines most of the "input" queries: facts supplied by the client of the -analyzer. Reading the docs of the `ra_db::input` module should be useful: -everything else is strictly derived from those inputs. - -### `crates/ra_hir` - -HIR provides high-level "object oriented" access to Rust code. - -The principal difference between HIR and syntax trees is that HIR is bound to a -particular crate instance. That is, it has cfg flags and features applied (in -theory, in practice this is to be implemented). So, the relation between -syntax and HIR is many-to-one. The `source_binder` module is responsible for -guessing a HIR for a particular source position. - -Underneath, HIR works on top of salsa, using a `HirDatabase` trait. - -### `crates/ra_ide_api` - -A stateful library for analyzing many Rust files as they change. `AnalysisHost` -is a mutable entity (clojure's atom) which holds the current state, incorporates -changes and hands out `Analysis` --- an immutable and consistent snapshot of -the world state at a point in time, which actually powers analysis. - -One interesting aspect of analysis is its support for cancellation. When a -change is applied to `AnalysisHost`, first all currently active snapshots are -canceled. Only after all snapshots are dropped the change actually affects the -database. - -APIs in this crate are IDE centric: they take text offsets as input and produce -offsets and strings as output. This works on top of rich code model powered by -`hir`. - -### `crates/ra_ide_api_light` - -All IDE features which can be implemented if you only have access to a single -file. `ra_ide_api_light` could be used to enhance editing of Rust code without -the need to fiddle with build-systems, file synchronization and such. - -In a sense, `ra_ide_api_light` is just a bunch of pure functions which take a -syntax tree as input. - -The tests for `ra_ide_api_light` are `#[cfg(test)] mod tests` unit-tests spread -throughout its modules. - - -### `crates/ra_lsp_server` - -An LSP implementation which wraps `ra_ide_api` into a langauge server protocol. - -### `crates/ra_vfs` - -Although `hir` and `ra_ide_api` don't do any IO, we need to be able to read -files from disk at the end of the day. This is what `ra_vfs` does. It also -manages overlays: "dirty" files in the editor, whose "true" contents is -different from data on disk. - -### `crates/gen_lsp_server` - -A language server scaffold, exposing a synchronous crossbeam-channel based API. -This crate handles protocol handshaking and parsing messages, while you -control the message dispatch loop yourself. - -Run with `RUST_LOG=sync_lsp_server=debug` to see all the messages. - -### `crates/ra_cli` - -A CLI interface to rust-analyzer. - -### `crate/tools` - -Custom Cargo tasks used to develop rust-analyzer: - -- `cargo gen-syntax` -- generate `ast` and `syntax_kinds` -- `cargo gen-tests` -- collect inline tests from grammar -- `cargo install-code` -- build and install VS Code extension and server - -### `editors/code` - -VS Code plugin - - -## Common workflows - -To try out VS Code extensions, run `cargo install-code`. This installs both the -`ra_lsp_server` binary and the VS Code extension. To install only the binary, use -`cargo install-lsp` (shorthand for `cargo install --path crates/ra_lsp_server --force`) - -To see logs from the language server, set `RUST_LOG=info` env variable. To see -all communication between the server and the client, use -`RUST_LOG=gen_lsp_server=debug` (this will print quite a bit of stuff). - -There's `rust-analyzer: status` command which prints common high-level debug -info. In particular, it prints info about memory usage of various data -structures, and, if compiled with jemalloc support (`cargo jinstall-lsp` or -`cargo install --path crates/ra_lsp_server --force --features jemalloc`), includes - statistic about the heap. - -To run tests, just `cargo test`. - -To work on the VS Code extension, launch code inside `editors/code` and use `F5` to -launch/debug. To automatically apply formatter and linter suggestions, use `npm -run fix`. diff --git a/docs/dev/README.md b/docs/dev/README.md new file mode 100644 index 000000000..74bf86f68 --- /dev/null +++ b/docs/dev/README.md @@ -0,0 +1,37 @@ +# Contributing Quick Start + +Rust Analyzer is just a usual rust project, which is organized as a Cargo +workspace, builds on stable and doesn't depend on C libraries. So, just + +``` +$ cargo test +``` + +should be enough to get you started! + +To learn more about how rust-analyzer works, see +[./architecture.md](./architecture.md) document. + +Various organizational and process issues are discussed here. + +# Getting in Touch + +Rust Analyzer is a part of [RLS-2.0 working +group](https://github.com/rust-lang/compiler-team/tree/6a769c13656c0a6959ebc09e7b1f7c09b86fb9c0/working-groups/rls-2.0). +Discussion happens in this Zulip stream: + +https://rust-lang.zulipchat.com/#narrow/stream/185405-t-compiler.2Fwg-rls-2.2E0 + +# Issue Labels + +* [good-first-issue](https://github.com/rust-analyzer/rust-analyzer/labels/good%20first%20issue) + are good issues to get into the project. +* [E-mentor](https://github.com/rust-analyzer/rust-analyzer/issues?q=is%3Aopen+is%3Aissue+label%3AE-mentor) + issues have links to the code in question and tests. +* [E-easy](https://github.com/rust-analyzer/rust-analyzer/issues?q=is%3Aopen+is%3Aissue+label%3AE-easy), + [E-medium](https://github.com/rust-analyzer/rust-analyzer/issues?q=is%3Aopen+is%3Aissue+label%3AE-medium), + [E-hard](https://github.com/rust-analyzer/rust-analyzer/issues?q=is%3Aopen+is%3Aissue+label%3AE-hard), + labels are *estimates* for how hard would be to write a fix. +* [E-fun](https://github.com/rust-analyzer/rust-analyzer/issues?q=is%3Aopen+is%3Aissue+label%3AE-fun) + is for cool, but probably hard stuff. + diff --git a/docs/dev/arhictecture.md b/docs/dev/arhictecture.md new file mode 100644 index 000000000..57f76ebae --- /dev/null +++ b/docs/dev/arhictecture.md @@ -0,0 +1,200 @@ +# Architecture + +This document describes the high-level architecture of rust-analyzer. +If you want to familiarize yourself with the code base, you are just +in the right place! + +See also the [guide](./guide.md), which walks through a particular snapshot of +rust-analyzer code base. + +For syntax-trees specifically, there's a [video walk +through](https://youtu.be/DGAuLWdCCAI) as well. + +## The Big Picture + +![](https://user-images.githubusercontent.com/1711539/50114578-e8a34280-0255-11e9-902c-7cfc70747966.png) + +On the highest level, rust-analyzer is a thing which accepts input source code +from the client and produces a structured semantic model of the code. + +More specifically, input data consists of a set of test files (`(PathBuf, +String)` pairs) and information about project structure, captured in the so called +`CrateGraph`. The crate graph specifies which files are crate roots, which cfg +flags are specified for each crate (TODO: actually implement this) and what +dependencies exist between the crates. The analyzer keeps all this input data in +memory and never does any IO. Because the input data is source code, which +typically measures in tens of megabytes at most, keeping all input data in +memory is OK. + +A "structured semantic model" is basically an object-oriented representation of +modules, functions and types which appear in the source code. This representation +is fully "resolved": all expressions have types, all references are bound to +declarations, etc. + +The client can submit a small delta of input data (typically, a change to a +single file) and get a fresh code model which accounts for changes. + +The underlying engine makes sure that model is computed lazily (on-demand) and +can be quickly updated for small modifications. + + +## Code generation + +Some of the components of this repository are generated through automatic +processes. These are outlined below: + +- `gen-syntax`: The kinds of tokens that are reused in several places, so a generator + is used. We use tera templates to generate the files listed below, based on + the grammar described in [grammar.ron]: + - [ast/generated.rs][ast generated] in `ra_syntax` based on + [ast/generated.tera.rs][ast source] + - [syntax_kinds/generated.rs][syntax_kinds generated] in `ra_syntax` based on + [syntax_kinds/generated.tera.rs][syntax_kinds source] + +[tera]: https://tera.netlify.com/ +[grammar.ron]: ./crates/ra_syntax/src/grammar.ron +[ast generated]: ./crates/ra_syntax/src/ast/generated.rs +[ast source]: ./crates/ra_syntax/src/ast/generated.rs.tera +[syntax_kinds generated]: ./crates/ra_syntax/src/syntax_kinds/generated.rs +[syntax_kinds source]: ./crates/ra_syntax/src/syntax_kinds/generated.rs.tera + + +## Code Walk-Through + +### `crates/ra_syntax` + +Rust syntax tree structure and parser. See +[RFC](https://github.com/rust-lang/rfcs/pull/2256) for some design notes. + +- [rowan](https://github.com/rust-analyzer/rowan) library is used for constructing syntax trees. +- `grammar` module is the actual parser. It is a hand-written recursive descent parser, which + produces a sequence of events like "start node X", "finish not Y". It works similarly to [kotlin's parser](https://github.com/JetBrains/kotlin/blob/4d951de616b20feca92f3e9cc9679b2de9e65195/compiler/frontend/src/org/jetbrains/kotlin/parsing/KotlinParsing.java), + which is a good source of inspiration for dealing with syntax errors and incomplete input. Original [libsyntax parser](https://github.com/rust-lang/rust/blob/6b99adeb11313197f409b4f7c4083c2ceca8a4fe/src/libsyntax/parse/parser.rs) + is what we use for the definition of the Rust language. +- `parser_api/parser_impl` bridges the tree-agnostic parser from `grammar` with `rowan` trees. + This is the thing that turns a flat list of events into a tree (see `EventProcessor`) +- `ast` provides a type safe API on top of the raw `rowan` tree. +- `grammar.ron` RON description of the grammar, which is used to + generate `syntax_kinds` and `ast` modules, using `cargo gen-syntax` command. +- `algo`: generic tree algorithms, including `walk` for O(1) stack + space tree traversal (this is cool) and `visit` for type-driven + visiting the nodes (this is double plus cool, if you understand how + `Visitor` works, you understand the design of syntax trees). + +Tests for ra_syntax are mostly data-driven: `tests/data/parser` contains a bunch of `.rs` +(test vectors) and `.txt` files with corresponding syntax trees. During testing, we check +`.rs` against `.txt`. If the `.txt` file is missing, it is created (this is how you update +tests). Additionally, running `cargo gen-tests` will walk the grammar module and collect +all `//test test_name` comments into files inside `tests/data` directory. + +See [#93](https://github.com/rust-analyzer/rust-analyzer/pull/93) for an example PR which +fixes a bug in the grammar. + +### `crates/ra_db` + +We use the [salsa](https://github.com/salsa-rs/salsa) crate for incremental and +on-demand computation. Roughly, you can think of salsa as a key-value store, but +it also can compute derived values using specified functions. The `ra_db` crate +provides basic infrastructure for interacting with salsa. Crucially, it +defines most of the "input" queries: facts supplied by the client of the +analyzer. Reading the docs of the `ra_db::input` module should be useful: +everything else is strictly derived from those inputs. + +### `crates/ra_hir` + +HIR provides high-level "object oriented" access to Rust code. + +The principal difference between HIR and syntax trees is that HIR is bound to a +particular crate instance. That is, it has cfg flags and features applied (in +theory, in practice this is to be implemented). So, the relation between +syntax and HIR is many-to-one. The `source_binder` module is responsible for +guessing a HIR for a particular source position. + +Underneath, HIR works on top of salsa, using a `HirDatabase` trait. + +### `crates/ra_ide_api` + +A stateful library for analyzing many Rust files as they change. `AnalysisHost` +is a mutable entity (clojure's atom) which holds the current state, incorporates +changes and hands out `Analysis` --- an immutable and consistent snapshot of +the world state at a point in time, which actually powers analysis. + +One interesting aspect of analysis is its support for cancellation. When a +change is applied to `AnalysisHost`, first all currently active snapshots are +canceled. Only after all snapshots are dropped the change actually affects the +database. + +APIs in this crate are IDE centric: they take text offsets as input and produce +offsets and strings as output. This works on top of rich code model powered by +`hir`. + +### `crates/ra_ide_api_light` + +All IDE features which can be implemented if you only have access to a single +file. `ra_ide_api_light` could be used to enhance editing of Rust code without +the need to fiddle with build-systems, file synchronization and such. + +In a sense, `ra_ide_api_light` is just a bunch of pure functions which take a +syntax tree as input. + +The tests for `ra_ide_api_light` are `#[cfg(test)] mod tests` unit-tests spread +throughout its modules. + + +### `crates/ra_lsp_server` + +An LSP implementation which wraps `ra_ide_api` into a langauge server protocol. + +### `crates/ra_vfs` + +Although `hir` and `ra_ide_api` don't do any IO, we need to be able to read +files from disk at the end of the day. This is what `ra_vfs` does. It also +manages overlays: "dirty" files in the editor, whose "true" contents is +different from data on disk. + +### `crates/gen_lsp_server` + +A language server scaffold, exposing a synchronous crossbeam-channel based API. +This crate handles protocol handshaking and parsing messages, while you +control the message dispatch loop yourself. + +Run with `RUST_LOG=sync_lsp_server=debug` to see all the messages. + +### `crates/ra_cli` + +A CLI interface to rust-analyzer. + +### `crate/tools` + +Custom Cargo tasks used to develop rust-analyzer: + +- `cargo gen-syntax` -- generate `ast` and `syntax_kinds` +- `cargo gen-tests` -- collect inline tests from grammar +- `cargo install-code` -- build and install VS Code extension and server + +### `editors/code` + +VS Code plugin + + +## Common workflows + +To try out VS Code extensions, run `cargo install-code`. This installs both the +`ra_lsp_server` binary and the VS Code extension. To install only the binary, use +`cargo install-lsp` (shorthand for `cargo install --path crates/ra_lsp_server --force`) + +To see logs from the language server, set `RUST_LOG=info` env variable. To see +all communication between the server and the client, use +`RUST_LOG=gen_lsp_server=debug` (this will print quite a bit of stuff). + +There's `rust-analyzer: status` command which prints common high-level debug +info. In particular, it prints info about memory usage of various data +structures, and, if compiled with jemalloc support (`cargo jinstall-lsp` or +`cargo install --path crates/ra_lsp_server --force --features jemalloc`), includes + statistic about the heap. + +To run tests, just `cargo test`. + +To work on the VS Code extension, launch code inside `editors/code` and use `F5` to +launch/debug. To automatically apply formatter and linter suggestions, use `npm +run fix`. -- cgit v1.2.3