aboutsummaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAleksey Kladov <[email protected]>2018-02-11 14:58:22 +0000
committerAleksey Kladov <[email protected]>2018-02-11 14:58:22 +0000
commit59087840f515c809498f09ec535e59054a893525 (patch)
tree282e1f9606dbeeba6c19a2dcd6ff94da420d155a /docs
parent9e2c0564783aa91f6440e7cadcc1a4dfda785de0 (diff)
Document how the parsing works
Diffstat (limited to 'docs')
-rw-r--r--docs/ARCHITECTURE.md21
1 files changed, 9 insertions, 12 deletions
diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md
index a1fa246c2..6b4434396 100644
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -33,19 +33,22 @@ The centerpiece of this whole endeavor is the syntax tree, in the
33 33
34The syntax tree is produced using a three-staged process. 34The syntax tree is produced using a three-staged process.
35 35
36First, a raw text is split into tokens with a lexer. Lexer has a 36First, a raw text is split into tokens with a lexer (the `lexer` module).
37peculiar signature: it is an `Fn(&str) -> Token`, where token is a 37Lexer has a peculiar signature: it is an `Fn(&str) -> Token`, where token
38pair of `SyntaxKind` (you should have read the `tree` module and RFC 38is a pair of `SyntaxKind` (you should have read the `tree` module and RFC
39by this time! :)) and a len. That is, lexer chomps only the first 39by this time! :)) and a len. That is, lexer chomps only the first
40token of the input. This forces the lexer to be stateless, and makes 40token of the input. This forces the lexer to be stateless, and makes
41it possible to implement incremental relexing easily. 41it possible to implement incremental relexing easily.
42 42
43Then, the bulk of work, the parser turns a stream of tokens into 43Then, the bulk of work, the parser turns a stream of tokens into
44stream of events. Not that parser **does not** construct a tree right 44stream of events (the `parser` module; of particular interest are
45away. This is done for several reasons: 45the `parser/event` and `parser/parser` modules, which contain parsing
46API, and the `parser/grammar` module, which contains actual parsing code
47for various Rust syntactic constructs). Not that parser **does not**
48construct a tree right away. This is done for several reasons:
46 49
47* to decouple the actual tree data structure from the parser: you can 50* to decouple the actual tree data structure from the parser: you can
48 build any datastructre you want from the stream of events 51 build any data structure you want from the stream of events
49 52
50* to make parsing fast: you can produce a list of events without 53* to make parsing fast: you can produce a list of events without
51 allocations 54 allocations
@@ -77,12 +80,6 @@ And at last, the TreeBuilder converts a flat stream of events into a
77tree structure. It also *should* be responsible for attaching comments 80tree structure. It also *should* be responsible for attaching comments
78and rebalancing the tree, but it does not do this yet :) 81and rebalancing the tree, but it does not do this yet :)
79 82
80
81## Error reporing
82
83TODO: describe how stuff like `skip_to_first` works
84
85
86## Validator 83## Validator
87 84
88Parser and lexer accept a lot of *invalid* code intentionally. The 85Parser and lexer accept a lot of *invalid* code intentionally. The