aboutsummaryrefslogtreecommitdiff
path: root/readme
diff options
context:
space:
mode:
authorAkshay <[email protected]>2022-08-02 15:20:46 +0100
committerAkshay <[email protected]>2022-08-02 15:27:12 +0100
commitcfc70207996e202edbb577b2ad97a61ba9eb0eaa (patch)
tree97a3f25c3016766d6456efb748d48cbc6c525a47 /readme
parentefd96e8df6805a45aaf5822141dee11c642b51ae (diff)
add textual comparisonHEADmaster
structural comparison helps detect a vast majority of duplicates, but it has a few false positives when files contain only trivia. textual similarity can help detect and eliminate those false positives.
Diffstat (limited to 'readme')
-rw-r--r--readme7
1 files changed, 3 insertions, 4 deletions
diff --git a/readme b/readme
index 869e43e..d92989f 100644
--- a/readme
+++ b/readme
@@ -18,15 +18,14 @@ Internals:
18 18
19The tool uses tree-sitter to produce ASTs for the given files. It then lazily 19The tool uses tree-sitter to produce ASTs for the given files. It then lazily
20traverses the trees of the two files to be compared and exits on encountering 20traverses the trees of the two files to be compared and exits on encountering
21the first structural difference in the ASTs. 21the first structural difference in the ASTs. Additionally, it performs a
22textual similarity check to eliminate outliers such as files that consist
23entirely of trivia nodes.
22 24
23 25
24Known issues: 26Known issues:
25------------ 27------------
26 28
27- A fully commented-out file is equivalent to every other fully commented-out
28 file and to empty files
29
30- Does not account for equivalence of unordered children: 29- Does not account for equivalence of unordered children:
31 30
32 ==== file1.rs ==== ==== file2.rs ==== 31 ==== file1.rs ==== ==== file2.rs ====