aboutsummaryrefslogtreecommitdiff
path: root/readme
blob: d92989fd49c8bf8d40a805c5e38414e3ca7fa6df (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Accepts a list of files as args and returns pairs of duplicate files, one pair
per line.

Usage:
-----

    cargo run --release --quiet -- [FILES]


Example:
-------

    cargo run ---release --quiet -- js-files/*.js


Internals:
---------

The tool uses tree-sitter to produce ASTs for the given files. It then lazily
traverses the trees of the two files to be compared and exits on encountering
the first structural difference in the ASTs. Additionally, it performs a
textual similarity check to eliminate outliers such as files that consist
entirely of trivia nodes.


Known issues:
------------

- Does not account for equivalence of unordered children:

    ==== file1.rs ====            ==== file2.rs ====
    fn bar(x, y, z) {}            fn foo(a, b) {}
    fn foo(a, b) {}               fn bar(x, y, z) {}

  Here, `function` nodes are "unordered" children, the files are structurally
  equivalent because the order of `function` nodes is irrelevant. However, the
  tool considers these files to be unique.