In this repository we present the code of Sloth, our solution for determining the largest overlap between two tables.
The code is provided in the "sloth.py" file, while the main files for the experiments and for preprocessing the datasets are made available in the "data_preparation" and "experiments" folders, respectively.
The "examples" folders contains representative pairs of tables from Wikipedia describing typical cases where the detected largest overlap significantly differs from traditional set similarity measures, such as Jaccard similarity and overlap set similarity.