Skip to content
/ sloth Public
forked from lucazecchini/sloth

Reference paper: "Determining the Largest Overlap between Tables" (Luca Zecchini, Tobias Bleifuß, Giovanni Simonini, Sonia Bergamaschi, Felix Naumann). Proceedings of the ACM on Management of Data (PACMMOD), 2024

Notifications You must be signed in to change notification settings

dbmodena/sloth

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sloth

In this repository we present the code of Sloth, our solution for determining the largest overlap between two tables.

The code is provided in the "sloth.py" file, while the main files for the experiments and for preprocessing the datasets are made available in the "data_preparation" and "experiments" folders, respectively.

The "examples" folders contains representative pairs of tables from Wikipedia describing typical cases where the detected largest overlap significantly differs from traditional set similarity measures, such as Jaccard similarity and overlap set similarity.

About

Reference paper: "Determining the Largest Overlap between Tables" (Luca Zecchini, Tobias Bleifuß, Giovanni Simonini, Sonia Bergamaschi, Felix Naumann). Proceedings of the ACM on Management of Data (PACMMOD), 2024

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%