Golang metrics for calculating string similarity and other string utility functions
-
Updated
Jul 31, 2024 - Go
Golang metrics for calculating string similarity and other string utility functions
A package to compute medical segmentation metrics.
This repository provides a Python script to cluster keywords based on the similarity of their associated URLs, calculated using the Jaccard similarity coefficient.
This project implements a recommendation system using the Polars DataFrame library. The system recommends products to reviewers based on the Jaccard similarity of their review histories.
Flora Genie is a personalized plant recommendation system designed to help amateur gardeners select the most suitable plants for their homes or gardens.
Calculate various string metrics efficiently in Haskell
Implemented various spellcheck techniques like cosine similarity, jaccard similarity and levenshtein distance. Open to any further contributions.
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
A movie recommender written in Go that suggests movies considering various factors within a particular dataset, encompassing users, movies, and movie ratings.
Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report
A collection of string comparisons algorithms
Addressed Entity Resolution challenges. Tasks include schema-agnostic blocking, pairwise comparisons, Meta-Blocking graph construction, and Jaccard similarity computation. Deliverables include source code, reports, and reproducibility guidelines in Python
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
SARS-COV-2 genome analysis using Big Data algorithms in order to find clusters of similar mutations that belongs to different clades which mutate together and generate the correspondent clade.
An application for fraud detection in medicine packages and tablets.
Asynchronous Distributed Actor-based Approach to Jaccard Similarity for Genome Comparisons
Jupyter Notebook illustrates and compares different approaches to sentence similarity scoring.
Descriptive, predictive analysis of taxability
Document Comparison web application based on Jaccard Similarity Index. The uploaded file is compared to all previously uploaded ones. Built with Java/JSP
Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.
Add a description, image, and links to the jaccard-similarity topic page so that developers can more easily learn about it.
To associate your repository with the jaccard-similarity topic, visit your repo's landing page and select "manage topics."