Identifying the boundaries of main content of fiction and non-fiction works in the HathiTrust Extracted Features dataset.
scanned-documents
extracting-features
clustering-algorithm
digital-libraries
clustering-analysis
smoothing-methods
detecting-paratext-boundaries
-
Updated
May 10, 2022 - Jupyter Notebook