Skip to content

Issues: adbar/trafilatura

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Assignee
Filter by who’s assigned
Sort

Issues list

Remove deprecations (mostly CLI) maintenance Software compability and continuity
#676 opened Aug 15, 2024 by adbar
3 tasks
v2.0
Investigate spacing in element tails question Further information is requested
#661 opened Jul 26, 2024 by adbar
Faulty extraction for very short documents enhancement New feature or request
#660 opened Jul 26, 2024 by Psynbiotik
Add magic_html to benchmarks evaluation Evaluation scripts and data
#650 opened Jul 18, 2024 by dantetemplar
Missing h1 heading if <header> outside of <article> question Further information is requested
#642 opened Jul 11, 2024 by chrisgoddard
links/urls are not apprearing using extract feedback Feedback from users requested
#636 opened Jul 1, 2024 by alroythalus
some extraction duplicated in xml question Further information is requested
#634 opened Jun 27, 2024 by fortyfourforty
Deprecate Python 3.6 & 3.7 maintenance Software compability and continuity
#630 opened Jun 26, 2024 by adbar v2.0
Deprecate GUI in its current form (Gooey) maintenance Software compability and continuity
#629 opened Jun 26, 2024 by adbar v2.0
Sometimes, html tags remain on the string bug Something isn't working feedback Feedback from users requested
#627 opened Jun 23, 2024 by masylum
Parts of article block are sometimes not being extracted feedback Feedback from users requested
#622 opened Jun 17, 2024 by naktinis
Image/Video caption and credits removal documentation Docs in need of update or extension question Further information is requested
#616 opened Jun 6, 2024 by hamsarajan
It's set include_images=True, but there is no picture bug Something isn't working
#610 opened May 31, 2024 by dark2star
Remove HTML doc pages from package and add instructions to build them documentation Docs in need of update or extension maintenance Software compability and continuity
#609 opened May 30, 2024 by adbar
New port of readability.js? question Further information is requested
#604 opened May 23, 2024 by zirkelc
Add option to provide XPaths for content extraction enhancement New feature or request
#596 opened May 16, 2024 by klvbdmh
Extracting content from an URl is getting none question Further information is requested
#586 opened May 5, 2024 by Fabiha15
Wrong links position in text from telegram post question Further information is requested
#585 opened May 4, 2024 by RedHotUnicorn
Extract text from buttons for semantic elements question Further information is requested
#573 opened Apr 23, 2024 by zirkelc
ProTip! Type g i on any issue or pull request to go back to the issue listing page.