New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

adding more extraction engine #71

Closed

codeperfectplus opened this issue Oct 31, 2022 · 0 comments

Labels

Member

codeperfectplus commented Oct 31, 2022

Feature proposal (Draft)

Adding more extraction engines. Some of the engines may be fast, some may be accurate. so there should be a user choice to select the engine.

Currently, PDF has two Extraction Engines(modules).

PDFMiner-
PdfParser-

For Doc/Docx -

doc2text (Implemented)
docs - not yet

codeperfectplus added the enhancement label

codeperfectplus closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment