You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
I have a branded manual which section title's are in a specific color. I would like to chunk the PDF into section using color information.
Describe the solution you'd like
A clear and concise description of what you want to happen.
When using partition_pdf with "fast" strategy, the color of the text is stored in the metadata. (And the documentation reflects it).
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
I already tried to use the "by_title" chunking system but some text.category are wrong or the section is chunked to be 500 chars aprox despite to set the max_partition to None.
Additional context
Add any other context or screenshots about the feature request here.
Using unstructured from docker image.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem? Please describe.
I have a branded manual which section title's are in a specific color. I would like to chunk the PDF into section using color information.
Describe the solution you'd like
A clear and concise description of what you want to happen.
When using partition_pdf with "fast" strategy, the color of the text is stored in the metadata. (And the documentation reflects it).
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
I already tried to use the "by_title" chunking system but some text.category are wrong or the section is chunked to be 500 chars aprox despite to set the max_partition to None.
Additional context
Add any other context or screenshots about the feature request here.
Using unstructured from docker image.
The text was updated successfully, but these errors were encountered: