tags | ||
---|---|---|
|
Multi-modal multi-lingual model able to process and understand visual structured documents introduced by Xu et al. (2021).
Basically its LayoutLMv2 initialized with InfoXLM. The model is pretrained in the same way as is LayoutLMv2, but with multi-lingual documents.
The paper also introduces human-annotated dataset of multi-lingual documents.