Replies: 21 comments 1 reply
-
可以修改一下这里的代码,去掉try: https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/ppocr/data/pubtab_dataset.py#L96-L125 |
Beta Was this translation helpful? Give feedback.
-
去掉try,如果图片有问题,不会报错,是直接跳过堵塞的图片,继续读取下一张图训练? |
Beta Was this translation helpful? Give feedback.
-
去掉try,如果图片有问题,不会报错,是直接跳过堵塞的图片,继续读取下一张图训练? |
Beta Was this translation helpful? Give feedback.
-
还有就是我用pubnettab 和WTW等数据进行训练时,准确率有时会从原有的0.65左右一下子直接降到0.000.这种怎么解决? 还有很复杂的内嵌单元格的表格,这种识别效果好像很差,怎么准备数据? |
Beta Was this translation helpful? Give feedback.
-
针对 复杂内嵌单元格 的识别问题,我们目前也还在努力优化中,尚无法提供明确的优化建议,如有进展我们也会在PaddleOCR及时更新发布,欢迎关注我们之后的工作~ |
Beta Was this translation helpful? Give feedback.
-
使用try是为了避免因为个别图像文件读取报错导致训练进程退出,如果你怀疑加载图片时阻塞,那需要修改这里的代码来排查。 |
Beta Was this translation helpful? Give feedback.
-
去掉try catch,目前没有堵塞了。 |
Beta Was this translation helpful? Give feedback.
-
那好奇怪,也没有报错吗? |
Beta Was this translation helpful? Give feedback.
-
也没有报错。就是偶尔会堵塞。去掉try catch 就没有了。 好像识别效果还不行。 |
Beta Was this translation helpful? Give feedback.
-
好像还是会堵塞。 |
Beta Was this translation helpful? Give feedback.
-
那这样看起来或许不是图片读取阻塞了。 |
Beta Was this translation helpful? Give feedback.
-
目前的表格识别模型对于跨行跨列的识别效果确实不理想,我们还在努力优化中。 |
Beta Was this translation helpful? Give feedback.
-
那会是什么问题呢?就是在某个时刻就不动了。不用自己生成的图片好像不会堵塞。 |
Beta Was this translation helpful? Give feedback.
-
可以定位一下具体是哪张图导致了阻塞吗? |
Beta Was this translation helpful? Give feedback.
-
就是不知道怎么定位 |
Beta Was this translation helpful? Give feedback.
-
这里的代码没有打出信息吗: |
Beta Was this translation helpful? Give feedback.
-
没有打印,直接不动 |
Beta Was this translation helpful? Give feedback.
-
2000张训练1-2轮时可以正常,到新一轮时可能就会在某个地方被堵住 |
Beta Was this translation helpful? Give feedback.
-
那应该不是dataloader卡住了,因为一轮训练已经遍历了全部的训练图片。hang住的时候,gpu利用率正常吗? |
Beta Was this translation helpful? Give feedback.
-
没看gpu,如果gpu爆了,可以怎么解决?只能关闭,重新启动训练? |
Beta Was this translation helpful? Give feedback.
-
目前只能重新启动训练,可以resume加载最近一次保存的断点文件继续训练。另外,不知道是否是环境问题导致的训练挂掉,可以试下使用Paddle官方的docker,具体参考文档:https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/install/docker/linux-docker.html |
Beta Was this translation helpful? Give feedback.
-
请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem
请尽量不要包含图片在问题中/Please try to not include the image in the issue.
pubtab_dataset.py 加载模型图片时会堵塞,不知道是不是标签问题,还是图片问题,怎么debug排查?怎么解决堵塞?
Beta Was this translation helpful? Give feedback.
All reactions