Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对题目中含有空格的问题,由于文件命名规则问题,下载时会导致崩溃 #6

Open
66my opened this issue May 2, 2024 · 1 comment

Comments

@66my
Copy link

66my commented May 2, 2024

顺序爬取,当爬到特定问题下,整个程序就会崩溃。
举例网址1“https://www.zhihu.com/question/614902680/answer/3152426894 金融行业用 AI 做量化交易和高频交易靠谱吗?未来会如何发展 ?”
举例网址2“https://www.zhihu.com/question/622572713/answer/3221012170 如何看待某车企的内部规定,要求所有技术人员不能与供应商私自联系 ?”
上述网址回答,与其他问题的主要区别在于“私自联系 ?”和“如何发展 ?”,即最后一个问号前,多了一个空格,我认为可能是这个位置导致无法识别保存路径。

运行命令为python crawler.py --answer --MarkDown
以下部分使用最新版代码运行

Traceback (most recent call last):
  File "D:\24Python\12_数据抓取\00知乎\240502\git-zhihu_spider_selenium-master\crawler.py", line 1142, in <module>
    zhihu()
  File "D:\24Python\12_数据抓取\00知乎\240502\git-zhihu_spider_selenium-master\crawler.py", line 1087, in zhihu
    crawl_answer_detail(driver)
  File "D:\24Python\12_数据抓取\00知乎\240502\git-zhihu_spider_selenium-master\crawler.py", line 954, in crawl_answer_detail
    with open(os.path.join(dircrea, nam + ".md"), 'w', encoding='utf-8') as obj:
FileNotFoundError: [Errno 2] No such file or directory: 'D:\\24Python\\12_数据抓取\\00知乎\\240502\\git-zhihu_spider_selenium-master\\answer\\金融行业用 AI 做量化交易和高频交易靠谱吗未来会如何发展 \\金融行业用 AI 做量化交易和.md'
@ZouJiu1
Copy link
Owner

ZouJiu1 commented May 2, 2024

已经修复了的, commit/08e826853b8f3868248c4d752c110a01fd5c2a4a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants