Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why use byte not str while in path (Windows) #7

Open
lyksdu opened this issue May 2, 2020 · 0 comments
Open

why use byte not str while in path (Windows) #7

lyksdu opened this issue May 2, 2020 · 0 comments

Comments

@lyksdu
Copy link

lyksdu commented May 2, 2020

def preprocess_and_write(params: Tuple[bytes, bytes, PrepConfig, str], bpe_data: Optional[BpeData] = None):

eh, I am working with this repository. on windows

I find when I use unicode like chinese in path like "./文档/", to_repr.py is likely to encode this string to bytes, this cause Exception.

unicode bytes like b'\xe6\x96\x87\xe6\xa1\xa3.py' which means ”文档.py“ , in Windows, it means a recursive folder. And python built-in function os.path.basename will not recognize this. When writing MetaData to file, this will raise a FileOrDirNotExist Exception

actually, I change the path to str to avoid this exception, but I dont know if there are any other side effects

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant