Skip to content
This repository has been archived by the owner on Apr 27, 2021. It is now read-only.

Crawler for Cantonese pronunciation data on LSHK Jyutping Word List (香港語言學學會粵拼詞表)

Notifications You must be signed in to change notification settings

sgalal/lshk-word-list-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lshk-word-list-crawler

Crawler for Cantonese pronunciation data on LSHK Jyutping Word List (香港語言學學會粵拼詞表)

See sanitized.txt for the final result.

File structure

  • lshk.py: The crawler
  • result.txt: Raw result output by the crawler
  • sanitize.py: Sanitizer for the result
  • sanitized.txt: Final result output by the sanitizer
  • sanitize_log.txt: Sanitize log

License

According to the original terms, the dictionary data is distributed under CC BY 4.0.

Python code in this repository is distributed under MIT license.

Disclaimer

The link of the word list is now broken. If you are interested in a more up-to-date word list, see rime/rime-cantonese.