-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data gathering #19
Comments
What about Unihan? With the hexadecimal codepoint we can get the glyph like this in Python:
A JS solution would be better, but this is out of the scope of the project, we can do it anyway we think fits. |
Please check out :
|
Thanks for the link cjk-unihan might be useful for other projects. I think it's better to limit the project to generating font and outsource the data gathering/validation to another project. This way we stay focus and efficient. I'm closing as different users might have different needs hence handcraft their dictionaries. |
I reckon the JS solution is in tobei/unihan code
|
Did you gathered the data ? |
Not yet, could you work on a project to do so? |
Yup. See also peterolson/hanzi-tools#1 (comment) |
@hugolpz I think you have a typo in your comment, there is a ratio of 1:10 between node-pinyin and unihan characters/phonetic pairs. Can you confirm/correct this number? |
We can get the codepoint using |
We currently look for database with
{ "glyph": "西", "phonetic": "xī" }
(orxi1
, or alternatives).Sources possible, info to complete :
Moedict
Unicode :
CJKlib
The text was updated successfully, but these errors were encountered: