-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
src: introducing to_unicode #11
Conversation
include/ada/idna/to_unicode.h
Outdated
std::string to_unicode(const std::string_view& input); | ||
} // namespace ada::idna | ||
|
||
#endif // ADA_IDNA_TO_UNICODE_H |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will complain here that the file does end with an empty line. :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have a clang-format rule for this :)
@miguelteixeiraa Can you check the CI errors? I think your PR looks good. It might even be correct (minus the build errors). I am not 100% sure what to_unicode is supposed to do. Maybe you have checked the specification? I haven't. |
You have a number of files that do not end with an empty line. I do not care personally but that is a big deal to some. (Very old editors that nobody use anymore required that text files ended with an empty line. It seems to have since become a cult with the dogma that an empty final line is 'good'.) |
Yes, I've checked: Based on: https://www.rfc-editor.org/rfc/inline-errata/rfc3490.html So the NAMEPREP thing is missing (which involves things like normalization, and checks for the label validity (sizes...etc etc )) I'm doing 1, 3, 4, 5, 6, 7 (through the tests), 8
This is the algo to be applied to each label. It is basically the reverse operation that is being done for to_ascii. |
@lemire I have a question about our utf8_punycode_alternating.txt |
Anyways, I know how to fix it |
@miguelteixeiraa No punycode by itself does not start by xn--. See https://en.wikipedia.org/wiki/Punycode or https://datatracker.ietf.org/doc/html/rfc3492 The prefix |
src/punycode.cpp
Outdated
@@ -22,6 +23,40 @@ static constexpr char digit_to_char(int32_t digit) { | |||
return digit < 26 ? char(digit + 97) : char(digit + 22); | |||
} | |||
|
|||
static constexpr bool begins_with(std::string_view view, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have multiple definitions of begins_with and is_ascii. The final PR should clean that up and have just one definition per function.
I read more carefully your code and it looks pretty good. |
Make sure to add... #include <algorithm> at the top of |
146d689
to
fc378ca
Compare
I just mess up everything and I will reopen it. |
Ref: #4
Sorry for the delay folks! Still need more tests + add some comments, but how does it look like until now?