Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve update-locales script and fix locale processing bug #23240

Merged
merged 4 commits into from
Mar 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 40 additions & 5 deletions build/update-locales.sh
Original file line number Diff line number Diff line change
@@ -1,14 +1,49 @@
#!/bin/sh
#!/bin/bash

set -e

SED=sed

if [[ $OSTYPE == 'darwin'* ]]; then
# for macOS developers, use "brew install gnu-sed"
SED=gsed
fi
Copy link
Member

@silverwind silverwind Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could just do the same detection that Makefile does and adapt sed command syntax based on whether it is GNU or BSD sed. Better than to require the installation of additional software :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll check it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Actually I have thought about this problem, but I didn't choose the sed -i'' for BSD sed, because the script has been here for long time, nobody really needs it, I just want to avoid any inconsistency. And I think this PR is the last fix by modifying the script 😂 in the future we should decide whether use Go to rewrite or totally replace the ini package.

The sed -i'' might also work, just like it in Makefile, but I haven't tested it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more explanations about my "worry about inconsistency": this script is hard to debug, and not covered by any test (unlike the sed in Makefile, any error can be found in the first time and gets fixed quickly). If there is any inconsistency between GNU sed and BSD sed, the problem can not be found until next crowdin sync runs, and it's difficult for macOS users to debug the inconsistency if they are using a different version.

If there is no inconsistency, then using sed -i'' is fine for BSD sed.

Just share some of my thoughts, for your information, no objection nor block. 😁


if [ ! -f ./options/locale/locale_en-US.ini ]; then
echo "please run this script in the root directory of the project"
exit 1
fi

mv ./options/locale/locale_en-US.ini ./options/

# Make sure to only change lines that have the translation enclosed between quotes
sed -i -r -e '/^[a-zA-Z0-9_.-]+[ ]*=[ ]*".*"$/ {
s/^([a-zA-Z0-9_.-]+)[ ]*="/\1=/
s/\\"/"/g
# the "ini" library for locale has many quirks
# * `a="xx"` gets `xx` (no quote)
# * `a=x\"y` gets `x\"y` (no unescaping)
# * `a="x\"y"` gets `"x\"y"` (no unescaping, the quotes are still there)
# * `a='x\"y'` gets `x\"y` (no unescaping, no quote)
# * `a="foo` gets `"foo` (although the quote is not closed)
# * 'a=`foo`' works like single-quote
# crowdin needs the strings to be quoted correctly and doesn't like incomplete quotes
# crowdin always outputs quoted strings if there are quotes in the strings.

# this script helps to unquote the crowdin outputs for the quirky ini library
# * find all `key="...\"..."` lines
# * remove the leading quote
# * remove the trailing quote
# * unescape the quotes
# * eg: key="...\"..." => key=..."...
$SED -i -r -e '/^[-.A-Za-z0-9_]+[ ]*=[ ]*".*"$/ {
s/^([-.A-Za-z0-9_]+)[ ]*=[ ]*"/\1=/
s/"$//
s/\\"/"/g
}' ./options/locale/*.ini

# * if the escaped line is incomplete like `key="...` or `key=..."`, quote it with backticks
# * eg: key="... => key=`"...`
# * eg: key=..." => key=`..."`
$SED -i -r -e 's/^([-.A-Za-z0-9_]+)[ ]*=[ ]*(".*[^"])$/\1=`\2`/' ./options/locale/*.ini
$SED -i -r -e 's/^([-.A-Za-z0-9_]+)[ ]*=[ ]*([^"].*")$/\1=`\2`/' ./options/locale/*.ini

# Remove translation under 25% of en_us
baselines=$(wc -l "./options/locale_en-US.ini" | cut -d" " -f1)
baselines=$((baselines / 4))
Expand Down
Loading