Skip to content

Commit

Permalink
allow Relaxed to match punycode TLDs
Browse files Browse the repository at this point in the history
For example, it should match "test.xn--8y0a063a" just like it matches
"test.联通".

Instead of doubling the size of the regexp by adding the punycode
version of every known TLD, simply match any valid punycode string which
follows "xn--". It's highly unlikely that this would cause false
positives.

Fixes #27.
  • Loading branch information
mvdan committed Jul 15, 2019
1 parent 32cda0c commit 776b0d8
Show file tree
Hide file tree
Showing 2 changed files with 8 additions and 1 deletion.
4 changes: 3 additions & 1 deletion xurls.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,9 @@ func strictExp() string {
}

func relaxedExp() string {
site := domain + `(?i)` + anyOf(append(TLDs, PseudoTLDs...)...) + `(?-i)`
punycode := `xn--[a-z0-9-]+`
knownTLDs := anyOf(append(TLDs, PseudoTLDs...)...)
site := domain + `(?i)(` + punycode + `|` + knownTLDs + `)(?-i)`
hostName := `(` + site + `|` + ipAddr + `)`
webURL := hostName + port + `(/|/` + pathCont + `?|\b|$)`
return strictExp() + `|` + webURL
Expand Down
5 changes: 5 additions & 0 deletions xurls_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,11 @@ func TestRegexes(t *testing.T) {
{`foo.onion`, true},
{`中国.中国`, true},
{`中国.中国/foo中国`, true},
{`test.联通`, true},
{`test.xn--8y0a063a`, true},
{`test.xn--8y0a063a/foobar`, true},
{`test.xn-foo`, nil},
{`test.xn--`, nil},
{`foo.com/`, true},
{`1.1.1.1`, true},
{`10.50.23.250`, true},
Expand Down

0 comments on commit 776b0d8

Please sign in to comment.