Skip to content

Commit

Permalink
cmd/xurls: don't use bufio.Scanner to scan the input
Browse files Browse the repository at this point in the history
It was a convenient way to obtain the input in chunks, so that the tool
could print urls incrementally without having to read the entirety of
the input at once.

Unfortunately, we failed to notie that bufio.Scanner has a hard limit on
the size of each "token". In our particular case, it meant that any
sequence of many thousands of input bytes without any whitespace could
make the tool error out.

Instead, use bufio.Reader, which grows a buffer to fit the data being
read. Go back to reading one line at a time, as it can only stop at one
specific byte like '\n', and not many of them like all whitespace
characters.

Fixes #28.
  • Loading branch information
mvdan committed Jul 18, 2019
1 parent 776b0d8 commit 9058190
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 8 deletions.
19 changes: 12 additions & 7 deletions cmd/xurls/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"bufio"
"flag"
"fmt"
"io"
"os"
"regexp"

Expand Down Expand Up @@ -41,15 +42,19 @@ func scanPath(re *regexp.Regexp, path string) error {
defer f.Close()
r = f
}
scanner := bufio.NewScanner(r)
scanner.Split(bufio.ScanWords)
for scanner.Scan() {
word := scanner.Text()
for _, match := range re.FindAllString(word, -1) {
fmt.Println(match)
bufr := bufio.NewReader(r)
for {
line, err := bufr.ReadBytes('\n')
for _, match := range re.FindAll(line, -1) {
fmt.Printf("%s\n", match)
}
if err == io.EOF {
break
} else if err != nil {
return err
}
}
return scanner.Err()
return nil
}

func main() {
Expand Down
2 changes: 1 addition & 1 deletion generate/schemesgen/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ func schemeList() []string {
}
defer resp.Body.Close()
r := csv.NewReader(resp.Body)
r.Read() //ignore headers
r.Read() // ignore headers
schemes := make([]string, 0)
for {
record, err := r.Read()
Expand Down

0 comments on commit 9058190

Please sign in to comment.