Benchmarks with hyperfine #129

dbrgn · 2020-08-12T07:24:25Z

https://github.com/sharkdp/hyperfine

CrispyBaguette · 2020-09-07T15:00:30Z

I ran a quick benchmark between tealdeer, the C client, and the npm client. I'll run a more exhaustive benchmark when (if ?) I get some time.

hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' 'tldr tar' 'tldr-c-client/tldr tar' 'npm-tldr/node_modules/.bin/tldr tar'

Benchmark #1: tldr tar
  Time (mean ± σ):      67.4 ms ±  10.9 ms    [User: 4.5 ms, System: 6.2 ms]
  Range (min … max):    47.2 ms …  97.7 ms    24 runs
 
Benchmark #2: tldr-c-client/tldr tar
  Time (mean ± σ):      61.4 ms ±   6.4 ms    [User: 4.2 ms, System: 5.1 ms]
  Range (min … max):    49.4 ms …  75.4 ms    21 runs
 
Benchmark #3: npm-tldr/node_modules/.bin/tldr tar
  Time (mean ± σ):      1.131 s ±  0.083 s    [User: 682.0 ms, System: 102.4 ms]
  Range (min … max):    1.027 s …  1.315 s    10 runs
 
Summary
  'tldr-c-client/tldr tar' ran
    1.10 ± 0.21 times faster than 'tldr tar'
   18.42 ± 2.34 times faster than 'npm-tldr/node_modules/.bin/tldr tar'```

ve-nt · 2021-05-06T15:05:06Z

Hi, I came here after reading #145. I have written a TLDR client called Outfieldr that, according to my own hyperfine benchmarks, runs around 30 times faster than Tealdeer. Here is the hyperfine log that can also be found on the README. I chose the same candidates as Tealdeer to compare against.

Benchmark #1: ./bin/bash_client ip
  Time (mean ± σ):      15.0 ms ±   1.6 ms    [User: 13.2 ms, System: 3.9 ms]
  Range (min … max):    13.4 ms …  19.5 ms    183 runs

Benchmark #2: ./bin/c_client ip
  Time (mean ± σ):       3.6 ms ±   0.5 ms    [User: 2.4 ms, System: 1.8 ms]
  Range (min … max):     3.2 ms …   6.0 ms    493 runs

Benchmark #3: ./bin/go_client ip
  Time (mean ± σ):      92.6 ms ±   1.2 ms    [User: 87.5 ms, System: 5.0 ms]
  Range (min … max):    91.8 ms …  96.9 ms    31 runs

Benchmark #4: ./bin/outfieldr ip
  Time (mean ± σ):       0.1 ms ±   0.0 ms    [User: 0.4 ms, System: 0.3 ms]
  Range (min … max):     0.0 ms …   0.6 ms    1094 runs

Benchmark #5: ./bin/tealdeer ip
  Time (mean ± σ):       3.2 ms ±   0.2 ms    [User: 2.1 ms, System: 1.6 ms]
  Range (min … max):     3.0 ms …   5.4 ms    522 runs

Benchmark #6: ./bin/node_modules/tldr/bin/tldr ip
  Time (mean ± σ):     471.0 ms ±   5.6 ms    [User: 480.1 ms, System: 49.3 ms]
  Range (min … max):   465.9 ms … 481.8 ms    10 runs

Summary
  './bin/outfieldr ip' ran
   34.06 ± 15.16 times faster than './bin/tealdeer ip'
   38.35 ± 17.62 times faster than './bin/c_client ip'
  160.36 ± 72.35 times faster than './bin/bash_client ip'
  987.72 ± 433.12 times faster than './bin/go_client ip'
 5021.80 ± 2201.98 times faster than './bin/node_modules/tldr/bin/tldr ip'

My best guess is a large bottleneck that Tealdeer has is it's dynamic linking. I'm guessing this because I used to link against libcurl in my own client, and this added ~2ms to it's runtime. Here's what Tealdeer was linking against in it's release build as of 0716808

~/Code/repos/tealdeer $ ldd target/release/tldr 
	linux-vdso.so.1 (0x00007ffc5bdbe000)
	libssl.so.1.1 => /usr/lib/libssl.so.1.1 (0x00007f793acb7000)
	libcrypto.so.1.1 => /usr/lib/libcrypto.so.1.1 (0x00007f793a9d9000)
	libdl.so.2 => /usr/lib/libdl.so.2 (0x00007f793a9d2000)
	libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0x00007f793a9b8000)
	libpthread.so.0 => /usr/lib/libpthread.so.0 (0x00007f793a997000)
	libc.so.6 => /usr/lib/libc.so.6 (0x00007f793a7ca000)
	/lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007f793b1c4000)

Replacing libcurl with a pure Zig implementation so that I could build as a static executable removed this 2ms and brought my average time back down to 0.1ms again. I'm sure that Tealdeer could benefit from this as well, by using pure Rust dependencies.

Anyway, I'd appreciate it if you added my client to your benchmarks section. If you haven't used Zig before and need help building the source, feel free to reach out.

sondr3 · 2021-05-09T12:46:57Z

As a further comment on speed, I agree that dynamic linking is another one, though I'd wager that the major difference is how Rust does printing compared to Zig and others, if you google Rust printing slow you'll find many mentions of this. Just a quick change from using println! to writeln! on a locked stdout improves the performance to

Benchmark #1: ./target/release/tldr ip
  Time (mean ± σ):       0.6 ms ±   0.2 ms    [User: 1.2 ms, System: 0.2 ms]
  Range (min … max):     0.4 ms …   3.0 ms    808 runs

And this was done on a busy laptop, there are a few other things that I think might be a problem (Docopt being one of them).

diff --git a/src/formatter.rs b/src/formatter.rs
index e2553c4..5095a8a 100644
--- a/src/formatter.rs
+++ b/src/formatter.rs
@@ -1,6 +1,6 @@
 //! Functions related to formatting and printing lines from a `Tokenizer`.
 
-use std::io::BufRead;
+use std::io::{BufRead, Write};
 
 use ansi_term::{ANSIString, ANSIStrings};
 use log::debug;
@@ -68,6 +68,8 @@ where
     R: BufRead,
 {
     let mut command = String::new();
+    let mut stdout = std::io::stdout();
+    stdout.lock();
     while let Some(token) = tokenizer.next_token() {
         match token {
             LineType::Empty => {
@@ -83,13 +85,18 @@ where
                 command = title;
                 debug!("Detected command name: {}", &command);
             }
-            LineType::Description(text) => println!("  {}", config.style.description.paint(text)),
-            LineType::ExampleText(text) => println!("  {}", config.style.example_text.paint(text)),
+            LineType::Description(text) => {
+                writeln!(stdout, "  {}", config.style.description.paint(text)).unwrap()
+            }
+            LineType::ExampleText(text) => {
+                writeln!(stdout, "  {}", config.style.example_text.paint(text)).unwrap()
+            }
             LineType::ExampleCode(text) => {
-                println!("      {}", &format_code(&command, &text, &config))
+                writeln!(stdout, "      {}", &format_code(&command, &text, &config)).unwrap()
             }
             LineType::Other(text) => debug!("Unknown line type: {:?}", text),
         }
     }
-    println!();
+    writeln!(stdout).unwrap();
+    stdout.flush().unwrap();
 }

niklasmohrin · 2021-05-09T16:49:55Z

@ve-nt Thank you for your input! The following PR looks very promising to me

The benchmarks in the readme are out of date, we should maybe remove them 🤔 I have been fiddling around with benchmarking in Docker in another PR, but I am not happy with it yet. We will make sure to inspect your Zig client and include it if it is feasible to implement

dbrgn · 2021-05-09T16:51:20Z

Docopt being one of them

Docopt will definitely be removed in the future (see #108).

niklasmohrin · 2021-06-02T22:16:58Z

@ve-nt You will have to help me out a bit if you want us to include outfieldr. I failed to build the client with the latest stable release (0.7). I saw that you mention 0.8-dev on your README, so I tried with the master branch of the language (the base image I wanted to use doesn't have a 0.8 version tagged), but the build fails here aswell. This is what the relevant part of the Dockerfile looks like:

FROM euantorano/zig:master AS zig-builder

WORKDIR /build-outputs
RUN apk add git \
        && git clone https://gitlab.com/ve-nt/outfieldr.git \
        && cd outfieldr \
        && git submodule init \
        && git submodule update \
        && zig build -Drelease-safe

ve-nt · 2021-06-03T09:32:07Z

@niklasmohrin One of the submodules needed to be updated after recent change in the standard library. I've done this and pushed, it should compile on 0.8.0-dev.2729+87dae0ce9 now.

ve-nt · 2021-06-03T10:03:34Z

I don't know whether you want to build in release-safe or release-fast for the benchmarks. I benchmarked against release-fast, however I expect that users of the program would want to use a release-safe build. The difference in performance between the two builds were pretty minor from my measurements.

niklasmohrin · 2021-06-04T19:55:41Z

Yeah, I just used release-safe, see the PR if you want to review the benchmark

dbrgn · 2021-06-06T00:25:37Z

It makes sense to use the build variant that would be used by packagers as well. I assume it's release-safe.

dbrgn mentioned this issue Aug 12, 2020

Speed comparison #38

Closed

dbrgn mentioned this issue Oct 18, 2020

A challenger emerges ... #145

Closed

niklasmohrin mentioned this issue Feb 4, 2021

Hyperfine benchmarks #163

Merged

vladimyr mentioned this issue Mar 31, 2021

Inactive and unmaintained clients tldr-pages/tldr#4044

Open

sondr3 mentioned this issue May 9, 2021

Use unbuffered I/O for faster printing to stdout, switch to Rustls #187

Merged

niklasmohrin closed this as completed in #163 Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks with hyperfine #129

Benchmarks with hyperfine #129

dbrgn commented Aug 12, 2020

CrispyBaguette commented Sep 7, 2020

ve-nt commented May 6, 2021 •

edited

Loading

sondr3 commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented Jun 2, 2021

ve-nt commented Jun 3, 2021

ve-nt commented Jun 3, 2021

niklasmohrin commented Jun 4, 2021

dbrgn commented Jun 6, 2021

Benchmarks with hyperfine #129

Benchmarks with hyperfine #129

Comments

dbrgn commented Aug 12, 2020

CrispyBaguette commented Sep 7, 2020

ve-nt commented May 6, 2021 • edited Loading

sondr3 commented May 9, 2021

niklasmohrin commented May 9, 2021

dbrgn commented May 9, 2021

niklasmohrin commented Jun 2, 2021

ve-nt commented Jun 3, 2021

ve-nt commented Jun 3, 2021

niklasmohrin commented Jun 4, 2021

dbrgn commented Jun 6, 2021

ve-nt commented May 6, 2021 •

edited

Loading