Use fuzzing to drive property testing (#7)

* Initial work toward using a fuzzer as a QC tool There's some problems with the available QC tools, at least for the purposes of this project. Namely: * the Rust testing framework limits the amount of concurrency available, starving some tests of runtime * there is no distribution available * test-case generation can be slow Considering we're randomly searching through a state space the name of the game is speed. To that end, I have in mind a notion of QuickChecking which operates first in fuzzer mode--quickly throwing random nonsense into your program to find crashes, track branch conditions in the program--and, once a crash has been found, switches into QC mode to shrink the found case. This commit does not have that mode operation notion, only shuffles code around to run a property test under AFL. There is no shrinking. Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * Adapt approach to honggfuzz, avoid allocation crashes This commit adjust the fuzzer from AFL to honggfuzz. Honggfuzz has the advantage of being multi-threaded by default but, honestly, the initial big draw was easy hook-in to lldb. The AFL fuzzer from the previous commit 'detected' HashMap panicing when we requested too much capacity but I found that an easy call to "cargo hfuzz run-debug hash_map" made the failure much more clear. The main change here is around the Reserve operation, limited now to a smallish reservation bump. This avoids the case where HashMap will panic on call to its `reserve`, being unable to signal that its allocation failed. The model is also careful about requested too much capacity, as noted. If HashMap published its overhead per pair we could calculate when the allocations would fail but it, reasonably, does not publish such information. Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * Resolve test build failure I neglected to remove quickcheck from `lib.rs`, though it is no longer a project dependency. This caused test build of the project to fail. Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * Introduce FiniteBuffer This commit introduces FiniteBuffer into the hash map fuzz. FiniteBuffer, unlike RingBuffer, does not produce an infinite amount of data but will limit production to the total number of bytes available in `data`. This change means we have to guard each Arbitrary and it's possible that if the fuzzer is not correctly configured we'll only check small total operations. But, it is quite a bit faster and fuzzers that do minimization -- like AFL -- are able to eat into the shrink step needed for proper quickchecking. This is promising, I think. Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * Back off aggressive use of clippy Clippy has changed its lint names to scoped form. This plays with CI pipelines using stable release until 'tool_lint' feature lands. Seems like this is targeted for the 2018 edition so, uh, this change will surely be reverted in the near term. Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
blt · Sep 9, 2018 · 2b6450d · 2b6450d
1 parent debe3ba
commit 2b6450d
Show file tree

Hide file tree

Showing 5 changed files with 202 additions and 154 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,4 @@
 /target
 **/*.rs.bk
 Cargo.lock
+resources/
diff --git a/Cargo.toml b/Cargo.toml
@@ -3,5 +3,10 @@ name = "bughunt-rust"
 version = "0.1.0-pre"
 authors = ["Brian L. Troutwine <brian@troutwine.us>"]
 
-[dev-dependencies]
-quickcheck = "0.6"
+[dependencies]
+honggfuzz = "0.5"
+arbitrary = { git = "https://github.com/blt/rust_arbitrary.git" }
+
+[[bin]]
+path = "src/bin/stdlib/collections/hash_map.rs"
+name = "hash_map"
diff --git a/src/bin/stdlib/collections/hash_map.rs b/src/bin/stdlib/collections/hash_map.rs
@@ -0,0 +1,117 @@
+#[macro_use]
+extern crate honggfuzz;
+extern crate arbitrary;
+extern crate bughunt_rust;
+
+use arbitrary::*;
+use bughunt_rust::stdlib::collections::hash_map::*;
+use std::collections::HashMap;
+
+fn main() {
+    loop {
+        fuzz!(|data: &[u8]| {
+            if let Ok(mut ring) = FiniteBuffer::new(data, 4048) {
+                let hash_seed: u8 = if let Ok(hs) = Arbitrary::arbitrary(&mut ring) {
+                    hs
+                } else {
+                    return;
+                };
+                // Why is capacity not usize? We're very likely to request a
+                // capacity so large that the HashMap cannot allocate enough
+                // slots to store them, resulting in a panic when we call
+                // `with_capacity_and_hasher`. This is a crash, but an
+                // uninteresting one.
+                //
+                // We also request a low-ish capacity, all but guaranteeing we'll
+                // force reallocation during execution.
+                //
+                // See note on [`hash_map::Op::Reserve`] for details
+                let capacity: u8 = if let Ok(cap) = Arbitrary::arbitrary(&mut ring) {
+                    cap
+                } else {
+                    return;
+                };
+
+                let mut model: PropHashMap<u16, u16> = PropHashMap::new();
+                let mut sut: HashMap<u16, u16, BuildTrulyAwfulHasher> =
+                    HashMap::with_capacity_and_hasher(
+                        capacity as usize,
+                        BuildTrulyAwfulHasher::new(hash_seed),
+                    );
+
+                while let Ok(op) = Arbitrary::arbitrary(&mut ring) {
+                    match op {
+                        Op::Clear => {
+                            // Clearning a HashMap removes all elements but keeps
+                            // the memory around for reuse. That is, the length
+                            // should drop to zero but the capacity will remain the
+                            // same.
+                            let prev_cap = sut.capacity();
+                            sut.clear();
+                            model.clear();
+                            assert_eq!(0, sut.len());
+                            assert_eq!(sut.len(), model.len());
+                            assert_eq!(prev_cap, sut.capacity());
+                        }
+                        Op::ShrinkToFit => {
+                            // NOTE There is no model behaviour here
+                            //
+                            // After a shrink the capacity may or may not shift from
+                            // the passed arg `capacity`. But, the capacity of the
+                            // HashMap should never grow after a shrink.
+                            //
+                            // Similarly, the length of the HashMap prior to a
+                            // shrink should match the length after a shrink.
+                            let prev_len = sut.len();
+                            let prev_cap = sut.capacity();
+                            sut.shrink_to_fit();
+                            assert_eq!(prev_len, sut.len());
+                            assert!(sut.capacity() <= prev_cap);
+                        }
+                        Op::Get { k } => {
+                            let model_res = model.get(&k);
+                            let sut_res = sut.get(&k);
+                            assert_eq!(model_res, sut_res);
+                        }
+                        Op::Insert { k, v } => {
+                            let model_res = model.insert(k, v);
+                            let sut_res = sut.insert(k, v);
+                            assert_eq!(model_res, sut_res);
+                        }
+                        Op::Remove { k } => {
+                            let model_res = model.remove(&k);
+                            let sut_res = sut.remove(&k);
+                            assert_eq!(model_res, sut_res);
+                        }
+                        Op::Reserve { n } => {
+                            // NOTE There is no model behaviour here
+                            //
+                            // When we reserve we carefully check that we're not
+                            // reserving into overflow territory. When
+                            // `#![feature(try_reserve)]` is available we can
+                            // make use of `try_reserve` on the SUT
+                            if sut.capacity().checked_add(n as usize).is_some() {
+                                sut.reserve(n as usize);
+                            } // else { assert!(sut.try_reserve(*n).is_err()); }
+                        }
+                    }
+                    // Check invariants
+                    //
+                    // `HashMap<K, V>` defines the return of `capacity` as
+                    // being "the number of elements the map can hold
+                    // without reallocating", noting that the number is a
+                    // "lower bound". This implies that:
+                    //
+                    //  * the HashMap capacity must always be at least the
+                    //    length of the model
+                    assert!(sut.capacity() >= model.len());
+                    // If the SUT is empty then the model must be.
+                    assert_eq!(model.is_empty(), sut.is_empty());
+                    // The length of the SUT must always be exactly the length of
+                    // the model.
+                    assert_eq!(model.len(), sut.len());
+                }
+            }
+        })
+    }
+}
diff --git a/src/lib.rs b/src/lib.rs
@@ -1,6 +1,4 @@
-#[cfg(test)]
-extern crate quickcheck;
-
+extern crate arbitrary;
 #[deny(warnings)]
 #[deny(bad_style)]
 #[deny(missing_docs)]
@@ -9,15 +7,4 @@ extern crate quickcheck;
 #[deny(rust_2018_compatibility)]
 #[deny(rust_2018_idioms)]
 #[deny(unused)]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_pedantic))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_perf))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_style))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_complexity))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_correctness))]
-#[cfg_attr(feature = "cargo-clippy", deny(clippy_cargo))]
-// We allow 'stuttering' as the structure of the project will mimic that of
-// stdlib. For instance, QC tests for HashMap will appear in a module called
-// `hash_map`.
-#[cfg_attr(feature = "cargo-clippy", allow(stutter))]
 pub mod stdlib;
diff --git a/src/stdlib/collections/hash_map.rs b/src/stdlib/collections/hash_map.rs
@@ -1,4 +1,5 @@
 //! Tests for `std::collections::HashMap`
+use arbitrary::*;
 use std::hash::{BuildHasher, Hash, Hasher};
 use std::mem;
 
@@ -152,148 +153,85 @@ where
     }
 }
 
-#[cfg(test)]
-mod test {
-    use super::*;
-    use quickcheck::{Arbitrary, Gen, QuickCheck, TestResult};
-    use std::collections::HashMap;
-
-    // The `Op<K, V>` defines the set of operations that are available against
-    // `HashMap<K, V>` and `PropHashMap<K, V>`. Some map directly to functions
-    // available on the types, others require a more elaborate interpretation
-    // step. See `oprun` below for details.
-    #[derive(Clone, Debug)]
-    enum Op<K, V> {
-        ShrinkToFit,
-        Clear,
-        Reserve { n: usize },
-        Insert { k: K, v: V },
-        Remove { k: K },
-        Get { k: K },
-    }
+/// The `Op<K, V>` defines the set of operations that are available against
+/// `HashMap<K, V>` and `PropHashMap<K, V>`. Some map directly to functions
+/// available on the types, others require a more elaborate interpretation
+/// step.
+#[derive(Clone, Debug)]
+pub enum Op<K, V> {
+    /// This opertion triggers `std::collections::HashMap::shrink_to_fit`
+    ShrinkToFit,
+    /// This operation triggers `std::collections::HashMap::clear`
+    Clear,
+    /// This operation triggers `std::collections::HashMap::reserve`
+    Reserve {
+        /// Reserve `n` capacity elements
+        ///
+        /// Why is this not usize? It's possible we'll try to reserve
+        /// too much underlying capacity, forcing the `reserve` call to panic
+        /// when the underlying allocation fails. The HashMap does not document
+        /// its overhead per pair and `try_reserve` is still behind a feature
+        /// flag. So, we only reserve smallish capacity and hope for the best.
+        n: u16,
+    },
+    /// This operation triggers `std::collections::HashMap::insert`
+    Insert {
+        /// The key to be inserted
+        k: K,
+        /// The value to be inserted
+        v: V,
+    },
+    /// This operation triggers `std::collections::HashMap::remove`
+    Remove {
+        /// The key to be removed
+        k: K,
+    },
+    /// This operation triggers `std::collections::HashMap::get`
+    Get {
+        /// The key to be removed
+        k: K,
+    },
+}
 
-    impl<K: 'static, V: 'static> Arbitrary for Op<K, V>
+impl<K, V> Arbitrary for Op<K, V>
+where
+    K: Clone + Send + Arbitrary,
+    V: Clone + Send + Arbitrary,
+{
+    fn arbitrary<U>(u: &mut U) -> Result<Self, U::Error>
     where
-        K: Clone + Send + Arbitrary,
-        V: Clone + Send + Arbitrary,
+        U: Unstructured + ?Sized,
     {
-        fn arbitrary<G>(g: &mut G) -> Op<K, V>
-        where
-            G: Gen,
-        {
-            // ================ WARNING ================
-            //
-            // `total_enum_fields` is a goofy annoyance but it should match
-            // _exactly_ the number of fields available in `Op<K, V>`. If it
-            // does not then we'll fail to generate `Op` variants for use in our
-            // QC tests.
-            let total_enum_fields = 6;
-            let variant = g.gen_range(0, total_enum_fields);
-            match variant {
-                0 => {
-                    let k: K = Arbitrary::arbitrary(g);
-                    let v: V = Arbitrary::arbitrary(g);
-                    Op::Insert { k, v }
-                }
-                1 => {
-                    let k: K = Arbitrary::arbitrary(g);
-                    Op::Remove { k }
-                }
-                2 => {
-                    let k: K = Arbitrary::arbitrary(g);
-                    Op::Get { k }
-                }
-                3 => Op::ShrinkToFit,
-                4 => Op::Clear,
-                5 => {
-                    let n: usize = Arbitrary::arbitrary(g);
-                    Op::Reserve { n }
-                }
-                _ => unreachable!(),
+        // ================ WARNING ================
+        //
+        // `total_enum_fields` is a goofy annoyance but it should match
+        // _exactly_ the number of fields available in `Op<K, V>`. If it
+        // does not then we'll fail to generate `Op` variants for use in our
+        // QC tests.
+        let total_enum_fields = 6;
+        let variant: u8 = Arbitrary::arbitrary(u)?;
+        let op = match variant % total_enum_fields {
+            0 => {
+                let k: K = Arbitrary::arbitrary(u)?;
+                let v: V = Arbitrary::arbitrary(u)?;
+                Op::Insert { k, v }
             }
-        }
-    }
-
-    #[test]
-    fn oprun() {
-        fn inner(hash_seed: u8, capacity: usize, ops: Vec<Op<u16, u16>>) -> TestResult {
-            let mut model: PropHashMap<u16, u16> = PropHashMap::new();
-            let mut sut: HashMap<u16, u16, BuildTrulyAwfulHasher> =
-                HashMap::with_capacity_and_hasher(capacity, BuildTrulyAwfulHasher::new(hash_seed));
-            for op in &ops {
-                match op {
-                    Op::Clear => {
-                        // Clearning a HashMap removes all elements but keeps
-                        // the memory around for reuse. That is, the length
-                        // should drop to zero but the capacity will remain the
-                        // same.
-                        let prev_cap = sut.capacity();
-                        sut.clear();
-                        model.clear();
-                        assert_eq!(0, sut.len());
-                        assert_eq!(sut.len(), model.len());
-                        assert_eq!(prev_cap, sut.capacity());
-                    }
-                    Op::ShrinkToFit => {
-                        // NOTE There is no model behaviour here
-                        //
-                        // After a shrink the capacity may or may not shift from
-                        // the passed arg `capacity`. But, the capacity of the
-                        // HashMap should never grow after a shrink.
-                        //
-                        // Similarly, the length of the HashMap prior to a
-                        // shrink should match the length after a shrink.
-                        let prev_len = sut.len();
-                        let prev_cap = sut.capacity();
-                        sut.shrink_to_fit();
-                        assert_eq!(prev_len, sut.len());
-                        assert!(sut.capacity() <= prev_cap);
-                    }
-                    Op::Get { k } => {
-                        let model_res = model.get(k);
-                        let sut_res = sut.get(&k);
-                        assert_eq!(model_res, sut_res);
-                    }
-                    Op::Insert { k, v } => {
-                        let model_res = model.insert(*k, *v);
-                        let sut_res = sut.insert(*k, *v);
-                        assert_eq!(model_res, sut_res);
-                    }
-                    Op::Remove { k } => {
-                        let model_res = model.remove(k);
-                        let sut_res = sut.remove(k);
-                        assert_eq!(model_res, sut_res);
-                    }
-                    Op::Reserve { n } => {
-                        // NOTE There is no model behaviour here
-                        //
-                        // When we reserve we carefully check that we're not
-                        // reserving into overflow territory. When
-                        // `#![feature(try_reserve)]` is available we can
-                        // make use of `try_reserve` on the SUT
-                        if sut.capacity().checked_add(*n).is_some() {
-                            sut.reserve(*n);
-                        } // else { assert!(sut.try_reserve(*n).is_err()); }
-                    }
-                }
-                // Check invariants
-                //
-                // `HashMap<K, V>` defines the return of `capacity` as
-                // being "the number of elements the map can hold
-                // without reallocating", noting that the number is a
-                // "lower bound". This implies that:
-                //
-                //  * the HashMap capacity must always be at least the
-                //    length of the model
-                assert!(sut.capacity() >= model.len());
-                // If the SUT is empty then the model must be.
-                assert_eq!(model.is_empty(), sut.is_empty());
-                // The length of the SUT must always be exactly the length of
-                // the model.
-                assert_eq!(model.len(), sut.len());
+            1 => {
+                let k: K = Arbitrary::arbitrary(u)?;
+                Op::Remove { k }
             }
-            TestResult::passed()
-        }
-        QuickCheck::new().quickcheck(inner as fn(u8, usize, Vec<Op<u16, u16>>) -> TestResult)
+            2 => {
+                let k: K = Arbitrary::arbitrary(u)?;
+                Op::Get { k }
+            }
+            3 => Op::ShrinkToFit,
+            4 => Op::Clear,
+            5 => {
+                let n: u16 = Arbitrary::arbitrary(u)?;
+                Op::Reserve { n }
+            }
+            _ => unreachable!(),
+        };
+        Ok(op)
     }
 }