Skip to content

A Go cover tree implementation for nearest-neighbour search and clustering

License

Notifications You must be signed in to change notification settings

mandykoh/go-covertree

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

go-covertree

GoDoc Go Report Card Build Status

go-covertree is a cover tree implementation in Go for nearest-neighbour search and clustering. It uses an extensible backing store interface (suitable to adapting to key-value stores, RDBMSes, etc) to support very large data sets.

For further horizontal scaling, a partitioned store implementation supports sharding across multiple underlying stores using a partitioning function.

See the API documentation for more details.

This software is made available under an MIT license.

Thread safety

Tree instances are thread-safe for readonly access.

Insertions into the tree (using Insert) are purely append-only operations, and safe to make concurrently, allowing tree construction to be parallelised.

Searching the tree (using FindNearest) is purely a read-only operation and safe to do concurrently, including with insertions.

Removals from the tree (using Remove) are not thread-safe and should be externally synchronised if concurrent read-write access is required.

Store implementations should observe their own thread-safety considerations.

Example usage

Define a type to be stored in the tree:

type Point struct {
    X float64
    Y float64
}

Define a DistanceFunc to compute the distance between two instances of the type:

func distanceBetween(a, b interface{}) float64 {
    p1 := a.(*Point)
    p2 := b.(*Point)
	
    distX := p1.X - p2.X
    distY := p1.Y - p2.Y
	
    return math.Sqrt(distX * distX + distY * distY)
}

Create a Tree. A tree using a provided in-memory store can be conveniently created using NewInMemoryTree:

tree := covertree.NewInMemoryTree(basis, rootDistance, distanceBetween)

The basis specifies the logarithmic base for determining the ratio of coverage of nodes at adjacent levels of the tree. If unsure, values around 2.0 may be good starting points.

The rootDistance specifies the maximum expected distance between nodes (actually the minimum distance between root nodes) and determines when new root nodes are created. This should generally be set to the largest distance between nodes expected for your data set.

Custom Store implementations can also use the basic Tree constructor to create trees:

// Creates a tree that is backed by a specific store
tree, err := covertree.NewTreeWithStore(pointStore, basis, rootDistance, distanceBetween)       

Insert some things into the tree:

err := tree.Insert(&Point{1.5, 3.14})

Find the 5 nearest things in the store that are within 10.0 of a query point:

results, err := tree.FindNearest(&Point{0.0, 0.0}, 5, 10.0)

Remove things from the store:

removed, err := tree.Remove(&Point{1.5, 3.14})

About

A Go cover tree implementation for nearest-neighbour search and clustering

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages