Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory allocation in bulk op #56

Merged
merged 1 commit into from
Nov 3, 2017
Merged

Reduce memory allocation in bulk op #56

merged 1 commit into from
Nov 3, 2017

Conversation

feliixx
Copy link

@feliixx feliixx commented Nov 2, 2017

Use memory pooling to reuse bulkActions and avoid some allocations

Here are some benchmarks results:

benchmark                 old ns/op     new ns/op     delta
BenchmarkBulkInsert-2     4441204       4473256       +0.72%

benchmark                 old allocs     new allocs     delta
BenchmarkBulkInsert-2     1967           1945           -1.12%

benchmark                 old bytes     new bytes     delta
BenchmarkBulkInsert-2     89338         37986         -57.48%

and here is the benchmark code:

func BenchmarkBulkInsert(b *testing.B) {
	var documents [1000]bson.Raw
	for i := range documents {
		data, _ := bson.Marshal(bson.M{"n": i})
		documents[i] = bson.Raw{Kind: bson.ElementDocument, Data: data}
	}
	s, err := mgo.Dial("mongodb://localhost:27017")
	if err != nil {
		b.Fail()
	}
	c := s.DB("bench").C("test")
	b.ResetTimer()

	for n := 0; n < b.N; n++ {
		bulk := c.Bulk()
		bulk.Unordered()
		for _, doc := range documents {
			bulk.Insert(doc)
		}
		_, err := bulk.Run()
		if err != nil {
			b.Fail()
		}
	}
}

bulk.go Outdated
@@ -294,6 +306,9 @@ func (b *Bulk) Run() (*BulkResult, error) {
break
}
}
action.idxs = action.idxs[0:0]
action.docs = action.docs[0:0]
actionPool.Put(action)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be above the if !ok check - if an operation fails the actions leak.

@domodwyer
Copy link

Hey @feliixx,

Good idea and an easy win! Just need to return the action back to the pool before the check but quite happy to merge after.

Dom

szank
szank previously approved these changes Nov 2, 2017
  use memory pooling to reuse bulkActions and
  avoid some allocations
@feliixx
Copy link
Author

feliixx commented Nov 3, 2017

Hi @domodwyer

thanks or the review, it should be fine now

@domodwyer domodwyer merged commit 7cd0b89 into globalsign:development Nov 3, 2017
@domodwyer
Copy link

Fantastic, cheers @feliixx!

@domodwyer domodwyer changed the title reduce memory allocation in bulk op Reduce memory allocation in bulk op Nov 6, 2017
@domodwyer domodwyer mentioned this pull request Nov 6, 2017
@feliixx feliixx deleted the memoryPool branch November 23, 2017 09:44
@domodwyer domodwyer mentioned this pull request Jan 15, 2018
domodwyer added a commit that referenced this pull request Jan 15, 2018
Includes:

* Reduced memory in bulk operations (#56)
* Native x509 authentication (#55)
* Better connection recovery (#69)
* Example usage (#75 and #78)

Thanks to:

* @bachue
* @csucu 
* @feliixx


---
[Throughput overview](https://user-images.githubusercontent.com/9275968/34954403-3d3253dc-fa18-11e7-8eef-0f2b0f21edc3.png)

Select throughput has increased by ~600 requests/second with slightly increased variance:
```
x => r2017.11.06-select-zipfian-throughput.log 
y => 9acbd68-select-zipfian-throughput.log 

     n   min   max median  average   stddev      p99
x 3600 49246 71368  66542 66517.26 2327.675 70927.01
y 3600 53304 72005  67151 67145.36 2448.534 71630.00

          62000       64000       66000        68000       70000       72000    
 |----------+-----------+-----------+------------+-----------+-----------+-----|
                              +---------+--------+                              
1          -------------------|         |        |--------------------          
                              +---------+--------+                              
                                 +---------+---------+                          
2    ----------------------------|         |         |--------------------      
                                 +---------+---------+                          
Legend: 1=data$x, 2=data$y

At 95% probablitiy:
===> average is statistically significant     (p=0.000000, diff ~628.094444)
===> variance is statistically significant    (p=0.002398)

```

* [insert-latency.txt](https://github.com/globalsign/mgo/files/1632474/insert-latency.txt)
* [insert-throughput.txt](https://github.com/globalsign/mgo/files/1632475/insert-throughput.txt)
* [select-zipfian-latency.txt](https://github.com/globalsign/mgo/files/1632476/select-zipfian-latency.txt)
* [select-zipfian-throughput.txt](https://github.com/globalsign/mgo/files/1632477/select-zipfian-throughput.txt)
* [update-zipfian-latency.txt](https://github.com/globalsign/mgo/files/1632478/update-zipfian-latency.txt)
* [update-zipfian-throughput.txt](https://github.com/globalsign/mgo/files/1632479/update-zipfian-throughput.txt)

Note: latencies are approximations calculated from grouped data
libi pushed a commit to libi/mgo that referenced this pull request Dec 1, 2022
Includes:

* Reduced memory in bulk operations (globalsign#56)
* Native x509 authentication (globalsign#55)
* Better connection recovery (globalsign#69)
* Example usage (globalsign#75 and globalsign#78)

Thanks to:

* @bachue
* @csucu 
* @feliixx


---
[Throughput overview](https://user-images.githubusercontent.com/9275968/34954403-3d3253dc-fa18-11e7-8eef-0f2b0f21edc3.png)

Select throughput has increased by ~600 requests/second with slightly increased variance:
```
x => r2017.11.06-select-zipfian-throughput.log 
y => 9acbd68-select-zipfian-throughput.log 

     n   min   max median  average   stddev      p99
x 3600 49246 71368  66542 66517.26 2327.675 70927.01
y 3600 53304 72005  67151 67145.36 2448.534 71630.00

          62000       64000       66000        68000       70000       72000    
 |----------+-----------+-----------+------------+-----------+-----------+-----|
                              +---------+--------+                              
1          -------------------|         |        |--------------------          
                              +---------+--------+                              
                                 +---------+---------+                          
2    ----------------------------|         |         |--------------------      
                                 +---------+---------+                          
Legend: 1=data$x, 2=data$y

At 95% probablitiy:
===> average is statistically significant     (p=0.000000, diff ~628.094444)
===> variance is statistically significant    (p=0.002398)

```

* [insert-latency.txt](https://github.com/globalsign/mgo/files/1632474/insert-latency.txt)
* [insert-throughput.txt](https://github.com/globalsign/mgo/files/1632475/insert-throughput.txt)
* [select-zipfian-latency.txt](https://github.com/globalsign/mgo/files/1632476/select-zipfian-latency.txt)
* [select-zipfian-throughput.txt](https://github.com/globalsign/mgo/files/1632477/select-zipfian-throughput.txt)
* [update-zipfian-latency.txt](https://github.com/globalsign/mgo/files/1632478/update-zipfian-latency.txt)
* [update-zipfian-throughput.txt](https://github.com/globalsign/mgo/files/1632479/update-zipfian-throughput.txt)

Note: latencies are approximations calculated from grouped data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants