-
-
Notifications
You must be signed in to change notification settings - Fork 837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add task pool #37
base: master
Are you sure you want to change the base?
Add task pool #37
Conversation
Currently I have a return slice and just map the values with a function, I will change this however to let the users manage return data themselves |
I have done some benchmarks and will commit them later. As of right now, the task pool implementation is about 4x faster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contrib @Azer0s !
I made a few comments
Refactored everything according to comments. Shall I add some doc in README? @samber |
Yes please! Can you update the benchmark at the bottom of README? |
@samber Done. Could you look through the documentation I added? |
…task_pool # Conflicts: # README.md
…task_pool # Conflicts: # README.md
@samber I added another helper function called With. This is useful for the situations described in the readme and for some situations with the taskpool. |
…task_pool # Conflicts: # README.md
please merge @samber |
Good job. But maybe the API that implemented with method I think expandable optional parameters might be more suitable: func Map[T any, R any](collection []T, iteratee func(T, int) R, options ...*ParallelOption) []R {
}
// Usage Cases:
// set max concurrency count with option
parallel.Map(collection, callback, parallel.Option().Concurrency(20))
// normal calling
parallel.Map(collection, callback)
// expandable to more options in future
options := parallel.Option()
options.Concurrency(10)
options.Timeout(2 * time.Second)
options.Retries(3)
parallel.Map(collection, callback, options) By the way, this design with optional parameters was well-practiced in other language communities: Node.js await pMap(array, callback, { concurrency: 20 }) Node.js await Bluebird.map(array, callback, { concurrency: 20 }) Node.js await Prray.from(array).mapAsync(callback, { concurrency: 20 }) |
@Bin-Huang Ok I found a solution that imo is a bit more elegant. Could you take a look? @Bin-Huang @samber |
for i, item := range collection { | ||
go func(_item T, _i int) { | ||
res := iteratee(_item, _i) | ||
var DefaultPoolSize = runtime.NumCPU() / 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it runtime.NumCPU()
can be 1, then DefaultPoolSize
will be zero?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know some use cases will cause confusing default behavior about default pool size.
Let says someone wants to request 10 URLs in parallel on a device had 2 CPUs, he will write some code just like that:
responses := parallel.Map(URLs, request)
In this case, requesting a URL is only an IO-intensive task, so for the best performance and efficiency, it should requesting 10 URLs at the same time, just like the code above looks like for the package user.
But DefaultPoolSize is working internally and it is 1
(half of the CPU number), so actually, only one requesting task is started at the same time. And in this case, someone will confused about why the code will be slower 10x times.
The task pool can improve performance only if the developer knows what they doing, and adjusts the pool size thoughtfully for the special scene. A global default pool size can't improve performance in any scene, and often it's worse (if it can, the Golang GMP will make it internally).
I recommend removing the default pool size, and only using the task pool when the concurrency limit option is set.
I think this PR is very useful, are we planing to merge it? 😂 |
Would love to see this feature. Bit of a blocker to have unlimited goroutines getting spun up when using parallel functions. |
These methods are very useful, when those are be merged ? |
@kcmvp @Bin-Huang is this still relevant? If so I can fix the conflicts and merge |
This is more of a prototype right now but it is already a bit faster than the normal lop.Map implementation.
Running on M1 Macbook Pro 2021