Let's say you're implementing a map function to run in parallel, how do you decide how many threads to spawn. There seems to be a pretty heavy cost to spawning a thread per cell when the list is very long, instead of having each thread process a few cells each. What's this topic called? What do I need to search for best practices?
@neauoire
IMHO this goes deep into algorithm engineering. And likely depends highly on your architecture and even the function you are applying.
E.g. for sorting algorithms, quicksort performs badly if the size of your data exceeds the CPU cache (and then again if it exceeds a memory page etc.).
This is because quicksort jumps around in memory a lot.
Merge sort on the other hand work mostly linear, so it performs better independent of the concrete numbers for cache and mem sizes.
That said: Depending on the kind of calculation you are doing, it might be easiest to just test the performance of the code with different numbers of threads.
But I am curious about other answers :blobcatgiggle:
@neauoire
IMHO this goes deep into algorithm engineering. And likely depends highly on your architecture and even the function you are applying.
E.g. for sorting algorithms, quicksort performs badly if the size of your data exceeds the CPU cache (and then again if it exceeds a memory page etc.).
This is because quicksort jumps around in memory a lot.