Data replication
- A few program parameters need to be replicated, but not the amplitude array, which
has the bulk of the data. Therefore, data replication is not a problem here.
Load balancing
- All points require equal work, so the points should be divided equally amongst the
processes.
- Cyclic? Block?
- Both types of decomposition split the work load evenly, +/- one point.
- Cyclic and block do equally well for load balancing.
Communication
- Cyclic?
- Cyclic decomposition will deal out the points so that neighboring points
will always be found on different processes. This isn't what we want!
- Block?
- Block decomposition leaves contiguous data points on the same process.
Only the point at each end of the block requires communication, so the
larger the block size, the smaller the percentage of communication
overhead.
- Cyclic does too much communicating compared to block.
Conclusion: domain decomposition should be block by position
