Skip to main content

more options


Function or Data

Function: divide the computation into disjoint or unassociated operations

  • All the work in this problem is accomplished in one loop. It would be contrived (and less efficient!) to split it up.

Data: give each process a different subset of the domain

  • Dimensions: position (i), time (t)
  • Which data can be computed concurrently in the update of A(i,t+1)? - test for data independence:
      A(i,t+1) = 2.0*A(i,t) - A(i,t-1)
         + c * (A(i-1,t) - 2.0*A(i,t) + A(i+1,t))
  • Does (t+1) depend on (t), at (i)? - yes
  • Does (i) depend on (i-1), at (t+1)? - no
  • The test shows that t has a data dependence but i doesn't
  • Conclusion: parallelize by decomposition along i
  • What data will be required?
    • Neighboring points
    • Small amount of communication across domain boundaries

Function: divide the computation into disjoint or unassociated operations

  • All the work in this problem is accomplished in one loop. It would be contrived (and less efficient!) to split it up.

Data: give each process a different subset of the domain

  • The two dimensions for this problem are position (i) and time (t)
  • Which data can be computed concurrently in the update of A(i,t+1)?
    • TEST FOR DATA INDEPENDENCE:
      In a loop over an array index, updates to the array can be done concurrently IF the elements do not require the values of elements computed in a previous iteration of the index.
      Examples:
      F(I) = F(I-1) is NOT independent
      F(I) = F(I)*2 is independent
      F(I) = G(I-1) is independent
    • Let's take a look at our problem...
           A(i,t+1) = 2.0*A(i,t) - A(i,t-1)
              + c * (A(i-1,t) - 2.0*A(i,t) + A(i+1,t))
  • Time fails this test: A(i,t+1) requires A(i,t) and A(i,t-1).
    This is called a data dependence. (It seems pretty intuitive that a simulation can't be decomposed by time—this is just a more concrete test. But note that there are methods known as FDTD, or "finite difference time domain," that can be decomposed along the time axis, without violating cause-and-effect!)
  • Position passes this test:
    A(i,t+1) does not require any other values at (t+1).
  • Conclusion: parallelize by decomposition along i
  • What data are required?
    • Amplitude depends on values at neighboring points for the previous timestep.
      A(i,t+1) requires A(i-1,t) and A(i+1,t)
    • This will result in communication overhead to communicate across domain boundaries, plus idle time spent waiting for processes with slightly higher work loads to catch up.