First 5 minutes of hell

Scalability = the property of the parallel algorithm to keep the parallel optimality if both and grow or shrink.

Amdahl’s Law for saturation of parallelizability

    • at each sequential algorithm , there is an inherently sequential fraction (executed only on 1 thread - I/O operations, initialization etc.) and the the parallelizable part
    • the sequential part is not parallelizable, so it bounds the speedup (at some point, adding more processors does not increase the speedup anymore)
    • essentially saying that if we have fixed problem size , there is only a limited amount of parallelism to do (it cannot be scaled to the infinity)

Gustavson’s Law

  • Setup: split the parallel execution time into t_seq (sequential part) and t_par(n,p) (parallel part, executed on p processors)
    • let be the fraction of the parallel execution time spent on the sequential part:
    • then the speedup simplifies to:
  • limit (assuming grows with while stays fixed):
    • Gustavson assumes that as we get more processors, we scale the problem size (rather than keeping it fixed), under this assumption, linear speedup is achievable

Isoefficiency functions

  • when preserving the efficiency, we need to define the best ratio of processors and the problem size , and the isoefficiency functions define the upper and lower bound of this relationship
  • for a given constant efficiency :
    • the minimum n needed for a given p:
      • Gustavson essentially says that if the problem size grows according to the , the efficiency will be sufficiently good
    • the maximum p allowed for a given n:
      • = for a fixed size , how many processors can I use while still maintaining the efficiency at least
      • it is the largest function, so any more processors beyond this and the efficiency drops below (we will be wasting them)
        • this is what the Amdahl’s Law is saying, if I use more processors, I “saturate” the parallelism and adding more processors won’t help
        • the is the saturation point, where I am hitting the ceiling

Parallel scalability

  • definition: scalability is the property of a parallel algorithm to keep the parallel optimality if both and grow or shrink
    • the scalability expresses that larger problems can be solved in the same time as smaller problems if sufficient is utilized
    • it’s not always about having a maximum number of processing units available
      • it’s the scalability issue, we have to scale with to keep all cores busy all the time
    • T - time, E - efficiency, S - speedup
  • there are two types of scalability:
    • strong scalability: measures the capability of a parallel algorithm for fixed to achieve linear speedup with increasing p (Amdahl’s law puts strong limits to this)
      • alternatively: strong scalability is the measure of efficiency decrease if increases while is fixed
    • weak scalability: defines how the parallel time varies with for fixed
      • alternatively: weak scalability is the measure of growth of such that a fixed efficiency is preserved when increases

Amdahl’s Law

  • each sequential calculation has some part of it inherently sequential, call it
    • the remaining part of could be parallelized
    • we assume the perfect parallelization without any overhead
  • derivation: let be the sequential time, with
    • under perfect parallelization the parallel time is
    • dividing by yields the formula below
  • the formula ( is fixed, varies): - the law:
    • the parallel speedup will always be bounded by no matter how many threads are used
    • it measures the parallel algorithm efficiency decrease if is increasing while is fixed
      • that every added processor has more marginal benefit
      • after a certain value of , adding new processors does not make sense, there is not enough parallel work to do (see the above plots)
        • that is called the saturation limit
    • it basically says that a fixed size problem provides a limited amount of parallellism (there is a limit of a reasonable number of parallel threads to execute it)
  • examples:
    • if , the speedup will be for any
      • no matter how many processors in parallel I use, the speedup will be at most 10 times faster (because the non-parallelizable 10 % of sequential time will be a bottleneck)
    • if , the speedup for any

Gustafson’s Law

  • weak scalability: if we scale and linearly together, a constant efficiency can be maintained
    • and thus the linear speedup and optimal cost
    • meaning, if we add more computing units , we should also scale up the problem size
    • it measures the growth when we preserve the efficiency while increasing
  • the non-parallelizable sequential portion stays constant (I/O operations, initialization…), so as we add processors and proportionally grow the problem , the parallel portion scales linearly, hence overall speedup is linear in p
  • the formula (here is the sequential fraction of the parallel execution time on the scaled problem, not of the serial time):
  • reconciliation with Amdahl: the two laws are not contradictory, they answer different questions
    • Amdahl fixes and asks how speedup behaves with - strong scaling, pessimistic
    • Gustafson fixes time-per-processor and lets grow with - weak scaling, optimistic
    • the apparent disagreement comes from where is measured: Amdahl on the serial execution, Gustafson on the scaled parallel execution

Isoefficiency functions

  • when preserving the efficiency of the parallel algorithm, isoefficiency functions define the lowerbound and upperbound of the and relationship
  • we can define some fixed efficiency that we want to achieve, e.g. 0.2 (20 %):
    • the isoefficiency functions specify:
      • the minimum n needed for a given p:
      • the maximum p allowed for a given n:
    • to maintain the given efficiency guarantee
  • we ask: “what minimum efficiency do I want to guarantee?”
    • and with isoefficiency functions, I can calculate the correct scaling of n and p to maintain this efficiency (when I have to option to remove/add computing units and increase/decrease the problem size)
  • the isoefficiency is the property of the algorithm, not of the problem
    • different algorithms of the same problem can have differernt isoefficiency values

Potential exam questions

  1. Derive Amdahl’s law from assuming perfect parallelization of the parallelizable part.
  2. State Amdahl’s formula and compute the asymptotic speedup limit for .
  3. What is the saturation limit and why does adding more processors beyond it not help?
  4. State Gustafson’s formula. How is defined differently than in Amdahl’s law?
  5. Are Amdahl’s law and Gustafson’s law contradictory? Explain.
  6. Define strong scalability and weak scalability. Which law is associated with which?
  7. Define efficiency in terms of sequential work and parallel overhead , and explain how this leads to the notion of isoefficiency.
  8. What does it mean for an algorithm to have isoefficiency function ? Is this better or worse than ?
  9. Derive the isoefficiency function of parallel reduction on processors.
  10. Explain why isoefficiency is a property of the algorithm, not of the problem. Give an intuitive argument.