First 5 minutes of hell
Scalability = the property of the parallel algorithm to keep the parallel optimality if both and grow or shrink.
Amdahl’s Law for saturation of parallelizability
- at each sequential algorithm , there is an inherently sequential fraction (executed only on 1 thread - I/O operations, initialization etc.) and the the parallelizable part
- the sequential part is not parallelizable, so it bounds the speedup (at some point, adding more processors does not increase the speedup anymore)
- essentially saying that if we have fixed problem size , there is only a limited amount of parallelism to do (it cannot be scaled to the infinity)
Gustavson’s Law
- Setup: split the parallel execution time into t_seq (sequential part) and t_par(n,p) (parallel part, executed on p processors)
- let be the fraction of the parallel execution time spent on the sequential part:
- then the speedup simplifies to:
- limit (assuming grows with while stays fixed):
- Gustavson assumes that as we get more processors, we scale the problem size (rather than keeping it fixed), under this assumption, linear speedup is achievable
Isoefficiency functions
- when preserving the efficiency, we need to define the best ratio of processors and the problem size , and the isoefficiency functions define the upper and lower bound of this relationship
- for a given constant efficiency :
- the minimum n needed for a given p:
- Gustavson essentially says that if the problem size grows according to the , the efficiency will be sufficiently good
- the maximum p allowed for a given n:
- = for a fixed size , how many processors can I use while still maintaining the efficiency at least
- it is the largest function, so any more processors beyond this and the efficiency drops below (we will be wasting them)
- this is what the Amdahl’s Law is saying, if I use more processors, I “saturate” the parallelism and adding more processors won’t help
- the is the saturation point, where I am hitting the ceiling
Parallel scalability
- definition: scalability is the property of a parallel algorithm to keep the parallel optimality if both and grow or shrink
- the scalability expresses that larger problems can be solved in the same time as smaller problems if sufficient is utilized
- it’s not always about having a maximum number of processing units available
- it’s the scalability issue, we have to scale with to keep all cores busy all the time
- T - time, E - efficiency, S - speedup
- there are two types of scalability:
- strong scalability: measures the capability of a parallel algorithm for fixed to achieve linear speedup with increasing p (Amdahl’s law puts strong limits to this)
- alternatively: strong scalability is the measure of efficiency decrease if increases while is fixed
- weak scalability: defines how the parallel time varies with for fixed
- alternatively: weak scalability is the measure of growth of such that a fixed efficiency is preserved when increases
- strong scalability: measures the capability of a parallel algorithm for fixed to achieve linear speedup with increasing p (Amdahl’s law puts strong limits to this)
Amdahl’s Law
- each sequential calculation has some part of it inherently sequential, call it
- the remaining part of could be parallelized
- we assume the perfect parallelization without any overhead
- derivation: let be the sequential time, with
- under perfect parallelization the parallel time is
- dividing by yields the formula below
- the formula ( is fixed, varies): - the law:
- the parallel speedup will always be bounded by no matter how many threads are used
- it measures the parallel algorithm efficiency decrease if is increasing while is fixed
- that every added processor has more marginal benefit
- after a certain value of , adding new processors does not make sense, there is not enough parallel work to do (see the above plots)
- that is called the saturation limit
- it basically says that a fixed size problem provides a limited amount of parallellism (there is a limit of a reasonable number of parallel threads to execute it)
- examples:
- if , the speedup will be for any
- no matter how many processors in parallel I use, the speedup will be at most 10 times faster (because the non-parallelizable 10 % of sequential time will be a bottleneck)
- if , the speedup for any
- if , the speedup will be for any
Gustafson’s Law
- weak scalability: if we scale and linearly together, a constant efficiency can be maintained
- and thus the linear speedup and optimal cost
- meaning, if we add more computing units , we should also scale up the problem size
- it measures the growth when we preserve the efficiency while increasing
- the non-parallelizable sequential portion stays constant (I/O operations, initialization…), so as we add processors and proportionally grow the problem , the parallel portion scales linearly, hence overall speedup is linear in p
- the formula (here is the sequential fraction of the parallel execution time on the scaled problem, not of the serial time):
- reconciliation with Amdahl: the two laws are not contradictory, they answer different questions
- Amdahl fixes and asks how speedup behaves with - strong scaling, pessimistic
- Gustafson fixes time-per-processor and lets grow with - weak scaling, optimistic
- the apparent disagreement comes from where is measured: Amdahl on the serial execution, Gustafson on the scaled parallel execution
Isoefficiency functions
- when preserving the efficiency of the parallel algorithm, isoefficiency functions define the lowerbound and upperbound of the and relationship
- we can define some fixed efficiency that we want to achieve, e.g. 0.2 (20 %):
- the isoefficiency functions specify:
- the minimum n needed for a given p:
- the maximum p allowed for a given n:
- to maintain the given efficiency guarantee
- the isoefficiency functions specify:
- we ask: “what minimum efficiency do I want to guarantee?”
- and with isoefficiency functions, I can calculate the correct scaling of n and p to maintain this efficiency (when I have to option to remove/add computing units and increase/decrease the problem size)
- the isoefficiency is the property of the algorithm, not of the problem
- different algorithms of the same problem can have differernt isoefficiency values
Potential exam questions
- Derive Amdahl’s law from assuming perfect parallelization of the parallelizable part.
- State Amdahl’s formula and compute the asymptotic speedup limit for .
- What is the saturation limit and why does adding more processors beyond it not help?
- State Gustafson’s formula. How is defined differently than in Amdahl’s law?
- Are Amdahl’s law and Gustafson’s law contradictory? Explain.
- Define strong scalability and weak scalability. Which law is associated with which?
- Define efficiency in terms of sequential work and parallel overhead , and explain how this leads to the notion of isoefficiency.
- What does it mean for an algorithm to have isoefficiency function ? Is this better or worse than ?
- Derive the isoefficiency function of parallel reduction on processors.
- Explain why isoefficiency is a property of the algorithm, not of the problem. Give an intuitive argument.