First 5 minutes of hell
PRAM (Parallel RAM) is a abstraction computational model for a parallel computer. It abstracts:
- all RAM processors are synchronized with the same clock
- each accesses the shared memory in constant time (not taking synchronization and communication overhead into account)
- three types of operations:
- global read
- local computation
- global write
- two time models:
- unit time model (each operation takes 1 time unit)
- global time model (local computation takes 1 time unit, R/W take time unit, still constant)
- how are conflicts handled?
- EREW
- CREW
- CRCW (three options: priority, arbitrary, common)
- = Parallel Random Access Machine
- a different distinction:
- a RAM in PRAM means Random Access Machine (as in Von Neumann’s architecture)
- a “normal” RAM (as in e.g. DRAM) is Random Access Memory
- a PRAM is an abstraction model on how to access memory in a parallel computer (with multiple processors)
- the key abstraction: a processor can access any shared memory (SM) cell in time (unless conflict occurs)
- ignoring physical constraints (latency, technology etc.)
- there is a big difference between a local operation and an operation with remote access to the shared memory (it could be 100x slower)
- this is the most comfortable model to implement (we just care about splitting the work onto multiple processors and not caring about the memory locality) - so this PRAM model is often the starting point in designing our parallel algorithm
- the key abstraction: a processor can access any shared memory (SM) cell in time (unless conflict occurs)
3 types of operations
-
the idea is having a cycle of operations in a loop: all processors read (R), all compute locally (L) and all write back (W)
- simultaneously and synchronously
-
the notation
⟨RLW⟩^kdenoteskiterations of the sequence READ, LOCAL, WRITE -
operations
- Global read from a shared memory cell (R)
- Local computation (L) - on local registers/local memory
- Global write operation to a shared memory cell (W)
Timing models
- when designing the algorithm in PRAM, we can follow two timing models
- unit time model: each operation R, L, or W takes time 1
- global time model: L takes time 1 and R/W take constant time
d > 1- this is more realistic (because there is a big difference between local operation in the CPU (L) and remote shared memory access (R, W), so the constant
dapproximates the delays)
- this is more realistic (because there is a big difference between local operation in the CPU (L) and remote shared memory access (R, W), so the constant
Shared access memory conflict handling
-
EREW (Exclusive Read, Exclusive Write) - most restrictive model
- two processors cannot access a single memory cell at some time (not for R, not for W)
- it’s my responsibility is to ensure that the restrictions are respected
- it is the most restrictive, but surprisingly, quite good for many use-cases
- example use-cases:
- algorithms, where at every parallel step, each shared memory is touched by at most one CPU
- e.g. element-wise operations (each CPU reads own elements and writes own result)
-
CREW (Concurrent Read, Exclusive Write)
- concurrent reading is not forbidden (it is relaxed), but writing is exclusive
- use-case: matrix multiplication (same reading of rows, but writing is exlusive)
-
CRCW (Concurrent Read, Concurrent Write)
- multiple approaches to resolve the conflict:
- Priority CRCW: priority queue of the request (the CPU with higher priority is “stronger” and gets priority)
- Arbitrary CRCW: random processor can write, the code must handle this uncertainty (the code must produce correct results regardless of the writer) - again many use-cases perform well with this approach
- Common CRCW - it’s possible to write simultaneously, if the value to be written is the same (it’s my responsibility as a programmer is to ensure that)
- if not, the machine will have undefined behavior
- multiple approaches to resolve the conflict:
-
any algorithm designed for EREW runs correctly on CREW and CRCW (it gets more permissive)
- this doesn’t work the other way
Connection to real architectures
- PRAM model maps to shared-memory parallel systems:
- UMA - Memory organization
- one shared memory for all CPUs, each one of them has access to the whole address space
- SMP = symmetric multiprocessor (several CPUs sharing the same shared memory), usually servers
- UMA - Memory organization
Potential exam questions
- Define the PRAM model. What are its components, and what are the three types of operations processors can execute?
- Explain the difference between the unit time model and the global time model in PRAM. Why is the global time model more realistic?
- What does “RAM” stand for in PRAM, and why is this distinction important?
- Describe the three PRAM submodels (EREW, CREW, CRCW) and their restrictions on shared memory access.
- For each of the three CRCW variants (Priority, Arbitrary, Common), explain the write conflict resolution mechanism and state who bears responsibility for correctness (hardware vs. programmer).
- Give an example of a situation where EREW-PRAM is sufficient. Why might this restrictive model still be practical?
- In Common-CRCW-PRAM, what happens if two processors attempt to write different values to the same cell? Whose responsibility is it to prevent this?
- Rank the PRAM submodels by computational power. Can an algorithm designed for CREW run unmodified on an EREW machine? Why or why not?
- How does the PRAM model relate to real shared-memory architectures (UMA, CC-NUMA)?
- What is the key abstraction/idealization of PRAM, and what physical constraints does it ignore?