First 5 minutes of hell

Definition of the Embedding:

place vertices of $G$ on vertices of $H$ and route edges of $G$ along paths(!) in $H$ (edge from $G$ is mapped on some path (multiple edges) of $H$ )

how to measure the quality of the embedding?

maximal load of the host node (how many $G$ vertices are mapped to a single $H$ -vertex)

maximal dilation (how long the longest edge-route is)

maximal edge congestion (how many $G$ -edge routes pass through a single $H$ -edge)

$G$ and $H$ are quasiisometric if we can embed one to the other with constant overhead

one can simulate the other with a constant slowdown (one parallel step in one takes a constant number of parallel steps in the other)

they are computationally equivalent if they can simulate the other with constant slowdown

lemma: if they are quasiisometric, then they are computationally equivalent (not the other way!)

Theorem: meshes and tori are quasiisometric and therefore computationally equivalent.

trivially, mesh is a subset of the torus

load = 1, dilation = 1, congestion = 1

the other way: with Cartesian product decomposition, embedding the 1D torus onto the 1D mesh (where load = 1 and dilation = congestion = 2) and building it back again (using the orthogonality and Cartesian product)

the cost of “faking” the wrap-around edges must be constant

zig-zag method combining the even and odd edges

The ordinary butterfly and the wrapped butterfly are quasiisometric.

trivially, we can just merge the end-vertices of the ordinary butterfly (load = 2, dilation = 1) and get the cycles of the wrapped butterfly

the other way: a complicated constructive proof

Conceptual foundation

Quasiisometry and computational equivalence

Quasiisometry is a static, graph-theoretic equivalence between two interconnection networks based on the existence of mutual embeddings with bounded quality measures.

Definition 2 (Quasiisometric and computationally equivalent networks)

$G$ and $H$ are quasiisometric if $G ⟶ emb H$ and $H ⟶ emb G$ both exist with constant embedding measures (load, expansion, dilation, congestion all bounded by constants independent of graph size).
$H$ simulates $G$ with slowdown $h$ if one parallel step on $G$ can be simulated in $O (h)$ parallel steps on $H$ .
$G$ and $H$ are computationally equivalent networks if each can simulate the other with constant slowdown.

Lemma 3

Quasiisometric $G$ and $H$ are computationally equivalent, but not vice versa.

Why quasiisometry matters

Static embedding measures cannot capture dynamic behaviour of a parallel algorithm on the host network (e.g. a large dilation is not a problem if the corresponding route is used only scarcely). Quasiisometry is however robust with respect to any dynamic behaviour: if $G$ and $H$ are quasiisometric, then the asymptotic behaviour of any parallel algorithm differs by at most a constant multiplicative factor between the two networks. This is the strongest static argument for treating two topologies as “the same” for the purpose of parallel computation.

Result 1: Meshes and tori are quasiisometric

Theorem 6

Let $M = M (z_{1}, \dots, z_{n})$ be the $n$ -dimensional mesh and $K = K (z_{1}, \dots, z_{n})$ the $n$ -dimensional torus of identical dimensions. Then $M$ and $K$ are quasiisometric and therefore computationally equivalent.

Proof - the easy direction

$M \subset K$ trivially (the torus has all the mesh edges plus the wraparound edges), so $K$ simulates $M$ with no slowdown. The identity embedding $M ⟶ emb K$ has $load = 1$ , $dil = 1$ , $ecng = 1$ .

Proof - the hard direction: $K ⟶ emb M$ with $load = 1$ , $dil = ecng = 2$

The proof uses Cartesian product decomposition, exploiting the orthogonality of meshes and tori:

Decompose $M = M (z_{1}) \times \dots \times M (z_{n})$ and $K = K (z_{1}) \times \dots \times K (z_{n})$ .
Embed each 1-D factor $K (z_{i}) ⟶ emb M (z_{i})$ with $load = 1$ and $dil = ecng = 2$ .
Apply the Cartesian product to combine the per-dimension embeddings.

The 1-D zig-zag embedding $K (z) ⟶ emb M (z)$

Number the $z$ vertices $0, 1, \dots, z - 1$ . Place the torus vertices in the mesh in the zig-zag order: $0 \to 2 \to 4 \to \dots \to (z - 1 or z - 2) \to \dots \to 5 \to 3 \to 1$ i.e. hop over by 2 going right along the even-indexed vertices, then return on the alternate (odd-indexed) vertices. Every torus edge then maps to a mesh path of length at most 2, and every mesh link carries at most 2 torus-edge paths. The argument is identical for even and odd $z$ .

Why orthogonality is key

In orthogonal topologies, embedding moves along one dimension have no effect on the others. The Cartesian product preserves the per-dimension bounds: the overall embedding still has $load = 1$ , $dil = 2$ , $ecng = 2$ , regardless of the number $n$ of dimensions.

Consequence for MPI

This is the theoretical justification for the MPI_Cart_Create design decision that lets the programmer freely declare each dimension as either a 1-D mesh or a 1-D torus: computationally, the choice does not matter, only constant factors.

Result 2: Ordinary and wrapped butterflies are quasiisometric

Lemma 11

The ordinary butterfly $o B F_{n}$ and the wrapped butterfly $wB F_{n}$ are quasiisometric.

Proof - the easy direction: $o B F_{n} ⟶ emb wB F_{n}$

This direction is trivial: merge the terminal vertices of each row of $o B F_{n}$ to obtain the row-cycles of $wB F_{n}$ . This yields $load = 2$ and $dil = 1$ .

Proof - the hard direction: $wB F_{n} ⟶ emb o B F_{n}$ with $load = 1$ , $dil = 3$

This case is more involved than the mesh-torus case because the butterfly is not orthogonal: the column index of a butterfly fixes which hypercubic dimension is exercised at that stage, so permuting columns simultaneously permutes the hypercube dimensions. Each row of $wB F_{n}$ is a 1-D torus $K (n)$ , and the cylinder structure means the first and last drawn columns are physically identical.

Step 1: Canonical path in $wB F_{n}$

By vertex symmetry of $wB F_{n}$ , it does not matter which path we pick. Choose path $P$ from $u = (0, 11 \dots 11)$ to $v = (0, 00 \dots 00)$ traversing all $n$ stages, with bits of the row address inverted in the order $0, 1, \dots, n - 1$ .

After embedding $wB F_{n}$ into $o B F_{n}$ , every edge of $P$ must have dilation at most 3, even though the embedded walk has to detour through the last column of $o B F_{n}$ and return.

Step 2: Why the idempotent mapping fails

The naive idempotent mapping maps each column of $wB F_{n}$ to the same-indexed column of $o B F_{n}$ . The last edge of $P$ would then have dilation $n + 1$ (i.e. logarithmic in the number of rows), because we must visit $o B F_{n}$ ‘s last column $n$ and return to column $0$ . This concentrates the entire wraparound cost into a single edge of length $Θ (lo g size)$ , violating the constant-dilation goal of quasiisometry. The fix has to spread that cost out so every step carries at most a constant share.

Step 3: Reformulating as a walk problem in $o B F_{n}$

We need a walk from $u = (0, 11 \dots 11)$ to $v = (0, 00 \dots 00)$ in $o B F_{n}$ such that:

we visit each column $1, \dots, n - 1$ exactly once,
column $n$ is just transient,
the distance between two neighbours on the walk is at most $3$ .

This is analogous to the $K (n) ⟶ emb M (n)$ zig-zag with $dil = 2$ , but with two complications: we embed $K (n)$ into $M (n + 1)$ instead of $M (n)$ , and bit inversions are tightly coupled to column moves (not independent like in orthogonal topologies). Both endpoint column numbers and row addresses are fixed in advance.

Step 4: The bit-permutation construction

Each valid walk corresponds to a specific permutation of the $n$ bits in the row address giving the order in which they are inverted.

For even $n$ , the two equivalent permutations are:

(a) $1, 3, \dots, n - 3, n - 1, n - 2, \dots, 0$
(b) $0, 2, \dots, n - 2, n - 1, n - 3, \dots, 1$

For odd $n$ , the two equivalent permutations are:

(a) $1, 3, \dots, n - 2, n - 1, n - 3, \dots, 0$
(b) $0, 2, \dots, n - 3, n - 1, n - 2, \dots, 1$

Both possibilities are equivalent in terms of dilations: each yields exactly 1 edge of dilation 3, 1 edge of dilation 1, and the rest have dilation 2. For the running example $n = 4$ , the chosen permutation is $1, 3, 2, 0$ .

Step 5: Using row-symmetry of $wB F_{n}$

The key trick: instead of fighting the rigid column-to-dimension correspondence of butterflies, we exploit row-symmetry to relabel the source graph. A systematic permutation of bits in row addresses of $wB F_{n}$ is an automorphism (a bijection preserving adjacency), so we get the same $wB F_{n}$ , just drawn differently. We rename all vertices of $wB F_{n}$ using this automorphism, so that idempotent column mapping into the destination is now valid.

Step 6: The final embedding

For $n = 4$ with permutation $1, 3, 2, 0$ :

column 1 of permuted $wB F_{4}$ maps to column 2 of $o B F_{4}$ ,
column 2 of permuted $wB F_{4}$ maps to column 3 of $o B F_{4}$ ,
column 3 of permuted $wB F_{4}$ maps to column 1 of $o B F_{4}$ .

The per-edge dilations achieved along the embedded walk are lengths $2, 3, 2, 1$ in $o B F_{4}$ , all bounded by the target constant $3$ independently of $n$ . The construction yields $load = 1$ and $dil = 3$ .

Lessons learned about butterfly symmetry

The constructive proof reveals two structural lemmas about butterflies.

Lemma 12

$wB F_{n}$ has $n!$ automorphisms given by the permutations of bits in row addresses (with the standard layout).

Lemma 13

$o B F_{n}$ has $n!$ automorphisms given by the permutations of bits in row addresses.

Said otherwise, $o B F_{n}$ is not vertex-symmetric, but row-symmetric: for any two rows $r_{1}$ and $r_{2}$ of $o B F_{n}$ , there is an automorphism sending $r_{1}$ to $r_{2}$ . The row-symmetry is inherited from the hypercube on which the butterfly is built: the hypercube’s dimension symmetry survives into the butterfly structure as a freedom to permute rows.

Comparison of the two results

Both results establish quasiisometry between a “sparser” topology (mesh, ordinary butterfly) and its “richer” counterpart (torus, wrapped butterfly), with the easy direction being trivial inclusion and the hard direction requiring a constructive embedding that spreads a wraparound cost evenly across edges.

The mesh-torus proof is simple because meshes and tori are orthogonal: per-dimension embeddings combine cleanly via the Cartesian product. The butterfly proof is more involved because butterflies are not orthogonal: column index and hypercube dimension are tied together, so we cannot independently permute dimensions. The workaround is to exploit the rich row-symmetry of $wB F_{n}$ to relabel vertices first, making idempotent column mapping valid.

In both cases the achieved bounds are $load = 1$ , $dil \in 2, 3$ , $ecng$ constant - small enough to be considered “constant overhead” for parallel computation.

Potential exam questions

Given the lecturer’s proof-heavy, definition-precise style, expect questions like:

Define quasiisometric networks. State Lemma 3 and explain why the converse (computational equivalence $\Rightarrow$ quasiisometry) does not hold.
Prove Theorem 6: meshes and tori of the same dimensions are quasiisometric. Give the explicit construction of $K (z) ⟶ emb M (z)$ with $dil = 2$ .
Why is the Cartesian product decomposition usable here? What property of meshes and tori makes the per-dimension argument compose?
State Lemma 11 and prove the easy direction $o B F_{n} ⟶ emb wB F_{n}$ . What are the embedding measures?
Sketch the proof of the hard direction $wB F_{n} ⟶ emb o B F_{n}$ with $load = 1$ and $dil = 3$ . Why does the idempotent mapping fail, and what would its dilation be?
For $n = 4$ , give the bit-inversion permutation used in the construction and explain the corresponding mapping of columns of the permuted $wB F_{4}$ into columns of $o B F_{4}$ .
State the two equivalent bit-inversion permutations for general even $n$ and odd $n$ . How many edges of each dilation value does each permutation produce?
Why is the butterfly proof more difficult than the mesh-torus proof? What structural property is missing in butterflies?
State Lemma 12 and Lemma 13. What does ” $o B F_{n}$ is row-symmetric but not vertex-symmetric” mean precisely, and how was this used in the embedding construction?
Compare and contrast the two quasiisometry proofs in this lecture. What is the common high-level strategy, and where does each proof exploit a specific structural feature of the topologies involved?

Petrova digitální zahrada 🚀

Procházet

PDP - Quasiisometric topologies - meshes-tori, ordinary-wrapped butterflies

Conceptual foundation

Quasiisometry and computational equivalence

Definition 2 (Quasiisometric and computationally equivalent networks)

Lemma 3

Why quasiisometry matters

Result 1: Meshes and tori are quasiisometric

Theorem 6

Proof - the easy direction

Proof - the hard direction: $K ⟶ emb M$ with $load = 1$ , $dil = ecng = 2$

The 1-D zig-zag embedding $K (z) ⟶ emb M (z)$

Why orthogonality is key

Consequence for MPI

Result 2: Ordinary and wrapped butterflies are quasiisometric

Lemma 11

Proof - the easy direction: $o B F_{n} ⟶ emb wB F_{n}$

Proof - the hard direction: $wB F_{n} ⟶ emb o B F_{n}$ with $load = 1$ , $dil = 3$

Step 1: Canonical path in $wB F_{n}$

Step 2: Why the idempotent mapping fails

Step 3: Reformulating as a walk problem in $o B F_{n}$

Step 4: The bit-permutation construction

Step 5: Using row-symmetry of $wB F_{n}$

Step 6: The final embedding

Lessons learned about butterfly symmetry

Lemma 12

Lemma 13

Comparison of the two results

Potential exam questions

Graf

Obsah

Příchozí odkazy

Petrova digitální zahrada 🚀

Procházet

PDP - Quasiisometric topologies - meshes-tori, ordinary-wrapped butterflies

Conceptual foundation

Quasiisometry and computational equivalence

Definition 2 (Quasiisometric and computationally equivalent networks)

Lemma 3

Why quasiisometry matters

Result 1: Meshes and tori are quasiisometric

Theorem 6

Proof - the easy direction

Proof - the hard direction: K⟶emb​M with load=1, dil=ecng=2

The 1-D zig-zag embedding K(z)⟶emb​M(z)

Why orthogonality is key

Consequence for MPI

Result 2: Ordinary and wrapped butterflies are quasiisometric

Lemma 11

Proof - the easy direction: oBFn​⟶emb​wBFn​

Proof - the hard direction: wBFn​⟶emb​oBFn​ with load=1, dil=3

Step 1: Canonical path in wBFn​

Step 2: Why the idempotent mapping fails

Step 3: Reformulating as a walk problem in oBFn​

Step 4: The bit-permutation construction

Step 5: Using row-symmetry of wBFn​

Step 6: The final embedding

Lessons learned about butterfly symmetry

Lemma 12

Lemma 13

Comparison of the two results

Potential exam questions

Graf

Obsah

Příchozí odkazy

Proof - the hard direction: $K ⟶ emb M$ with $load = 1$ , $dil = ecng = 2$

The 1-D zig-zag embedding $K (z) ⟶ emb M (z)$

Proof - the easy direction: $o B F_{n} ⟶ emb wB F_{n}$

Proof - the hard direction: $wB F_{n} ⟶ emb o B F_{n}$ with $load = 1$ , $dil = 3$

Step 1: Canonical path in $wB F_{n}$

Step 3: Reformulating as a walk problem in $o B F_{n}$

Step 5: Using row-symmetry of $wB F_{n}$