| UTCU | DEDECDMANCE | COMPTIMED | ARCHITECTURE | 27-11-2022 |
|------|-------------|-----------|--------------|------------|
| HIGH | PERFURMANCE | COMPUTER  | ARCHITECTURE | 2/-11-2023 |

| PLEASE REUTURN THIS SHEET ALONG WITH ALL |
|------------------------------------------|
| THE SHEETS VOILWERE GIVEN                |

| SURNAME    |  |  |
|------------|--|--|
| SURNAME    |  |  |
| FIRST NAME |  |  |

Consider a bus-based multicore that supports a new cache-coherence protocol called MOESI. Compared to the well-known MESI protocol, the MOESI protocol adds a 5<sup>th</sup> state called O (Owned). A copy in O state is like a SM copy in Dragon: the owned copy is modified and the "owner cache" has the responsibility to provide the copy once a BusRd transaction involves that copy; at the same time, the M state is now simplified, as it doesn't have to update memory on Flush (only Flush\* transactions appear in this protocol). A copy enters the O state if another cache needs a copy (for reading) while that copy is in M state; on a local read, local write or other bus transactions, the O copy behaves like an S copy.

1a) [Points 8/30] Draw the diagram of the MESIF protocol according to the above description.











1b) [Points 22/30] Assuming a cost of 1cc (1 clock-cycle) for read/write operations, 90cc for BusRd or BusRdx transactions, 60cc for BusUpgr, 20 cc for Flush\* and 30cc for Flush. Evaluate the total cost (in clock-cycles) for the following streams:

|                | Core Operation | C1    | C2 | С3    | Bus Transaction | Data from | Cycles |  |
|----------------|----------------|-------|----|-------|-----------------|-----------|--------|--|
| Н              | PrRd1          |       |    |       |                 |           |        |  |
|                | PrWr1          |       |    |       |                 |           |        |  |
| MOES           | PrRd1          |       |    |       |                 |           |        |  |
| Ö              | PrWr1          |       |    |       |                 |           |        |  |
| Σ              | PrRd2          |       |    |       |                 |           |        |  |
|                | PrWr2          |       |    |       |                 |           |        |  |
| ٠.             | PrRd2          |       |    |       |                 |           |        |  |
| 占              | PrWr2          |       |    |       |                 |           |        |  |
| 됬              | PrRd3          |       |    |       |                 |           |        |  |
| ä              | PrWr3          |       |    |       |                 |           |        |  |
| й              | PrRd3          |       |    |       |                 |           |        |  |
| tream-         | PrWr3          |       |    |       |                 |           |        |  |
| Ø              |                |       |    | TOTAL | i               | •         |        |  |
| ы              | Core Operation | C1    | C2 | С3    | Bus Transaction | Data from | Cycles |  |
| H              | PrRd1          |       |    |       |                 |           |        |  |
| [+]            | PrRd2          |       |    |       |                 |           |        |  |
| $\overline{C}$ | PrRd3          |       |    |       |                 |           |        |  |
| MOES           | PrWr1          |       |    |       |                 |           |        |  |
|                | PrWr2          |       |    |       |                 |           |        |  |
| 2              | PrWr3          |       |    |       |                 |           |        |  |
| Ţ              | PrRd1          |       |    |       |                 |           |        |  |
| Ħ              | PrRd2          |       |    |       |                 |           |        |  |
| (0             | PrRd3          |       |    |       |                 |           |        |  |
| Ÿ              | PrWr3          |       |    |       |                 |           |        |  |
| tream-         | PrWr1          |       |    |       |                 |           |        |  |
| W              |                | TOTAL |    |       |                 |           |        |  |
|                | Core Operation | C1    | C2 | C3    | Bus Transaction | Data from | Cycles |  |
| H              | PrRd1          |       |    |       |                 |           |        |  |
| [+]            | PrRd2          |       |    |       |                 |           |        |  |
| $\overline{C}$ | PrRd3          |       |    |       |                 |           |        |  |
| MOES           | PrRd3          |       |    |       |                 |           |        |  |
| 3              | PrWr1          |       |    |       |                 |           |        |  |
| Υ,             | PrWr1          |       |    |       |                 |           |        |  |
| tream-         | PrWr1          |       |    |       |                 |           |        |  |
| ษ              | PrWr1          |       |    |       |                 |           |        |  |
| Ũ              | PrWr2          |       |    |       |                 |           |        |  |
| Ĥ              | PrWr3          |       |    |       |                 |           |        |  |
| 4              |                |       | •  | TOTAL |                 | •         |        |  |

## **EXERCIZE 1a)**

First, we start drawing the states of the MESI protocol, and then let's focus on the M-state and O-state.

**M-state**: PrRd and PrWr are exactly as in MESI; however, when there is a BusRd, the copy enters into O-state while providing the copy to the requesting cache (via a Flush\* transactions) without the need of updating the memory; since it is now the cache with the copy in O-state that has the responsibility to provide a shared-modified copy, the memory is updated on replacement (i.e., for cache conflicts) of M copies or O copies. If a BusRdX transaction is observed in M-state, that cache provides the copy (via a Flush\* transaction) to the requesting cache and change its state from M to I.

**O-state**: since it is now this state that has the responsibility to update memory on replacement or to provide the copy to other caches, the local read or write behave like in the S-state; similar for BusRd, BusRdX or BusUpgr the O-state will have the same behavior for operations and transactions happening in the S-state.



## **EXERCIZE 1b)**

| IZE I    | D)             |    |    |    |                   |           |        |
|----------|----------------|----|----|----|-------------------|-----------|--------|
|          | Core Operation | C1 | C2 | С3 | Bus Transaction   | Data from | Cycles |
| SI       | PrRd1          | E  | I  | I  | BusRd(S=0)        | Mem       | 90     |
|          | PrWr1          | М  |    |    | -                 | -         | 1      |
| 년<br>1   | PrRd1          | M  |    |    | -                 | -         | 1      |
| MOE      | PrWr1          | М  |    |    | -                 | _         | 1      |
| Σ        | PrRd2          | 0  | S  |    | BusRd(S=1),Flush* | C1        | 90+20  |
| $\vdash$ | PrWr2          | I  | М  |    | BusUpgr           | -         | 60     |
|          | PrRd2          |    | М  |    | -                 | -         | 1      |
| ם        | PrWr2          |    | М  |    | -                 | -         | 1      |
| tream-   | PrRd3          |    | 0  | S  | BusRd(S=1),Flush* | C2        | 90+20  |
| Ľ        | PrWr3          |    | I  | M  | BusUpgr           | -         | 60     |
| نڼ       | PrRd3          |    |    | M  | -                 | -         | 1      |
| Ø        | PrWr3          |    |    | M  | -                 | -         | 1      |
|          |                |    |    |    | •                 | TOTAL     | 437    |
|          | Core Operation | C1 | C2 | С3 | Bus Transaction   | Data from | Cycles |
| Н        | PrRd1          | E  | I  | I  | BusRd(S=0)        | Mem       | 90     |
| ß        | PrRd2          | S  | S  |    | BusRd(S=1),Flush* | C1        | 90+20  |
| 띗        | PrRd3          | S  | S  | s  | BusRd(S=1),Flush* | C1/C2     | 90+20  |
| MOES     | PrWr1          | M  | I  | I  | BusUpgr           | -         | 60     |
| ~        | PrWr2          | I  | М  |    | BusRdX,Flush*     | C1        | 90+20  |
| Ŋ        | PrWr3          |    | I  | М  | BusRdX,Flush*     | C2        | 90+20  |
| F        | PrRd1          | S  |    | 0  | BusRd(S=1),Flush* | C3        | 90+20  |
| E<br>E   | PrRd2          | S  | S  | 0  | BusRd(S=1),Flush* | C3        | 90+20  |
| tream    | PrRd3          | S  | S  | 0  | -                 | -         | 1      |
| H        | PrWr3          | I  | I  | M  | BusUpgr           | -         | 60     |
| st       | PrWr1          | M  |    | I  | BusRdX,Flush*     | C3        | 90+20  |
| 02       |                |    |    |    | •                 | TOTAL     | 981    |
|          | Core Operation | C1 | C2 | C3 | Bus Transaction   | Data from | Cycles |
| SH       | PrRd1          | E  | I  | I  | BusRd(S=0)        | Mem       | 90     |
| MOES     | PrRd2          | S  | S  |    | BusRd(S=1),Flush* | C1        | 90+20  |
| 0        | PrRd3          | S  | S  | S  | BusRd(S=1),Flush* | C1/C2     | 90+20  |
| ≥;       | PrRd3          | s  | S  | S  | -                 | -         | 1      |
| m        | PrWr1          | М  | I  | I  | BusUpgr           | -         | 60     |
| Ť        | PrWr1          | М  |    |    | -                 | -         | 1      |
| tream    | PrWr1          | М  |    |    | -                 | -         | 1      |
| Ø        | PrWr1          | М  |    |    | -                 | -         | 1      |
| Ľ        | PrWr2          | I  | М  |    | BusRdX,Flush*     | C1        | 90+20  |
| نډ       | PrWr3          |    | I  | М  | BusRdX,Flush*     | C2        | 90+20  |
| Ø        |                |    | •  | •  | •                 | TOTAL     | 594    |