CSCI 8150 Advanced Computer Architecture
CSCI 8150 Advanced Computer Architecture
CSCI 8150 Advanced Computer Architecture
Hwang, Chapter 7
Multiprocessors and Multicomputers
7.2 Cache Coherence &
Synchronization
The Cache Coherence Problem
Since there are multiple levels in a memory
hierarchy, with some of these levels private
to one or more processors, some levels may
contain copies of data objects that are
inconsistent with others.
This problem is manifested most obviously
when individual processors maintain cached
copies of a unique shared-memory location,
and then modify that copy. The inconsistent
view of that object obtained from other
processor’s caches and main memory is
called the cache coherence problem.
Causes of Cache Inconsistency
Cache inconsistency only occurs when there
are multiple caches capable of storing
(potentially modified) copies of the same
objects.
There are three frequent sources of this
problem:
Sharing of writable data
Process migration
I/O activity
Inconsistency in Data Sharing
Suppose two processors each use (read) a data item
X from a shared memory. Then each processor’s
cache will have a copy of X that is consistent with
the shared memory copy.
Now suppose one processor modifies X (to X’). Now
that processor’s cache is inconsistent with the other
processor’s cache and the shared memory.
With a write-through cache, the shared memory
copy will be made consistent, but the other
processor still has an inconsistent value (X).
With a write-back cache, the shared memory copy
will be updated eventually, when the block
containing X (actually X’) is replaced or invalidated.
Inconsistency in Data Sharing
Inconsistency After Process Migration
If a process accesses variable X (resulting in
it being placed in the processor cache), and
is then moved to a different processor and
modifies X (to X’), then the caches on the
two processors are inconsistent.
This problem exists regardless of whether
write-through caches or write-back caches
are used.
Inconsistency after Process Migration
Inconsistency Caused by I/O
Data movement from an I/O device to a shared
primary memory usually does not cause cached
copies of data to be updated.
As a result, an input operation that writes X causes
it to become inconsistent with a cached value of X.
Likewise, writing data to an I/O device usually use
the data in the shared primary memory, ignoring
any potential cached data with different values.
A potential solution to this problem is to require the
I/O processors to maintain consistency with at least
one of the processor’s private caches, thus “passing
the buck” to the processor cache coherence solution
(which will we see).
I/O Operations Bypassing the Cache
A Possible Solution
Cache Coherence Protocols
When a bus is used to connect processors and
memories in a multiprocessor system, each cache
controller can “snoop” on all bus transactions,
whether they involve the current processor or not.
If a bus transaction affects the consistency of a
locally-cached object, then the local copy can be
invalidated.
If a bus is not used (e.g. a crossbar switch or
network is used), then there is no convenient way to
“snoop” on memory transactions. In these systems,
some variant of a directory scheme is used to insure
cache coherence.
Snoopy Bus Protocols
Two basic approaches
write-invalidate – invalidate all other cached
copies of a data object when the local cached
copy is modified (invalidated items are
sometimes called “dirty”)
write-update – broadcast a modified value of a
data object to all other caches at the time of
modification
Snoopy bus protocols achieve consistency
among caches and shared primary memory
by requiring the bus interfaces of processors
to watch the bus for indications that require
updating or invalidating locally cached
Initial State – Consistent Caches
After Write-Invalidate by P1
After Write-Update by P1
Operations on Cached Objects
Read – as long as an object has not been
invalidated, read operations are permitted,
and obviously do not change the object’s
state
Write – as long as an object has not been
invalidated, write operations on the local
object are permitted, but trigger the
appropriate protocol action(s).
Replace –the cache block containing an
object is replaced (by a different block)
Write-Through Cache
In the transition diagram (next slide), the two
possible object states in the “local” cache (valid and
invalid) are shown.
The operations that may be performed are read,
write, and replace by the local processor or a
remote processor.
Transitions from locally valid to locally invalid occur
as a result of a remote processor write or a local
processor replacing the cache block.
Transitions from locally invalid to locally valid occur
as a result of the local processor reading or writing
the object (necessitating, of course, the fetch of a
consistent copy from shared memory).
Write-Through Cache State Transitions
X1 ← 1 X2 ← 1
work work
X1 ← 0 X2 ← 0
No No
Y1 = 1? Y2 = 1?
Yes Yes