1. System Cache.................................................................................................................................. 2
1.1. Role of Cache in the PC...................................................................................................... 2
1.2. "Layers" of Cache................................................................................................................3
1.2.1. Level 1 (Primary) Cache...........................................................................................4
1.2.2. Level 2 (Secondary) Cache.......................................................................................5
1.2.3. Disk Cache................................................................................................................ 5
1.2.4. Peripheral Cache....................................................................................................... 5
1.3. Function and Operation of the System Cache.....................................................................6
1.3.1. Why Caching Works................................................................................................. 6
1.3.2. How Caching Works................................................................................................. 7
1.3.3. Parts of the Level 2 Cache........................................................................................ 8
1.3.4. Structure of the Data Store........................................................................................8
1.3.5. Cache Mapping and Associativity............................................................................ 9
1.3.6. Comparison of Cache Mapping Techniques...........................................................10
1.3.7. Tag Storage..............................................................................................................12
1.3.8. How the Memory Address Is Used......................................................................... 13
1.3.9. Cache Write Policy and the Dirty Bit..................................................................... 13
1.3.10. Summary: The Cache Read/Write Process...........................................................15
1.4. Cache Characteristics........................................................................................................ 17
1.4.1. Cache Speed............................................................................................................ 17
1.4.2. Cache Size...............................................................................................................18
1.4.3. System RAM Cacheability..................................................................................... 19
1.4.4. Integrated vs. Separate Data and Instruction Caches............................................. 20
1.4.5. Mapping Technique.................................................................................................21
1.4.6. Write Policy.............................................................................................................21
1.4.7. Transactional or Non-Blocking Cache....................................................................22
1.4.8. Cache Bursting........................................................................................................23
1.4.9. Asynchronous Cache...............................................................................................23
1.4.10. Synchronous Burst Cache..................................................................................... 23
1.4.11. Pipelined Burst (PLB) Cache................................................................................24
1.4.12. Comparison of Transfer Technology Performance...............................................24
1.5. Cache Structure and Packaging.........................................................................................25
1.5.1. Integrated Level 2 Cache........................................................................................ 25
1.5.2. Daughterboard Cache..............................................................................................26
1.5.3. Motherboard Cache.................................................................................................26
1.5.4. COASt Modules......................................................................................................27
1. System Cache
The system cache is responsible for a great deal of the system performance
improvement of today's PCs. The cache is a buffer of sorts between the
very fast processor and the relatively slow memory that serves it. (The
memory is not really slow, it's just that the processor is much faster.)
The presence of the cache allows the processor to do its work while waiting
for memory far less often than it otherwise would.
There are in fact several different "layers" of cache in a modern PC, each
acting as a buffer for recently-used information to improve performance,
but when "the cache" is mentioned without qualifiers, it normally refers
to the "secondary" or "level 2" cache that is placed between the processor
and system RAM. The various levels of cache are discussed here, in the
discussion on the theory and operation behind cache (since many of the
principles are the same). However, most of the focus of this section is
on the level 2 system cache.
1.1.Role of Cache in the PC
In early PCs, the various components had one thing in common: they were
all really slow :^). The processor was running at 8 MHz or less, and taking
many clock cycles to get anything done. It wasn't very often that the
processor would be held up waiting for the system memory, because even
though the memory was slow, the processor wasn't a speed demon either.
In fact, on some machines the memory was faster than the processor.
In the 15 or so years since the invention of the PC, every component has
increased in speed a great deal. However, some have increased far faster
than others. Memory, and memory subsystems, are now much faster than they
were, by a factor of 10 or more. However a current top of the line processor
has performance over 1,000 times that of the original IBM PC!
This disparity in speed growth has left us with processors that run much
faster than everything else in the computer. This means that one of the
key goals in modern system design is to ensure that to whatever extent
possible, the processor is not slowed down by the storage devices it works
with. Slowdowns mean wasted processor cycles, where the CPU can't do
anything because it is sitting and waiting for information it needs. We
want it so that when the processor needs something from memory, it gets
it as soon as possible.
The best way to keep the processor from having to wait is to make everything
that it uses as fast as it is. Wouldn't it be best just to have memory,
system buses, hard disks and CD-ROM drives that just went as fast as the
processor? Of course it would, but there's this little problem called
"technology" that gets in the way. :^)
Actually, it's technology and cost; a modern 2 GB hard disk costs less
than $200 and has a latency (access time) of about 10 milliseconds. You
could implement a 2 GB hard disk in such a way that it would access
information many times faster; but it would cost thousands, if not tens
of thousands of dollars. Similarly, the highest speed SRAM available is
much closer to the speed of the processor than the DRAM we use for system
memory, but it is cost prohibitive in most cases to put 32 or 64 MB of
it in a PC.
There is a good compromise to this however. Instead of trying to make the
whole 64 MB out of this faster, expensive memory, you make a smaller piece,
say 256 KB. Then you find a smart algorithm (process) that allows you to
use this 256 KB in such a way that you get almost as much benefit from
it as you would if the whole 64 MB was made from the faster memory. How
do you do this? The short answer is by using this small cache of 256 KB
to hold the information most recently used by the processor. Computer
science shows that in general, a processor is much more likely to need
again information it has recently used, compared to a random piece of
information in memory. This is the principle behind caching.
1.2. "Layers" of Cache
There are in fact many layers of cache in a modern PC. This does not even
include looking at caches included on some peripherals, such as hard disks.
Each layer is closer to the processor and faster than the layer below it.
Each layer also caches the layers below it, due to its increased speed
relative to the lower levels:
Level Devices Cached
Level 1 Cache
Level 2 Cache, System RAM, Hard
Disk / CD-ROM
Level 2 Cache System RAM, Hard Disk / CD-ROM
System RAM Hard Disk / CD-ROM
Hard Disk /
CD-ROM
--
What happens in general terms is this. The processor requests a piece of
information. The first place it looks is in the level 1 cache, since it
is the fastest. If it finds it there (called a hit on the cache), great;
it uses it with no performance delay. If not, it's a miss and the level
2 cache is searched. If it finds it there (level 2 "hit"), it is able to
carry on with relatively little delay. Otherwise, it must issue a request
to read it from the system RAM. The system RAM may in turn either have
the information available or have to get it from the still slower hard
disk or CD-ROM. The mechanics of how the processor (really the chipset
controlling the cache and memory) "look" for the information in these
various places isdiscussed here.
It is important to realize just how slow some of these devices are compared
to the processor. Even the fastest hard disks have an access time measuring
around 10 milliseconds. If it has to wait 10 milliseconds, a 200 MHz
processor will waste 2 million clock cycles! And CD-ROMs are generally
at least 10 times slower. This is why using caches to avoid accesses to
these slow devices is so crucial.
Caching actually goes even beyond the level of the hardware. For example,
your web browser uses caching itself, in fact, two levels of caching! Since
loading a web page over the Internet is very slow for most people, the
browser will hold recently-accessed pages to save it having to re-access
them. It checks first in its memory cache and then in its disk cache to
see if it already has a copy of the page you want. Only if it does not
find the page will it actually go to the Internet to retrieve it.
1.2.1.Level 1 (Primary) Cache
Level 1 or primary cache is the fastest memory on the PC. It is in fact,
built directly into the processor itself. This cache is very small,
generally from 8 KB to 64 KB, but it is extremely fast; it runs at the
same speed as the processor. If the processor requests information and
can find it in the level 1 cache, that is the best case, because the
information is there immediately and the system does not have to wait.
The level 1 cache is discussed in more detail here, in the section on
processors.
Note: Level 1 cache is also sometimes called "internal" cache since
it resides within the processor.
1.2.2.Level 2 (Secondary) Cache
The level 2 cache is a secondary cache to the level 1 cache, and is larger
and slightly slower. It is used to catch recent accesses that are not
caught by the level 1 cache, and is usually 64 KB to 2 MB in size. Level
2 cache is usually found either on the motherboard or a daughterboard that
inserts into the motherboard. Pentium Pro processors actually have the
level 2 cache in the same package as the processor itself (though it isn't
in the same circuit where the processor and level 1 cache are) which means
it runs much faster than level 2 cache that is separate and resides on
the motherboard. Pentium II processors are in the middle; their cache runs
at half the speed of the CPU.
Note: Level 2 cache is also sometimes called "external" cache since
it resides outside the processor. (Even on Pentium Pros... it is on a
separate chip in the same package as the processor.)
1.2.3.Disk Cache
A disk cache is a portion of system memory used to cache reads and writes
to the hard disk. In some ways this is the most important type of cache
on the PC, because the greatest differential in speed between the layers
mentioned here is between the system RAM and the hard disk. While the
system RAM is slightly slower than the level 1 or level 2 cache, the hard
disk is much slower than the system RAM.
Unlike the level 1 and level 2 cache memory, which are entirely devoted
to caching, system RAM is used partially for caching but of course for
other purposes as well. Disk caches are usually implemented using software
(like DOS's SmartDrive). They are discussed in more detail in the section
on hard disk performance.
1.2.4.Peripheral Cache
Much like the hard disk, other devices can be cached using the system RAM
as well. CD-ROMs are the most common device cached other than hard disks,
particularly due to their very slow initial access time, measured in the
tens to hundreds of milliseconds (which is an eternity to a computer).
In fact, in some cases CD-ROM drives are cached to the hard disk, since
the hard disk, despite its slow speed, is still much faster than a CD-ROM
drive is.
1.3.Function and Operation of the System Cache
This section discusses the principles behind the design of cache memory,
and explains how the secondary (level 2) cache works in detail. This will
give you a much better understanding of how the cache works and what the
issues are in its design--at least I hope it will, because that was my
primary goal in writing this. I was frustrated as I put the site together
with my inability to find anything on the 'net that really explained how
the cache worked.
This section is focused on the secondary cache, but in fact, the function
of the primary (level 1) cache built into modern processors is in many
ways identical: in terms of how associativity works, how the cache is
organized, how the system checks for hits, etc. However, many of the
implementation details are different.
Note: This is an advanced section with some potentially confusing
concepts. I make use of examples in order to hopefully make sure the
explanations make sense. You will find this section most helpful if you
read all the subsections it contains in order. You may also find reading
the section explaining system memory operation and timing instructive.
This page also makes extensive reference to memory addresses and locations,
and binary numbers. If you are not familiar with binary mathematics, you
may want to read this introductory page on the subject.
1.3.1.Why Caching Works
Cache is in some ways a really amazing technology. A 512 KB level 2 cache,
caching 64 MB of system memory, can supply the information that the
processor requests 90-95% of the time. Think about the ratios here: the
level 2 cache is less than 1% of the size of the memory it is caching,
but it is able to register a "hit" on over 90% of requests. That's pretty
efficient, and is the reason why caching is so important.
The reason that this happens is due to a computer science principle called
locality of reference. It states basically that even within very large
programs with several megabytes of instructions, only small portions of
this code generally get used at once. Programs tend to spend large periods
of time working in one small area of the code, often performing the same
work many times over and over with slightly different data, and then move
to another area. This occurs because of "loops", which are what programs
use to do work many times in rapid succession.
Just as one example (there are many), let's suppose you start up your word
processor and open your favorite document. The word processor program at
some point must read the file and then print on the screen the text it
finds. This is done (in very simplified terms) using code similar to this:
Open document file.
Open screen window.
For each character in the document:
Read the character.
Store the character into working memory.
Write the character to the window if the character is part of the first
page.
Close the document file.
The loop is of course the three instructions that are done "for each
character in the document". These instructions will be repeated many
thousands of times, and there are hundreds or thousands of loops like these
in the software you use. Every time you hit "page down" on your keyboard,
the word processor must clear the screen, figure out which characters to
display next, and then run a similar loop to copy them from memory to the
screen. Several loops are used when you tell it to save the file to the
hard disk.
This example shows how caching improves performance when dealing with
program code, but what about your data? Not surprisingly, access to data
(your work files, etc.) is similarly repetitive. When you are using your
word processor, how many times do you scroll up and down looking at the
same text over and over, as you edit it? The system cache holds much of
this information so that it can be loaded more quickly the second, third,
and next times that it is needed.
1.3.2.How Caching Works
In the example in the previous section a loop was used to read characters
from a file, store them in working memory, and then write them to the screen.
The first time each of these instructions (read, store, write) is executed,
it must be loaded from relatively slow system memory (assuming it is in
memory, otherwise it must be read from the hard disk which is much, much
slower even than the memory).
The cache is programmed (in hardware) to hold recently-accessed memory
locations in case they are needed again. So each of these instructions
will be saved in the cache after being loaded from memory the first time.
The next time the processor wants to use the same instruction, it will
check the cache first, see that the instruction it needs is there, and
load it from cache instead of going to the slower system RAM. The number
of instructions that can be buffered this way is a function of the size
and design of the cache.
Let's suppose that our loop is going to process 1,000 characters and the
cache is able to hold all three instructions in the loop (which sounds
obvious, but isn't always, due to cache mapping techniques). This means
that 999 of the 1,000 times these instructions are executed, they will
be loaded from the cache, or 99.9% of the time. This is why caching is
able to satisfy such a large percentage of requests for memory even though
it has a capacity that is often less than 1% the size of the system RAM.
1.3.3.Parts of the Level 2 Cache
The level 2 cache is comprised of two main components. These are not
usually physically located in the same chips, but represent logically how
the cache works. These parts of the cache are:
The Data Store: This is where the cached information is actually
kept. When reference is made to "storing something in the cache" or
"retrieving something from the cache", this is where the actual data goes
to or comes from. When someone says that the cache is 256 KB or 512 KB,
they are referring to the size of the data store. The larger the store,
the more information that can be cached and the more likelihood of the
cache being able to satisfy a request, all else being equal.
The Tag RAM: This is a small area of memory used by the cache to
keep track of where in memory the entries in the data store belong. The
size of the tag RAM--and not the size of the data store--controls how much
of main memory can be cached.
In addition to these memory areas are of course the cache controller
circuitry. Most of the work of controlling the level 2 cache on a modern
PC is performed by the system chipset.
1.3.4.Structure of the Data Store
Many people think of the cache as being organized as a large sequence of
bytes (8 bits each). In fact, on a modern fifth-generation or later PC,
the level 2 cache is org
本文档为【PCguide - cache】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑,
图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。