首页 PCguide - cache

PCguide - cache

举报
开通vip

PCguide - cache 1. System Cache.................................................................................................................................. 2 1.1. Role of Cache in the PC......................................................................................

PCguide - cache
1. System Cache.................................................................................................................................. 2 1.1. Role of Cache in the PC...................................................................................................... 2 1.2. "Layers" of Cache................................................................................................................3 1.2.1. Level 1 (Primary) Cache...........................................................................................4 1.2.2. Level 2 (Secondary) Cache.......................................................................................5 1.2.3. Disk Cache................................................................................................................ 5 1.2.4. Peripheral Cache....................................................................................................... 5 1.3. Function and Operation of the System Cache.....................................................................6 1.3.1. Why Caching Works................................................................................................. 6 1.3.2. How Caching Works................................................................................................. 7 1.3.3. Parts of the Level 2 Cache........................................................................................ 8 1.3.4. Structure of the Data Store........................................................................................8 1.3.5. Cache Mapping and Associativity............................................................................ 9 1.3.6. Comparison of Cache Mapping Techniques...........................................................10 1.3.7. Tag Storage..............................................................................................................12 1.3.8. How the Memory Address Is Used......................................................................... 13 1.3.9. Cache Write Policy and the Dirty Bit..................................................................... 13 1.3.10. Summary: The Cache Read/Write Process...........................................................15 1.4. Cache Characteristics........................................................................................................ 17 1.4.1. Cache Speed............................................................................................................ 17 1.4.2. Cache Size...............................................................................................................18 1.4.3. System RAM Cacheability..................................................................................... 19 1.4.4. Integrated vs. Separate Data and Instruction Caches............................................. 20 1.4.5. Mapping Technique.................................................................................................21 1.4.6. Write Policy.............................................................................................................21 1.4.7. Transactional or Non-Blocking Cache....................................................................22 1.4.8. Cache Bursting........................................................................................................23 1.4.9. Asynchronous Cache...............................................................................................23 1.4.10. Synchronous Burst Cache..................................................................................... 23 1.4.11. Pipelined Burst (PLB) Cache................................................................................24 1.4.12. Comparison of Transfer Technology Performance...............................................24 1.5. Cache Structure and Packaging.........................................................................................25 1.5.1. Integrated Level 2 Cache........................................................................................ 25 1.5.2. Daughterboard Cache..............................................................................................26 1.5.3. Motherboard Cache.................................................................................................26 1.5.4. COASt Modules......................................................................................................27 1. System Cache The system cache is responsible for a great deal of the system performance improvement of today's PCs. The cache is a buffer of sorts between the very fast processor and the relatively slow memory that serves it. (The memory is not really slow, it's just that the processor is much faster.) The presence of the cache allows the processor to do its work while waiting for memory far less often than it otherwise would. There are in fact several different "layers" of cache in a modern PC, each acting as a buffer for recently-used information to improve performance, but when "the cache" is mentioned without qualifiers, it normally refers to the "secondary" or "level 2" cache that is placed between the processor and system RAM. The various levels of cache are discussed here, in the discussion on the theory and operation behind cache (since many of the principles are the same). However, most of the focus of this section is on the level 2 system cache. 1.1.Role of Cache in the PC In early PCs, the various components had one thing in common: they were all really slow :^). The processor was running at 8 MHz or less, and taking many clock cycles to get anything done. It wasn't very often that the processor would be held up waiting for the system memory, because even though the memory was slow, the processor wasn't a speed demon either. In fact, on some machines the memory was faster than the processor. In the 15 or so years since the invention of the PC, every component has increased in speed a great deal. However, some have increased far faster than others. Memory, and memory subsystems, are now much faster than they were, by a factor of 10 or more. However a current top of the line processor has performance over 1,000 times that of the original IBM PC! This disparity in speed growth has left us with processors that run much faster than everything else in the computer. This means that one of the key goals in modern system design is to ensure that to whatever extent possible, the processor is not slowed down by the storage devices it works with. Slowdowns mean wasted processor cycles, where the CPU can't do anything because it is sitting and waiting for information it needs. We want it so that when the processor needs something from memory, it gets it as soon as possible. The best way to keep the processor from having to wait is to make everything that it uses as fast as it is. Wouldn't it be best just to have memory, system buses, hard disks and CD-ROM drives that just went as fast as the processor? Of course it would, but there's this little problem called "technology" that gets in the way. :^) Actually, it's technology and cost; a modern 2 GB hard disk costs less than $200 and has a latency (access time) of about 10 milliseconds. You could implement a 2 GB hard disk in such a way that it would access information many times faster; but it would cost thousands, if not tens of thousands of dollars. Similarly, the highest speed SRAM available is much closer to the speed of the processor than the DRAM we use for system memory, but it is cost prohibitive in most cases to put 32 or 64 MB of it in a PC. There is a good compromise to this however. Instead of trying to make the whole 64 MB out of this faster, expensive memory, you make a smaller piece, say 256 KB. Then you find a smart algorithm (process) that allows you to use this 256 KB in such a way that you get almost as much benefit from it as you would if the whole 64 MB was made from the faster memory. How do you do this? The short answer is by using this small cache of 256 KB to hold the information most recently used by the processor. Computer science shows that in general, a processor is much more likely to need again information it has recently used, compared to a random piece of information in memory. This is the principle behind caching. 1.2. "Layers" of Cache There are in fact many layers of cache in a modern PC. This does not even include looking at caches included on some peripherals, such as hard disks. Each layer is closer to the processor and faster than the layer below it. Each layer also caches the layers below it, due to its increased speed relative to the lower levels: Level Devices Cached Level 1 Cache Level 2 Cache, System RAM, Hard Disk / CD-ROM Level 2 Cache System RAM, Hard Disk / CD-ROM System RAM Hard Disk / CD-ROM Hard Disk / CD-ROM -- What happens in general terms is this. The processor requests a piece of information. The first place it looks is in the level 1 cache, since it is the fastest. If it finds it there (called a hit on the cache), great; it uses it with no performance delay. If not, it's a miss and the level 2 cache is searched. If it finds it there (level 2 "hit"), it is able to carry on with relatively little delay. Otherwise, it must issue a request to read it from the system RAM. The system RAM may in turn either have the information available or have to get it from the still slower hard disk or CD-ROM. The mechanics of how the processor (really the chipset controlling the cache and memory) "look" for the information in these various places isdiscussed here. It is important to realize just how slow some of these devices are compared to the processor. Even the fastest hard disks have an access time measuring around 10 milliseconds. If it has to wait 10 milliseconds, a 200 MHz processor will waste 2 million clock cycles! And CD-ROMs are generally at least 10 times slower. This is why using caches to avoid accesses to these slow devices is so crucial. Caching actually goes even beyond the level of the hardware. For example, your web browser uses caching itself, in fact, two levels of caching! Since loading a web page over the Internet is very slow for most people, the browser will hold recently-accessed pages to save it having to re-access them. It checks first in its memory cache and then in its disk cache to see if it already has a copy of the page you want. Only if it does not find the page will it actually go to the Internet to retrieve it. 1.2.1.Level 1 (Primary) Cache Level 1 or primary cache is the fastest memory on the PC. It is in fact, built directly into the processor itself. This cache is very small, generally from 8 KB to 64 KB, but it is extremely fast; it runs at the same speed as the processor. If the processor requests information and can find it in the level 1 cache, that is the best case, because the information is there immediately and the system does not have to wait. The level 1 cache is discussed in more detail here, in the section on processors. Note: Level 1 cache is also sometimes called "internal" cache since it resides within the processor. 1.2.2.Level 2 (Secondary) Cache The level 2 cache is a secondary cache to the level 1 cache, and is larger and slightly slower. It is used to catch recent accesses that are not caught by the level 1 cache, and is usually 64 KB to 2 MB in size. Level 2 cache is usually found either on the motherboard or a daughterboard that inserts into the motherboard. Pentium Pro processors actually have the level 2 cache in the same package as the processor itself (though it isn't in the same circuit where the processor and level 1 cache are) which means it runs much faster than level 2 cache that is separate and resides on the motherboard. Pentium II processors are in the middle; their cache runs at half the speed of the CPU. Note: Level 2 cache is also sometimes called "external" cache since it resides outside the processor. (Even on Pentium Pros... it is on a separate chip in the same package as the processor.) 1.2.3.Disk Cache A disk cache is a portion of system memory used to cache reads and writes to the hard disk. In some ways this is the most important type of cache on the PC, because the greatest differential in speed between the layers mentioned here is between the system RAM and the hard disk. While the system RAM is slightly slower than the level 1 or level 2 cache, the hard disk is much slower than the system RAM. Unlike the level 1 and level 2 cache memory, which are entirely devoted to caching, system RAM is used partially for caching but of course for other purposes as well. Disk caches are usually implemented using software (like DOS's SmartDrive). They are discussed in more detail in the section on hard disk performance. 1.2.4.Peripheral Cache Much like the hard disk, other devices can be cached using the system RAM as well. CD-ROMs are the most common device cached other than hard disks, particularly due to their very slow initial access time, measured in the tens to hundreds of milliseconds (which is an eternity to a computer). In fact, in some cases CD-ROM drives are cached to the hard disk, since the hard disk, despite its slow speed, is still much faster than a CD-ROM drive is. 1.3.Function and Operation of the System Cache This section discusses the principles behind the design of cache memory, and explains how the secondary (level 2) cache works in detail. This will give you a much better understanding of how the cache works and what the issues are in its design--at least I hope it will, because that was my primary goal in writing this. I was frustrated as I put the site together with my inability to find anything on the 'net that really explained how the cache worked. This section is focused on the secondary cache, but in fact, the function of the primary (level 1) cache built into modern processors is in many ways identical: in terms of how associativity works, how the cache is organized, how the system checks for hits, etc. However, many of the implementation details are different. Note: This is an advanced section with some potentially confusing concepts. I make use of examples in order to hopefully make sure the explanations make sense. You will find this section most helpful if you read all the subsections it contains in order. You may also find reading the section explaining system memory operation and timing instructive. This page also makes extensive reference to memory addresses and locations, and binary numbers. If you are not familiar with binary mathematics, you may want to read this introductory page on the subject. 1.3.1.Why Caching Works Cache is in some ways a really amazing technology. A 512 KB level 2 cache, caching 64 MB of system memory, can supply the information that the processor requests 90-95% of the time. Think about the ratios here: the level 2 cache is less than 1% of the size of the memory it is caching, but it is able to register a "hit" on over 90% of requests. That's pretty efficient, and is the reason why caching is so important. The reason that this happens is due to a computer science principle called locality of reference. It states basically that even within very large programs with several megabytes of instructions, only small portions of this code generally get used at once. Programs tend to spend large periods of time working in one small area of the code, often performing the same work many times over and over with slightly different data, and then move to another area. This occurs because of "loops", which are what programs use to do work many times in rapid succession. Just as one example (there are many), let's suppose you start up your word processor and open your favorite document. The word processor program at some point must read the file and then print on the screen the text it finds. This is done (in very simplified terms) using code similar to this:  Open document file.  Open screen window.  For each character in the document: Read the character. Store the character into working memory. Write the character to the window if the character is part of the first page.  Close the document file. The loop is of course the three instructions that are done "for each character in the document". These instructions will be repeated many thousands of times, and there are hundreds or thousands of loops like these in the software you use. Every time you hit "page down" on your keyboard, the word processor must clear the screen, figure out which characters to display next, and then run a similar loop to copy them from memory to the screen. Several loops are used when you tell it to save the file to the hard disk. This example shows how caching improves performance when dealing with program code, but what about your data? Not surprisingly, access to data (your work files, etc.) is similarly repetitive. When you are using your word processor, how many times do you scroll up and down looking at the same text over and over, as you edit it? The system cache holds much of this information so that it can be loaded more quickly the second, third, and next times that it is needed. 1.3.2.How Caching Works In the example in the previous section a loop was used to read characters from a file, store them in working memory, and then write them to the screen. The first time each of these instructions (read, store, write) is executed, it must be loaded from relatively slow system memory (assuming it is in memory, otherwise it must be read from the hard disk which is much, much slower even than the memory). The cache is programmed (in hardware) to hold recently-accessed memory locations in case they are needed again. So each of these instructions will be saved in the cache after being loaded from memory the first time. The next time the processor wants to use the same instruction, it will check the cache first, see that the instruction it needs is there, and load it from cache instead of going to the slower system RAM. The number of instructions that can be buffered this way is a function of the size and design of the cache. Let's suppose that our loop is going to process 1,000 characters and the cache is able to hold all three instructions in the loop (which sounds obvious, but isn't always, due to cache mapping techniques). This means that 999 of the 1,000 times these instructions are executed, they will be loaded from the cache, or 99.9% of the time. This is why caching is able to satisfy such a large percentage of requests for memory even though it has a capacity that is often less than 1% the size of the system RAM. 1.3.3.Parts of the Level 2 Cache The level 2 cache is comprised of two main components. These are not usually physically located in the same chips, but represent logically how the cache works. These parts of the cache are:  The Data Store: This is where the cached information is actually kept. When reference is made to "storing something in the cache" or "retrieving something from the cache", this is where the actual data goes to or comes from. When someone says that the cache is 256 KB or 512 KB, they are referring to the size of the data store. The larger the store, the more information that can be cached and the more likelihood of the cache being able to satisfy a request, all else being equal.  The Tag RAM: This is a small area of memory used by the cache to keep track of where in memory the entries in the data store belong. The size of the tag RAM--and not the size of the data store--controls how much of main memory can be cached. In addition to these memory areas are of course the cache controller circuitry. Most of the work of controlling the level 2 cache on a modern PC is performed by the system chipset. 1.3.4.Structure of the Data Store Many people think of the cache as being organized as a large sequence of bytes (8 bits each). In fact, on a modern fifth-generation or later PC, the level 2 cache is org
本文档为【PCguide - cache】,请使用软件OFFICE或WPS软件打开。作品中的文字与图均可以修改和编辑, 图片更改请在作品中右键图片并更换,文字修改请直接点击文字进行修改,也可以新增和删除文档中的内容。
该文档来自用户分享,如有侵权行为请发邮件ishare@vip.sina.com联系网站客服,我们会及时删除。
[版权声明] 本站所有资料为用户分享产生,若发现您的权利被侵害,请联系客服邮件isharekefu@iask.cn,我们尽快处理。
本作品所展示的图片、画像、字体、音乐的版权可能需版权方额外授权,请谨慎使用。
网站提供的党政主题相关内容(国旗、国徽、党徽..)目的在于配合国家政策宣传,仅限个人学习分享使用,禁止用于任何广告和商用目的。
下载需要: 免费 已有0 人下载
最新资料
资料动态
专题动态
is_094482
暂无简介~
格式:pdf
大小:326KB
软件:PDF阅读器
页数:0
分类:互联网
上传时间:2013-11-27
浏览量:9