What is the first-level cache of CPU?

ByMord Maclachlan 2020-01-09 1131

CPU cache (Cache Memory) is a temporary memory located between CPU and memory. Its capacity is much smaller than memory, but its switching speed is much faster than memory. The emergence of cache is mainly to solve the contradiction between the CPU operation speed and the memory read and write speed, because the CPU operation speed is much faster than the memory read and write speed, which will make CPU spend a long time waiting for the data to arrive or write the data to memory.
The data in the cache is a small part of the memory, but this small part is about to be accessed by CPU in a short time. When CPU calls a large amount of data, it can avoid memory and call directly from the cache, thus speeding up the reading speed. Thus it can be seen that adding cache to CPU is an efficient solution, so that the whole internal memory (cache + memory) becomes a high-speed storage system with both cache and memory. Caching has a great impact on the performance of CPU, mainly because of the data exchange order of CPU and the bandwidth between CPU and cache.

The working principle of cache is that when CPU wants to read a data, it first looks it up from the cache, and if it finds it, it immediately reads it and sends it to CPU for processing; if it does not find it, it uses a relatively slow speed to read from memory and sends it to CPU for processing. At the same time, the data block in which the data is located is called into the cache, which makes it possible to read the whole block of data from the cache in the future without having to call memory.

It is this read mechanism that makes the hit rate of the CPU read cache very high (most CPU can reach about 90%), which means that 90% of the data to be read by CPU next time is in the cache, and only about 10% needs to be read from memory. This greatly saves time for CPU to read memory directly and makes it almost unnecessary for CPU to read data. In general, the order in which CPU reads data is cached before memory.

At present, the cache basically uses SRAM memory. SRAM is the abbreviation of English Static RAM. It is a kind of memory with static access function, which can store its internal data without refreshing the circuit. Unlike the need to refresh the circuit like DRAM memory, every once in a while, it is fixed to refresh and charge the DRAM, otherwise the internal data will disappear, so SRAM has high performance, but SRAM also has its shortcomings, that is, its low integration, the same capacity of DRAM memory can be designed for a smaller volume, but SRAM needs a large volume, which is also an important reason why the cache capacity can not be made too large at present. Its characteristics are summarized as follows: the advantages are energy saving, high speed, no need to cooperate with the memory refresh circuit, and can improve the overall work efficiency, while the disadvantages are low integration, large volume of the same capacity, and high price. Can only be used in a small number of critical systems to improve efficiency.

According to the order of reading data and the tight degree of integration with CPU, CPU cache can be divided into first-level cache, second-level cache, some high-end CPU also have three-level cache, all the data stored in each level of cache is part of the next level cache, the technical difficulty and manufacturing cost of these three kinds of cache are relatively decreasing, so their capacity is also relatively increasing. When CPU wants to read a data, it first looks it up in the first-level cache, if it doesn't find it, then looks it up in the second-level cache, and if it still hasn't, it looks up from the third-level cache or memory.

Generally speaking, the hit rate of each level of cache is about 80%, that is, 80% of the total data volume can be found in the first-level cache, and only 20% of the total data volume needs to be read from the second-tier cache, the third-tier cache or memory, thus it can be seen that the first-level cache is the most important part of the entire CPU cache architecture.

Level 1 Cache (L1 Cache,) is located next to the CPU kernel. It is the most closely integrated CPU cache with CPU, and it is also the earliest CPU cache in history. Because the first-level cache has the highest technical difficulty and manufacturing cost, the increase in technical difficulty and cost caused by increasing capacity is very large, but the performance improvement is not obvious, and the performance-to-price ratio is very low. moreover, the hit rate of the existing first-level cache is already very high, so the first-level cache is the smallest of all caches, which is much smaller than the second-level cache.

Generally speaking, first-level cache can be divided into first-level data cache (Data Cache,D-Cache) and first-level instruction cache (Instruction Cache,I-Cache). The two are used to store data and decode the instructions to execute the data in real time, and both of them can be accessed by CPU at the same time, which reduces the conflict caused by contention for Cache and improves the efficiency of the processor. At present, most CPU's first-level data cache and first-level instruction cache have the same capacity. For example, AMD's Athlon XP has 64KB's first-level data cache and 64KB's first-level instruction cache, and its first-level cache is represented by 64KB+64KB, the rest of CPU's first-level cache representation, and so on.

Intel's CPU using NetBurst architecture (the most typical is Pentium 4) has a special first-level cache, which uses a newly added first-level tracking cache (Execution Trace Cache,T-Cache or ETC to replace the first-level instruction cache with a capacity of 12K μ Ops, which means it can store 12K or 12000 decoded microinstructions.

The operation mechanism of the first-level trace cache is different from that of the first-level instruction cache. The first-level instruction cache only decodes the instructions in real time and does not store them, while the first-level tracking cache also decodes some instructions, which are called micro-ops, and these microinstructions can be stored in the first-level tracking cache without the need for a program to decode them every time. Therefore, the first-level tracking cache can effectively increase the ability to decode instructions at high operating frequency, and μ Ops means micro-ops,. It provides μ ops to the processor core at a high speed. The

Intel NetBurst micro-architecture uses the execution trace cache to separate the decoder from the execution loop. This trace cache provides uops to the core with high bandwidth, which is essentially suitable for making full use of the instruction-level parallelism in the software. Intel does not disclose the actual capacity of the first-level trace cache, except that the first-level trace cache can store 12000 microinstructions (micro-ops). Therefore, you cannot simply use the number of microinstructions to compare the size of the instruction cache.

In fact, the single-core NetBurst architecture CPU is basically sufficient with a cache of 8K muops, and an extra 4k muops can greatly improve the cache hit ratio. If you want to use hyper-threading technology, 12K μ Ops will be a little insufficient, which is an important reason why sometimes Intel processors can lead to performance degradation when using hyper-threading technology.

For example, the first-level cache of the Northwood core is 8KB + 12K μ Ops, which means that the first-level data cache is 8KB and the first-level tracking cache is 12K μ Ops, while the first-level cache of the Prescott core is 16KB + 12K μ Ops, which means that the first-level data cache is 16KB and the first-level tracking cache is 12K μ Ops. Here 12K μ Ops is definitely not equal to 12KB, the units are different, one is μ Ops, and the other is Byte (bytes), and the operating mechanisms of the two are completely different. So those who simply add Intel's CPU first-level cache, such as the Northwood core as 20KB-level cache, and the Prescott core as 28KB-level cache, and think that the first-level cache capacity of the Intel processor is much lower than that of the AMD processor 128KB is completely wrong, and the two are not comparable.

In the CPU comparison with certain differences in architecture, many caches have been difficult to find the corresponding things, even if there are differences in the design ideas and functional definitions of caches with similar names, so it is impossible to compare them with simple arithmetic addition; while in CPU comparisons with very similar architectures, it is meaningful to compare the cache sizes of various functions respectively. What is the first-level cache of

 You may also want to read:
 Xiaomi Mi Watch Color VS HUAWEI Watch GT2 VS Amazfit GTR: Which is More Worth Buying?
 Xiaomi Mi Watch Color In-depth Review: Fashion Products Without Electricity Anxiety
 How to Turn your Xiaomi Mi Band 4 as a remote shutter for Android camera?
 How to hide the display notch of Redmi Note 8 and Redmi Note 8 Pro?

Gearbest Daily deals up to 69 percent off

Extensive Product Selection

● Over 300,000 products

● 20 different categories

● 15 local warehosues

● Multiple top brands

Convenient Payment

● Global payment options: Visa, MasterCard, American Express

● PayPal, Western Union and bank transfer are accepted

● Boleto Bancario via Ebanx (for Brazil)

Prompt Shipping

● Unregistered air mail

● Registered air mail

● Priority line

● Expedited shipping

Dedicated After-sales Service

● 45 day money back guarantee

● 365 day free repair warranty

● 7 day Dead on Arrival guarantee (DOA)

You might also like


Related Products