Home Useful Tips Stands for 2 cores. Processors. Why two sets of processor cores are needed

Stands for 2 cores. Processors. Why two sets of processor cores are needed

In the early years of the new millennium, when CPU frequencies finally passed the 1 GHz mark, some companies (let's not point fingers at Intel) predicted that the new NetBurst architecture could reach frequencies of the order of 10 GHz in the future. Enthusiasts anticipated a new era when CPU clock speeds would spike like mushrooms after rain. Need more performance? Just upgrade to a higher clock speed processor.

Newton's apple fell loudly on the heads of dreamers who saw megahertz as the easiest way to continue to grow PC performance. Physical limitations did not allow an exponential increase in clock frequency without a corresponding increase in heat generation, and other problems associated with production technologies also began to arise. Indeed, in recent years, the fastest processors have been operating at frequencies from 3 to 4 GHz.

Of course, progress cannot be stopped when money is paid for it - there are quite a few users who are ready to shell out a considerable sum for a more powerful computer. Therefore, engineers began to look for other ways to increase performance, in particular, increasing the efficiency of instruction execution, and not just relying on the clock speed. Parallelism turned out to be a solution too - if you can't make the CPU faster, then why not add a second one of the same processor to increase computational resources?

Pentium EE 840 is the first dual-core CPU to hit retail.

The main problem with concurrency is that the software must be specially written to distribute the load across multiple threads — that is, you don’t get an immediate return on your investment, as opposed to frequency. In 2005, when the first dual-core processors came out, they did not provide significant performance gains, since there was quite a bit of software used on the desktop PCs to support them. In fact, most dual-core CPUs were slower than single-core processors for most tasks, since single-core CPUs ran at higher clock speeds.

However, four years have passed, and a lot has changed for them. Many software developers have optimized their products to take advantage of multiple cores. Single core processors are harder to find on the market today, and dual, triple, and quad core CPUs are considered commonplace.

But the question arises: how many CPU cores do you really need? Is a triple-core processor enough for gaming, or is it better to pay extra and take a quad-core chip? Is a dual-core processor enough for the average user, or does more cores really make any difference? Which applications are optimized for multiple cores, and which ones will only respond to changes in specifications such as frequency or cache size?

We thought it was a good time to test the applications from the updated package (however, the update is not over yet) on single, dual, triple and quad core configurations in order to understand how valuable multi-core processors have become in 2009.

To make the tests fair, we chose a quad-core processor - overclocked to 2.7 GHz Intel Core 2 Quad Q6600. After running tests on our system, we then disabled one of the cores, rebooted, and repeated the tests. We sequentially turned off the cores and got results for a different number of active cores (from one to four), while the processor and its frequency did not change.

Disabling CPU cores under Windows is very easy. If you want to know how to do this, type "msconfig" in the Windows Vista "Start Search" window and press "Enter". This will open the System Configuration utility.

In it, go to the "Boot / Boot" tab and press the "Advanced options" button.

This will bring up the "BOOT Advanced Options" window. Select the "Number of Processors" checkbox and specify the number of processor cores that will be active in the system. Everything is very simple.

After confirmation, the program will offer to reboot. After rebooting, you can see the number of active cores in the Task Manager. Calling the "Task Manager" is performed by pressing the keys Crtl + Shift + Esc.

Select the Performance tab in the Task Manager. In it, you can see the load graphs for each processor / core (be it a separate processor / core or a virtual processor, as we get in the case of Core i7 with active support for Hyper-Threading) in the "Chronology of CPU / CPU Usage History". Two graphs mean two active cores, three - three active cores, etc.

Now that you have familiarized yourself with the methodology of our tests, let us move on to a detailed examination of the configuration of the test computer and programs.

Test configuration

System hardware
CPU Intel Core 2 Quad Q6600 (Kentsfield), 2.7 GHz, FSB-1200, 8 MB L2 cache
Platform MSI P7N SLI Platinum, Nvidia nForce 750i, BIOS A2
Memory A-Data EXTREME DDR2 800+, 2 x 2048 MB, DDR2-800, CL 5-5-5-18 1.8 V
HDD Western Digital Caviar WD50 00AAJS-00YFA, 500 GB, 7200 RPM, 8 MB Cache, SATA 3.0 Gb / s
Network Integrated nForce 750i Gigabit Ethernet Controller
Video cards Gigabyte GV-N250ZL-1GI 1 GB DDR3 PCIe
Power Supply Ultra HE1000X, ATX 2.2, 1000W
Software and drivers
Operating system Microsoft Windows Vista Ultimate 64-bit 6.0.6001, SP1
DirectX version DirectX 10
Platform driver nForce Driver Version 15.25
Graphics driver Nvidia Forceware 182.50

Tests and settings

3D games
Crysis Quality settings set to lowest, Object Detail to High, Physics to Very High, version 1.2.1, 1024x768, Benchmark tool, 3-run average
Left 4 dead Quality settings set to lowest, 1024x768, version 1.0.1.1, timed demo.
World in Conflict Quality settings set to lowest, 1024x768, Patch 1.009, Built-in benchmark.
iTunes Version: 8.1.0.52, Audio CD ("Terminator II" SE), 53 min., Default format AAC
Lame MP3 Version: 3.98 (64-bit), Audio CD "" Terminator II "SE, 53 min, wave to MP3, 160 Kb / s
TMPEG 4.6 Version: 4.6.3.268, Import File: "Terminator II" SE DVD (5 Minutes), Resolution: 720x576 (PAL) 16: 9
DivX 6.8.5 Encoding mode: Insane Quality, Enhanced Multi-Threading, Enabled using SSE4, Quarter-pixel search
XviD 1.2.1 Display encoding status = off
MainConcept Reference 1.6.1 MPEG2 to MPEG2 (H.264), MainConcept H.264 / AVC Codec, 28 sec HDTV 1920x1080 (MPEG2), Audio: MPEG2 (44.1 KHz, 2 Channel, 16-Bit, 224 Kb / s), Mode: PAL (25 FPS), Profile: Tom "s Hardware Settings for Qct-Core
Autodesk 3D Studio Max 2009 (64-bit) Version: 2009, Rendering Dragon Image at 1920x1080 (HDTV)
Adobe Photoshop CS3 Version: 10.0x20070321, Filtering from a 69 MB TIF-Photo, Benchmark: Tomshardware-Benchmark V1.0.0.4, Filters: Crosshatch, Glass, Sumi-e, Accented Edges, Angled Strokes, Sprayed Strokes
Grisoft AVG Antivirus 8 Version: 8.0.134, Virus base: 270.4.5 / 1533, Benchmark: Scan 334 MB Folder of ZIP / RAR compressed files
WinRAR 3.80 Version 3.80, Benchmark: THG-Workload (334 MB)
WinZip 12 Version 12, Compression = Best, Benchmark: THG-Workload (334 MB)
3DMark Vantage Version: 1.02, GPU and CPU scores
PCMark Vantage Version: 1.00, System, Memory, Hard Disk Drive benchmarks, Windows Media Player 10.00.00.3646
SiSoftware Sandra 2009 SP3 CPU Test = CPU Arithmetic / MultiMedia, Memory Test = Bandwidth Benchmark

Test results

Let's start with the results of synthetic tests, in order to then evaluate how well they correspond to real tests. It is important to remember that synthetic tests are written for the future, so they should be more responsive to changes in the number of cores than real applications.

We'll start with the 3DMark Vantage synthetic gaming performance benchmark. We chose the "Entry" run, which 3DMark runs at the lowest resolution available, so that CPU performance has a greater impact on the result.

Almost linear growth is quite interesting. The biggest gain is observed when moving from one core to two, but then the scalability can be traced quite noticeably. Now let's move on to the PCMark Vantage benchmark, which is designed to display overall system performance.

The PCMark results suggest that the end user will benefit from up to three CPU cores, while the fourth core will reduce performance slightly. Let's see what this result is related to.

In the memory subsystem test, we again observe the largest performance gain when moving from one CPU core to two.

The productivity test, in our opinion, has the greatest influence on the overall PCMark test result, since in this case the performance gain ends at three cores. Let's see if the results of another SiSoft Sandra synthetic benchmark are the same.

We'll start with SiSoft Sandra's arithmetic and multimedia tests.


Synthetic tests demonstrate a fairly linear performance gain when moving from one CPU core to four. This benchmark was written specifically to make good use of the four cores, but we doubt real-world applications will see the same linear progress.

Sandra's memory benchmark also suggests that three cores will provide more memory bandwidth in iSSE2 integer buffered operations.

After the synthetic tests, it's time to see what we get in the application tests.

Audio encoding has traditionally been a segment in which applications have not benefited greatly from multiple cores or have not been optimized by the developers. Below are the results from Lame and iTunes.

Lame does not show much benefit when using multiple cores. Interestingly, we see a small performance gain with an even number of cores, which is rather strange. However, the difference is small, so it just might be within the margin of error.

As for iTunes, we see a small performance gain after activating two cores, but more cores do nothing.

It turns out that neither Lame nor iTunes are optimized for multiple CPU cores for audio encoding. On the other hand, as far as we know, video encoding programs are often highly optimized for multiple cores due to their inherently parallel nature. Let's take a look at the video encoding results.

We'll start our video encoding tests with the MainConcept Reference.

Notice how much the result is affected by increasing the number of cores: encoding time decreases from nine minutes on a single-core 2.7GHz Core 2 processor to just two minutes and 30 seconds when all four cores are active. It is quite clear that if you often transcode video, then it is better to take a processor with four cores.

Will we get similar benefits in the TMPGEnc benchmarks?

Here you can see the effect on the result of the encoder. If the DivX encoder is highly optimized for multiple CPU cores, then Xvid does not show such a noticeable advantage. However, even Xvid gives a 25% reduction in coding time when moving from one core to two.

Let's start our graphics tests with Adobe Photoshop.

As you can see, the CS3 version does not notice the addition of cores. Strange result for such a popular program, although we admit we weren't using the latest version of Photoshop CS4. The CS3 results are still not encouraging.

Let's take a look at the 3D rendering results in Autodesk 3ds Max.

It is quite obvious that Autodesk 3ds Max "loves" additional kernels. This feature was present in 3ds Max even while the program was running in a DOS environment, since the 3D rendering task took so long that it was necessary to distribute it across several computers on the network. Again, quad-core processors are highly desirable for such programs.

The antivirus scan test is very close to real life conditions as almost everyone uses antivirus software.

AVG Antivirus demonstrates wonderful performance gains when increasing CPU cores. Computer performance can be severely degraded during antivirus scans, and the results clearly show that multiple cores can significantly reduce scan times.


WinZip and WinRAR do not provide noticeable gains on multiple cores. WinRAR demonstrates performance gains on two cores, but nothing more. It will be interesting to see how the just released version 3.90 performs.

In 2005, when dual-core desktop computers began to emerge, there were simply no games that could demonstrate the performance gains from single-core CPUs to multi-core CPUs. But times have changed. How do multiple CPU cores affect modern games? Let's run some popular games and see. We ran gaming tests at a low resolution of 1024x768 and low levels of graphical detail to minimize the impact of the graphics card and to determine how much these games are hitting CPU performance.

Let's start with Crysis. We've minimized all the options except for the object detail, which we set to "High", and the Physics, which we set to "Very High". As a result, the performance of the game should be more dependent on the CPU.

Crysis showed an impressive dependence on the number of CPU cores, which is quite surprising since we thought it was more responsive to the performance of the graphics card. In any case, you can see that in Crysis, single-core CPUs give a frame rate half as much as with four cores (however, remember that if the game depends more on the performance of the video card, then the scatter of results with a different number of CPU cores will be less) ... It is also interesting to note that Crysis can only use three cores, since the addition of the fourth does not make a noticeable difference.

But we know that Crysis is serious about physics calculations, so let's see what the situation will be in a game with less advanced physics. For example, in Left 4 Dead.

Interestingly, Left 4 Dead demonstrates a similar result, although the lion's share of the performance gain appears after the addition of a second core. There is a slight increase in the transition to three cores, but the fourth core is not required for this game. An interesting trend. Let's see how it will be typical for the real-time strategy World in Conflict.

The results are similar again, but we see an amazing feature - three CPU cores give slightly better performance than four. The difference is close to the margin of error, but this confirms again that the fourth core is not used in games.

It's time to draw conclusions. Since we got a lot of data, let's simplify the situation by calculating the average performance gain.

First, I would like to say that the results of synthetic tests are too optimistic when comparing the use of several cores with real applications. The performance gain of synthetic tests when moving from one core to several looks almost linear, each new core adds 50% performance.

In applications, we see more realistic progress - about 35% increase from the second CPU core, 15% increase from the third and 32% increase from the fourth. It's strange that by adding a third core, we only get half the advantage that the fourth core gives.

In applications, however, it is better to look at individual programs rather than the overall result. Indeed, audio encoding applications, for example, do not benefit at all from an increase in the number of cores. On the other hand, video encoding applications offer significant benefits from a larger number of CPU cores, although this is quite dependent on the encoder used. In the case of the 3D renderer 3ds Max, we can see that it is heavily optimized for multi-core environments, and 2D photo editing applications like Photoshop are unresponsive to the number of cores. AVG Antivirus has shown significant performance gains on multiple cores, and the gain is not so great on file compression utilities.

As for games, when moving from one core to two, performance increases by 60%, and after adding a third core to the system, we get another 25% lead. The fourth core has no advantage in the games we have chosen. Of course, if we took more games, then the situation could change, but, in any case, the Phenom II X3 triple-core processors seem to be a very attractive and inexpensive choice for the gamer. It is important to note that when moving to higher resolutions and adding visual detail, the difference due to the number of cores will be less, since the graphics card will be the decisive factor affecting the frame rate.


Four cores.

Taking into account all that has been said and done, a number of conclusions can be drawn. Overall, you don't need to be any kind of professional user to benefit from a multi-core CPU setup. The situation has changed significantly compared to what it was four years ago. Of course, the difference does not seem so significant at first glance, but it is quite interesting to note how much applications have become optimized for multithreading in the last few years, especially those programs that can give a significant performance increase from this optimization. In fact, we can say that today it makes no sense to recommend single-core CPUs (if you still find them), except for solutions with low power consumption.

In addition, there are applications for which users are encouraged to buy processors with as many cores as possible. These include video encoding programs, 3D rendering programs, and optimized work applications, including antivirus software. For gamers, gone are the days when a single-core processor with a powerful graphics card was enough.

When buying a processor, many people try to choose something more abruptly, with several cores and a high clock speed. But at the same time, few people know what the number of processor cores actually affects. Why, for example, an ordinary and unpretentious dual-core processor can be faster than a quad-core processor, or the same "percent" with 4 cores will be faster than a "processor" with 8 cores. This is a pretty interesting topic that is definitely worth exploring in more detail.

Introduction

Before starting to understand what the number of processor cores affects, I would like to make a small digression. Until a few years ago, CPU designers were confident that manufacturing technologies that are advancing so rapidly would produce "stones" with clock speeds of up to 10 GHz, which would allow users to forget about problems with poor performance. However, no success was achieved.

No matter how the technical process developed, that Intel, that AMD ran into purely physical limitations, which simply did not allow the production of "prots" with a clock frequency of up to 10 GHz. Then it was decided to focus not on frequencies, but on the number of cores. Thus, a new race for the production of more powerful and efficient processor "crystals" began, which continues to this day, but not as actively as it was at first.

Intel and AMD processors

Today Intel and AMD are direct competitors in the processor market. Looking at revenues and sales, the Blues have a clear advantage, although recently the Reds have been trying to keep up. Both companies have a good range of ready-made solutions for all occasions - from a simple processor with 1-2 cores to real monsters in which the number of cores exceeds 8. Usually such "stones" are used on special working "computers" that have a narrow focus ...

Intel

So, today Intel has 5 types of processors: Celeron, Pentium, and i7. Each of these "stones" has a different number of cores and is designed for different tasks. For example, Celeron has only 2 cores and is used mainly on office and home computers. Pentium, or, as it is also called, "stump", is also used in the home, but already has much better performance, primarily due to the Hyper-Threading technology, which "adds" two more virtual cores to the physical two cores, which are called threads ... Thus, the dual-core "percent" works like the most budgetary quad-core processor, although this is not entirely correct, but this is the main point.

As for the Core line, the situation is roughly the same. The younger model with number 3 has 2 cores and 2 threads. The older line - Core i5 - already has full 4 or 6 cores, but lacks the Hyper-Threading function and has no additional threads, except for 4-6 standard ones. And the last thing - core i7 - these are top-end processors, which, as a rule, have from 4 to 6 cores and twice as many threads, that is, for example, 4 cores and 8 threads or 6 cores and 12 threads.

AMD

Now it's worth mentioning AMD. The list of "pebbles" from this company is huge, there is no point in listing everything, since most of the models are simply outdated. It is worth, perhaps, to note the new generation, which in a sense "copies" "Intel" - Ryzen. This line also contains models numbered 3, 5 and 7. The main difference from Ryzen's blue ones is that the youngest model immediately provides full 4 cores, while the older model has not 6, but eight. In addition, the number of threads varies. Ryzen 3 - 4 threads, Ryzen 5 - 8-12 (depending on the number of cores - 4 or 6) and Ryzen 7 - 16 threads.

It is worth mentioning another line of "red" - FX, which appeared in 2012, and, in fact, this platform is already considered outdated, but thanks to the fact that now more and more programs and games are starting to support multithreading, the Vishera line is again has gained popularity, which, along with low prices, is only growing.

Well, as for the disputes over the frequency of the processor and the number of cores, then, in fact, it is more correct to look towards the second, since everyone has already decided on the clock frequencies for a long time, and even the top models from Intel operate at nominal 2. 7, 2. 8 , 3 GHz. In addition, the frequency can always be raised by overclocking, but in the case of a dual-core processor, this will not give much effect.

How to find out how many cores

If someone does not know how to determine the number of processor cores, then this can be done easily and simply, even without downloading and installing separate special programs. You just need to go to the "Device Manager" and click on the small arrow next to the item "Processors".

You can get more detailed information about what technologies your "stone" supports, what clock frequency it has, its revision number and much more with the help of a special and small program CPU-Z. You can download it for free on the official website. There is a version that does not require installation.

The advantage of two cores

What could be the advantage of a dual-core processor? In many things, for example, in games or applications, in the development of which single-threaded work was the main priority. Take, for example, the game Wold of Tanks. The most common dual-core processors like Pentium or Celeron will produce quite decent performance results, while some FX from AMD or INTEL Core will use much more of their capabilities, and the result will be about the same.

The better 4 cores

How can 4 cores be better than two? Better performance. Quad-core "stones" are designed for more serious work, where simple "hemp" or "selerons" simply cannot cope. Any 3D graphics program like 3Ds Max or Cinema4D is a great example of this.

During the rendering process, these programs use the maximum computer resources, including RAM and processor. Dual-core CPUs will lag very much in render processing time, and the more complex the scene, the longer it will take. But processors with four cores will cope with this task much faster, since additional threads will also come to their aid.

Of course, you can take some budget "procyclist" from the Core i3 family, for example, the 6100 model, but 2 cores and 2 additional threads will still be inferior to a full-fledged quad-core processor.

6 and 8 cores

Well, and the last segment of multi-cores - processors with six and eight cores. Their main purpose, in principle, is exactly the same as that of the CPU above, only they are needed where ordinary "fours" cannot cope. In addition, on the basis of "stones" with 6 and 8 cores, they build full-fledged profile computers that will be "sharpened" for certain activities, for example, video editing, 3D programs for modeling, rendering of ready-made heavy scenes with a large number of polygons and objects, etc. .d.

In addition, such multi-cores show themselves very well in working with archivers or in applications where good computing power is needed. In games that are optimized for multithreading, there is no equal to such processors.

What affects the number of processor cores

So what else can the number of cores affect? First of all, to increase energy consumption. Yes, no matter how amazing it sounds, but it is. You should not worry too much, because in everyday life this problem, so to speak, will not be noticeable.

The second is heating. The more cores, the better the cooling system is needed. AIDA64 program will help to measure the temperature of the processor. At startup, you need to click on "Computer" and then select "Sensors". It is necessary to monitor the temperature of the processor, because if it constantly overheats or works at too high temperatures, then after a while it will simply burn out.

Dual-core processors are unfamiliar with such a problem, because they do not have too high performance and heat dissipation, respectively, but multi-core processors - yes. The "hottest" stones are considered to be from AMD, especially the FX series. For example, take the FX-6300. The processor temperature in the AIDA64 program is around 40 degrees and this is in idle mode. Under load, the figure will grow and if overheating occurs, the computer will turn off. So, when buying a multi-core processor, you shouldn't forget about a cooler.

What else affects the number of processor cores? Multitasking. Dual-core "processes" will not be able to provide stable performance when working in two, three or more programs at the same time. The simplest example is streamers on the Internet. In addition to the fact that they play some game at high settings, they have a program running in parallel that allows you to broadcast the gameplay to the Internet online, an Internet browser with several open pages also works, where the player, as a rule, reads comments people watching it and monitors other information. Not every multi-core processor can provide adequate stability, let alone dual- and single-core processors.

It is also worth saying a few words about the fact that multi-core processors have a very useful thing called "L3 cache of the third level". This cache has a certain amount of memory, which constantly records various information about running programs, actions performed, etc. All this is needed in order to increase the speed of the computer and its performance. For example, if a person often uses Photoshop, then this information will be saved in the memory of the porridge, and the time for starting and opening the program will be significantly reduced.

Summarizing

Summing up the conversation about what the number of processor cores affects, we can come to one simple conclusion: if you need good performance, speed, multitasking, work in heavy applications, the ability to comfortably play modern games, etc., then your choice is processor with four cores or more. If you need a simple "computer" for office or home use, which will be used to a minimum, then 2 cores are what you need. In any case, when choosing a processor, you first need to analyze all your needs and tasks, and only then consider any options.

Good afternoon, dear readers of our technoblog. Today we have not a review, but a kind of comparison, which processor is better than 2-core or 4-core? I wonder who is cooler in 2018? Then let's get started. Let's say right away that in most cases the palm will go to a device with a large number of physical modules, but chips with 2 cores are not as simple as they seem at first glance.

Many have probably already guessed that we will consider all the current representatives from Intel of the Pentium Coffee Lake family and the popular "hyperpen" G4560 (Kaby Lake). How relevant are the models in the current year and is it worth considering buying more productive AMD Ryzen or the same Core i3 with 4 cores.

The AMD Godavari and Bristol Ridge family is deliberately not considered for one simple reason - it has no further potential, and the platform itself turned out to be not as successful as it might have been expected.

Often these solutions are bought either out of ignorance, or "for rent" as some kind of the cheapest possible assembly for the Internet and online films. But we are not particularly happy with this state of affairs.

Differences between 2-core chips and 4-core chips

Let's consider the main points that distinguish the first category of chips from the second. At the hardware level, you can see that only the number of computing units is different. In other cases, the cores are united by a high-speed data exchange bus, a common memory controller for fruitful and efficient work with RAM.

Often, the L1 cache of each core is an individual value, but L2 can be either the same for all, or also individual for each block. However, in this case, the L3 cache is additionally used.

In theory, 4-core solutions should be 2 times faster and more powerful, since they perform 100% more operations per clock (let's take as a basis the identical frequency, cache, technical process and all other parameters). But in practice, the situation changes in a completely non-linear way.

But here it is worth paying tribute: in a multithread, the whole essence of 4 cores is fully revealed.

Why are 2-core processors still popular?

If you look at the mobile segment of electronics, you will notice the dominance of 6-8 nuclear chips, which look as organic as possible and are loaded in parallel when performing all tasks. Why is that? Android and iOS operating systems are fairly young systems with a high level of competition, and therefore the optimization of each application is the key to the success of device sales.

With the PC industry, the situation is different and here's why:

Compatibility. When developing any software, developers strive to please both new and old audiences with weak hardware. There is more emphasis on dual-core processors at the expense of support for 8-cores.

Parallelization of tasks. Despite the dominance of technology in 2018, getting a program to run with multiple cores and CPU threads in parallel is still not easy. When it comes to calculating several completely different applications, then there are no questions, but when it comes to calculations within one program, it is already worse: you have to regularly calculate completely different information, while not forgetting about the success of the tasks and the absence of errors in calculations.

In games, the situation is even more interesting, since it is practically impossible to divide the amount of information into equal "shares". As a result, we get the following picture: one computing unit is 100% oil, the other 3 are waiting for their turn.

Continuity. Each new solution builds on previous developments. Writing code from scratch is not only expensive, but also often unprofitable for the development center, since "this is enough for people, and users of 2-core chips are still the lion's share."

Take, for example, many iconic projects like Lineage 2, AION, World of Tanks. All of them were created on the basis of ancient engines that are able to adequately load only one physical core, and therefore only the chip frequency plays the main role in the calculations.
Financing. Not everyone can afford to create a completely new product, not designed for 4.8, 16 threads. It is too expensive and in most cases unjustified. Take, for example, the same cult GTA V, which will easily "eat" both 12 and 16 threads, not to mention the cores.

The cost of its development has exceeded a good 200 million dollars, which in itself is very expensive. Yes, the game was a success, as Rockstar's credibility was enormous among the players. What if it was a young startup? Here you already understand everything.

Do you need multi-core processors?

Let's look at the situation from the point of view of a common man in the street. Most users need 2 cores for the following reasons:

  • low needs;
  • most applications are stable;
  • games are not a top priority;
  • low cost of assemblies;
  • processors themselves are cheap;
  • the majority buy ready-made solutions;
  • some users have no idea what they are selling in stores and feel great.

Can I play on 2 cores? Yes, no problem, as the Intel Core i3 line up to the 7th generation has been successfully proving for several years. Also very popular were Pentium Kaby Lake, which for the first time in history introduced support for Hyper Threading.
Is it worth buying 2 cores now, albeit with 4 threads? Exclusively for office tasks. The era of these chips is gradually leaving, and manufacturers began to massively switch to 4 full-fledged physical cores, and therefore it is not worth considering the same Pentium and Core i3 Kaby Lake in the long term. AMD has completely abandoned dual-core technology.

  • Tutorial

In this article I will try to describe the terminology used to describe systems capable of executing multiple programs in parallel, that is, multicore, multiprocessor, multithreaded. The different kinds of parallelism in IA-32 CPUs have appeared at different times and in a somewhat inconsistent manner. It's pretty easy to get confused in all this, especially considering that operating systems carefully hide details from not-too-sophisticated applications.

The purpose of the article is to show that with all the variety of possible configurations of multiprocessor, multicore and multithreaded systems for programs running on them, opportunities are created both for abstraction (ignoring differences) and for taking into account the specifics (the ability to programmatically find out the configuration).

Warning about marks ®, ™, in the article

My comment explains why company employees should use copyright marks in public communications. In this article, I had to use them quite often.

CPU

Of course, the oldest, most often used and controversial term is "processor".

In the modern world, a processor is what we buy in a beautiful Retail box or a not very beautiful OEM package. An indivisible entity that plugs into a socket on the motherboard. Even if there is no connector and cannot be removed, that is, if it is firmly soldered, it is one chip.

Mobile systems (phones, tablets, laptops) and most desktops have a single processor. Workstations and servers sometimes boast two or more processors on a single motherboard.

Supporting multiple CPUs in one system requires numerous design changes. At a minimum, it is necessary to ensure their physical connection (provide several sockets on the motherboard), resolve issues of processor identification (see later in this article, as well as my previous note), negotiation of memory accesses and delivery of interrupts (the interrupt controller must be able to route interrupts for multiple processors) and, of course, support from the operating system. Unfortunately, I could not find a documentary mention of the creation of the first multiprocessor system on Intel processors, however, Wikipedia claims that Sequent Computer Systems supplied them already in 1987 using Intel 80386 processors. Widespread support for multiple chips in one system is becoming available starting with Intel® Pentium.

If there are several processors, then each of them has its own connector on the board. At the same time, each of them has complete independent copies of all resources, such as registers, executors, caches. They share a common memory - RAM. Memory can be connected to them in various and rather non-trivial ways, but that is a separate story beyond the scope of this article. It is important that in any scenario for the executable programs the illusion of a uniform shared memory available from all processors included in the system should be created.


Ready for takeoff! Intel® Desktop Board D5400XS

Core

Historically, multi-core in Intel IA-32 appeared later than Intel® HyperThreading, but in the logical hierarchy it comes next.

It would seem that if the system has more processors, then its performance is higher (on tasks that can use all the resources). However, if the cost of communication between them is too high, then all the gain from parallelism is killed by long delays in the transfer of shared data. This is exactly what is observed in multiprocessor systems - both physically and logically, they are very far from each other. To communicate effectively in such an environment, specialized buses such as Intel® QuickPath Interconnect have to be invented. Energy consumption, size and price of the final solution, of course, do not decrease from all this. High integration of components should come to the rescue - circuits executing parts of a parallel program should be dragged closer to each other, preferably on one crystal. In other words, one processor should organize several cores, in everything identical to each other, but working independently.

Intel's first multi-core IA-32 processors were introduced in 2005. Since then, the average number of cores in server, desktop, and now mobile platforms has been steadily growing.

Unlike two single-core processors on the same system, sharing only memory, two cores can also share caches and other resources that are responsible for interacting with memory. Most often, the first level caches remain private (each core has its own), while the second and third level can be either shared or separate. This organization of the system allows to reduce delays in data delivery between neighboring cores, especially if they are working on a common task.


A micrograph of an Intel quad-core processor, codenamed Nehalem. Separate cores, a shared L3 cache, as well as QPI links to other processors and a common memory controller are allocated.

Hyperthreading

Until about 2002, the only way to get an IA-32 system capable of executing two or more programs in parallel was to use multiprocessor systems. The Intel® Pentium® 4, as well as the Xeon line, codenamed Foster (Netburst), introduced a new technology - hyperthreading or hyperthreading - Intel® HyperThreading (hereinafter referred to as HT).

There is nothing new under the sun. HT is a special case of what the literature calls simultaneous multithreading (SMT). Unlike "real" cores, which are full and independent copies, in the case of HT, only a part of the internal nodes are duplicated in one processor, primarily responsible for storing the architectural state - registers. The executive nodes responsible for organizing and processing data remain in the singular, and at any given time are used by at most one of the threads. Like kernels, hyperthreads share caches among themselves, but from what level it depends on the specific system.

I will not try to explain all the pros and cons of designs with SMT in general and with HT in particular. The interested reader can find a fairly detailed discussion of the technology in many sources, and of course on Wikipedia. However, I will note the following important point, which explains the current limits on the number of hyperthreads in real products.

Stream limits
When is the presence of "dishonest" multicore in the form of HT justified? If one application thread is unable to load all the executing nodes inside the kernel, then they can be "borrowed" to another thread. This is typical for applications that have a "bottleneck" not in computations, but in data access, that is, they often generate cache misses and have to wait for data to be delivered from memory. At this time, the kernel without HT will be forced to idle. The presence of HT allows you to quickly switch free executing nodes to a different architectural state (since it is just duplicated) and execute its instructions. This is a special case of a technique called latency hiding, when one long operation, during which useful resources are idle, is masked by the parallel execution of other tasks. If the application already has a high utilization of kernel resources, the presence of hyperthreads will not allow it to get accelerated - “honest” kernels are needed here.

Typical desktop and server application scenarios for general-purpose machine architectures have the potential for concurrency enabled by HT. However, this potential is quickly “used up”. Perhaps for this reason, on almost all IA-32 processors, the number of hardware hyperthreads does not exceed two. In typical scenarios, the gain from using three or more hyperthreads would be small, but the loss in crystal size, power consumption and cost is significant.

A different situation is observed in typical tasks performed on video accelerators. Therefore, these architectures are characterized by the use of SMT techniques with a large number of threads. Since the Intel® Xeon Phi coprocessors (introduced in 2010) are ideologically and genealogically quite close to video cards, they can be four hyperthreading on each core - a configuration unique to IA-32.

Logical processor

Of the three described "levels" of parallelism (processors, cores, hyperthreads), some or all of them may be missing in a particular system. This is affected by BIOS settings (multi-core and multithreading are disabled independently), microarchitectural features (for example, HT was missing in Intel® Core ™ Duo, but was returned with the release of Nehalem) and system events (multiprocessor servers can turn off failed processors in case of malfunctions and continue to "fly" on the rest). How is this multi-tier zoo of concurrency visible to the operating system and, ultimately, to the application?

Further, for convenience, we denote the number of processors, cores, and threads in some system by the triple ( x, y, z), where x is the number of processors y is the number of cores in each processor, and z- the number of hyperthreads in each core. Hereinafter, I will call this triple topology- a well-established term that has little to do with the section of mathematics. Work p = xyz defines the number of entities named logical processors systems. It defines the total number of independent concurrent application process contexts in a shared memory system that the operating system is forced to consider. I say “forced” because it cannot control the order of execution of two processes on different logical processors. This also applies to hyperthreads: although they work "sequentially" on the same core, the specific order is dictated by the hardware and is not available for monitoring or controlling programs.

Most often, the operating system hides the features of the physical topology of the system on which it is running from the end applications. For example, the following three topologies: (2, 1, 1), (1, 2, 1) and (1, 1, 2) - the OS will represent in the form of two logical processors, although the first of them has two processors, the second - two cores, and the third has just two threads.


Windows Task Manager shows 8 logical processors; but how much is it in processors, cores and hyperthreads?


Linux top shows 4 logical processors.

This is quite convenient for application developers - they do not have to deal with the hardware features that are often irrelevant to them.

Topology definition programmatically

Of course, abstraction of the topology into a single number of logical processors in some cases creates enough grounds for confusion and misunderstandings (in heated Internet disputes). Computing applications that want to squeeze the maximum performance out of hardware require detailed control over where their threads will be placed: closer to each other on neighboring hyperthreads, or, conversely, farther away on different processors. The speed of communication between logical processors in a single core or processor is much higher than the speed of data transfer between processors. The possibility of heterogeneity in the organization of RAM also complicates the picture.

Information about the topology of the system as a whole, as well as the position of each logical processor in IA-32, is available using the CPUID instruction. Since the appearance of the first multiprocessor systems, the logical processor identification scheme has been expanded several times. To date, parts of it are contained in sheets 1, 4 and 11 of the CPUID. Which sheet to watch can be determined from the following flowchart taken from the article:

I will not bore you here with all the details of the individual parts of this algorithm. If interest arises, then the next part of this article can be devoted to this. I will refer the interested reader to, in which this issue is dealt with in as much detail as possible. Here I will first briefly describe what APIC is and how it relates to topology. Then consider working with sheet 0xB (eleven in decimal), which is currently the last word in "apicostroenie".

APIC ID
Local APIC (advanced programmable interrupt controller) is a device (now part of the processor) responsible for working with interrupts coming to a specific logical processor. Each logical processor has its own APIC. And each of them in the system must have a unique APIC ID value. This number is used by interrupt controllers for addressing when delivering messages, and by everyone else (for example, the operating system) to identify logical processors. The specification for this interrupt controller has evolved from the Intel 8259 PIC through Dual PIC, APIC and xAPIC to x2APIC.

Currently, the width of the number stored in the APIC ID has reached the full 32 bits, although in the past it was limited to 16, and even earlier - only 8 bits. Nowadays, the remnants of the old days are scattered all over the CPUID, but all 32 bits of the APIC ID are returned in CPUID.0xB.EDX. Each logical processor, independently executing the CPUID instruction, will return its own value.

Clarification of family ties
The APIC ID value by itself does not say anything about the topology. To find out which two logical processors are inside one physical processor (ie, they are “brothers” of hyperthreads), which two are inside the same processor, and which ones are in completely different processors, you need to compare their APIC ID values. Depending on the degree of relationship, some of their bits will be the same. This information is contained in the CPUID.0xB sublists, which are encoded using the ECX operand. Each of them describes the position of the bit field of one of the topology levels in EAX (more precisely, the number of bits that need to be shifted in the APIC ID to the right to remove the lower topology levels), as well as the type of this level - hyperthread, core, or processor - in ECX.

Logical processors located inside the same core will have the same APIC ID bits, except for those belonging to the SMT field. For logical processors in the same processor, all bits except for the Core and SMT fields. Since the number of sublists for CPUID.0xB can grow, this scheme will allow supporting the description of topologies with more levels, if the need arises in the future. Moreover, it will be possible to enter intermediate levels between the existing ones.

An important consequence of the organization of this scheme is that there may be "holes" in the set of all APIC IDs of all logical processors in the system; they won't go sequentially. For example, in a multicore processor with turned off HT, all APIC IDs may turn out to be even, since the least significant bit responsible for encoding the hyperstream number will always be zero.

Note that CPUID.0xB is not the only source of information about logical processors available to the operating system. A list of all processors available to it, along with their APIC ID values, is encoded in the MADT ACPI table.

Operating systems and topology

Operating systems provide logical processor topology information to applications through their own interfaces.

On Linux, topology information is contained in the / proc / cpuinfo pseudo file and the dmidecode command output. In the example below, I am filtering the cpuinfo content on some quad-core system without HT, leaving only the topology-related entries:

Hidden text

[email protected]: ~ $ cat / proc / cpuinfo | grep "processor \ | physical \ id \ | siblings \ | core \ | cores \ | apicid" processor: 0 physical id: 0 siblings: 4 core id: 0 cpu cores: 2 apicid: 0 initial apicid: 0 processor: 1 physical id: 0 siblings: 4 core id: 0 cpu cores: 2 apicid: 1 initial apicid: 1 processor: 2 physical id: 0 siblings: 4 core id: 1 cpu cores: 2 apicid: 2 initial apicid: 2 processor: 3 physical id: 0 siblings: 4 core id: 1 cpu cores: 2 apicid: 3 initial apicid: 3

In FreeBSD, the topology is reported via the sysctl mechanism in the kern.sched.topology_spec variable as XML:

Hidden text

[email protected]: ~ $ sysctl kern.sched.topology_spec kern.sched.topology_spec: 0, 1, 2, 3, 4, 5, 6, 7 0, 1, 2, 3, 4, 5, 6, 7 0, 1 THREAD groupSMT group 2, 3 THREAD groupSMT group 4, 5 THREAD groupSMT group 6, 7 THREAD groupSMT group

In MS Windows 8, topology information can be seen in the Task Manager.

Good afternoon, dear readers of our blog. Today we will try to figure out what is more important than the frequency or the number of processor cores? What influences each of these parameters in everyday use, in games and professional applications? Does manual overclocking make more sense? In general, let's delve into how it all works.

The comparison procedure will be elementary to disgrace:

  • the advantages of a high clock speed;
  • the advantages of a large number of processor cores;
  • the need for this or that, depending on the selected tasks;
  • results.

Now let's get started.

High frequencies are a sign of comfortable gaming

Let's immediately plunge into the gaming industry and list on the fingers of one hand those games that need multithreading for comfortable work. Only the latest Ubisoft products (Assassin's Creed Origins, Watch Dogs 2), old GTA V, fresh Deus Ex and Metro Last Light Redux come to mind. These projects will easily "eat up" all the vacant computing power of the processor, including cores and threads.

But this is rather an exception to the rule, since other games are more demanding on the CPU frequency and video memory resources. In other words, if you decide to run the good old DOOM on AMD Ryzen Threadripper 1950X with its 16 processing cores (expensive, powerful), you will be extremely disappointed due to the following factors:

  • FPS will be low;
  • most cores and threads are idle;
  • overpayment is highly doubtful.

And all because this chip is focused on professional computing, rendering, video processing and other tasks, in which it is the streams, and not the frequency potential, that “solve”.
We change AMD to Intel Core i5 8600K and we see an unexpected result - the number of frames has increased, the picture stability has increased, all cores are used optimally. And if you disperse the stone, then the picture will turn out to be absolutely gorgeous. This is because gaming still correctly perceives from 4 to 8 cores (not counting the exceptions described above), and the further growth of physical and virtual flows is simply unjustified, you have to drive.

When do you need multithreading

Now let's compare two top solutions from Intel and AMD in professional tasks: Core 7 8700K (6/12, L3 - 9 MB) and Ryzen 7 2700x (8/16, L3 - 16 MB). And here the number of cores and threads plays the main and best role in the following tasks:

  • archiving;
  • data processing;
  • rendering;
  • work with graphics;
  • creation of complex 3D objects;
  • Application Development.

It should be noted that if the program is not designed for multithreading, then Intel wins the palm only due to the higher frequency, but in other cases the leadership remains with the "reds".

Let's summarize

Now let's think logically. Both AMD and Intel have leveled off quite well in terms of performance over the past few years. Both chips are built for the latest Ryzen + (AM4) and Coffee Lake (s1151v2) platforms and have excellent overclocking potential as well as future proofing.

If your primary task is to get a high FPS in modern gaming projects, then the "blue" platform looks like a more optimal solution here.

However, it should be understood that a high frame rate will be noticeable only on monitors with a frequency of 120 Hz and above. At 60 hertz, you simply won't notice the difference in smoothness.

All other things being equal, the version from AMD looks more "omnivorous" and universal, and it has more cores with it, which means that new prospects are opening up like the same streaming that is so popular on Youtube.

We hope you now understand what is the difference between the frequency and the number of cores, and in what cases the overpayment for threads is justified.

I believe that in this fight, there can be no winner here, since the battle in comparisons was in different weight categories.

On that note, let's finish, remember to subscribe to the blog for now.

New on the site

>

Most popular