September 01, 2007
Choosing Dual or Quad Core
I'm a big fan of dual-core systems. I think there's a clear and substantial benefit for all computer users when there are two CPUs waiting to service requests, instead of just one. If nothing else, it lets you gracefully terminate an application that has gone haywire, consuming all available CPU time. It's like having a backup CPU in reserve, waiting to jump in and assist as necessary. But for most software, you hit a point of diminishing returns very rapidly after two cores. In Quad-Core Desktops and Diminishing Returns, I questioned how effectively today's software can really use even four CPU cores, much less the inevitable eight and sixteen CPU cores we'll see a few years from now.
To get a sense of what kind of performance improvement we can expect going from 2 to 4 CPU cores, let's focus on the Core 2 Duo E6600 and Core 2 Quad Q6600 processors. These 2.4 GHz CPUs are identical in every respect, except for the number of cores they bring to the table. In a recent review, Scott Wasson at the always-thorough Tech Report presented a slew of benchmarks that included both of these processors. Here's a quick visual summary of how much you can expect performance to improve when upgrading from 2 to 4 CPU cores:
Task Manager CPU Graph | | improvement 2 to 4 cores |
The Elder Scrolls IV: Oblivion | none | |
Rainbow 6: Vegas | none | |
Supreme Commander | none | |
Valve Source engine particle simulation | 1.8 x | |
Valve VRAD map compilation | 1.9 x | |
3DMark06: Return to Proxycon | none | |
3DMark06: Firefly Forest | none | |
3DMark06: Canyon Flight | none | |
3DMark06: Deep Freeze | none | |
3DMark06: CPU test 1 | 1.7 x | |
3DMark06: CPU test 2 | 1.6 x | |
The Panorama Factory | 1.6 x | |
picCOLOR | 1.4 x | |
Windows Media Encoder x64 | 1.6 x | |
Lame MT MP3 encoder | none | |
Cinebench | 1.7 x | |
POV-Ray | 2.0 x | |
Myrimatch | 1.8 x | |
STARS Euler3D | 1.5 x | |
SiSoft Sandra Mandelbrot | 2.0 x |
The results seem encouraging, until you take a look at the applications that benefit from quad-core-- the ones that aren't purely synthetic benchmarks are rendering, encoding, or scientific applications . It's the same old story. Beyond encoding and rendering tasks which are naturally amenable to parallelization, the task manager CPU graphs tell the sad tale of software that simply isn't written to exploit more than two CPUs.
Unfortunately, CPU parallelism is inevitable. Clock speed can't increase forever; the physics don't work. Mindlessly ramping clock speed to 10 GHz isn't an option. CPU vendors are forced to deliver more CPU cores running at nearly the same clock speed, or at very small speed bumps. Increasing the number of CPU cores on a die should defeat raw clock speed increases, at least in theory. In the short term, we have to choose between faster dual-core systems, or slower quad-core systems. Today, a quad-core 2.4 GHz CPU costs about the same as a dual-core 3.0 GHz CPU. But which one will provide superior performance? A recent Xbit Labs review performed exactly this comparison:
| 3.0 GHz Dual Core | 2.4 GHz Quad Core | improvement 2 to 4 cores |
PCMark05 | 9091 | 8853 | -3% |
SysMark 2007, E-Learning | 167 | 140 | -16% |
SysMark 2007, Video Creation | 131 | 151 | 15% |
SysMark 2007, Productivity | 152 | 138 | -9% |
SysMark 2007, 3D | 160 | 148 | -8% |
Quake 4 | 136 | 117 | -15% |
F.E.A.R. | 123 | 110 | -10% |
Company of Heroes | 173 | 161 | -7% |
Lost Planet | 62 | 54 | -12% |
Lost Planet "Concurrent Operations" | 62 | 81 | 30% |
DivX 6.6 | 65 | 64 | 0% |
Xvid 1.2 | 43 | 45 | 5% |
H.264 QuickTime Pro 7.2 | 189 | 188 | 0% |
iTunes 7.3 MP3 encoding | 110 | 131 | -16% |
3ds Max 9 SP2 | 4.95 | 6.61 | 33% |
Cinebench 10 | 5861 | 8744 | 49% |
Excel 2007 | 39.9 | 24.4 | 63% |
WinRAR 3.7 | 188 | 180 | 5% |
Photoshop CS3 | 70 | 73 | -4% |
Microsoft Movie Maker 6.0 | 73 | 80 | -9% |
It's mostly what I would expect-- only rendering and encoding tasks exploit parallelism enough to overcome the 25% speed deficit between the dual and quad core CPUs. Outside of that specific niche, performance will actually suffer for most general purpose software if you choose a slower quad-core over a faster dual-core.
However, there were some surprises in here, such as Excel 2007, and the Lost Planet "concurrent operations" setting. It's possible software engineering will eventually advance to the point that clock speed matters less than parallelism. Or eventually it might be irrelevant, if we don't get to make the choice between faster clock speeds and more CPU cores. But in the meantime, clock speed wins most of the time. More CPU cores isn't automatically better. Typical users will be better off with the fastest possible dual-core CPU they can afford.
[advertisement] PhotoDrop.com makes it simple to create and share online photo albums. Upload your full resolution pictures via the web site or with the free Photo DropZone utility, and you're done. No fees. No storage limits. It's the fastest and easiest way to share photos and create albums. |
Posted by Jeff Atwood View blog reactions
« Falling Into The Pit of Success Keeping The Menu Simple »
I'm disappointed Jeff. Where's the 2 to 4 core comparison for Visual Studio and other compilers?
This is a .NET blog right?
Cameron on September 3, 2007 09:28 PMThanks for that! What about servers? web applications, database... Will quad cores systems add benefit there?
Pierre on September 3, 2007 09:36 PMdepends if you want a single application to go faster or you have several apps you want to go faster.
Say.... Running Several instances of Visual Studio and a VMWare... etc etc
Keith Nicholas on September 3, 2007 09:44 PMStuff like 3D rendering, or compositing applications, or pretty much anything dealing with processing images, can very easily be split into regions, for rendering by separate cores.
Modo (3d app) is the most obvious example of this.
When you render something with two cores, you see two little blue boxes processing a segment of the image. If you have 4, you see four little boxes. If you have two quad-core machines doing network-rendering, you see four blue boxes (local cores rendering), and four orange boxes (the remote box rendering)
Even me, who isn't the best coder in the world, could work out how to write a render to take advantage of multiple cores.
Where as with games, I can't think of anything that could utilize the spare CPU cores..
I wonder if it's even remotely possible, but: To use extra cores as "software-graphics-cards". Since graphics are the only thing that really needs lots more processing power in games, it'd make sense to say divide the screen up between them, and use the remaining two cores to process extra effects on their area on screen. Biggest problem being the CPU's aren't as fast as drawing stuff to the screen as graphics cards are...
But, yeh.. For gaming, dual (or even single) core processors are more than enough. CPU's are generally not the bottle-neck for games.
Buut... For 3D/compositing workstations, a quad-core CPU (or dual-CPU quad-core) does substantially speed up rendering.
Another thought, to add to this slightly rambling comment:
MP3 encoding. Instead of speeding up a single-MP3 encoding, why not have the application process 4 different files at once. It'd me much simpler to code (Since you don't need to worry about parralizing(?) the encoding process, you just basically need to fork the encoding once for each core..
Since encoding the same MP3 over 4 cores probably wouldn't speed it up that much (The code would spend more time starting the next file than actually processing bits), completing four files at a time would complete the task faster.
Please don't use red and green text that are otherwise identical (same saturation, value, font, and so on), to differentiate positive and negative results. Yes, there is a minus sign in front of the negative results, but this is slow for the brain to latch on to, especially since the font is rather thin.
There are many other ways to visually separate good and bad results, almost all of which are better than just red and green and no other differentiation. I've seen some beautiful and effective choices, though many tend to bias the reader (bolding the bad results, for instance). Personally I find just replacing the green with blue to be quite effective.
This is the sort of comparison you see all the time, and it may be an incredibly stupid question, but instead of seeing how one application does across multiple CPU cores, I'd like to know how the Operating System goes distributing several applications across cores.
Or is that not how it works?
Because if I can get four apps working at higher performance, sometimes that's a better scenario.
Of course there is also the point to be made of how many apps have ever been written to take advantage of multiple cores yet?
A lot of the ones I work with there just isn't the need.
Actually with all the background processing and everything I'd love to see how some of the WPF apps coming out are going to go.
Andrew Tobin on September 3, 2007 09:59 PMThe issue is certainly the software. I think that pretty much everything is parallelizable. The issue is that they aren't within our current programming paradigms. I think the question you should be asking is why we really need more powerful computers. The answer, I believe coincides much with the list of things multiple cores are good for!
You say "Unfortunately, CPU parallelism is inevitable.". Unfortunately? I think not! This is the opening for revolution in computer architecture and programming languages.
dbr - Games can actually be very easily parallelizable, and could easily take advantage of all the power offered by quad, 8, 16 cores, etc. The issue is that the game engine would have to be written with parallelization in mind (looks live Valve is doing this), and the benefit isn't huge when not too many have dual/quads. It wouldn't be unreasonable to devote one core to managing the graphics card(s), push & load content, etc. One or two cores could do physics stuff, in the absence of a PPU. I'd love to see a game that actually benefits from an entire core devoted to AI and game play. It is nearly impossible to design a game such that it smoothly scales from single core to several cores, though. It changes the game too much.
Michael Sloan on September 3, 2007 10:14 PM@dbr:
There is actually quite a bit that can be sped up in games, outside of the pure rendering aspect. For instance, depending on the algorithms used, AI can often be separated into global and individual "thinking" -- the latter can be distributed across cores. Even with a purely global AI design, simply moving the entire AI subsystem to a separate core may work well.
Then of course there's the sound subsystem, which can decently chew CPU when a great many environmental sound effect tracks are mixed by a 3D audio engine. Again, that can be thrown in its own thread.
And then there is physics. Some of it can be parallelized, and other pieces can't -- but certainly physics can be overlapped with rendering. Because the non-destructible portions of the environment will be unaffected by the results of the physics calculations, those can be rendered while physics is still being run for a given frame. Also, once the physics is completed for a given frame, the results can be passed off to the renderer while the physics computation begins on the *next* frame.
Weather and other complex but slowly-changing environmental effects can be computed in separate threads that post results asynchronously to the main engine. Networking/world synchronization can run in a thread of its own.
And the list goes on .... Mind you, such a heavily threaded design is not necessarily *easy*, but it certainly is worth the work, as Valve has been quick to point out.
Yep. Main benefit of multi-core on the desktop is avoiding excessive context switching. As you say, diminishing returns above two unless the software has been explicitly parallelized.
Evan on September 3, 2007 10:34 PM> What about servers?
Servers are totally different scenarios. There are plenty of users who believe their desktop usage scenarios are similar to servers, but it's utter wishful thinking on their part..
> if you want a single application to go faster or you have several apps you want to go faster.
Within reason, yes, but dual-core gets you 99% of the benefit of (n) core. If you're not careful, this becomes the wishful thinking scenario I just described. No matter how much of an ultra-elite-ninja single user you are, I guarantee you're not generating anything close to the kind of load that a server would experience under even the mildest of loads. Desktops aren't servers.
> Where as with games, I can't think of anything that could utilize the spare CPU cores..
http://news.zdnet.com/2100-9584_22-6119913.html
--
One such company is Remedy, which demonstrated a game called "Alan Wake" at the Intel show.
The game is designed to farm tasks to different processor cores, said Markus Maki, director of development, in an interview. There are three major program threads and each can occupy a core of its own: one for the main game action, one for simulating physics of game objects and one for preparing terrain information that's later sent to the graphics chip for rendering. A fourth core can handle other threads, including playing sound and retrieving data from a DVD, Maki said
--
I have yet to see a single game that shows *anything* close to the kind of scaling that we regularly see with rendering or encoding.
Approaches like this sound good on paper, but developers are seriously hobbled by the existing market of single and dual core CPUs. They have to write AI that can scale between an entire core on a quad-core machine, 1/2 of a core on a dual, or 15% of CPU time on a single.
Jeff Atwood on September 3, 2007 10:34 PMWhy look at today's programs performance with tomorrow's cpu setups? Surely after time programs will be written to take advantage of multiple cores. Remember there was a time when "no user of a pc" would need more than 637k of RAM ;)
kenny on September 3, 2007 10:45 PM> Where's the 2 to 4 core comparison for Visual Studio and other compilers?
If you can find these kinds of benchmarks, then godspeed. They're rare. The very first link in this post contains one compilation benchmark, but it's dual-core:
http://www.codinghorror.com/blog/archives/000285.html
This review shows no scaling improvement for quad-core in Visual Studio 2005 compilation:
http://xtreview.com/review212.htm
The gcc compiler does support multiple cores and seems to scale fairly well:
http://www.phoronix.com/scan.php?page=article&item=585&num=4
Cheat sheet for the last graph: E5320 is quad 1.86 Ghz; E5150 is dual 2.66 Ghz.
single E5150 -- 12.06 sec
single E5320 -- 11.08 sec
http://techreport.com/articles.x/11237
I think there's a better article though, I don't have time to find it now. One of the interviews with Valve about multi-core support explains some of the benefits and difficulties with programming for multi-core (or more adapting existing code for multi-core).
[ICR] on September 3, 2007 10:52 PMCon: Amdahl's Law
http://en.wikipedia.org/wiki/Amdahl%27s_law
Pro: Reevaluating Amdahl's Law
http://www.scl.ameslab.gov/Publications/Gus/AmdahlsLaw/Amdahls.html
The Xbit Labs review can't have activated the "threads=x" option for xvid. xvid encoding on a quad core Mac Pro either from command line or from Handbrake maxes out all four cores, and hits about 95fps encoding rate (with all the quality options on).
Matthew on September 3, 2007 11:12 PM> But in the meantime, clock speed wins most of the time. More CPU cores isn't automatically better.
More CPU cores still allow you to run more applications with less contention for CPU resources (you may get starved for memory bandwidth though).
In this day and age of Firefox and other IntelliJ/Eclipse/Visual Studio (while I do love them you can't consider them lightweight on either memory or resources), having more CPU cores allows your computer to still be responsive even though you're running Firefox *and* your IDE *and* some expensive compilation *and* even more without having to rely on nicing processes.
Masklinn on September 3, 2007 11:13 PM.NET compilation gets some multi-core love with the 3.5 Framework[1]. I've been using this for a while on my home projects. It helps a bit, but not a ton. If you have a lot of projects and a clean dependency graph, it can shave a decent amount of time off the total build, but it varies a lot.
1. http://blogs.msdn.com/msbuild/archive/2007/04/26/building-projects-in-parallel.aspx
Now, even there the drop-off is significant after /m:2. On my Q6600@3.3Ghz, running with 4 build nodes (/m:4) is rarely any faster than running with 2 (/m:2). Here are some fresh timings for a clean build on a small-to-medium size project:
/m:1 - 4.39s, 4.24s, 4.71s (4.45s avg)
/m:2 - 3.58s, 3.65s, 3.60s (3.61s avg)
/m:3 - 3.86s, 3.52s, 3.74s (3.70s avg)
/m:4 - 3.19s, 3.75s, 3.86s (3.60s avg)
This is around 2.5 MB of source code spread out over 16 projects.
Even so, I'm pleased with my Q6600. It wasn't very much more expensive than the dual core, and usually there are quite a few things going on besides compilation to take advantage of the extra power.
Derek on September 3, 2007 11:23 PMI'm very surprised that the Erlang fan boys haven't jumped in here yet.
I test my own OS and play around with other OS's in emulators all the time. I'm only on a single-core CPU at the moment (to upgrade means new everything, pretty much) and a friend with a dual-core allowed me to try some emulation on his system.
First thing I noticed was the difference moving the emulator's process onto the second core (via task manager's "Set Affinity" option) made to the running of the rest of the system. Note that this isn't Virtual PC (which can run at very low CPU usage), these are emulators such as Bochs, QEMU and PearPC, all of which enjoy eating up valuable CPU time.
> Why look at today's programs performance with tomorrow's cpu setups?
Agreed.
pcmattman on September 4, 2007 12:04 AMIt's almost pointless to worry about multicore performance in standard apps at this stage of the game. One can't just go back and "add in" support for multicore in any significant application, beyond trivial stuff that can be stuck in a background thread (which should be done already, for UI interactivity).
To really see the benefit of multicore, applications will not only have to be largely rewritten, but devs will have to start thinking in a completely different way. Not just in a "how can we thread this algorithm" way, but in a "what should we pre-emptively compute just in case the user wants it" way. The latter is where multicore starts to make sense, but it's much harder than the former (which is already pretty hard). As far as I know, the only people really thinking about this are at Microsoft Research...
Regardless, massive redesigns won't be justifiable to the shareholders until everyone *has* the multicore systems. It's chicken-and-egg. Hence, it's our job as forward-thinking developers to convince everyone we know to buy N-core over (N/2)-core, so that we'll have more cores to play with.
Remember, more cores == more awesome. For the children.
I'm actually working on a single, dual, quad MSBUILD benchmark now...soon.
Scott Hanselman on September 4, 2007 12:10 AM> Regardless, massive redesigns won't be justifiable to the shareholders until everyone *has* the multicore systems.
In an ideal world.
In reality you start redesigning as soon as one of the shareholders has one of the multicore systems.
Here in the netherlands, the price for a quadcore 6600 is 50 euro's higher than the dual core 6600... it's really a no-brainer.
Frans Bouma on September 4, 2007 12:33 AM> I'm very surprised that the Erlang fan boys haven't jumped in here yet.
They don't care, their software makes use of multiple cores just fine already.
Masklinn on September 4, 2007 12:38 AMI don't consider perfomance tests run in a vacuum to be a great measure of true performance.
I would think running the tests while listening to music, surfing performance tuning sites in I'd, outlook periodicly checking for mail, an im chat app running, sidebar full of gadgets, and a task bar of the normal bloat ware (adobe or steam) would be a better representation of the perfomance gains.
brian on September 4, 2007 12:42 AMEven Anandtech has a hard time coming up with realistic multitasking benchmarks that stress a quad-core machine:
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2879&p=12
--
When we were trying to think up new multitasking benchmarks to truly stress Kentsfield and Quad FX platforms we kept running into these interesting but fairly out-there scenarios that did a great job of stressing our test beds, but a terrible job and making a case for how you could use quad-core today.
--
Their answer? H.264 blu-ray video playback while "doing something else". Lame. How do you watch a movie and do something else at the same time?
On the other hand, doing a lot of rendering or encoding in the background makes sense. But I'd argue this is an extraordinarily rare activity for mainstream computer users. Perhaps if video editing really takes off, and everyone's a star on YouTube with their own show..
Jeff Atwood on September 4, 2007 12:51 AMYour bottom four comparison percentages seem to have the wrong "polarity".
Mark on September 4, 2007 12:58 AMCompiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.
Going from a 7200 to a 10000 RPM boot disk has the same effect than going from a Dualcore to a Quadcore machine (from ~17 minutes down to ~13 minutes). So you will probably benefit the same or even more from a faster Dualcore AND a 10k RPM hard drive than from a Quadcore alone. Of course, if you can have both, go for it. But unless you use Incredibuild with Visual Studio - even if used only as standalone solution, you just won't get very good parallelism out of Visual Studio (C++ compiler) alone.
>> I'm very surprised that the Erlang fan boys haven't jumped in here yet.
> They don't care, their software makes use of multiple cores just fine already.
I wonder, what about Stackless Python?..
Vladimir on September 4, 2007 01:06 AMRe. Visual Studio, isn't it a breach of the license terms to publish performance information? It certainly is with SQL Server! This would explain why there isn't any data out there.
Syd on September 4, 2007 01:07 AMOne area where quad is definitely better is for optimization. With dual core, it's difficult to optimize for quad, but with quad you can optimize for quad, dual, uni-core systems.
Anytime you're optimizing parallel code for a wide audience, you want more cores. And with quads about to become the baseline (and already very cheap), choosing dual cores is probably no longer the right choice.
Andrew Binstock on September 4, 2007 01:10 AMHi, first time poster here. I like the blog, very interesting and thought provoking (although I don't always agree!).
I think the problem with current software and multicore CPUs is the threading model the OSes use. It's not easy to scale threads dynamically, let alone load balance them. I've been toying around with the idea of a system based on small chunks of work which are given to whichever processor is free - i.e. a single queue with multiple servers. It's not easy. Incidentally, I came up with the idea whilst working on PS3 hardware.
Skizz
Skizz on September 4, 2007 01:24 AM>> I have yet to see a single game that shows *anything* close to the kind of scaling that we regularly see with rendering or encoding.
Main gaming platforms (Xbox360, PS3) have been multicore for quite some time (XBox has 3 cores/6 hw threads. PS3 has a PPU and 6 SPUs). Most of the games these days are multithreaded on these platforms, simply because they have to be in order to survive! "Free GHz ride" never existed on consoles.
Erlang fanboys think their software uses multiple cores, but in fact you need to have multiple Erlang interpreter processes running to do that. The code to start Erlang processes in different native processes is different to just starting them in one process, and Erlang cannot move its processes from one native process to another it one is busy and the other is idle.
Asd on September 4, 2007 02:09 AMIf Intel is going to be pumping these XX-cores out, I'd imagine their friends at Microsoft and elsewhere would feel a push to writing software to fully utilise these cores so people will still enjoy faster experiences when buying their shiny new computers.
Else I'm sure word would get around pretty quickly, from friends and family, that those new XX-cores on the TV from dell aren't much faster than the box under their desk.
transcriber on September 4, 2007 02:32 AMWhat about statistical software - "GNU R project" and such - these are the power-hungry applications and they are used by universities, who are always trigger happy to upgrade (and waste money in the process).
jinxs on September 4, 2007 02:35 AMThe noises coming from Intel for their 45nm process generation suggest that they may have cracked the leakage current problems that plagued the 90nm and 65nm generations, which I believe was largely the reason that the clock speed couldn't be ramped up (without causing massive heat dissipation problems).
If they really have fixed it, we could start seeing clock speeds going up again, although we might see lower-voltage, lower-power parts at current clock rates as well. Of course, now that multi-core has been introduced, it won't be taken away - the transistor budget for an out-of-order superscalar processor core is already well below the number of transistors it's possible to put on a chip-sized piece of silicon, so those transistors are effectively going free. What else are you going to do with them, add even more cache?
Mike Dimmick on September 4, 2007 02:51 AMTwo things:
1, Quad Core 2.4 Ghz prices are roughly comparable with Dual Core 3.0 Ghz, so you would expect similar performance. You'd also expect older application that weren't built with parallelism in mind to not take full advantage of the full complement of cores. That's likely to change drastically over the next 18m - 2y.
2, From memory the 2.4 Ghz Quad is far more overclockable than the 3.0 GHz Dual (all other things being equal) - indeed, in your own blog : http://www.codinghorror.com/blog/archives/000908.html : you overclocked a Quad to 3.0 GHz. I'd be interested how a 2.4 clocked to 3.0 fared in the comparisons.
Dino on September 4, 2007 04:01 AM@Asd
> Erlang fanboys think their software uses multiple cores, but in fact
> you need to have multiple Erlang interpreter processes running to do
> that. The code to start Erlang processes in different native processes
> is different to just starting them in one process, and Erlang cannot
> move its processes from one native process to another it one is busy
> and the other is idle.
That is not entirely true.
The Erlang VM automatically starts a thread for each core, each of which handles execution of one of the upcoming scheduled Erlang processes. Erlang code which have been written with concurrency in mind (that is, several processes that do different things at the same time, which is normal procedure in Erlang) will scale perfectly well on multi core systems without any modification at all.
See:
http://www.ericsson.com/technology/opensource/erlang/news/archive/erlang_goes_multi_core.shtml
That's nice... generalization based on one type of multi-core CPU.
How much of this is directly due to:
1 - non-parallelized code (granted, this is acknowledged)
and
2 - bus saturation
Cross comparisons using only parallelized code between different multi-core architectures PLEASE!
anonymoustroll on September 4, 2007 04:24 AMWell it is at least good to see that Valve has their code going in the right direction. Let's hope that makes TeamFortress 2 that much more fun!
As for compile speed, disk is still the slow part of that process. I would rather have an SSD for a build drive any day for their high I/O per second compared to spinning platters. Processors, dual or quad, are still so far ahead of storage it is sick. Even tons of RAM for cache didn't seem to help our builds with disk continuing to be the bottleneck.
Even large scale vm systems are using multiple host adapters and more to get a performant virtual array that doesn't choke the vm. In fact an HBA per vm is not uncommon to get speeds acceptable.
And like another poster saidi the cost difference is so low why not just get the quad.
your data seems contrary to something the inquirer posted this morning
http://www.theinquirer.net/default.aspx?article=42114
This suggests that infact lost planets performance increases considerably with a quad core compared to a dual core
abhorsen666 on September 4, 2007 04:47 AMI believe that most of the next batch of games will take advantage of quad-cores. Bioshock already uses about 50-60% of each core in my quad Q6600. It probably means it would run about equally well at 100% in a dual core, but still, it parallelizes remarkably well.
Game developers are way too addicted to adding cool stuff into their games for them to ignore a source of computing power for long. They will eventually find something to take advantage of it. Take for instance how Valve's Gabe Newell went from complaining about the enormous difficulty of programming for multiple cores to loving quad-cores:
http://www.next-gen.biz/index.php?option=com_content&task=view&id=510&Itemid=2
http://www.extremetech.com/article2/0,1697,2050558,00.asp
So, I can't be sure of how much will games take advantage of quad-cores in the next year, but I don't think it's a bad moment to get one.
Valls on September 4, 2007 04:55 AM"Their answer? H.264 blu-ray video playback while "doing something else". Lame. How do you watch a movie and do something else at the same time?"
Well.. not that I disagree with the article, but I do this all the time. I'll catch up on a TV show or watch a movie while I'm doing some photo editing. Lightroom & Photoshop on one monitor, player on the other.
Ben on September 4, 2007 05:10 AMOne thing about LAME encoding: I realize they were evidently using a multithreaded version of LAME, but even single-threaded LAME can see an increase, as you can run multiple instances of it. Not "real" multithreading, but I use foobar to convert things to MP3 all the time, and it spawns as many LAME instances as I have threads, and the speedup from a C2D to a C2Q was huge; not perfectly linear scaling, but a large increase.
Hotdog on September 4, 2007 05:15 AMSomebody should compare dual vs quad core performance advantages using development tools such as Visual Studio 2005, SQL Server 2005 (Analysis, Reporting, etc., Services) and running multiple instances of Virtual Machines (VMware and/or Virtual PC). This is more meaningful to many of us who use computers for a living... yes we do play games and encode audio/video but we spend more time developing applications.
I myself am wondering whether there is any advantage in going quad core using these dev tools... or whether I am just wasting electricity (quad cores are rated 135 watts vs 65 watts for dual cores) generating heat by running one.
cyclo on September 4, 2007 05:23 AMThe newer game engines (e.g. Crysis) are already starting to utilize quad cores so I would add games to the list.
Josh on September 4, 2007 05:31 AMReal world applications would be Nikon Capture NX on both XP and Mac and the Canon equivalent, IrfanView and Bibble transcoding and tweaking directories of 12 MP files, using something like MS Publisher to add and delete pages. On a Mac run Tiger and XP at the same time.
Another stress test would be to have all these apps open, plus FireFox with a dozen tabs, throw in QuickBoooks, Windows Explorer, a couple of large spreadsheets, and burn a DVD.
I'm looking to replace a G4 Mac and a 3 year old XP box. Should a get a Mac mini, or go for an XServe or MacPro?
Help me, folks.
Mark on September 4, 2007 05:33 AM>> Compiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.
Visual Studio's C++ compiler doesn't scale out-of-the-box for projects with a lot of dependencies, because building projects in parallel doesn't fit here. but by adding the undocumented /MP(X) compiler option under "advanced command-line options" there is a multi-cpu performance gain even at the project level:
/MP2 for dual-core, /MP4 for quad-core
It utilizes 100% of all available cpus over here, so the statement that the cpu's sitting idle is not correct, at least with vc++ 8.0.
As Mark already commented on 12:58 AM: the last 4 entries in the XBit comparison have their sign reversed -- Excel is actually 63% SLOWER on 4 cores.
Plus, while I understand the reasoning behind comping 2.4GHz to 3.0GHz, I find it worth noting that in many cases in which the quad performs worse, it's still performing better than the CPU ratio would predict.
Hex Err on September 4, 2007 05:38 AM>> Compiling a C++ project in Visual Studio 2005 with a Quadcore requires to use Incredibuild, otherwise 3 CPUs sit idle for 80% of the time.
Visual Studio's C++ compiler doesn't scale out-of-the-box for projects with a lot of dependencies, because building projects in parallel doesn't fit here. but by adding the undocumented /MP(X) compiler option under "advanced command-line options" there is a multi-cpu performance gain even at the project level:
/MP2 for dual-core, /MP4 for quad-core
It utilizes 100% of all available cpus over here, so the statement that the cpu's sitting idle is not correct, at least with vc++ 8.0.
Massive parallelism is coming. What will drive it is robotics, and the need for whatever the desktop evolves into to program and perhaps control it.
In ten years machines with hundreds, perhaps even thousands of cpus will be either on the drawing boards or in production designed specifically for this appication. The massively parallel revolution is coming, and it's coming in a big way.
Massively parallel machines are probably the only way to simulate cognitive functions.
Even at the level of an insect’s cognitive abilities today’s supercomputers still fall fare short in comparative processing power. In order to build robotic applications that are truly useful they will have to be at least as smart as your typical insect and massively parallel machines at the micro level are the only way to achieve this goal.
I run lots of apps at the same time, not being on the same core is very helpful to me and improves latency. Writing for a single core isn't that bad.
Deathbyfire on September 4, 2007 06:00 AMOne of the tests I'm personally interested in (and which is always missing) is using a sequencer (Cubase, Ableton, Logic, etc.) and running VST plugins. For instance, something like Native Instruments' Massive totally bogs down my old (XP2400+) CPU if it's set to high quality mode, and the number of instances playing simultaneously or doing neat tricks like convolution reverb would be a good workout. With the software studio, lots more people are making music.
Today I'll get a Quad system, so it's interesting to see how the workload is divided; simultaneous audio streams should be parallelized rather easily.
Rob Janssen on September 4, 2007 06:17 AMWhat about increasing the cache size (as mentioned without detail by others above), increasing the register set, or adding special-purpose instructions such as matrix manipulations?
On a related note, has anyone ever done a post-mortem on the RISC vs. CISC wars? Lessons learned?
Michelle on September 4, 2007 06:21 AMIt won't make that much of a difference on current systems, but that's because games specifically are normally not very multithreaded. This is changing.
Every engine that is designed around the PS3 or XBOX 360 will most likely feature a very multithreaded design that will benefit significantly from a quad-core PC.
So I wouldn't buy a quad core now, but I'd expect those utilitization numbers to change over the next year or two.
Yrro on September 4, 2007 06:33 AMI can't really see a "negative" to getting a quad-core processor beyond price, and I think it's a poor arguement. The difference in price between clock-speed and core number is negligable, as is the performance difference.
Having recently had to purchase an "emergency" replacement for the home computer, I had "no choice" but to spring for a quad-core (the Q6600 mentioned above) from Gateway because the gross price of the system was phenominally lower than anything I could have assembled by myself ($1000 flat for a complete multimedia PC with decent RAM video and hard drive components!)
I'm counting on the fact that this PC will be in use in three years, at which time more mainstream software should leverage multi-core.
Rick Cabral on September 4, 2007 06:51 AMAs is the case for so many other things with me, it all comes down to actual performance. I love reading benchmarks and watching the wars that start between fanboys/girls over the accuracy of the results and the ensuing "My hardware is better than your hardware" mud-slinging.
The way I see it, your average Joe (at this point) doesn't need a quad-core system. We have a situation where the hardware in question is actually ahead of the software being developed to use that hardware. At this point, I can't think of a single scenario where an average computer user would need that much power.
Is this -really- a problem, though? If we, as developers, know that these types of processors are available, then why not write to take advantage of that power? Let's face it...single-core CPUs are on the way out, unless clock speeds start improving significantly with the next generations of CPUs.
Back to my starting comment, though. I've owned my share of machines in my relatively short lifetime, and I've done development on every one of them. As it stands now, I wouldn't dream of having anything less than a dual-core, and the next machine I plan on building will feature a quad-core. It's not that I need that power -right now-, but when the time comes that I will need it, it will already be there.
My old machine (which at this point, is essentially my test machine), has a 2.0 GHz AMD 2000+. I installed the VS Orcas Beta this weekend, and it runs like a pig through molasses. I know if I put it on my brother's dual-core 2.0 it would run better. And if I were to install it on a quad-core, then it would run even better than that, as well as the 40,000 other things I'm doing while hammering on my keyboard. Moral of the story? There -are- people who can take advantage of that technology who exist, so why not harness it?
Benchmarks are helpful, but not what I base my hardware purchases on. Now, gimmie one of them quad-cores. :)
James on September 4, 2007 07:15 AMMaybe we'll measure performance in cores and not Hz in a not so distant future. "I have a 2 MegaCore computer, what do you have?"
Adam on September 4, 2007 07:32 AMAll these benchmarks seem to focus on running a single application, which is hardly ever the case! I usually have at least 4 applications open (web browser, email, IM, download manager etc), not to mention all the OS processes that run in the background.
Surely these different applications can be run on different cores? While the performance of an individual application may not be improved by additional cores surely the performance of the whole system will be?
Ben on September 4, 2007 07:32 AMAs others pointed out, single applications may be slow in handling more than a two processors at once (which is what a dual core chip really is). That will depend upon the software writers taking advantage of multiple core processes and using threads. However, that doesn't mean there isn't improvement if multiple applications are running.
One of the biggest pains for a developer is building an application in parallel. I know that XCode and gcc on Linux can take advantage of quad core systems, and it may simply take Windows a while to catch up. Visual Studio 2008 "Ocras", will be able to handle parallel builds, and probably handle quad cores.
As for games, as more and more people start using quad core systems, games will be rewritten to take advantage of them. Rendering engines certainly could be optimized to allow more than a single object at a time to render at once (think of the cell processor). I am not too familiar with the gaming environment, but I expect that most games will start to take advantage of the new hardware with in a year.
As one of the department heads once told me, "The Hardware Fairy only comes once every few years, so always overspec what you need because you may be stuck with that system for five years." You may be right now that today's Windows software may be unable to take advantage of quad core systems, but what about the next twelve months?
I suspect that Visual Studio 2008 "Ocras" will be able to once it finally comes out, and that most games will quickly put out newer revisions that will take advantage of the quad cores. There may even be a service pack for Vista that will take better advantage of quad cores in the next 12 months. I personally would opt for a quad core system based upon my experience with software and operating systems. I suspect that even if there is now improvement in speed now, there certainly will be with in six to twelve months.
David on September 4, 2007 07:35 AMMark posted:
'Your bottom four comparison percentages seem to have the wrong "polarity".'
---
Actually, if you look at the source article, the bottom 4 comparisons are measuring run-time in seconds (lower is better), whereas some of the other comparisons are measuring speed (or some other quantity such as FPS, where higher is better).
So the relative percentages are correct, but the numbers reproduced in this blog post lack important contextual information - the actual quantities being measured, and how to interpret them (i.e. which is better - higher numbers or lower numbers).
You are way off target here.
The reason things like games don't use the quad core is that most of them were written before quad cores. Most of them don't easily scale to more cores so you'll have to wait for the games to catch up.
There also is the very real issue of being able to do more things at once.
Someone asked how you can watch a movie and do other things--quite easily. Nothing says that what else you are doing needs a human. There are plenty of things you could leave running while you're watching that movie.
Loren Pechtel on September 4, 2007 08:54 AMIn my previous comment, I should've clarified that the given relative percentages are correct as long as you interpret a positive value to mean "better performance", and a negative value to mean "worse performance". Again, the problem is that the raw numbers, with no units, are meaningless - you have to go back to the original article to see whether "higher" or "lower" means "better performance" for any given comparison.
Will on September 4, 2007 09:07 AM> I wonder, what about Stackless Python?..
Dunno, does the stackless modification remove the GIL? If it doesn't, then the tasklets still run in a single thread.
> Erlang fanboys think their software uses multiple cores, but in fact you need to have multiple Erlang interpreter processes running to do that.
Failed troll is failed, the Erlang runtime is natively multithreaded since the release of R11B-0 in May 2006. Since that time, the runtime automatically spawns a thread per core (default, you can ask for more or less) and dynamically maps your erlang processes on the OS threads.
You should update your knowledge before trying your trolls.
Masklinn on September 4, 2007 09:16 AM"Physics" is a singular noun. "The physics don't work" don't work with me.
Howard on September 4, 2007 09:41 AMAs other people have pointed out, the build system for VC++ 2008 will build multiple targets simultaniously. Here's an article on Valve using multiple cores.
http://arstechnica.com/articles/paedia/cpu/valve-multicore.ars
executive summary:
Even if you don't do anything special, you get some benefit, because your subsystems often run in multiple threads.
The graphics subsystem can really benefit from multiple cores. Other subsystems less so.
A side benefit is that the rest of the game gets more CPU, so you can have a more complex AI.
The practical max right now is 4 CPUs. More than that, and you're running into memory starvation issues.
> The last 4 entries in the XBit comparison have their sign reversed -- Excel is actually 63% SLOWER on 4 cores
The numbers are correct; it's a bit confusing because some units are larger->better, others are smaller->better.
Jeff Atwood on September 4, 2007 10:15 AMSo i guess you would want a quad core for development and not so much for gaming.
I guess in scenarios where the extra cores are getting pinned to a virtual machine.
I'm sure the payoff for multiple cores has deminishing returns on a normal desktop scenario.
brian on September 4, 2007 10:23 AMOk, now try running an Effect in Paint.NET (which is heavily optimized for "N" cores)!
http://www.getpaint.net/misc/pdn_4x_faster.png
Rick Brewster on September 4, 2007 10:23 AMJeff wrote:"The numbers are correct; it's a bit confusing because some units are larger->better, others are smaller->better."
Maybe you should add the units to the numbers in your post, otherwise people have no way of knowing what those numbers mean without looking at the original article.
Will on September 4, 2007 10:29 AMEven without intelligently-threaded applications, I think most users can make good use of dual-core desktops simply due to multitasking. Quad-cores definitely take more work to utilize, but certain classes of users could definitely use this... video editors, web developers, and other types of programmers and creative professionals often have several distinct intense processes at once.
Gabe da Silveira on September 4, 2007 10:35 AMIt's only a matter of time until all applications are written to take advantage of multiple cores, in which case we'll probably see a leap in software engineering. Maybe the Windows of the future won't take an eternity to boot anymore?
I should think you could occupy even more than 4 cores in game programming.
It's going to take some serious talent to harness the multitude of threads that will be available in the future and then apply them to multiple GPUs. Of course most people won't have SLI or Quad-SLI setups, but I think someone somewhere should write a "Super Game"(read up on your Nietzsche, kids) to demonstrate the shear power available without worrying about scalability across single cores. I bet John Carmack would do it, he already has enough Ferraris so he doesn't have to worry about sales volume.
Mattkins on September 4, 2007 10:41 AMAMD issues-
AMD vs Intel - IIRC AMD "Barcelona" is "true" quad core, current intel "quad" cores are two two core units in one package, typical intel/microsoft-style marketing-over-technical trickery.
Also, AMD quad core opterons will support nested page tables, making virtualization perform significantly better. Thus, if you want to play with virtualization, getting a quad core AMD might be your best short term option.
Zonky Zizzymouse on September 4, 2007 10:47 AMAnybody could test 7zip? Would it benefit from more cores?
MaS on September 4, 2007 11:36 AMThe implications of quad and more core processors are limitless.
Someone above said how they couldn't imagine games taking advantage of multi-cores. How about 'bad guys' running in one process (ie: on one core) and you/good guys running in another? Talk about awesome AI. They could respond and learn in real time. How about running a game 'server' locally while multiple people are connected to you?
As far as these benchmarks they're silly. someone already said it but...
Yesterdays software on tomorrows technology. It just doesn't matter much. I guarantee you there are aspects of almost every piece of software that could be improved by spreading the workload across multiple cores.
It is physically impossible for single-core processors to execute more than one instruction at a time. With the theoretical ceiling of clock speed fast approaching this means we may be at our limit of speed... but WAIT! we can now process more than one instruction at a time due to multiple cores (CPU's).
How about multi-tasking? how did the quad core hold up against the dual core when doing 4 things at once? how about the quad core vs the dual core with hyperthreading?
Burning a Disk, Encoding a DVD, Playing FarCry and streaming music via Rhapsody?
Those are the sort of comparisons that show the true potential of the multi-core processors. We will eventually hit a core-ceiling where it just doesn't make practical sense to go further but just like the clock speed was 15 years ago that's a long way off.
I say bring on the Cores!
Randy Aldrich on September 4, 2007 01:37 PMJeff,
One can try OpenMP from http://www.openmp.org to parallelize C/C++ applications. I've never tried it, but it would be a cool blog to test it out.
Kashif
Kashif Shaikh on September 4, 2007 02:54 PMI just like the idea that I wouldn't have to worry about my processor getting bogged down by background tasks while I'm gaming. On the other hand, I can't think of a lot of things I'd want to do in the background that wouldn't be using much more precious game resources, like networking and RAM...
My experience so far with the Dual Cores is that they're impressive when compared with a single core, but not so impressive that I'd avoid going with a quad core on a gaming rig, even if just playing a game doesn't stress all 4 cores. I'd rather have the extra overhead available and stop spending so much time optimizing my gaming systems, especially when it looks like many of the actual game benchmarks are making good utilization of 2 of the available cores (something you wouldn't have been able to say of a dual CPU system 5 years ago).
By the time I can afford the quad-core systems, someone will have figured out how to get some use out of it in the big 3 game engines (and many of the others).
Vizeroth on September 4, 2007 03:05 PMIf I were you I would repeat DivX 6.6.1.4 test to make sure multi-threading is enabled (not by auto but by setting the number of thhreads manually in encoder properties). I am not sure what you used as a host application but I suggest VirtualDub.
Lame test is also suspicious, I would repeat that too.
Quake 4 should (obviously) be tested with MT patch.
Photoshop is generally a bandwidth bound application. You need to make sure that you perform operations which are not bandwidth but compute bound to see the effects of more cores.
Add some audio processing application (Sonar or Cubase + many VST software synths and effects come to mind). Sound Forge 9.0 too, then flac or MonkeysAudio lossless compression.
there's also multi threaded applications designed to run certain aspects of a program on a certain core, that should significantly increase performance IMO
Danny on September 4, 2007 05:28 PM"Thanks for that! What about servers? web applications, database... Will quad cores systems add benefit there?"
Pardon if someone's already covered this, but applications that can handle more simultaneous threads of execution will benefit, otherwise not. A database I use with my day job, Progress, can start up multiple server processes. The last several places I've worked have had quad-cpu machines, and the database will cheerfully spawn multiple servers to spawn user requests, and typically use 3 or 4 of those cpus. A multithreaded web server would probably see benefits, for the same reason.
Rick C on September 4, 2007 06:30 PMCore Unaffinity:
Some multicore processors have a shared L2 cache, while some keep them separate for each core. Skiz's idea of "system based on small chunks of work which are given to whichever processor is free" (Skizz on September 4, 2007 01:24 AM) is fine when there is a shared cache. Otherwise, bouncing a thread's stack and active data among the cores' caches will waste a lot of cycles.
Running multiple applications:
Who runs two processes?
Whether with Windows task manager (alt-ctrl-del) or linux ps -ef see how many processes are running, many of which have multiple threads.
Typically, Windows desktop users are probably running 50-100 processes and 500+ threads.
Affinity:
Unless you force core affinity, I find that Windows does not do a good job keeping a CPU intense process on one core (even without any Windows calls in the intense loop)
Power Savings:
In the future, when the HW and OS work together to shut down cores that are not needed that microsecond, multicore will be very helpful in lowering the average power consumption of the processor.
Responsiveness:
I have had a Pentium D machine for two years, and have been very happy with it. It stays very responsive even when doing interactive work while there is a CPU intensive application running, e.g. MP3 encoding, audio filtering, etc.
See more about the multicore topic at http://www.2cpu.com/
David
David on September 4, 2007 07:31 PMYou didn't review any web browsers! The majority of the time that I'm on my computer, I'm in Firefox.
I often open 5, 15, 30 or more tabs at once, after which Firefox becomes extremely unresponsive for several seconds. By far the most slowdown and unresponsiveness I ever see is in Firefox, from opening lots of tabs at once.
If more cores can help this situation, it's a definite plus for me!
James Justin Harrell on September 4, 2007 09:21 PMFor something really controversial, why don't do the same test with browsers? :)
Monkey on September 4, 2007 11:56 PMIf you can't terminate an application that eats all CPU (rather: will take all it gets), then that is a bug in the OS, not a lack of cores.
Andreas Krey on September 5, 2007 12:32 AM"All these benchmarks seem to focus on running a single application, which is hardly ever the case! I usually have at least 4 applications open (web browser, email, IM, download manager etc), not to mention all the OS processes that run in the background."
You would never be able to tell whether you were on a 2-core or 80-core box, trust me. In fact, the difference between a fast single core and a slow dual core is pretty low (but visible) in this kind of light usage scenario. None of the apps are ever using cpu at the same time, and barely ever use it during the course of their run; if they ever do contend, it's always for memory or disk instead.
"If more cores can help this situation, it's a definite plus for me!"
It doesn't. This is supposed to be addressed in 3.0, if it doesn't end up on the chopping block, but in 2.0 most of the tab & session bookkeeping is serialized through a single thread.
"Photoshop is generally a bandwidth bound application. You need to make sure that you perform operations which are not bandwidth but compute bound to see the effects of more cores."
That's creating a synthetic benchmark; fact is Photoshop is just plain I/O bound nearly all the time and fiddling with the benchmark to remove that is disingenuous.
Note that this is also true for encoding: You can scale amazingly to 4-8 cpus, but then you hit a brick wall somewhere in there because the I/O just plain can't keep up, the disks, bus, or network are just flooded. In the specific case of video, anything that uses avisynth as a backend will be hobbled by its single-threaded execution unless it moves to experimental new versions, this especially includes the popular gordian knot/Auto GK apps, as well as newer Staxrip/Megui. Mencoder GUIs are more multi-threaded but still not optimal.
These benchmarks are good for at least figuring out which one you should buy, if you're torn between the two. Massively multicore is going to pave the way for event-driven separable software models and slowly change everything, I expect, but we're still years away from seeing that happen. The language experiments going on now are pretty heartening, I hope to see more soon. :D
Foxyshadis on September 5, 2007 12:53 AMWe've seen dual versus quads. But how about a dual processor rig with the dual core processors on it versus a single quad-core platform?
I'd guess it will be a bit slower due to the fact that the L1 cache is not shared between all the cores. But also I'd guess it will be faster when more applications start since the 2 cpus have more I/O bandwidth /controllers/magic cpu stuff.
re: Foxyshadis: "In fact, the difference between a fast single core and a slow dual core is pretty low (but visible) in this kind of light usage scenario."
There's a difference between queuing theory and reality here. Simple queuing theory assumes that a processor services a request until it is done, without interruption. It assumes that queue management and assignment of work to a processor is not done by one of the processors. Reality includes I/O and timeslice completion interrupts and thread dispatching managed by a kernel dispatcher that also runs on a processor.
Given this reality, and the fact that each core in a dual core processor is NOT half as fast as an equal price uniprocessor, dual-core will make most users happier.
In scientific applications on RISC processors, it is common to run the compute on one core and the I/O on the other.
See the Queue.xls at http://forum.johnson.cornell.edu/faculty/mcclain/Software/Software.htm
DAKra
DAKra on September 5, 2007 06:21 AMI think the main point for quad-core is that you can run _many_more_ processes without weighing down the system. Were you running any of the above benchmarks in combo fashion?
If all users ever do is just work on one mean intensive program, then yes fast dual-core would be better. But I as a developer would be running a couple of virtual machines, zipping stuff, watching video, listening to music, compiling code. With bags of RAM in an x64 system, i need not close many programs. And certainly yes, it helps to have multiple disks to distribute the IO load.
Having said that, I bought both a dual-core desktop and a dual-core laptop three months back so my next machine probably will be oct-core.
Aaron Seet on September 5, 2007 06:30 AM[quote Jeff]
Their answer? H.264 blu-ray video playback while "doing something else". Lame. How do you watch a movie and do something else at the same time?
[/quote]
I do that all the time..... :-) I love being able to watch video on one monitor while reading email or web sites on the other.
Aaron Seet on September 5, 2007 06:38 AM>Instead of speeding up a single-MP3 encoding, why not have the
>application process 4 different files at once.
Yea, that also works great for mass PNG crunching (via pngout or pngcrunch and the like). With a single core this can take hours or even days. But then again it isn't much of an issue since speed isn't really critical.
With some book-keeping you can let it run every now and then with idle priority. Well, that's what I'm doing on my single core machine.
Jos Hirth on September 5, 2007 07:21 AMBasically, you've ascertained that not many applications currently take much advantage of more than two cpu's. This is a surprise, how?
Quad core CPU's have maybe been on the market a year, if that, and cost a good bit over $1K for a good portion of that time. The market saturation for quad core PC's is probably still a fraction of 1%.
Just because you write a multithreaded program, that doesn't mean it will simply, automatically, and efficiently use all available cores. In many applications, writing code to efficiently use 1-4 cores is more effort that optimizing for just 1-2 cores. And given the low market presence yet for quad-cores, it probably hasn't been a worthwhile cost.
As the quads get cheaper and more commonplace (and eventually 8+ core cpu's appear) more and more software will take advantage of the extra CPU power.
This is no different than, say, putting 2+GB of RAM in a PC from 4 years ago: it might help a few apps, but it wouldn't make much differnce for most software since it wasn't written to need/use it.
I see usefulness in large numbers of cores when a single processor is handling many many tasks at once. But this is generally rare (in the name of making a more failsafe system, it's better to give individual devices their own processors than to have a big mainframe for each company/home/etc)
But I need to ask, why aren't they specializing these cores to specific tasks? Adding a specialized GPU can yield huge results; why not try to do the same for CPUs?
Jim Robert on September 5, 2007 09:20 AMI see the benifit when I'm actually doing a couple things at the same time, such as compiling something with -j3 on 3 cores and surfing the net or using an editor on the other core.
Chris G on September 5, 2007 03:43 PMWhy is scalability to N cores something that should concern the application developer at all? Multithreading goes hand in hand with multitasking, but it has to be supported at the OS level. The most responsive GUI I've ever used---making the Windows and pre-OS X Mac experiences seem prehistoric---was on a dual 66-MHz PowerPC with 16 Mb of RAM, running BeOS. Not coincidentally, that was a (so-called) "pervasively multithreaded" OS, where applications, including the GUI, were encouraged to spawn as many threads as necessary, which were then distributed across CPUs as needed. If you ran out of memory or cycles, performance degradation was gradual---no more of the freezes and skipping mouse pointers that still plague an overloaded Windows box. My point is that you couldn't help but write an app on BeOS that made efficient use of however many CPUs there were. For a programmer to have to think in terms of "how can I make use of N cores" is premature optimization. (Of course, a lot of people would first have to rid themselves of the mentality that their code is the only thing the user's ever going to run, so playing nice with the OS, drivers, or other apps is irrelevant.)
Alex Chamberlain on September 5, 2007 03:58 PMVery good tests, even if the results are unexpected, you've got to start somewhere.
For a more practical use of the quad core cpu's, how about building your own supercomputer ;)
http://www.clustermonkey.net//content/view/211/1/
Wesley W on September 5, 2007 11:28 PMGeoff: "Please don't use red and green text that are otherwise identical (same saturation, value, font, and so on), to differentiate positive and negative results. Yes, there is a minus sign in front of the negative results, but this is slow for the brain to latch on to, especially since the font is rather thin.
There are many other ways to visually separate good and bad results, almost all of which are better than just red and green and no other differentiation. I've seen some beautiful and effective choices, though many tend to bias the reader (bolding the bad results, for instance). Personally I find just replacing the green with blue to be quite effective."
Personally, I had no problem comprehending the difference between the values. Traditionally, red ink is used to indicate negative, especially in the financial world (you've surely heard the phrase "in the red", haven't you?).
The problem I have is with people telling others how they should design *their own* websites. If you don't like the design, don't read the post. It doesn't matter how positives/negatives are differentiated; if you are unable to figure out the difference, you probably won't comprehend the post well enough to get any value from it anyway.
KenW on September 6, 2007 08:09 AMKenW, did it occur to you that Geoff might be red-green colorblind?
Alex Chamberlain on September 6, 2007 09:15 AMI work at a big game development studio, and you can trust me when I say one of the most important things we do for speed is to move things off the main thread. On console, our code already has to be highly parallelized to make use of what's available. Of course the same thing goes for PC game development, as quad cores (and soon, more) are becoming the standard for high-end gaming PC's.
We use IncrediBuild for parallelized distributed builds. It speeds up the development process a ton.
Johan on September 8, 2007 12:51 AM1. it's been said that supercomputers (multi-multi-multi-multi-core systems) are tools for transforming cpu-bound problems into io-bound problems.
2. there's this part of maths called Mass Service Systems that basically proves that having 1 processor of speed nX is always better than having n processors of speed X, for any n > 1. therefore, if there's a way to increase single core speed, manufacturers should always take it... the problem is, it's easier to just glue several cores together.
Does any rendering apps such as 3Dstudio MAX or V ray utilise 2 physical Dual Core processors? As im thinking of buying a new machine with the intel Dualcore 5160 but wondering is it worth getting the 2nd CPU?
Thanks in advance for any advice!
Gavin on September 11, 2007 11:56 AMHow about testing Audio App's such as Protool's, Cubase Sx 3 or Cubase 4. Cubase and Reason 3.0 as Rewire. Those are some great programs to test with because they use so much CPU power. Not just the application itself but running several instances of Plug ins (Compressors, Synthesizers, Distortions, Etc...) using multiple tracks of audio and midi. That will be a great test. Let me know if you do a test like that ever. I want to know what the best is.
"Where as with games, I can't think of anything that could utilize the spare CPU cores..
I wonder if it's even remotely possible, but: To use extra cores as "software-graphics-cards". Since graphics are the only thing that really needs lots more processing power in games, it'd make sense to say divide the screen up between them, and use the remaining two cores to process extra effects on their area on screen. Biggest problem being the CPU's aren't as fast as drawing stuff to the screen as graphics cards are..."
There is a relation that can be used as an example here. Systems that render the PhysX engine without a PhysX Pci-card (software rendering) installed take a huge ditch in performance in PhysX supported games. One such title coming out is Fury, however it is still in beta, mind you. When running this game without a PhysX Pci-card installed, most rendering, if not all, is based specifically off your CPU. One way to test this theory on a dual-core (or possibly quad-core+) is to Crtl-Alt-Del and set the game process's affinity to only one CPU. You will notice that your framerate will drop to roughly half compared to running both processors. Hands down, PhysX based applications force a huge payload onto a CPU due to software rendering. If I could get my hands on a quad-core system, it'd be great to test this theory further. However, take into consideration that I'm 'assuming' Fury is coded to support the relevance of this.
Josh Vining on September 14, 2007 01:50 AMThank you for your nice article.
Abel on September 28, 2007 01:19 AMAppreciate the informative content. However, the second comparison, which compares a 3GHz dual to a 2.4GHz quad completely invalidates the results. 2.4GHz is NOT the fastest quad available and the resulting data becomes a comparison of clock speed rather than cores (I suspect a 3GHz dual core processor is faster than a 2.4GHz dual core processor as well).
Lupine on September 28, 2007 07:45 AM>However, the second comparison, which compares a 3GHz dual to a 2.4GHz quad completely invalidates the results.
He used the two on purpose to raise a point about price vs. performance. FTA:
--
In the short term, we have to choose between faster dual-core systems, or slower quad-core systems. Today, a quad-core 2.4 GHz CPU costs about the same as a dual-core 3.0 GHz CPU.
--
Quad Cores also shine in other areas other than a single app. running.
I have 18 icon tray programs and 26 apps open right now! My machine gates this way when I'm developing applications and my web site. I can use any of t hese apps concurrently and they all seem to be running full speed when I switch (alt-tab)
amnother example is I can run all 4 of my aniti/Virus,trojan,phisher,crap programs and they all run full speed with processor time to spare!
BigDog on October 13, 2007 07:31 AMIf you want to use all cores of your dual core / quad core /multi core processor with Microsoft visual C++ 2005 at project level you can also install this free plug-in : MPCL ( http://www.todobits.es/mpcl.html )
Advisor on October 24, 2007 03:12 PMFKIN NOOBS
BUY a thermalright ultra 120 for like $60 and u can overclock the Q6600 which its core speed at stock is @ 2.4GHz and with the above cooler u can easily overclock it to 3.2GHz+ FFS im sick of ppl saying that the Q6600 is shit, its not. Play CS:S? Counterstrike takes use of multicores (hence the Q6600) therefore the game will run ALOT quicker because counterstrike is very dependant on the CPU to process data alto more than other games. Crysis is another example of a game that takes use of multi-cores, OH and also ALL FUTURE DX10 GAMES, FFS...
NOOBS on November 12, 2007 04:46 AMi run a quad core q6600 at 3.35ghz and believe you mean, you will see a difference over a 3.5ghz dual core in a lot of things, though not all.
SiR ReaL on November 20, 2007 10:19 PMCan I point out that computers run at the speed of the slowest component currently in use. This is never the CPU or GPU it's normally the HDD. So it's really not that important to have either a dual/quard core processor unless you have the other components to support it.
graeme on January 10, 2008 08:31 AMit's kind of disingenuous to test single applications against multiple cores as a benchmark for performance to address the benefits of clock speed over cores. On a real-world system users will be running many different applications at the same time. Even the most steadfast single threaded application could still see performance gains due to having more available CPU time stemming from reduced contention with other applications running on the machine at the same time.
As time progresses more and more applications will take advantage of multiple cores and we will see more benefits, so there's also a 'future-proofing' aspect to take into account here as well.
[quote]
Can I point out that computers run at the speed of the slowest component currently in use. This is never the CPU or GPU it's normally the HDD. So it's really not that important to have either a dual/quard core processor unless you have the other components to support it.[/quote]
This is untrue. Unless the program is heavily hitting the HDD during it's entire operation this is not an issue. Most programs hit the HDD to load themselves into memory and then happily chug along without needing the HDD at all. In such a case it is memory, bus and CPU (and/or GPU depending on what kind of program) speed/bandwidth which binds their peformance.
chris on January 14, 2008 08:41 AMhas anyone seen bench with dual quad-core configuration ?
I know that theorically speaking W2K and Vista must be able to full up the eight core. Or is it mandatory to buy a server version ?
I'm using a quad core processor (Q6600), i did an light oc for it(from 2.4ghz to 3.ghz). For me, I think quad core is so good to handle multi applications on the same time. It work good in my case. I can let a lot of applications, which is running in the background while I'm doing something else such as playing games or coding in Visual Studio, listening or watching a video, and i feel good when everything running so fast and smoothly.
In my opinion, I think quadcore is better than dual core a lot. I have ever used dual core processor before, but it doesn't make me sastify with the performance. Now with quadcore processor, i got the best performance with my works, what i have to dream for a long time ago.
I don't think so. If you wanna compare Dual Core and Quad Core, you cannot use single processor multithreading application. You must use multiprocessor application such as SQL Server, 3dsMax Adobe After FX and others. Of course if you need speed as linier speed then increase the clock frequency. Quad Core 2.4GHz is the same with 2.4GHz dual core processors. But for mathematical processor such as non openGL direct rendering will make Quad Core win and I certainly find very very fast.
My Benchmark is rendering a huge 3dsmax file and a complex single frame which takes around 15 minutes in Dual Core 3 GHz or approx. 10 minutes in Core 2 Duo 2.4GHz. Than I replaced with Quad Core 2.4 GHz 6600, it takes less than 3 minutes. For me this is truly fast processor.
Hopes this help for anyone and we hope all application such as Photoshop use multiprocessor for filters processing and other floating point calculations. Trust me it will be fast.
Rizzo on February 11, 2008 03:02 AMDo any of you guys know where I can get a gaming computer that can run crysis for $2000 dollars or less?
matt on February 13, 2008 02:29 PMmatt, just buy any modern Dell, then buy and install a new NVIDIA 8800 GT video card. Problem solved.. seriously!
Jeff Atwood on February 13, 2008 02:38 PMFor those of you wanting to know the difference between dual and quad core on a webserver check out this site: http://www.intel.com/performance/server/xeon/web.htm
You can also see the difference among many other corporate uses
Ed on February 15, 2008 11:09 AMNo offense Ed, but the Intel study you provided is loaded....they are comparing a 3.0 GHz Dual Core with a 3.0 GHz Quad core....I think we can all see how that one is going to play out before even looking at the results. What would be more interesting to see is how a single quad core CPU performs against 2 Single Core CPUs or 2 Dual Core CPUs serving web pages with a variety of server software (Apache, IIS, WebLogic, etc) and OS.
Ed G on March 7, 2008 11:11 AMTwo words: make -j
It took me a while to sort out why this blog post bothered me. It's the hidden elitism of a programming god giving advice to the peasants. The advice is good advice for someone's mom, but not for a programmer. Awfully sweet of you to offer it.
The Q6600 easily overclocks to 3.5 Ghz on air, before voltages and temps get out of hand. And why can I buy commodity motherboards with this option? Just as porn drove the videotape industry, gaming drives high-performance DIY computing. The parallelism is already there in multiple video card setups, and better support for four and eight cores is only a matter of time.
Elitism is relative. You know the meme about our universe being a molecule in some alien's coffee table? This comes to mind, reading a blog by a career Windows programmer. It's entirely possible that the .NET options don't work as well as "make -j", or the easy parallelism offered by Erlang or Haskell.
Syzygies on March 15, 2008 07:45 AMwaiting for the mighty Q9450 2.66Ghz !
wow i feel much better about my Intel Core 2 Extreme X6800 ( oc-ed it to 3.2)
thx a lot for the comparison
max on April 15, 2008 10:08 AMI agree with Syzygies, this post bothered me a bit too for its hidden "elitism of a programming god giving advice to the peasants" as you say it. I don't want to be told "what's best" from an article that I have to trust it's written by someone better and more clever than me in every way, but I want data and tests of the correct problem and maybe a short conclusion with some opinions on these data, that's it.
1 example of this is the reply from the author: "Within reason, yes, but dual-core gets you 99% of the benefit of (n) core. If you're not careful, this becomes the wishful thinking scenario I just described. No matter how much of an ultra-elite-ninja single user you are, I guarantee..."
Where did you get the 99% data?
How can you "guarantee" this apply to my normal desktop use?
This is the session with my desktop in this moment (after a few hours of work):
2 Visual Studio sessions (1 is compiling a very large project in this moment and the other is in debug mode, the dual core is maxed out), Sql server session that I use for local debugging, SSMS, 1 Internet Explorer with 5 tabs open, 3 Firefox window open with 10 tabs (average) in each open, Thunderbird retrieving emails from 20 email accounts every 2-10 minutes depending on importance, 2 Remote desktops sessions, Last.fm retrieving and playing music, Hamachi VPN, 2 chats, Worktime, Photoshop CS2, 3 notepad with text, Firewall, Antivirus (of course all of this in 2 screens). (And I do believe that this amount of apps is very common with MANY real-life programmers)
My main issue is that I have to wait for VS compilations a few minutes and debugging is not smooth (I have to wait for compilation of ASP.NET page at every change to debug it), I do agree that a SSD would be the most important piece to improve my performance(and I will buy an Mtron ssd this week), but I can "guarantee" you that I max out my dual core Athlon64X2 5600+ very often and this should mean that I would benefit a lot from a Quad core for my use... right?
EVEN if I don't touch my computer and I am not compiling but I just have my usual apps open, around 40% of the 2 cores are used.
Do you agree that with my average use I could make some use of a Quad core?
It would be interesting to see a test from a real life "ultra-elite-ninja" programmer to see what the REAL LIFE benefits are... In codinghorror you have a test of "compiling +multitasking" with dual core (http://www.codinghorror.com/blog/archives/000285.html) . This could be extended with a comparison to quad core in real life use and specifying what apps where used at the same time.. varying the number of apps we will see when the turning point is reached and so when quad core becomes more performant.
THAT would be very interesting..
In the mean time I still stand by my opinion that for heavy desktop users a quad core is VERY beneficial! and so between a 3.0 GHz
Dual Core or a 2.4 GHz dual core a quad will perform much better in these cases. This is my 2 cents.
As others have said, imaging is an area where cores help, and our WIC codecs were designed from the beginning with multiple cores in mind.
But there are other areas where multiple cores can be utilized: PNGOUTWin uses algorithms that can't be made parallel, but the app can use every core when processing a batch. It wasn't that hard to do, and the code isn't all that complicated. Everyone should be designing their code this way if they can. All it takes is converting your single background worker thread--which is an idea that has been around for a long time and probably already exists in a lot of apps--into a thread pool and then add some scatter/gather logic. Not all tasks can be subdivided, but I'm sure there are a lot that could take advantage of today's hardware.
David Blake on April 24, 2008 08:55 AM> The Q6600 easily overclocks to 3.5 Ghz on air
That's interesting, since I've worked on two that barely made it to 3.0 GHz.
> In the mean time I still stand by my opinion that for heavy desktop users a quad core is VERY beneficial!
And yet there is no benchmark data, outside of a few highly specialized areas, to support this argument. You believing this does not make it true. I know it feels good to see 4 cores in Task Manager (or the equivalent on your OS), but don't let that warm glow blind you to cold, hard data.
Jeff Atwood on April 24, 2008 04:18 PMJeff Atwood - At the same time there is no article or study that a developer using many application at the same time is better off with 2 cores. All the cases and all the tests that you show are NOT considering how a real developer would use his computer. So your cold, hard data doesn't really answer the question too, and you are in the same situation as me (no hard data). Why don't you run a serious test on this specifically for programmers? It would make a nice article as nobody has done it. I could help you out sorting out the case scenario for an ASP.NET programmer.
I see from this post: http://www.codinghorror.com/blog/archives/001103.html
That this April you finally changed your opinion and you seem to agree with me. Programmers should get the more cores they can get.. as you said.
I am not a blog writer but a programmer, and I can just tell you that with the way I program when I use a quad I can really see a great difference. That's a real life scenario and first-hand experience that none of the tests I have seen up to now can beat or disprove.
If you want to do some serious test let me know if I can help.
Ps. Also consider how handy virtual machines are for programmers and how they could run nicely on a 4+ cores! Good programmers should adapt to use to the full their machine and if you give more cores to them they will find some great use for them!. I would buy a 8 or 16 or 32 cores now if they were available.
Mark on April 25, 2008 12:57 PMOk, it seems that in some (or most)cases, GHz matter more than cores. But what if you overclock the Q6600 to 3.5 GHz or 3.8 GHz? Won't that make it better in all categories since it will now have more GHz and cores?
Andrey on May 11, 2008 10:55 PMIt's been said that humans aren't smart enough to write threaded software.
I think there's a lot of truth in that. I took a class in college where we had to implement semaphores in C++, and I resolved never to do that again if I could help it. However, you could also make the argument that humans aren't smart enough to juggle memory registers, either, and I'd tend to agree, for similar reasons.
A programming language should abstract the capabilities of the hardware into a mentally "ergonomic" framework, so that humans with human brains can get the computer to do interesting things.
There are a few languages that are very adept at handling multiple CPUs due to their inherent design. They weren't designed for multiple CPUs explicitly, but rather multiple servers and nodes in a network, which turns out to be very similar. Erlang is the classic example. By abstracting everything into processes that communicate by sending messages, your code doesn't have to care whether the process it's talking to is running on a different core on the same machine, or on another machine on the network. Because this is built into the language itself, it's much simpler to write code that keeps getting faster as you throw more cores at it. (In fact, it's practically impossible for an Erlang program *not* to balance across all the cores it has available---AFAIK, there's no way for the programmer to specify it, though of course the OS can distribute CPU time as it sees fit.)
Of course, programming in Erlang can be much less "straightforward" in some ways than programming in C or Basic. Objects and methods are a bit easier to grok than state machines and messages. But, the same could be said for programming in C or Basic instead of Assembly. Fundamentally, the point is that the parts of a computer that are hostile for humans to worry about, whether they're memory registers or concurrency, should be handled by a language that presents the programmer with something they can work with more easily. This presents a bit of a learning curve over pushing the bits yourself, but in the end, confers more power to build interesting things, as anyone who's used both Assembly and C# probably knows.
The fact that these programs don't get any faster when you double the computing power only shows that they were written in a soon-to-be outdated language. To a consumer, 4 is bigger than 2, so it's better. Once it's cheap enough to not be much of a difference on the bottom line, any manufacturer would be foolish to not put as many cores as they can fit into every machine, useful or not. (Just try finding a single-core computer today at an Apple store or Best Buy. It's pretty tricky, actually.)
When everyone's computer has 12 cores, the programs that crush their competitors will be the ones that take advantage of that with the least effort.
Isaac Z. Schlueter on May 13, 2008 08:47 PMI was just looking back at the data and I have come to the rather opposite conclusion- that quad core is better. I was suspicious about the long list (with red and green numbers) and I decided to check out the source and found that his list was based on non overclocked speeds. Now I already knew that, but what I did not know and found out was that if you overclock the quad core to 3.6 GHz, it will pretty much wipe the floor with a 3.8 GHz overclocked dual core. On all levels! I think that should have at least been mentioned because those who want the best performance for their buck will overclock.
Hope that clears up some questions,
Andrey
Andrey on May 14, 2008 02:51 PMI'm very conflicted here and was hoping you guys could give me some advice. I use my PC for video editing and audio recording/editing. I'm about to buy a new Dell XPS 630 to replace my 2.4Ghz P4 Dell that's six years old. My two options for processor are the Q6600 Quad 2.4 Ghz or for $100 more an E8500 Dual Core 3.0 Ghz processor. For my needs, I've been given conflicting opinions on which processor to go with and this article hasn't helped me much. However, if I went with the quad and OC'd to 3.0Ghz would that give the clear advantage to the quad core?
Sorry to be such a novice, but that's why I'm looking to you guys for advice. :)
Thanks,
George
> if you overclock the quad core to 3.6 GHz
Power dissipation is doubled for quad-core -- remember Intel (for now) quite literally slaps two copies of the dual core on a die and calls it a quad. The quads never overclock as well as the dual, because there are twice as many things that can go wrong. Plus the power problem, which can be severe. Their top of the line Quad is rated at 135 watts. At 3.6 you'll be pulling much more, 150w or more.
> I use my PC for video editing and audio recording/editing
For video editing, the quad core will be much better.
Jeff Atwood on May 18, 2008 01:00 AMIn real life we use many applications simultaneously. OS processes can be pushed to a separate core.One core for Norton Anti-Virus. The remaining two cores can handle browser with (mulitple tabs) and music player as I surf while listening to songs (while I donwload more songs or videos). I often have a scenario where I want to watch a DVD on my PC and have TV tuner card recording a program ad the system sputters and hangs.Add some batch image processing or doing a C++ build along with something else and you soon realise that a quad-core is a necessity and not a luxury.
Harish on May 29, 2008 04:12 AMQuad core is basic necessity these days !
Harish on May 29, 2008 04:13 AMYou just made these numbers up !
They are not consistent and not correct calculation.
We regularly do searches and there is anti-virus to think about.These two processes run concurrently quite often.Plus the core applications we are engaged in.So anybody atleast has 4 processors running. We need Quad Core.
Harish on June 1, 2008 09:47 PMFor anyone running Quad Core with Windows XP Pro SP3, try this. Open several programs that you need to run at once that all use lots of CPU time... (I often run Firefox, Adobe Photoshop, Microsoft Frontpage, AVG Anti-virus, Microsoft Outlook and Microsoft Word at the same time.
Now, do Ctrl + Alt + Del to bring up Task Manager. Then under Applications, right click an application and select "Go to Process" Then on the process it takes you to, Right click & select "Set Affinity" Now you can limit that application to one or two CPU's.
Do this with each of the programs you are running (Not always an option with software running as a Service such as my AVG anti virus)
I set my Photoshop to CPU 1 & 2, Microsoft Word to CPU3, Frontpage to CPU4, and Outlook to CPU1
Guess what? The result was faster response from all of the programs. When I Alt Tabbed between programs I did not have to wait anymore.
I say, if your going to benchmark the Quad Vs Duo Core... Set it up for Real life use. Not all of us are 16 and using the system to play video games. I say Give me the 31 CPU system that I see (27 of them grayed out LOL) when I set my CPU Affinity and let me get my work done!
Thanks,
Andrew Albee
Victory Computer Service
Houston, TX
Anything that is massively multi-threaded will benefit from
multiple cores. I do a lot of DirectX/DirectShow programming,
and without me doing a darn thing, the very nature of the COM objects (DirectShow filters) creates a massively multi-threaded application
that zooms when it can get its hands on multiple processors.
These apps have always run better on dual CPU, or hyper-threaded,
or Core 2 (now) processors than on single CPUs, and in fact,
get an almost perfect 100% performance boost by using 2 CPUs.
Do a Core 2 Quad is a no-brainer for me. It will make my apps
run great.