http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3326&p=1
While Nehalem is designed to scale to up to 8 cores per chip, each one of those cores has the hardware necessary to execute two threads simultaneously - yep, it's the return of Hyper Threading. Thus our quad-core Nehalem sample appeared as 8 logical cores under Windows Vista.
We took Valve's source-engine map compilation benchmark and measured the compile time to execute one instance (4 threads) vs. two instances of the benchmark. The graph below shows the increase in compilation time when we double the workload:
While the 2.66GHz Core 2 Quad Q9450 (Penryn) takes another 127 seconds to execute twice the workload, the 2.66GHz Nehalem only needs another 49 seconds. And if you're curious, this quad-core Nehalem running at 2.66GHz is within 20% of the performance of an eight-core 3.2GHz Skulltrail system. Equalize clock speed and we'd bet that a quad-core Nehalem would be the same speed as an 8-core Skulltrail here.(That means Intel has fixed Hyper Threading and made it so that you do not have any performance loss like previous hyperthreaded CPU's)
The DivX test is an important one as it doesn't scale well at all beyond four threads, any performance advantage Nehalem has here is entirely due to microarchitectural improvements and not influenced by its ability to work on twice as many threads at once.
Using AutoMKV we compress the same source file we used in our WME test down to 100MB, but with the x264 codec. We used the 2_Pass_Insane_Quality profile:
Encoding performance here went through the roof with Nehalem: a clock for clock boost of 44%. Once more, Nehalem at today's artificially limited, modest clock speed is already faster than any Penryn out today. What Intel did to AMD in 2006, it is doing to itself in 2008. Amazing.
Our benchmark, as always, is the SPECapc 3dsmax 8 test but for the purpose of this article we only run the CPU rendering tests and not the GPU tests.
The results are reported as render times in seconds and the final CPU composite score is a weighted geometric mean of all of the test scores.
POV-Ray is a popular raytracer, also available with a built in benchmark. We used the 3.7 beta which has SMP support and ran the built in multithreaded benchmark.
Finally POV-Ray echoes what we've seen elsewhere, with a 36% performance improvement over the 2.66GHz Core 2 Q9450. Note that Nehalem continues to be faster than even the fastest Penryns available today, despite the lower clock speed of this early sample.
Anandtech
And all this with the Intel Sandy Bridges approaching quickly. And if what has been said about Sandy Bridges is correct, 16-32 CPU cores with 2 logical cores(so theoretically 32-64 computing Cores) on each... We are in for a new era of computing.
Log in to comment