About that "PS3 is a supercomputer", Xbox 360's three cores own anything on this planet, megaflops, teraflops, etc... A quote for a removed AnandTech's article...
--------------------------
Speaking under conditions of anonymity with real world game developers who
have had first hand experience writing code for both the Xbox 360 and
PlayStation 3 hardware (and dev kits where applicable), we asked them (the developers) for
nothing more than their brutal honesty. What did they think of these new
consoles? Are they really outfitted with the PC-eclipsing performance we've
been lead to believe they have? The answer is actually quite frequently
found in history; as with anything, you get what you pay for.
What about all those Flops?
The one statement that we heard over and over again was that Microsoft was
sold on the peak theoretical performance of the Xenon CPU. Ever since the
announcement of the Xbox 360 and PS3 hardware, people have been set on
comparing Microsoft's figure of 1 trillion floating point operations per
second to Sony's figure of 2 trillion floating point operations per second
(TFLOPs). Any AnandTech reader should know for a fact that these numbers
are meaningless, but just in case you need some reasoning for why, let's
look at the facts.
First and foremost, a floating point operation can be anything; it can be
adding two floating point numbers together, or it can be performing a dot
product on two floating point numbers, it can even be just calculating the
complement of a fp number. Anything that is executed on a FPU is fair game
to be called a floating point operation.
Secondly, both floating point power numbers refer to the whole system, CPU
and GPU. Obviously a GPU's floating point processing power doesn't mean
anything if you're trying to run general purpose code on it and vice versa.
As we've seen from the graphics market, characterizing GPU performance in
terms of generic floating point operations per second is far from the full
performance story.
Third, when a manufacturer is talking about peak floating point performance
there are a few things that they aren't taking into account. Being able to
process billions of operations per second depends on actually being able to
have that many floating point operations to work on. That means that you
have to have enough bandwidth to keep the FPUs fed, no mispredicted
branches, no cache misses and the right structure of code to make sure that
all of the FPUs can be fed at all times so they can execute at their peak
rates. We already know that's not the case as game developers have already
told us that the Xenon CPU isn't even in the same realm of performance as
the Pentium 4 or Athlon 64. Not to mention that the requirements for
hitting peak theoretical performance are always ridiculous; caches are only
so big and thus there will come a time where a request to main memory is
needed, and you can expect that request to be fulfilled in a few hundred
clock cycles, where no floating point operations will be happening at all.
So while there may be some extreme cases where the Xenon CPU can hit its
peak performance, it sure isn't happening in any real world code.
The Cell processor is no different; given that its PPE is identical to one
of the PowerPC cores in Xenon, it must derive its floating point performance
superiority from its array of SPEs. So what's the issue with 218 GFLOPs
number (2 TFLOPs for the whole system)? Well, from what we've heard, game
developers are finding that they can't use the SPEs for a lot of tasks. So
in the end, it doesn't matter what peak theoretical performance of Cell's
SPE array is, if those SPEs aren't being used all the time.
Another way to look at this comparison of flops is to look at integer add
latencies on the Pentium 4 vs. the Athlon 64. The Pentium 4 has two double
pumped ALUs, each capable of performing two add operations per clock, that's
a total of 4 add operations per clock; so we could say that a 3.8GHz Pentium
4 can perform 15.2 billion operations per second. The Athlon 64 has three
ALUs each capable of executing an add every clock; so a 2.8GHz Athlon 64
can perform 8.4 billion operations per second.By this silly console
marketing logic, the Pentium 4 would be almost twice as fast as the Athlon
64, and a multi-core Pentium 4 would be faster than a multi-core Athlon 64.
Any AnandTecb reader should know that's hardly the case. No code is
composed entirely of add instructions, and even if it were, eventually the
Pentium 4 and Athlon 64 will have to go out to main memory for data, and
when they do, the Athlon 64 has a much lower latency access to memory than
the P4. In the end, despite what these horribly concocted numbers may lead
you to believe, they say absolutely nothing about performance. The exact
same situation exists with the CPUs of the next-generation consoles; don't
fall for it.
There you have it.
It's all a marketing bull.
Log in to comment