I tried to gather as much as i could, but i'm not exactly sure if the ps3's gpu stats are right.
Let's look at the maximum theoretical numbers for the Xbox 360 and PS3 GPUs.
Triangle Setup
Xbox 360 - 500 Million Triangles/sec
PS3 - 250 Million Triangles/sec
Vertex Shader Processing
Xbox 360 - 6.0 Billion Vertices/sec (using all 48 Unified Pipelines)
Xbox 360 - 2.0 Billion Vertices/sec (using only 16 of the 48 Unified Pipelines)
Xbox 360 - 1.5 Billion Vertices/sec (using only 12 of the 48 Unified Pipelines)
Xbox 360 - 1.0 Billion Vertices/sec (using only 8 of the 48 Unified Pipelines)
PS3 - 1.0 Billion Vertices/sec
Filtered Texture Fetch
Xbox 360 - 8.0 Billion Texels/sec
PS3 - 12.0 Billion Texels/sec
Vertex Texture Fetch
Xbox 360 - 8.0 Billion Texels/sec
PS3 - 4.0 Billion Texels/sec
Pixel Shader Processing with 16 Filtered Texels Per Cycle (Pixel ALU x Clock)
Xbox 360 - 24.0 Billion Pixels/sec (using all 48 Unified Pipelines)
Xbox 360 - 20.0 Billion Pixels/sec (using 40 of the 48 Unified Pipelines)
Xbox 360 - 18.0 Billion Pixels/sec (using 36 of the 48 Unified Pipelines)
Xbox 360 - 16.0 Billion Pixels/sec (using 32 of the 48 Unified Pipelines)
PS3 - 16.0 Billion Pixels/sec
Pixel Shader Processing without Textures (Pixel ALU x Clock)
Xbox 360 - 24.0 Billion Pixels/sec (using all 48 Unified Pipelines)
Xbox 360 - 20.0 Billion Pixels/sec (using 40 of the 48 Unified Pipelines)
Xbox 360 - 18.0 Billion Pixels/sec (using 36 of the 48 Unified Pipelines)
Xbox 360 - 16.0 Billion Pixels/sec (using 32 of the 48 Unified Pipelines)
PS3 - 24.0 Billion Pixels/sec
Multisampled Fill Rate
Xbox 360 - 16.0 Billion Samples/sec (8 ROPS x 4 Samples x 500MHz)
PS3 - 8.0 Billion Samples/sec (8 ROPS x 2 Samples x 500MHz)
Pixel Fill Rate with 4x Multisampled Anti-Aliasing
Xbox 360 - 4.0 Billion Pixels/sec (8 ROPS x 4 Samples x 500MHz / 4)
PS3 - 2.0 Billion Pixels/sec (8 ROPS x 2 Samples x 500MHz / 4)
Pixel Fill Rate without Anti-Aliasing
Xbox 360 - 4.0 Billion Pixels/sec (8 ROPS x 500MHz)
PS3 - 4.0 Billion Pixels/sec (8 ROPS x 500MHz)
Frame Buffer Bandwidth
Xbox 360 - 256.0 GB/sec (dedicated for frame buffer rendering)
PS3 - 20.8 GB/sec (shared with other graphics data: textures and vertices)
PS3 - 10.8 GB/sec (with 10.0 GB/sec subtracted for textures and vertices)
PS3 - 8.4 GB/sec (with 12.4 GB/sec subtracted for textures and vertices)
Texture/Vertex Memory Bandwidth
Xbox 360 - 22.4 GB/sec (shared with CPU)
Xbox 360 - 14.4 GB/sec (with 8.0 GB/sec subtracted for CPU)
Xbox 360 - 12.4 GB/sec (with 10.0 GB/sec subtracted for CPU)
PS3 - 20.8 GB/sec (shared with frame buffer)
PS3 - 10.8 GB/sec (with 10.0 GB/sec subtracted for frame buffer)
PS3 - 8.4 GB/sec (with 12.4 GB/sec subtracted for frame buffer)
Shader Model
Xbox 360 - Shader Model 3.0+ / Unified Shader Architecture
PS3 - Shader Model 3.0 / Discrete Shader Architecture
The ps3's RSX could be instead be represented ofGeForce 7 based architecture with 24 pixel pipelines, 8 vertex pipelines, 8 ROPS, 500MHz core, and 650MHz memory. In other words, it dropped from 550MHz/700MHz to 500MHz/650MHz.
I wasn't sure so i posted what i read more often.
Processors: Info taken from http://www.avsforum.com/because they already had it all.
"When I initially looked at the specifications of the Cell (PS3) and Xenos (Xbox 360) processors, it appeared that the cell processor had a big advantage over the xenos processor if both were able to harness the maximum amount of power. After looking into more detail I have come to the conclusion that the xenos processor will probably be able to perfom better than the cell processor under almost all conditions.
Both processors are stripped down and modified versions of the IBM 970 PowerPC. Each core executes at less than 1/2 the speed of the IBM 970 at the same clock frequency due to the fact that the IBM 970 has multiple execution units and will perform out-of-order execution (parallel processing) whereas the cell and xenos processors only have a single execution unit and will perform in-order execution (sequential processing). The following link illustrates the performance of a PS3 at 3.2 GHz and a Power Mac G5 at 1.6 Ghz using the linux operating system.
http://www.geekpatrol.ca/2006/11/playstation-3-performance/
Linux runs on the Power Processor Element (PPE) of the cell processor so the results should be similar to one core of the xenos processor since all three cores are the same. Both processors are clocked at 3.2 GHz.
The similarities of the two processors ends there. The xenos processor has 3 identical PPE cores where as the cell processor has only 1 PPE core and 7 SPE cores.
Cell Processor
- One general purpose PPE core that is used for the OS and the game application.
- 512 MB total memory on 2 buses which can be accesed directly only by the PPE core. 256 MB of processor main memory and 256 MB of memory used by GPU.
- 512 KB L2 cache for the PPE.
- 32 KB L1 instruction cache and 32 KB L1 data cache for the PPE.
- 7 specialized SPE cores. One is used for the OS leaving 6 for the game application.
- 256KB SRAM per SPE. No common memory between SPEs and SPE cannot access the PPEs main memory directly but the PPE can access the SPEs memory directly.
- Communications between SPE memory or to the PPE memory is performed via the Element Interconnect Bus (EIB) by either accessing ports or via DMA.
- SPEs do not have branch prediction capability.
Xenos Processor
- 3 General purpose PPE cores that are used for the OS and game application.
- 512 MB main memory that is shared by all three cores and GPU.
- 1 MB of L2 cache that is shared by the 3 cores (333 KB per core average).
- 32 KB L1 instruction cache and 32 KB data cache for each core.
- 2 Hardware threads per core.
Programming the 360
The OS does not use core 0 and uses only about 3% of the power of core 1 and 3% of the power of core 2. Therefore about 98% of the processor power of all three cores are available for the game application.
Programming the 360 is fairly easy and straight forward since a large amount of shared main memory is available, a relatively large amount of shared L2 cache is available, and information can be quickly and easily passed between different threads (cores) of the application by just passing pointers.
Typically an application will initially be developed using only one thread of a core. Once the application is developed the application can then be segmented to use multiple cores and possibly multiple hardware threads of each core. The easiest seqmentation would be to place the game control plus AI code in one core and graphics rendering code in another core. As soon as the AI code completes its operation, it would queue the information for graphics rendering core and immediately start to process the next frame. The graphics rendering code will be executing code for the current frame and the AI will be executing code for the next frame simultanously.
Segmenting a program beyond that becomes more difficult. The developer would have to first determine where the bottleneck is occuring. If it was in the AI code, he would then have to determine if parallel processing can be performed on the code (ex. In a racing program, it may be possible for the main program to process the AI for 5 racing cars and another core process the AI for the other 5 racing cars on the track at the same time). If the bottleneck was in the graphics rendering code, it may be possible for part of the graphics rendering code to be done in parallel in another core.
When a program is seqmented among all three cores, one of the cores may be active 100% of the time but the other two may only be active a very small time (10%, 20%, 50%, etc.). In this case more segmentation may be required of the core that is active 100% of the time. In this case, a new hardware thread can be added to one of the less active cores to handle 2 processes at one time. Once all the available hardware threads are used and more segmentation is still required, software threads (although not as efficient as hardware threads) can then be added until that core approaches 100% usage.
Once all three cores are executing near 100%, the maximum frame rate, sophistication, and detail capabilities will have been acheived. If the AI is issueing frames faster than the GPU can process them (maximum 60 fps at 720p or 30 fps at 1080i), more detail or sophistication can be added
Programming the PS3
The PS3 is so much more difficult to program than the 360. In a sense it is designed similar to multiprocessor systems used by specialized customers such NASA Ames Research Center. The concept is based on the principle that there is a very large amount of repetive mathematical data that can be performed in a parallel or a segmented sequential fashion (ex. one core multiples two arrays of 10000 numbers and then passes the output array to another core which performs divides on individual elements in the array which will pass the array to another core which performs some other operation on the data, etc. After the first core finishes its operation, it will acquire more data and perform the same operation).
Like the 360, the application would initially be developed using the PPE core. Next you would think that the PS3 (just like the 360) would be able to segment the game control plus AI code into one core and the graphics rendering code into another core. However that is not possible! Since the total application code may be about 100 MB and the SPE only has 256KB of memory, only about 1/400 of the total code can fit in one SPE memory. Also since there isn't any branch prediction capabilities in an SPE, branching should be done as little as possible (although I believe that the complier can insert code to cause pre-fetches so there may not be a big issue with branching).
Therefore the developer has to find code that is less than 256KB (including needed data space) that will execute in parallel.
Even if code can be found that can be segmented, data between the PPE and the SPE has to be passed back and forth via DMA which very slow compared of a pointer to the data like the 360.
If we assume that enough segment code was found that could use all the 6 SPE cores assigned to the game application, now the developer would try to balance the power among the cores. Like the 360, some or all the cores may have a very low utilization. Adding more hardware threads are not possible since each core has only one hardware thread. Adding software threads probably will not work due to the memory constraint. So the only option is an overlay scheme where the PPE will transfer new code using DMA to the SPE when the last overlay finishes processing. This is very time consuming and code has to be found that does not overlap in the same time frame."
Memory:
Memory - PS3's 256MB XDR @ 3.2GHz vs. Xbox 360's 512MB GDDR3 @ 700MHz
What all this means is that the bigger memory you have, combined with a fast processor, the faster you can process and display complex graphics, which affects the overall framerate speed of your game. In this case, the PS3 has half of the installed memory of the Xbox 360 but can process data faster. It's like trying to swallow a large chunk of data and your processor determines how fast you can chew it. Which will be faster - a small chunk chewed at 3.2Ghz or a big chunk chewed at 700 Mhz?
Media Type - PS3's BLUray Discs vs. Xbox 360's DVD-9
Most game discs nowadays are stored on DVD-9's much like those of the Xbox 360. What Sony has got going for them is that they're also the manufacturer of BLUray which is a high-definition storage disk capable of holding 54 Gigagbytes of data. This type of storage can be a potential necessity in the future.
Sound - both sporting 5.1 Dolby Surround Sound
Online potential - Both consoles feature Ethernet and Wi-Fi capabilities. They come in handy when playing online, or for non-gaming activities like downloading music and videos. Microsoft has an edge because it already has an online network in place.
Log in to comment