Good thread. I enjoyed the read.
Sounds interesting architecture and sounds like PS4 is going to be easy and stress free to develop on.
So PS4 is awesome, Confirmed.
This topic is locked from further discussion.
[QUOTE="Mrmedia01"]
Good thread. I enjoyed the read.
Sounds interesting architecture and sounds like PS4 is going to be easy and stress free to develop on.
So PS4 is awesome, Confirmed.
lostrib
Unless it's like $600 dollars without any good games
Â
PS3 has good games and continues to have great games come out. PS4 I know will have great games also, and great launch games. True though, PS4 price could be Messed up. I sure hope not. Price would be the only thing to hold PS4 back.
[QUOTE="Sali217"]The difference is the PS4 won't go obsolete as quickly as a new PC will. I will still be able to play all the latest games on it in 6 or 7 years without updating anything.lostrib
Don't care
That's because you're probably one of those PC nerds that enjoys tinkering with electronic crap.Good Read...
How does the PS4 differ from a high-end gaming PC?
http://www.shacknews.com/article/78889/how-does-the-ps4-differ-from-a-high-end-gaming
Sony described its upcoming PlayStation 4 as a "supercharged" PC. Powered by familiar x86 architecture manufactured by AMD, PS4 is more like a gaming PC than any previous Sony console. However, while it may use many parts found in high-end gaming PCs, PS4 system architect Mark Cerny argues that PS4 has many unique features that separate it from today's PCs.
"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," Cerny said, pointing to the 8GB of GDDR5 RAM that's fully addressable by both the CPU and GPU. "If [a PC] had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it's a very straightforward benefit that you get even on your first day of coding with the system."
According to Cerny, PS4 addresses the hiccups that can come from the communication between CPU, GPU, and RAM in a traditional PC. "A typical PC GPU has two buses," Cerny told Gamasutra in a very detailed technical write-up. "There's a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication--any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
PS4 addresses these concerns by adding another bus to the GPU "that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches." The end result is that it removes synchronization issues between the CPU and GPU. "We can pass almost 20 gigabytes a second down that bus," Cerny said, pointing out that it's "larger than the PCIe on most PCs!"
"The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we've worked with AMD to increase the limit to 64 sources of compute commands," Cerny said. According to Cerny, the reason for the increase is that middleware will have a need to use compute as well. "Middleware requests for work on the GPU will need to be properly blended with game requests, and then finally properly prioritized relative to the graphics on a moment-by-moment basis."
There's no TLDR because the entire article is pretty much filled with content. Either read or don't.
Good Read...
How does the PS4 differ from a high-end gaming PC?
http://www.shacknews.com/article/78889/how-does-the-ps4-differ-from-a-high-end-gaming
Sony described its upcoming PlayStation 4 as a "supercharged" PC. Powered by familiar x86 architecture manufactured by AMD, PS4 is more like a gaming PC than any previous Sony console. However, while it may use many parts found in high-end gaming PCs, PS4 system architect Mark Cerny argues that PS4 has many unique features that separate it from today's PCs.
"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," Cerny said, pointing to the 8GB of GDDR5 RAM that's fully addressable by both the CPU and GPU. "If [a PC] had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it's a very straightforward benefit that you get even on your first day of coding with the system."
According to Cerny, PS4 addresses the hiccups that can come from the communication between CPU, GPU, and RAM in a traditional PC. "A typical PC GPU has two buses," Cerny told Gamasutra in a very detailed technical write-up. "There's a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication--any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
PS4 addresses these concerns by adding another bus to the GPU "that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches." The end result is that it removes synchronization issues between the CPU and GPU. "We can pass almost 20 gigabytes a second down that bus," Cerny said, pointing out that it's "larger than the PCIe on most PCs!"
"The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we've worked with AMD to increase the limit to 64 sources of compute commands," Cerny said. According to Cerny, the reason for the increase is that middleware will have a need to use compute as well. "Middleware requests for work on the GPU will need to be properly blended with game requests, and then finally properly prioritized relative to the graphics on a moment-by-moment basis."
There's no TLDR because the entire article is pretty much filled with content. Either read or don't.
wasted_wisdom
"A typical PC GPU has two buses," said Cerny. "Theres a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication -- any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. ""
"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
http://www.whatmannerofburgeristhis.com/blog/topics/OpenCL/
The latest Tahiti/GCN GPUs have a read/write, incoherent L1 cache local to a compute unit. Since a single workgroup will always run on a single compute unit, memory will be consistent in that group using the cache.
Use atomics to bypass the L1 cache if you need strong memory consistency across workgroups. This is an option for reads that aren't very critical. This was true for one of the N-body kernels. For another it was many times slower than running a single workgroup at time to ensure global consistency.
-----
http://developer.amd.com/resources/heterogeneous-computing/what-is-heterogeneous-system-architecture-hsa/
With HSA, applications can create data structures in a single unified address space and can initiate work items on the hardware most appropriate for a given task. Sharing data between compute elements is as simple as sending a pointer. Multiple compute tasks can work on the same coherent memory regions, utilizing barriers and atomic memory operations as needed to maintain data synchronization (just as multi-core CPUs do today).
The HSA team at AMD analyzed the performance of Haar Face Detect, a commonly used multi-stage video analysis algorithm used to identify faces in a video stream. The team compared a CPU/GPU implementation in OpenCL against an HSA implementation. The HSA version seamlessly shares data between CPU and GPU, without memory copies or cache flushes because it assigns each part of the workload to the most appropriate processor with minimal dispatch overhead
----------------------
""Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, weve worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."
AMDmodified the PS4's GCN for 8 weaker CPU cores while PC GCN was designed for stronger CPU cores e.g. AMD's marketing favorite Intel Core i7-3770K.
At the same clockspeed(e.g 2.0Ghz), it would take two Jaguar cores to match one Intel Sandybridge/Ivybridge core. Desktop PC Sandybridge/Ivybridge has turbo mode (e.g. 3.8Ghz/3.9Ghz) to speed up narrow threads performance. AMD's own FX CPUs has turbo modes reaching +4Ghz.
One Intel Sandybridge/Ivybridge @3.4Ghz can easily cover 3.5 AMD Jaguar cores @ 2Ghz. Intel Sandybridge/Ivybridge Quad-Core @ +3.0 Ghz fits with flagship PC GCNs.
AMD designed GCNs to fit with the targeted CPU.
Idk I honestly couldn't care less. I think what sony is trying to say at the end is the "ps4 for gamers by gamers" thing which I'm completely satisfied by. Microsoft can make a hardware x2 more powerful than the PS4 for all I care. But does It have teh games?so it's not a high end PC?Â
Which one is it!!!
lostrib
[QUOTE="lostrib"]Idk I honestly couldn't care less. I think what sony is trying to say at the end is the "ps4 for gamers by gamers" thing which I'm completely satisfied by. Microsoft can make a hardware x2 more powerful than the PS4 for all I care. But does It have teh games?so it's not a high end PC?Â
Which one is it!!!
wasted_wisdom
Does it have "teh games"?
Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your walletThe PS4 is not an equivalent of a high-end gaming PC, it falls into the mid-rage territory, and unlike the PC, it's a closed system.
Rocker6
PS4 does'nt contain any performance tier hardware parts, in PC terms.
That being said, it's closed system nature allows for easier optimization and avoidance of potential bottlenecks...
Â
Anything new here?
Â
[QUOTE="Rocker6"]Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your walletThe PS4 is not an equivalent of a high-end gaming PC, it falls into the mid-rage territory, and unlike the PC, it's a closed system.
wasted_wisdom
Yep, pushing a console to compete with constantly evolving PC hardware makes no sense. It can somewhat hold its ground on release, sure, but give it a year or two and it starts to fall signifficantly behind. That's why I think the traditional console model of buying a static box every ~5 years will undergo some changes soon, PCs evolve constantly, mobile devices evolve constantly, only the console hardware stays the same for a prolonged period of time...
Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your wallet[QUOTE="wasted_wisdom"][QUOTE="Rocker6"]
The PS4 is not an equivalent of a high-end gaming PC, it falls into the mid-rage territory, and unlike the PC, it's a closed system.
Rocker6
Yep, pushing a console to compete with constantly evolving PC hardware makes no sense. It can somewhat hold its ground on release, sure, but give it a year or two and it starts to fall signifficantly behind. That's why I think the traditional console model of buying a static box every ~5 years will undergo some changes soon, PCs evolve constantly, mobile devices evolve constantly, only the console hardware stays the same for a prolonged period of time...
PCs do constantly evolve during a console generation...but that is as much of a drawback to the PC archtecture as it is a benefit.  This is the reason why Developers are calling PS4 a high end PC even though the components are somewhere around mid level...PC builds are all over the place and developers can't really just develop for the most powerful of the bunch but devs will be able to push what's inside of the PS4 to the max without any reservations.[QUOTE="Rocker6"]
[QUOTE="wasted_wisdom"] Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your walletcasharmy
Yep, pushing a console to compete with constantly evolving PC hardware makes no sense. It can somewhat hold its ground on release, sure, but give it a year or two and it starts to fall signifficantly behind. That's why I think the traditional console model of buying a static box every ~5 years will undergo some changes soon, PCs evolve constantly, mobile devices evolve constantly, only the console hardware stays the same for a prolonged period of time...
PCs do constantly evolve during a console generation...but that is as much of a drawback to the PC archtecture as it is a benefit.  This is the reason why Developers are calling PS4 a high end PC even though the components are somewhere around mid level...PC builds are all over the place and developers can't really just develop for the most powerful of the bunch but devs will be able to push what's inside of the PS4 to the max without any reservations.True, the majority of gaming PCs won't ever be high-end, but if you do own one, you'll definitely see gains over consoles, especially later in the generation, as the gap widens.
The devs may optimize and push the PS4 to its limits, but by the time they're entirely familiar with its architecture and can use it to its full potential, the PC hardware will be far ahead, even the mid-range parts. Devs aren't instantly familiar with the new console architecture and it takes a while to get the best results, you can best see that by comparing, say, launch title for the PS3 to the later exclusives like Uncharted 2.
[QUOTE="Rocker6"]Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your walletThe PS4 is not an equivalent of a high-end gaming PC, it falls into the mid-rage territory, and unlike the PC, it's a closed system.
wasted_wisdom
i think he means you can MOD games..thats what open platform means.
Yeah it's stupid how they keep trying to push that theory. PC is an open platform. Your PC's power is up to you and your wallet[QUOTE="wasted_wisdom"][QUOTE="Rocker6"]
The PS4 is not an equivalent of a high-end gaming PC, it falls into the mid-rage territory, and unlike the PC, it's a closed system.
KBFloYd
i think he means you can MOD games..thats what open platform means.
Nah, open nature of the PC platform doesn't apply only to the software, I was also talking about hardware upgradability...
[QUOTE="lostrib"]
[QUOTE="Mrmedia01"]
Good thread. I enjoyed the read.
Sounds interesting architecture and sounds like PS4 is going to be easy and stress free to develop on.
So PS4 is awesome, Confirmed.
Mrmedia01
Unless it's like $600 dollars without any good games
Â
PS3 has good games and continues to have great games come out. PS4 I know will have great games also, and great launch games. True though, PS4 price could be Messed up. I sure hope not. Price would be the only thing to hold PS4 back.
PS3 did not have very many good games at the beginnig of the gen
[QUOTE="ronvalencia"][QUOTE="wasted_wisdom"]
Good Read...
How does the PS4 differ from a high-end gaming PC?
http://www.shacknews.com/article/78889/how-does-the-ps4-differ-from-a-high-end-gaming
Sony described its upcoming PlayStation 4 as a "supercharged" PC. Powered by familiar x86 architecture manufactured by AMD, PS4 is more like a gaming PC than any previous Sony console. However, while it may use many parts found in high-end gaming PCs, PS4 system architect Mark Cerny argues that PS4 has many unique features that separate it from today's PCs.
"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," Cerny said, pointing to the 8GB of GDDR5 RAM that's fully addressable by both the CPU and GPU. "If [a PC] had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it's a very straightforward benefit that you get even on your first day of coding with the system."
According to Cerny, PS4 addresses the hiccups that can come from the communication between CPU, GPU, and RAM in a traditional PC. "A typical PC GPU has two buses," Cerny told Gamasutra in a very detailed technical write-up. "There's a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication--any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
PS4 addresses these concerns by adding another bus to the GPU "that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches." The end result is that it removes synchronization issues between the CPU and GPU. "We can pass almost 20 gigabytes a second down that bus," Cerny said, pointing out that it's "larger than the PCIe on most PCs!"
"The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we've worked with AMD to increase the limit to 64 sources of compute commands," Cerny said. According to Cerny, the reason for the increase is that middleware will have a need to use compute as well. "Middleware requests for work on the GPU will need to be properly blended with game requests, and then finally properly prioritized relative to the graphics on a moment-by-moment basis."
There's no TLDR because the entire article is pretty much filled with content. Either read or don't.
granddogg
"A typical PC GPU has two buses," said Cerny. "Theres a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication -- any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. ""
"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
http://www.whatmannerofburgeristhis.com/blog/topics/OpenCL/
The latest Tahiti/GCN GPUs have a read/write, incoherent L1 cache local to a compute unit. Since a single workgroup will always run on a single compute unit, memory will be consistent in that group using the cache.
Use atomics to bypass the L1 cache if you need strong memory consistency across workgroups. This is an option for reads that aren't very critical. This was true for one of the N-body kernels. For another it was many times slower than running a single workgroup at time to ensure global consistency.
-----
http://developer.amd.com/resources/heterogeneous-computing/what-is-heterogeneous-system-architecture-hsa/
With HSA, applications can create data structures in a single unified address space and can initiate work items on the hardware most appropriate for a given task. Sharing data between compute elements is as simple as sending a pointer. Multiple compute tasks can work on the same coherent memory regions, utilizing barriers and atomic memory operations as needed to maintain data synchronization (just as multi-core CPUs do today).
The HSA team at AMD analyzed the performance of Haar Face Detect, a commonly used multi-stage video analysis algorithm used to identify faces in a video stream. The team compared a CPU/GPU implementation in OpenCL against an HSA implementation. The HSA version seamlessly shares data between CPU and GPU, without memory copies or cache flushes because it assigns each part of the workload to the most appropriate processor with minimal dispatch overhead
----------------------
""Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, weve worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."
AMDmodified the PS4's GCN for 8 weaker CPU cores while PC GCN was designed for stronger CPU cores e.g. AMD's marketing favorite Intel Core i7-3770K.
At the same clockspeed(e.g 2.0Ghz), it would take two Jaguar cores to match one Intel Sandybridge/Ivybridge core. Desktop PC Sandybridge/Ivybridge has turbo mode (e.g. 3.8Ghz/3.9Ghz) to speed up narrow threads performance. AMD's own FX CPUs has turbo modes reaching +4Ghz.
One Intel Sandybridge/Ivybridge @3.4Ghz can easily cover 3.5 AMD Jaguar cores @ 2Ghz. Intel Sandybridge/Ivybridge Quad-Core @ +3.0 Ghz fits with flagship PC GCNs.
AMD designed GCNs to fit with the targeted CPU.
good read man i see u love amd.......i had a 7850 1gig love it but picked up a gtx660 sc 2gig...what do u know about the 8000 coming up..linkd will do From http://www.planet3dnow.de/cgi-bin/newspub/viewnews.cgi?id=1366210905It indicates the next model series for Radeon HDs would be 9xx0.
The hints for the future Radeon HDs would be from Radeon HD 7790 (Bonaire XT) i.e. scale it by two e.g. Radeon HD 7870 was a direct scale from Radeon HD 7770.
[QUOTE="lostrib"][QUOTE="Sali217"]The difference is the PS4 won't go obsolete as quickly as a new PC will. I will still be able to play all the latest games on it in 6 or 7 years without updating anything.Sali217
Don't care
That's because you're probably one of those PC nerds that enjoys tinkering with electronic crap.I do enjoy "tinkering" with my PC, also called a hobby.
[QUOTE="Sali217"]The difference is the PS4 won't go obsolete as quickly as a new PC will. I will still be able to play all the latest games on it in 6 or 7 years without updating anything.clyde46The PS4 is out of date before it hits the shelves. True dat
It reminds me on this:
Ken Kutaragi: PS3 A SUPACOMPUTAH!!!
*Ken Kutaragi Sacked*
Kaz Hirai: PS3 NOT A SUPACOMPUTAH!!!
[QUOTE="clyde46"][QUOTE="Sali217"]The difference is the PS4 won't go obsolete as quickly as a new PC will. I will still be able to play all the latest games on it in 6 or 7 years without updating anything.04dcarraherThe PS4 is out of date before it hits the shelves. True dat So was your card when you bought it the 560ti never was top of the line and you bought it any way.
Some people cant see pass the choices they make. The ps4 can do things on its hardware impossible to do on PC like cache bypass but some how some people ignore it.tormentosSome people can't read from AMD sources.
[QUOTE="wasted_wisdom"]
Good Read...
How does the PS4 differ from a high-end gaming PC?
http://www.shacknews.com/article/78889/how-does-the-ps4-differ-from-a-high-end-gaming
Sony described its upcoming PlayStation 4 as a "supercharged" PC. Powered by familiar x86 architecture manufactured by AMD, PS4 is more like a gaming PC than any previous Sony console. However, while it may use many parts found in high-end gaming PCs, PS4 system architect Mark Cerny argues that PS4 has many unique features that separate it from today's PCs.
"The 'supercharged' part, a lot of that comes from the use of the single unified pool of high-speed memory," Cerny said, pointing to the 8GB of GDDR5 RAM that's fully addressable by both the CPU and GPU. "If [a PC] had 8 gigabytes of memory on it, the CPU or GPU could only share about 1 percent of that memory on any given frame. That's simply a limit imposed by the speed of the PCIe. So, yes, there is substantial benefit to having a unified architecture on PS4, and it's a very straightforward benefit that you get even on your first day of coding with the system."
According to Cerny, PS4 addresses the hiccups that can come from the communication between CPU, GPU, and RAM in a traditional PC. "A typical PC GPU has two buses," Cerny told Gamasutra in a very detailed technical write-up. "There's a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication--any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
PS4 addresses these concerns by adding another bus to the GPU "that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches." The end result is that it removes synchronization issues between the CPU and GPU. "We can pass almost 20 gigabytes a second down that bus," Cerny said, pointing out that it's "larger than the PCIe on most PCs!"
"The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we've worked with AMD to increase the limit to 64 sources of compute commands," Cerny said. According to Cerny, the reason for the increase is that middleware will have a need to use compute as well. "Middleware requests for work on the GPU will need to be properly blended with game requests, and then finally properly prioritized relative to the graphics on a moment-by-moment basis."
There's no TLDR because the entire article is pretty much filled with content. Either read or don't.
ronvalencia
"A typical PC GPU has two buses," said Cerny. "Theres a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication -- any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."
"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. ""
"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
http://www.whatmannerofburgeristhis.com/blog/topics/OpenCL/
The latest Tahiti/GCN GPUs have a read/write, incoherent L1 cache local to a compute unit. Since a single workgroup will always run on a single compute unit, memory will be consistent in that group using the cache.
Use atomics to bypass the L1 cache if you need strong memory consistency across workgroups. This is an option for reads that aren't very critical. This was true for one of the N-body kernels. For another it was many times slower than running a single workgroup at time to ensure global consistency.
-----
http://developer.amd.com/resources/heterogeneous-computing/what-is-heterogeneous-system-architecture-hsa/
With HSA, applications can create data structures in a single unified address space and can initiate work items on the hardware most appropriate for a given task. Sharing data between compute elements is as simple as sending a pointer. Multiple compute tasks can work on the same coherent memory regions, utilizing barriers and atomic memory operations as needed to maintain data synchronization (just as multi-core CPUs do today).
The HSA team at AMD analyzed the performance of Haar Face Detect, a commonly used multi-stage video analysis algorithm used to identify faces in a video stream. The team compared a CPU/GPU implementation in OpenCL against an HSA implementation. The HSA version seamlessly shares data between CPU and GPU, without memory copies or cache flushes because it assigns each part of the workload to the most appropriate processor with minimal dispatch overhead
----------------------
""Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, weve worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."
AMDmodified the PS4's GCN for 8 weaker CPU cores while PC GCN was designed for stronger CPU cores e.g. AMD's marketing favorite Intel Core i7-3770K.
At the same clockspeed(e.g 2.0Ghz), it would take two Jaguar cores to match one Intel Sandybridge/Ivybridge core. Desktop PC Sandybridge/Ivybridge has turbo mode (e.g. 3.8Ghz/3.9Ghz) to speed up narrow threads performance. AMD's own FX CPUs has turbo modes reaching +4Ghz.
One Intel Sandybridge/Ivybridge @3.4Ghz can easily cover 3.5 AMD Jaguar cores @ 2Ghz. Intel Sandybridge/Ivybridge Quad-Core @ +3.0 Ghz fits with flagship PC GCNs.
AMD designed GCNs to fit with the targeted CPU.
good read man i see u love amd.......i had a 7850 1gig love it but picked up a gtx660 sc 2gig...what do u know about the 8000 coming up..linkd will do[QUOTE="tormentos"][QUOTE="04dcarraher"] True datclyde46So was your card when you bought it the 560ti never was top of the line and you bought it any way. Maybe he couldn't afford to buy top of the line gear. Ever thought of that?
Â
Just ignore Tormentos. He seems to think the 560ti is some shit card that can't run anything.Â
[QUOTE="Alienware_fan"]
Neckbeards can only compare numbers. the bigger the better. stupid.
lostrib
^stupid
so how many TIMES faster is your pc ?Pc might be more powerful on paper but the ps4 is designed for games, by game developers. which is why I believe ps4's architecture is better and more efficient for gaming. Bigger numbers dosent always mean better.
Please Log In to post.
Log in to comment