The Secrets of the Cell Processor

Was it really that powerful?

Many great claims were made of the Cell Broadband Engine chip, the CPU inside the PlayStation 3 - It would be the most powerful console of all time. But as time went by, many people felt that this claim had been oversold and that this unique architecture which was meant to make the console more powerful actually hindered the development of video games and in some titles actually made the performance worse. Was the Cell Processor really as powerful as it claimed? At the time the answer seemed to be "It depends".

We explored in a previous issue Ken Kutaragi’s ambition for the PlayStation 3 to be all things home entertainment so during the design phase of the console it was clear the CPU would need to be capable of lots of functions. The Cell processor was designed to have the best of both worlds, a general purpose element to make it easy for regular applications to run, combined with some specialized functions for intense number crunching, the type that video games often needed to run and most importantly, the architecture needed to allow these things to happen in parallel to maximize performance.

The CPU (Central Processing Unit) is the main part of a computer or console responsible for crunching numbers, controlling input and output (Such as taking input from the controller and then telling it to vibrate) and handling the information flow across the other components in the console.

The Cell chip was split into nine parts: Eight of the parts were the Synergistic Processor Elements (We’ll refer to them as SPUs without getting into the detail why), these were the part that really set the Cell processor apart from it’s rivals. The ninth element was a single PowerPC Processor, the control center for the chip, responsible for delegating to and combining results from the 8 SPUs.

It is these SPUs that gave way to PlayStation 3’s being used as a super computer. The US Air Force found that cost for cost it was cheaper to buy 1760 PlayStation 3’s and connect them together to create a super computer to perform simulations. And as a bonus I guess they could play some games in their lunch break.

The SPUs were the key revolution. They are designed for single instruction operations i.e they were not general purpose, they were meant for number crunching, the type of number crunching used to manipulate graphics in games. However, for any software engineer or video game developer, writing your application to utilize these SPUs to their full capability was extremely difficult and doesn’t suit all tasks. You had to be extra careful to manage your memory across them and co-ordinate the delegation and reassembly of the processed data.

Let me explain - it’s time for an analogy. Imagine you are a Chef responsible for creating lots of cakes. Our Chef is like the PowerPC Element (PPE), they are responsible for bringing together all the ingredients, mixing them together and putting them in the oven. They need to weigh the flour, sugar, butter and crack the eggs all on their own, as they can only do one thing at a time our Chef has to do each one in sequence. However, they could utilize their four line cooks to do that for them. The line cooks, or SPUs in the analogy, could each be assigned one of those tasks separately. If you set the operation up correctly all four line cooks can weigh and prepare all four ingredients at the same time and give them to you at the same time allowing you to prepare cakes much faster this way. Not only can they do it in parallel but as it is their only job they can do it faster than the general purpose Chef can.

The grey square of the cell broadband chip on a motherboard — The text etched onto this heatsink has been rubbed off but this is the chip in question.

The PPE inside the Cell Processor was a bit more than just a control center. It was a general purpose chip similar to what you get in a normal computer. When Microsoft was creating the Xbox 360, they also asked IBM for parts, who were busy working on the PPE for the Cell Broadband chip. They made a deal with Microsoft to provide a CPU for the Xbox 360 consisting of three PPEs very similar to the ones they were using in the Cell Processor. So the comparison landscape now looked more like this: 1 PPE + 8 SPUs in the PlayStation 3 versus 3 PPEs in the Xbox 360. (I must confess to simplifying here a bit.) So the real question was “Are the potential gains from having 8 SPUs worth it compared to simply having 3 general purpose PPEs?”

As it turns out, the Xbox 360’s design was more aligned to PC Gaming so this created a problem for a development company wishing to release games on PlayStation 3, Xbox 360 and PC. The PlayStation 3 architecture required you to do something special, you had to get into the code and make sure your design was using the SPUs to the best of their ability but many of these games used off the shelf engines like Unreal Engine, which had been purchased and simply didn’t make it easy to do customizations like this. So releases like Skyrim launched on all three consoles but the PlayStation 3 version had not be modified to make good use of the SPUs meaning the technical comparison in performance of Skyrim became more like running the game on one PPE on PlayStation 3 versus running it on three on the Xbox 360, so it’s no surprise that many games ran better out of the box on Xbox 360. Later patches would address some of this but there are still many games in the PlayStation 3 library which run poorly precisely because they did not have the time or money to optimize them for the console.

For a game to be effective on the PlayStation 3 it had to answer two questions. The first is: How do I approach the development of my game in such a way that I can design different elements so I can offload processing to the SPUs and the second is: How much time am I going to spend on maximizing that to improve performance? Games like Uncharted, The Last of Us and Metal Gear Solid 4 all look and play amazing because they were PlayStation 3 exclusives, and always would be, they had more time to dedicate to utilizing the hardware effectively. Metal Gear Solid 4 was designed from the ground up with the console architecture in mind and how to make best use of it and in Uncharted 2, there is a behind the scenes video where the team explain that making a sequel gave them more time to experiment offloading functionality that the GPU would have traditionally done to the SPUs, like lighting effects, allowing the GPU to create higher fidelity graphics overall.

In hindsight the Cell Architecture seems doomed to fail. It was probably a relic from when the ambitions of the PlayStation 3 were even grander than they ended up realizing. Asking developers to make complicated deep design choices based one one consoles particular architecture was a huge ask and something that may not have been greatly considered until it was too late. I think PlayStation learnt from this and the PlayStation 4 used a much more typical PC architecture allowing games to be developed for all three platforms with ease and they realized that games were where it was at and their newer slogan “For the gamers” really encapsulates that whole journey.

But this story is another reason why I love the PlayStation 3 and what it tried to achieve. No facets remained unexplored when it came to try and revolutionize home entertainment and I don’t think we will ever see a console get turned into a super computer again. So was the Cell Processor really that powerful? I’m afraid the answer is still “It depends”.

If you want more technical insight I recommend this fantastic book Programming the Cell Processor.