Sieve Offloading Explained
"Why would I use Sieve offloading?"
We are often asked this question. The best answer is an example. Suppose, for instance, you have a processor intensive C++ application originally written to run on a single-core x86 processor architecture. You might have updated it at some point to try to take advantage of multicore x86processors. You might have implemented threading in the code and, as promised, you got some performance improvement on a dual-core processor, a bit more on a four-core processor...and it's proving difficult to get any further performance gain at all from running the code on an eight-core processor.
At this stage you turn to consideration of alternative processor architectures - you might look at the Cell processor from IBM, for instance. The Cell processor is capable of absolutely stunning performance - so, where's the catch?
The catch is, how do you get your application code (which you have already modified to use x86 threading, remember) running on a processsor architecture that contains two types of processor core and different levels of memory associated with each processor core? The code won't run as it is. You don't want to rewrite it.
That's why you would want to use Sieve offloading.
With Sieve, you can make some very simple additions to your code, indicating which parts of it you want to be offloaded to run off the main processor core. If the code is already threaded, then Sieve will take a thread at a time and allocate each thread to (in the case of the Cell processor) an SPU, until all the available SPUs are busy, freeing up processing time on the main PPU processor.
If your code is still serial in form, then applying Sieve commands to the code will indicate to the system where to evoke parallelism in the code and distribute the code across all the SPUs for processing - all automatically. Even better, the programmer can debug his code the same way he has always done, serially, a line at a time, and he doesn't need to worry about where the code is being run, cycle timings or contentions for data between parts of the code - Sieve handles all that.
The diagram below illustrates how the process will typically work.


