arrow_back_ios Back to List

Passing by value for performance

Offload KB - faq

Old Content Alert

Please note that this is a old document archive and the will most likely be out-dated or superseded by various other products and is purely here for historical purposes.

Since pointer parameters of duplicated SPU functions can be PPU or SPU pointers it is recommended to use value parameters over reference parameters if possible. In standard C++ code often relatively small values are passed by reference in the hope that this will improve performance. In offloaded functions on the SPU such a reference may however be __outer (to PPU memory), so accesses to non-static members involve software cache or DMAs. For example:

struct vec3
{
    double x, y, z;
    vec3(double,double,double);
    inline vec3 cross(const vec3 &a) const
	{
		return vec3((y * a.z) - (z * a.y),
					(z * a.x) - (x * a.z),
					(x * a.y) - (y * a.x));
	}
};

vec3 global(1,1,1);

int main()
{
	vec3 var(1,1,1);
	__blockingoffload()
	{
		var = global.cross(global); //leads to slow fragmented memory accesses inside the SPU version of str::cross
	} 
}

The method vec3::cross is called inside a __blockingoffload block on an __outer object (global) with an __outer lvalue argument (global). So the compiler generates a vec3::cross duplicate with an __outer this pointer and outer parameter reference, which will lead to 12 __outer accesses (without common subexpression elimination). Using the command line option -warnonouterreadswrites (see /kb/80.html will print information on any of those accesses:

* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 6, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 6, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 6, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 6, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 7, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 7, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 7, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 7, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 8, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 8, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 8, column: 0* WARNING: Generating outer read, Alignment 8 ,Size: 8.--- In file: vec3test.cpp, at line: 8, column: 0

Turning parameter a into value parameter allows the compiler to take the copy at the call site with only one __outer memory of the full vector access instead of several fragmented accesses. Alternatively, caching global inside the __blockingoffload block (See /kb/135.html for caching data) will produce no __outer memory accesses inside vec3::cross.