reinterpret_as function in C++ AMP
With C++ AMP, you can use the templated array or array_view to work with your data types in a strongly typed manner by passing it in as the element type. Sometimes, however, you might need to reinterpret the element type, typically to adapt data to an existing interface or to manipulate data at bits level. In this blog post, I will introduce you to the reinterpret_as function available on array and array_view to address these scenarios.
About the API
Following is the reinterpret_as signature.
template <typename T, int N>
class array
{
template<typename ElementType>
array_view<ElementType,1> reinterpret_as() restrict(cpu,amp)
template<typename ElementType>
array_view<const ElementType,1> reinterpret_as() const restrict(cpu,amp)
};
template <typename T>
class array_view<T,1>
{
template <typename ElementType>
array_view<ElementType,1> reinterpret_as() const restrict(cpu,amp);
};
template <typename T>
class array_view<const T,1>
{
template <typename ElementType>
array_view<const ElementType,1> reinterpret_as() const restrict(cpu,amp);
};
Just like the view_as function, it is only available for array_views of rank 1 and arrays of any rank, and it preserves the constness of the original container.
In addition, the return type of reinterpret_as is always array_view of rank 1, because, it disregards the shape of the data and works on the linearized or flattened data directly. If you want to work with higher dimensional views of the data, you can always use the view_as operation on top of the returned array_view to do that. Note that, the total element number of the array_view object returned by reinterpret_as may be different from that of the original array or array_view object if the sizes of the from and to element types are different.
Example of reinterpret_as
Now let’s see a simple example that illustrates the usage of reinterpret_as to adapt data to an existing interface. Let’s pick up the same example of a library function that we used in the view_as example:
void random_fill(array_view<float> in);
…and let’s further imagine that you have an array_view of a short vector type, e.g. float_2. You could use reinterpret_as to adapt to the random_fill API as:
void fill(array_view<float_2> av)
{
random_fill(av.reinterpret_as<float>());
}
Another usage of reinterpret_as is to do bits manipulation. The myfabs function in the following example changes the given float data to its absolute value:
void myfabs(array_view<float> av)
{
array_view<unsigned int> uint_av = av.reinterpret_as<unsigned int>();
parallel_for_each(av.extent, [=](index<1> idx) restrict(amp)
{
uint_av[idx] &= 0x7FFFFFFF;
});
}
Note that the reinterpret_as operation will not trigger data synchronization by itself. So you could invoke parallel_for_each to do the first phase of your computation, then reinterpret the array_view that contains intermediate results to an array_view with a different element type without copying the data back to CPU, and then invoke another parallel_for_each to do the remaining computation on the reinterpreted array_view, and finally copy data back to CPU.
One thing to watch out for reinterpret_as is that the size of the reinterpreted element type must evenly divide into the total size of the original array or array_view. For example,
#include <iostream>
#include <amp.h>
#include <amp_graphics.h>
using namespace concurrency;
using namespace concurrency::graphics;
int main()
{
array<int_2> a(1024);
try
{
array_view<int_3> av = a.reinterpret_as<int_3>();
}
catch(const runtime_exception &e)
{
std::cout << e.what() << std::endl;
}
}
Output: Element type of reinterpret_as does not evenly divide into extent
Another important thing to watch out is that although the resulting array_view has a different element type, the element type of the underlying array is always kept. This information is used for sanity check for some operations. For example, sectioning an array_view in such a way that would not align with underlying element type will result in an exception being thrown. For example, assume a is an array<float_2> object,
array_view<float> av1 = a.reinterpret_as<float>();
array_view<float> av2 = av1.section(index<1>(1), extent<1>(4));
The second line above will result in a runtime_exception being thrown with the following message:
The array_view base extent, view offset and/or view extent is incompatible with the underlying buffer.
This concludes my introduction to reinterpret_as. I hope that this post together with the view_as post get you started with exploiting these two APIs when you need to reshape or reinterpret your data in C++ AMP.
As always, I welcome your questions and feedback below or on our MSDN forum.
Comments
Anonymous
May 16, 2012
guys, this is cool. no doubt about it. but what you are going to do with AMP C++ performance? it is pretty terrible in comparison to CUDA or OpenCL!Anonymous
May 16, 2012
Hi Alex, glad you find C++ AMP cool. Our measurements show comparable performance between CUDA, OpenCL, HLSL, and C++ AMP code, when the code is written the same way in all approaches. Sometimes one is slightly faster, sometimes the other is slightly faster. If you have found a workload that shows significant differences, please post the repro code in our forum so we can investigate including the driver details on your system.