如何:使用 Alloc 和 Free 提高内存性能
本文档说明如何使用 concurrency::Alloc 和 concurrency::Free 函数来提高内存性能。 本文将比较为三个不同的类型(每个类型都指定 new 和 delete 运算符)以并行方式反转数组的元素所需的时间。
Alloc 和 Free 函数在多个线程频繁调用 Alloc 和 Free 时非常有用。 由于运行时为每个线程保留单独的内存缓存,因此运行时无需使用锁或内存屏障即可管理内存。
示例
下面的示例演示三个类型,每个类型都指定 new 和 delete 运算符。 new_delete 类使用全局 new 和 delete 运算符,malloc_free 类使用 C 运行时 malloc 和 free 函数,而 Alloc_Free 类使用并发运行时 Alloc 和 Free 函数。
// A type that defines the new and delete operators. These operators
// call the global new and delete operators, respectively.
class new_delete
{
public:
static void* operator new(size_t size)
{
return ::operator new(size);
}
static void operator delete(void *p)
{
return ::operator delete(p);
}
int _data;
};
// A type that defines the new and delete operators. These operators
// call the C Runtime malloc and free functions, respectively.
class malloc_free
{
public:
static void* operator new(size_t size)
{
return malloc(size);
}
static void operator delete(void *p)
{
return free(p);
}
int _data;
};
// A type that defines the new and delete operators. These operators
// call the Concurrency Runtime Alloc and Free functions, respectively.
class Alloc_Free
{
public:
static void* operator new(size_t size)
{
return Alloc(size);
}
static void operator delete(void *p)
{
return Free(p);
}
int _data;
};
下面的示例显示 swap 和 reverse_array 函数。 swap 函数将交换指定索引处的数组的内容。 此函数将为临时变量分配堆中的内存。 reverse_array 函数将创建一个大型数组,并计算以并行方式反转该数组多次所需的时间。
// Exchanges the contents of a[index1] with a[index2].
template<class T>
void swap(T* a, int index1, int index2)
{
// For illustration, allocate memory from the heap.
// This is useful when sizeof(T) is large.
T* temp = new T;
*temp = a[index1];
a[index1] = a[index2];
a[index2] = *temp;
delete temp;
}
// Computes the time that it takes to reverse the elements of a
// large array of the specified type.
template <typename T>
__int64 reverse_array()
{
const int size = 5000000;
T* a = new T[size];
__int64 time = 0;
const int repeat = 11;
// Repeat the operation several times to amplify the time difference.
for (int i = 0; i < repeat; ++i)
{
time += time_call([&] {
parallel_for(0, size/2, [&](int index)
{
swap(a, index, size-index-1);
});
});
}
delete[] a;
return time;
}
下面的示例显示 wmain 函数,此函数将计算 reverse_array 函数对 new_delete、malloc_free 和 Alloc_Free 类型(每个类型都使用不同的内存分配方案)进行操作所需的时间。
int wmain()
{
// Compute the time that it takes to reverse large arrays of
// different types.
// new_delete
wcout << L"Took " << reverse_array<new_delete>()
<< " ms with new/delete." << endl;
// malloc_free
wcout << L"Took " << reverse_array<malloc_free>()
<< " ms with malloc/free." << endl;
// Alloc_Free
wcout << L"Took " << reverse_array<Alloc_Free>()
<< " ms with Alloc/Free." << endl;
}
下面是完整的示例。
// allocators.cpp
// compile with: /EHsc
#include <windows.h>
#include <ppl.h>
#include <iostream>
using namespace concurrency;
using namespace std;
// Calls the provided work function and returns the number of milliseconds
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
{
__int64 begin = GetTickCount();
f();
return GetTickCount() - begin;
}
// A type that defines the new and delete operators. These operators
// call the global new and delete operators, respectively.
class new_delete
{
public:
static void* operator new(size_t size)
{
return ::operator new(size);
}
static void operator delete(void *p)
{
return ::operator delete(p);
}
int _data;
};
// A type that defines the new and delete operators. These operators
// call the C Runtime malloc and free functions, respectively.
class malloc_free
{
public:
static void* operator new(size_t size)
{
return malloc(size);
}
static void operator delete(void *p)
{
return free(p);
}
int _data;
};
// A type that defines the new and delete operators. These operators
// call the Concurrency Runtime Alloc and Free functions, respectively.
class Alloc_Free
{
public:
static void* operator new(size_t size)
{
return Alloc(size);
}
static void operator delete(void *p)
{
return Free(p);
}
int _data;
};
// Exchanges the contents of a[index1] with a[index2].
template<class T>
void swap(T* a, int index1, int index2)
{
// For illustration, allocate memory from the heap.
// This is useful when sizeof(T) is large.
T* temp = new T;
*temp = a[index1];
a[index1] = a[index2];
a[index2] = *temp;
delete temp;
}
// Computes the time that it takes to reverse the elements of a
// large array of the specified type.
template <typename T>
__int64 reverse_array()
{
const int size = 5000000;
T* a = new T[size];
__int64 time = 0;
const int repeat = 11;
// Repeat the operation several times to amplify the time difference.
for (int i = 0; i < repeat; ++i)
{
time += time_call([&] {
parallel_for(0, size/2, [&](int index)
{
swap(a, index, size-index-1);
});
});
}
delete[] a;
return time;
}
int wmain()
{
// Compute the time that it takes to reverse large arrays of
// different types.
// new_delete
wcout << L"Took " << reverse_array<new_delete>()
<< " ms with new/delete." << endl;
// malloc_free
wcout << L"Took " << reverse_array<malloc_free>()
<< " ms with malloc/free." << endl;
// Alloc_Free
wcout << L"Took " << reverse_array<Alloc_Free>()
<< " ms with Alloc/Free." << endl;
}
下例是四处理器计算机的输出结果。
Took 2031 ms with new/delete.
Took 1672 ms with malloc/free.
Took 656 ms with Alloc/Free.
在此示例中,使用 Alloc 和 Free 函数的类型提供了最佳内存性能,因为 Alloc 和 Free 函数进行了优化,以便能够频繁分配和释放来自多个线程的内存块。
编译代码
将示例代码复制并将其粘贴在 Visual Studio 项目中,或将它粘贴到一个文件,名为 allocators.cpp ,然后在 Visual Studio 命令提示符窗口中运行以下命令。
cl.exe /EHsc allocators.cpp