Running OpenCL kernel on multiple GPUs?-Collection of common programming errors
VladimirRight now I programmed made several algorithms running in parallel on one GPU, but all of them have the same problem, when I try to execute them on several GPUs (for example, 3). The problem is that the code, executed on one GPU executes exactly the same amount of time on 3 GPUs (not faster). I tried to execute with more data, tried different tasks to be executed, nothing helped. Finally, I ended up trying to run the easiest task like elements sum and still got this awful mistake. That is why I don’t believe it is a problem of a particular algorithm and I feel there is a mistake in my code (or even in my approach to parallelizing code on several GPUs).
Here is the header file for my Parallel.cpp class:
#ifndef PARALLEL_H #define PARALLEL_H #define __NO_STD_VECTOR // Use cl::vector and cl::string and #define __NO_STD_STRING // not STL versions, more on this later #include class Parallel { public: Parallel(); int executeAttachVectorsKernel(int*, int*, int*, int); static void getMaxWorkGroupSize(int*, int*, int*); virtual ~Parallel(); protected: private: char* file_contents(const char*, int*); void getShortInfo(cl_device_id); int init(void); cl_platform_id platform; cl_device_id* devices; cl_uint num_devices; cl_command_queue* queues; int* WGSizes; int* WGNumbers; cl_context context; cl_program program; cl_kernel kernel; cl_mem input1; cl_mem input2; cl_mem output; }; #endif // PARALLEL_H
Here is the initialization method init:
int Parallel::init() { cl_int err; //Connect to the first platfrom err = clGetPlatformIDs(1, &platform, NULL); if (err != CL_SUCCESS) { cerr