{"id":6169,"date":"2014-04-13T04:36:35","date_gmt":"2014-04-13T04:36:35","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/04\/13\/running-opencl-kernel-on-multiple-gpus-collection-of-common-programming-errors\/"},"modified":"2014-04-13T04:36:35","modified_gmt":"2014-04-13T04:36:35","slug":"running-opencl-kernel-on-multiple-gpus-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/04\/13\/running-opencl-kernel-on-multiple-gpus-collection-of-common-programming-errors\/","title":{"rendered":"Running OpenCL kernel on multiple GPUs?-Collection of common programming errors"},"content":{"rendered":"<ul>\n<li><img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/9b20dfd81d1bb5c723c22949c40ad0ed?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nVladimir<\/p>\n<p>Right now I programmed made several algorithms running in parallel on one GPU, but all of them have the same problem, when I try to execute them on several GPUs (for example, 3). The problem is that the code, executed on one GPU executes exactly the same amount of time on 3 GPUs (not faster). I tried to execute with more data, tried different tasks to be executed, nothing helped. Finally, I ended up trying to run the easiest task like elements sum and still got this awful mistake. That is why I don&#8217;t believe it is a problem of a particular algorithm and I feel there is a mistake in my code (or even in my approach to parallelizing code on several GPUs).<\/p>\n<p>Here is the header file for my Parallel.cpp class:<\/p>\n<pre><code>#ifndef PARALLEL_H\n#define PARALLEL_H\n\n#define __NO_STD_VECTOR \/\/ Use cl::vector and cl::string and\n#define __NO_STD_STRING \/\/ not STL versions, more on this later\n#include \n\nclass Parallel\n{\n    public:\n        Parallel();\n        int executeAttachVectorsKernel(int*, int*, int*, int);\n        static void getMaxWorkGroupSize(int*, int*, int*);\n        virtual ~Parallel();\n    protected:\n    private:\n        char* file_contents(const char*, int*);\n        void getShortInfo(cl_device_id);\n        int init(void);\n        cl_platform_id platform;\n        cl_device_id* devices;\n        cl_uint num_devices;\n        cl_command_queue* queues;\n        int* WGSizes;\n        int* WGNumbers;\n        cl_context context;\n        cl_program program;\n        cl_kernel kernel;\n        cl_mem input1;\n        cl_mem input2;\n        cl_mem output;\n};\n\n#endif \/\/ PARALLEL_H\n<\/code><\/pre>\n<p>Here is the initialization method init:<\/p>\n<pre><code>int Parallel::init() {\ncl_int err;\n\n\/\/Connect to the first platfrom\nerr = clGetPlatformIDs(1, &amp;platform, NULL);\nif (err != CL_SUCCESS) {\n    cerr<\/code><\/pre>\n<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Vladimir Right now I programmed made several algorithms running in parallel on one GPU, but all of them have the same problem, when I try to execute them on several GPUs (for example, 3). The problem is that the code, executed on one GPU executes exactly the same amount of time on 3 GPUs (not [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-6169","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/6169","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=6169"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/6169\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=6169"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=6169"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=6169"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}