{"id":6156,"date":"2014-04-13T04:35:28","date_gmt":"2014-04-13T04:35:28","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/04\/13\/independent-searches-on-gpu-how-to-synchronize-its-finish-collection-of-common-programming-errors\/"},"modified":"2014-04-13T04:35:28","modified_gmt":"2014-04-13T04:35:28","slug":"independent-searches-on-gpu-how-to-synchronize-its-finish-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/04\/13\/independent-searches-on-gpu-how-to-synchronize-its-finish-collection-of-common-programming-errors\/","title":{"rendered":"independent searches on GPU &mdash; how to synchronize its finish?-Collection of common programming errors"},"content":{"rendered":"<p>Assume I have some algorithm generateRandomNumbersAndTestThem() which returns true with probability p and false with probability 1-p. Typically p is very small, e.g. p=0.000001.<\/p>\n<p>I&#8217;m trying to build a program in JOCL that estimates p as follows: generateRandomNumbersAndTestThem() is executed in parallel on all available shader cores (preferrably of multiple GPUs), until at least 100 trues are found. Then the estimate for p is 100\/n, where n is the total number of times that generateRandomNumbersAndTestThem() was executed.<\/p>\n<p>For p = 0.0000001, this means roughly 10^9 independent attempts, which should make it obvious why I&#8217;m looking to do this on GPUs. But I&#8217;m struggling a bit how to implement the stop condition properly. My idea was to have something along these lines as the kernel:<\/p>\n<pre><code>__kernel void sampleKernel(all_the_input, __global unsigned long *totAttempts) {\n    int gid = get_global_id(0);\n    \/\/here code that localizes all_the_input for faster access\n    while (lessThan100truesFound) {\n        totAttempts[gid]++;\n        if (generateRandomNumbersAndTestThem()) \n            reportTrue();\n    }\n}\n<\/code><\/pre>\n<p>How should I implement this without severe performance loss, given that<\/p>\n<ul>\n<li>triggering of the &#8220;if&#8221; will be a very rare event and so it is not a problem if all threads have to wait while reportTrue() is executed<\/li>\n<li>lessThan100truesFound has to be modified only once (from true to false) when reportTrue() is called for the 100th time (so I don&#8217;t even know if a boolean is the right way)<\/li>\n<li>the plan is to buy brand-new GPU hardware for this, so you can assume a recent GPU, e.g. multiple ATI Radeon HD7970s. But it would be nice if I could test it on my current HD5450.<\/li>\n<\/ul>\n<p>I assume that something can be done similar to Java&#8217;s &#8220;synchronized&#8221; modifier, but I fail to find the exact way to do it. What is the &#8220;right&#8221; way to do this, i.e. any way that works without severe performance loss?<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Assume I have some algorithm generateRandomNumbersAndTestThem() which returns true with probability p and false with probability 1-p. Typically p is very small, e.g. p=0.000001. I&#8217;m trying to build a program in JOCL that estimates p as follows: generateRandomNumbersAndTestThem() is executed in parallel on all available shader cores (preferrably of multiple GPUs), until at least 100 [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-6156","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/6156","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=6156"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/6156\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=6156"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=6156"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=6156"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}