{"id":4376,"date":"2014-03-30T10:16:47","date_gmt":"2014-03-30T10:16:47","guid":{"rendered":"https:\/\/unknownerror.org\/index.php\/2014\/03\/30\/problem-about-warp-collection-of-common-programming-errors\/"},"modified":"2014-03-30T10:16:47","modified_gmt":"2014-03-30T10:16:47","slug":"problem-about-warp-collection-of-common-programming-errors","status":"publish","type":"post","link":"https:\/\/unknownerror.org\/index.php\/2014\/03\/30\/problem-about-warp-collection-of-common-programming-errors\/","title":{"rendered":"problem about warp-Collection of common programming errors"},"content":{"rendered":"<ul>\n<li><img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/e656e8a79ae5a416f0c3d4de38d302b1?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nAshwin<br \/>\ncuda nvcc warp<br \/>\nThe CUDA Programming Guide (v4.1) describes this about predicated instructions in Sec 5.4.2:The compiler replaces a branch instruction with predicatedinstructions only if the number of instructions controlled by thebranch condition is less or equal to a certain threshold: If thecompiler determines that the condition is likely to produce manydivergent warps, this threshold is 7, otherwise it is 4.How can a condition produce many divergent warps? A given condition can only split a warp into two<\/li>\n<li><img decoding=\"async\" src=\"http:\/\/www.gravatar.com\/avatar\/86cf4281383bf9b2818433f8292a32d6?s=32&amp;d=identicon&amp;r=PG\" \/><br \/>\nKiaMorot<br \/>\nmultithreading cuda warp<br \/>\nMy understanding is that warp is a group of threads that defined at runtime through the task schedueler, one performance critical part of CUDA is the diveragence of threads within a warp, is there a way to make a good guess of how the hardware will construct warps within a thread block? For instance I have start a kernel with 1024 threads in a thread block, how is the warps be arranged, can I tell that (or at least make a good guess) from the thread idx?Since by doing this, one can minimize the<\/li>\n<li><img decoding=\"async\" src=\"http:\/\/i.stack.imgur.com\/8UESu.jpg?s=32&amp;g=1\" \/><br \/>\nUfuk Can Bi\u00e7ici<br \/>\nmultithreading synchronization cuda conditional-statements warp<br \/>\nI recently asked a question about synchronization issues among the threads of a block in CUDA, here: Does early exiting a thread disrupt synchronization among CUDA threads in a block? One of the comments to my question gave a link to a similar thread, which quoted the following about the CUDA barrier (__syncthreads()) instruction from the PTX guide:Barriers are executed on a per-warp basis as if all the threads in a warp are active. Thus, if any thread in a warp executes a bar instruction, it is<\/li>\n<\/ul>\n<p>Web site is in building<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Ashwin cuda nvcc warp The CUDA Programming Guide (v4.1) describes this about predicated instructions in Sec 5.4.2:The compiler replaces a branch instruction with predicatedinstructions only if the number of instructions controlled by thebranch condition is less or equal to a certain threshold: If thecompiler determines that the condition is likely to produce manydivergent warps, this [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-4376","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/4376","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/comments?post=4376"}],"version-history":[{"count":0,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/posts\/4376\/revisions"}],"wp:attachment":[{"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/media?parent=4376"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/categories?post=4376"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/unknownerror.org\/index.php\/wp-json\/wp\/v2\/tags?post=4376"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}