problem about warp-Collection of common programming errors


  • Ashwin
    cuda nvcc warp
    The CUDA Programming Guide (v4.1) describes this about predicated instructions in Sec 5.4.2:The compiler replaces a branch instruction with predicatedinstructions only if the number of instructions controlled by thebranch condition is less or equal to a certain threshold: If thecompiler determines that the condition is likely to produce manydivergent warps, this threshold is 7, otherwise it is 4.How can a condition produce many divergent warps? A given condition can only split a warp into two

  • KiaMorot
    multithreading cuda warp
    My understanding is that warp is a group of threads that defined at runtime through the task schedueler, one performance critical part of CUDA is the diveragence of threads within a warp, is there a way to make a good guess of how the hardware will construct warps within a thread block? For instance I have start a kernel with 1024 threads in a thread block, how is the warps be arranged, can I tell that (or at least make a good guess) from the thread idx?Since by doing this, one can minimize the

  • Ufuk Can Biçici
    multithreading synchronization cuda conditional-statements warp
    I recently asked a question about synchronization issues among the threads of a block in CUDA, here: Does early exiting a thread disrupt synchronization among CUDA threads in a block? One of the comments to my question gave a link to a similar thread, which quoted the following about the CUDA barrier (__syncthreads()) instruction from the PTX guide:Barriers are executed on a per-warp basis as if all the threads in a warp are active. Thus, if any thread in a warp executes a bar instruction, it is

Web site is in building