Show simple item record

dc.contributor.authorRosen, Paulen_US
dc.contributor.editorB. Preim, P. Rheingans, and H. Theiselen_US
dc.date.accessioned2015-02-28T15:30:28Z
dc.date.available2015-02-28T15:30:28Z
dc.date.issued2013en_US
dc.identifier.issn1467-8659en_US
dc.identifier.urihttp://dx.doi.org/10.1111/cgf.12103en_US
dc.description.abstractWe present an approach to investigate the memory behavior of a parallel kernel executing on thousands of threads simultaneously within the CUDA architecture. Our top-down approach allows for quickly identifying any significant differences between the execution of the many blocks and warps. As interesting warps are identified, we allow further investigation of memory behavior by visualizing the shared memory bank conflicts and global memory coalescence, first with an overview of a single warp with many operations and, subsequently, with a detailed view of a single warp and a single operation. We demonstrate the strength of our approach in the context of a parallel matrix transpose kernel and a parallel 1D Haar Wavelet transform kernel.en_US
dc.publisherThe Eurographics Association and Blackwell Publishing Ltd.en_US
dc.subjectHardware [B.8.2]en_US
dc.subjectPerformance and Reliabilityen_US
dc.subjectPerformance Analysis and Design Aidsen_US
dc.titleA Visual Approach to Investigating Shared and Global Memory Behavior of CUDA Kernelsen_US
dc.description.seriesinformationComputer Graphics Forumen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record