Lines Matching full:on

17   CONTENTS    = {A comprehensive introduction on SISAL's internal structure.
36 …SaC), a functional C-variant aimed at numerical applications that is based on the proposed design,…
52on multi-threading. The language design of SaC aims at combining high-level, compositional array p…
77 …howpublished = "\url{http://on-demand.gputechconf.com/gtc-express/2011/presentations/StreamsAndCon…
84 …howpublished = "\url{http://on-demand.gputechconf.com/gtc/2018/presentation/s8430-everything-you-n…
97on the Graphics Processing Unit (GPGPU), in order to overcome those computational demands. Unfortu…
120on a special loop construct, the with-loop, which in the functional language SAC (for Single Assig…
131 …booktitle = {6th Workshop on Declarative Aspects of Multicore Programming (DAMP'11), Austin, USA},
146 booktitle = {Proceedings of the Tenth Symposium on Trends in Functional Programming,
165 booktitle = {Proceedings of the 30th Symposium on Implementation and Application of Functional Lang…
180 …enables terse, composable programs to achieve state-of-the-art performance on a wide range of real…
181 booktitle = {Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Impl…
196on loop interchange and distribution but uses higher-order reasoning rather than array-dependence …
197 booktitle = {Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Impl…
227 …come a de facto standard for programming NVIDIA GPUs. However, CUDA places on the programmer the b…
228 …booktitle = {Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Unit…
237 booktitle={2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)},
266 booktitle = {Proceedings of the 4th International Workshop on Polyhedral Compilation Techniques},
274 booktitle={2019 IEEE/ACM Workshop on Memory Centric High Performance Computing (MCHPC)},
296 …title = {{Performance models for asynchronous data transfers on consumer Graphics Processing Units…
307on different GPU architectures. Thus, we illustrate this methodology by deriving expressions of pe…
318 …fectively, task-level parallelism is exploited as a multi-threaded program on a multicore CPU. For…
331on heterogeneous systems is described in this work. These asynchronous tasks arise from the Uintah…
332on-node;asynchronous out-of-order scheduling;GPU computational tasks;hybrid runtime system;unified…
348on GPUs, but tuning GPGPU applications for high performance is still quite challenging. Programmer…
357 … unified, pinned, and host/device memory allocations for memory-intensive workloads on Tegra SoC}},
363on processing near the source of the data. Edge computing devices using the Tegra SoC architecture…