Lines Matching refs:performance
36 …hose runtime performance is competitive with that of hand-optimized Fortran programs. However, ess…
63 howpublished = "\url{https://devblogs.nvidia.com/maximizing-unified-memory-performance-cuda/}",
120 …s not affect the runtime performance of the compiled code, i.e., the programmer is liberated from …
180 …performance difference between a naive implementation of a pipeline and an optimized one is often …
196 …pact of the language and compiler features and shows application-level performance competitive wit…
227 …marks show that the simplicity and flexibility hiCUDA provides come at no expense to performance.},
238 title={{LIFT: A functional data-parallel IR for high-performance GPU code generation}},
307 …performance computing applications, since the launch of the Compute Unified Device Architecture (C…
331 …in the context of our new Unified multi-threaded scheduler design. The performance of the Unified …
348 …performance is still quite challenging. Programmers need to consider numerous architectural detail…
353 keywords = {CUDA, memory performance, program optimization, GPGPU, performance estimation}
363 … GPU application memory usage can be reduced up to 50\%, and that even performance improvements ca…
376 …n optimization technique is introduced which achieves the same runtime performance regardless of w…