19 skills found
cupy / CupyNumPy & SciPy for GPU
NVIDIA / CUDALibrarySamplesCUDA Library Samples
Bruce-Lee-LY / Cuda HookHooked CUDA-related dynamic libraries by using automated code generation tools.
JuliaAttic / CUSPARSE.jlJulia interface to NVIDIA's CUSPARSE library
ceruleangu / Block Sparse BenchmarkBenchmark for matrix multiplications between dense and block sparse (BSR) matrix in TVM, blocksparse (Gray et al.) and cuSparse.
gishi523 / Cusparse Cholesky SolverA sample code for sparse cholesky solver with cuSPARSE and cuSOLVER library
chenxuhao / Caffe EscoinEscoin: Efficient Sparse Convolutional Neural Network Inference on GPUs
grlee77 / Python Cuda Cffiexperimental python CFFI interface to NVIDIA's cuSOLVER and cuSPARSE libraries.
zishun / CuSolverRf BatchA complete example of batched refactorization in cuSOLVER.
dgSPARSE / DgSPARSE WrapperUnified Sparse Library Wrapper Based on cuSPARSE
Ending2015a / ICCG0 CUDAIncomplete-Cholesky preconditioned conjugate gradient algorithm implemented with cuBLAS/cuSPARSE
marcsous / GpuSparseMatlab mex wrappers to cuSPARSE (NVIDIA)
bmsherman / CublasHaskell FFI bindings for CUBLAS, CUSPARSE, and CuFFT
AdamBrouwersHarries / Cusparse SpmvExample use of cusparse's spmv routine, with benchmarking/reporting code
ayazhassan / RT CUDA GUI DevelopmentRecent development in Graphic Processing Units (GPUs) has opened a new challenge in harnessing their computing power as a new general-purpose computing paradigm with its CUDA parallel programming. However, porting applications to CUDA remains a challenge to average programmers. We have developed a restructuring software compiler (RT-CUDA) with best possible kernel optimizations to bridge the gap between high-level languages and the machine dependent CUDA environment. RT-CUDA is based upon a set of compiler optimizations. RT-CUDA takes a C-like program and convert it into an optimized CUDA kernel with user directives in a con.figuration .file for guiding the compiler. While the invocation of external libraries is not possible with OpenACC commercial compiler, RT-CUDA allows transparent invocation of the most optimized external math libraries like cuSparse and cuBLAS. For this, RT-CUDA uses interfacing APIs, error handling interpretation, and user transparent programming. This enables efficient design of linear algebra solvers (LAS). Evaluation of RT-CUDA has been performed on Tesla K20c GPU with a variety of basic linear algebra operators (M+, MM, MV, VV, etc.) as well as the programming of solvers of systems of linear equations like Jacobi and Conjugate Gradient. We obtained significant speedup over other compilers like OpenACC and GPGPU compilers. RT-CUDA facilitates the design of efficient parallel software for developing parallel simulators (reservoir simulators, molecular dynamics, etc.) which are critical for Oil & Gas industry. We expect RT-CUDA to be needed by many industries dealing with science and engineering simulation on massively parallel computers like NVIDIA GPUs.
OrangeOwlSolutions / CuSPARSENo description available
jcuda / JcusparseJCusparse - Java bindings for CUSPARSE
Ending2015a / StableFluid CUDAA really old project that implemented the Stable Fluids using CUDA, cuBLAS and cuSPARSE
canercandan / Linear AlgebraA linear algebra framework in C++ along with a layout abstraction for parallelization paradigms. It provides operators to compute dense and sparse matrices with generically designed scalar, complex, vector and matrix types. At this time, the framework supports the libraries CUDA, CUBLAS, CUSP, CUSPARSE for parallel computing on GPGPU.