Newest 'cuda' Questions - Code Review Stack Exchange

2 votes

1 answer

92 views

RAII Wrapper For CUDA Pointers

I was recently working on my CUDA wrappers library, and this particular class is one of the oldest pieces of code in the entire project. Since that time, I added tons of other features (for example <...

NeKon

641

asked Sep 25 at 14:27

12 votes

1 answer

799 views

Strongly-typed CUDA device memory

When I discovered that CUDA device memory was represented by plain old void* I was horrified by having to deal with C-style type safety and resource ownership (i.e. ...

Toby Speight

88.7k

asked Sep 20 at 16:14

7 votes

1 answer

265 views

RAII Wrapper For Registering/Mapping CUDA Resources

I've implemented a resource management class for CUDA interop using RAII to ensure exception safety. The goal is to handle the registration/unregistration and mapping/unmapping, of graphics resources (...

NeKon

641

asked Sep 6 at 10:51

1 vote

0 answers

95 views

Sphere Generation System With CUDA-OpenGL Interop

This is some kind of follow up to my previous question, this question will be more focused on the actual tessellating pipeline. What I changed from previous question Implemented the async sphere ...

NeKon

641

asked Sep 3 at 15:36

1 vote

0 answers

67 views

CUDA Sphere Tesselation With Support For LOD

I was working on my version of "Universe Sandbox" and first thought comes to your mind is "where the hell are my planets?" so I thought loading models sucks and made this thing, It'...

NeKon

641

asked Aug 8 at 5:53

8 votes

1 answer

291 views

CUDA/NVRTC context switching function

I've implemented a feature in my C++ fractal explorer application to switch between CUDA and NVRTC. The main reason for the NVRTC/Driver API context is to support runtime compilation of custom CUDA ...

NeKon

641

asked May 16 at 14:18

15 votes

1 answer

2k views

CUDA Mandelbrot Kernel

I'm looking for feedback and suggestions on improving the performance and quality of my CUDA kernel for rendering the Mandelbrot set. I've implemented a "ping-pong" style coloring and ...

NeKon

641

asked Apr 21 at 19:18

3 votes

1 answer

103 views

Tracking total iterations in CUDA fractal renderer

I'm developing a fractal renderer in CUDA and need advice on tracking the total number of iterations performed during rendering. This is important for real-time dragging and zooming performance. ...

NeKon

641

asked Apr 1 at 12:58

6 votes

0 answers

169 views

FractalRendering on GPU with CUDA

I am doing a fractal renderer using CUDA, SFML, C++, recently optimized it to eat less memory, now I am going to optimize the actual fractals, because for some reason, it is the most holding back ...

NeKon

641

asked Mar 27 at 19:52

2 votes

1 answer

85 views

I have a pytorch module that takes in some parameters and predicts the difference between one of it inputs and the target

One instance of the following module uses up to almost 75% of my vram. So, I was wondering how I could improve that without slowing down runtime too much. The code is below: ...

Jayson Meribe

21

asked Dec 5, 2024 at 20:55

3 votes

1 answer

129 views

Pytorch code running slow for Deep Q learning (Reinforcement Learning)

I'm a new student in reinforcement learning. Below is the code that I wrote for deep Q learning: ...

Jahid Chowdhury Choton

85

asked May 2, 2024 at 20:12

1 vote

0 answers

252 views

A CUDA kernel for a matrix product as outer product vectors

To multiply the matrices A and B using the outer product of vectors, we can express each row of matrix A as a row vector and each column of matrix B as a column vector. Then, we can take the outer ...

user366312

747

asked Jul 23, 2023 at 5:05

2 votes

1 answer

173 views

Applying cointegration function from statsmodels on a large dataframe

I need to apply the coint function from the statsmodels library to 207 times series with 1397 points each, two by two. Currently, it takes between 35-40 minutes on my computer with an Intel 24 Cores ...

Begoodpy

135

asked Jun 2, 2023 at 14:20

5 votes

3 answers

237 views

Summation over different determinants that are independently computed using CUDA

Do you have any suggestions for improving the efficiency of the code below? I believe that better optimization can be implemented in the GPU function cuKer_sum, which is located in the ...

Anomalous Physicst

51

asked May 17, 2023 at 9:38

5 votes

1 answer

223 views

CUDA kernel to compare pairs of matrices

My first time writing anything significant in CUDA. This kernel takes two arrays representing square matrices and compares them pair-wise. It takes into consideration large input arrays, and ...

l3utterfly

153

asked Apr 16, 2023 at 11:14

Stack Exchange Network

Questions tagged [cuda]

RAII Wrapper For CUDA Pointers

Strongly-typed CUDA device memory

RAII Wrapper For Registering/Mapping CUDA Resources

Sphere Generation System With CUDA-OpenGL Interop

CUDA Sphere Tesselation With Support For LOD

CUDA/NVRTC context switching function

CUDA Mandelbrot Kernel

Tracking total iterations in CUDA fractal renderer

FractalRendering on GPU with CUDA

I have a pytorch module that takes in some parameters and predicts the difference between one of it inputs and the target

Pytorch code running slow for Deep Q learning (Reinforcement Learning)

A CUDA kernel for a matrix product as outer product vectors

Applying cointegration function from statsmodels on a large dataframe

Summation over different determinants that are independently computed using CUDA

CUDA kernel to compare pairs of matrices

Hot Network Questions