cuda/nvenc - Printable Version

cuda/nvenc - Printable Version

+- Jellyfin Forum (https://forum.jellyfin.org)
+-- Forum: Support (https://forum.jellyfin.org/f-support)
+--- Forum: Troubleshooting (https://forum.jellyfin.org/f-troubleshooting)
+--- Thread: cuda/nvenc (/t-cuda-nvenc)

cuda/nvenc - enesha - 2024-12-11

Afternoon all Smiling-face

I have a few questions today (i think) but I'll do them separately. I just have a question about cuda.

I have a large library and made some changes in the trickplay settings. No real problem. I totally know that processing trickplay will take forever with my items and settings. That's a whatever.

the only question I have is about the creation of the trickplay is with it's use of CUDA. HW enc etc is configured fine and it is using CUDA. Problem I have is that it it totally under using the card. Not a great card, a bit older, but was good at the time, pcie3x16 it seems. 4 GIG ram, and the useage of ffmpeg with cuda tope out at about 15%.

Is there a way to multithread the GPU? Not entirely sure if CUDA/GPU/ffmpeg would be up for it. I hae limited CUDA knowledge other than using it for years for BOINC progets etc.

I think you can TC to streaming users with more than one stream... right?

cpu i7
GPU nvidia 1080ti
32G Ram
Debian

Anybody more up to date on CUDA and all this? Smiling-face

Attached are some relevant pics I think Smiling-face

(edit1 just to more clear

RE: cuda/nvenc - TheDreadPirate - 2024-12-11

Currently Jellyfin only supports HWA decoding with Nvidia GPUs when generating trickplays. QSV, VAAPI, and Rockchip are supported for MJPEG encoding.

RE: cuda/nvenc - enesha - 2024-12-11

Heya DreadPirate

Yuppers, and nvidia is all im worried about. Cuda is nvidia only, right? Maybe you thought I was wondering about throwing a stream to the CPU. Nope. I am talking about multiple streams to the GPU. Like I said, I seem to recall that you could use GPU to transcode for more than one user who needs it. It looks like the cuda can have multiple streams, based on this article:

https://leimao.github.io/blog/Multi-Thread-Single-Stream-VS-Single-Thread-Multi-Stream-CUDA/

Which says as much in the first paragraph before it jumps into the rabbit hole. I am happy that with what we have now is using the GPU, using the cpu was killing me at first. I'm just saying 10% gpu usage and 200 megs of vram seems like the card is being super under utilized, and wondered if there was a way to push more than one process at a time Smiling-face

Would that be a cuda limitation, ffmpeg, or even the card. just curious about it just cause of how slow it's going lol

RE: cuda/nvenc - gnattu - 2024-12-11

It is ultimately limited by the decoder and from your image your decoder is already fully saturated at 98%.

RE: cuda/nvenc - TheDreadPirate - 2024-12-11

In your post and screenshots, I only see you talking about trickplays. Trickplays will only get worked on one at a time to ensure that system resources aren't completely consumed by a non-essential task. On Nvidia systems, it will only use the GPU for decoding and the CPU for encoding trickplays. We do not yet support NVENC for encoding trickplays. So there won't be a lot of GPU utilization when you only have trickplay generation going on besides the decoder, which appears to be maxed out.

As for actual video playback transcoding, ffmpeg will always use 100% of NVENC and transcode as fast as possible. And it can already handle multiple users at the same time, up to the software limitation set by Nvidia. Which is 8. But there are ways to remove the limit.

RE: cuda/nvenc - enesha - 2024-12-12

Ah wonderful as always mate! You explained perfectly what's happening, and even added the bit about tc streams.

I assume it's 93% dec as it is deciding as fast as possible for trick, and multiple TC streams would work because they are working close to realtime (video wise) and can pause if they get ahead with the tc...sounds like i have a grasp, I hope Smiling-face

a fount of knowledge as always Smiling-face

Super great thanks!