Jellyfin Forum
Hardware Encoding broken after updating proxmox - Printable Version

+- Jellyfin Forum (https://forum.jellyfin.org)
+-- Forum: Support (https://forum.jellyfin.org/f-support)
+--- Forum: Troubleshooting (https://forum.jellyfin.org/f-troubleshooting)
+--- Thread: Hardware Encoding broken after updating proxmox (/t-hardware-encoding-broken-after-updating-proxmox)



Hardware Encoding broken after updating proxmox - Nanometer3912 - 2024-01-04

I'm running Jellyfin on a proxmox server in a Debian 11 LXC via docker-compose. This has been humming along well without any issues and then suddenly after I've run a routine proxmox update, transcoding is broken. From what I can tell passthrough is working fine, as I can run nvidia-smi on proxmox host, debian 11 lxc and inside the docker container:


root@serverConfused-face# nvidia-smi
Thu Jan  4 21:45:26 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.113.01            Driver Version: 535.113.01  CUDA Version: 12.2    |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf          Pwr:Usage/Cap |        Memory-Usage | GPU-Util  Compute M. |
|                                        |                      |              MIG M. |
|=========================================+======================+======================|
|  0  Quadro P600                    Off | 00000000:02:00.0 Off |                  N/A |
| 24%  40C    P0              N/A /  N/A |      0MiB /  2048MiB |      0%      Default |
|                                        |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU  GI  CI        PID  Type  Process name                            GPU Memory |
|        ID  ID                                                            Usage      |
|=======================================================================================|
|  No running processes found                                                          |
+---------------------------------------------------------------------------------------+
root@serverConfused-face#


Any hw transcode errors with the following ffmpeg output (final four lines):


[AVHWDeviceContext @ 0x55cde7791d40] cu->cuInit(0) failed -> CUDA_ERROR_UNKNOWN: unknown error
Device creation failed: -542398533.
Failed to set value 'cuda=cu:0' for option 'init_hw_device': Generic error in an external library
Error parsing global options: Generic error in an external library


This happens with any transcode, but the above example is for transcoding an x265 HEVC mkv file


/usr/lib/jellyfin-ffmpeg/ffmpeg -analyzeduration 200M -init_hw_device cuda=cu:0 -filter_hw_device cu -hwaccel cuda -hwaccel_output_format cuda -threads 1 -autorotate 0 -i file:"/data/locationredacted/filenameredacted.mkv" -autoscale 0 -map_metadata -1 -map_chapters -1 -threads 8 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -preset p1 -b:v 6707950 -maxrate 6707950 -bufsize 13415900 -g:v:0 75 -keyint_min:v:0 75 -vf "setparams=color_primaries=bt709:color_trc=bt709:colorspace=bt709,scale_cuda=format=yuv420p" -codec:a:0 libfdk_aac -ac 2 -ab 192000 -ar 48000 -copyts -avoid_negative_ts disabled -max_muxing_queue_size 2048 -f hls -max_delay 5000000 -hls_time 3 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/data/transcodes/6ada95c6cc43163d6d99479c3195bfa3%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/data/transcodes/6ada95c6cc43163d6d99479c3195bfa3.m3u8"


ffmpeg version 5.1.4-Jellyfin Copyright © 2000-2023 the FFmpeg developers
  built with gcc 11 (Ubuntu 11.4.0-1ubuntu1~22.04)
  configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-libs=-lfftw3f --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-static --disable-libxcb --disable-sdl2 --disable-xlib --enable-lto --enable-gpl --enable-version3 --enable-shared --enable-gmp --enable-gnutls --enable-chromaprint --enable-libdrm --enable-libass --enable-libfreetype --enable-libfribidi --enable-libfontconfig --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libopenmpt --enable-libdav1d --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libfdk-aac --arch=amd64 --enable-libsvtav1 --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-opencl --enable-vaapi --enable-amf --enable-libmfx --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
  libavutil      57. 28.100 / 57. 28.100
  libavcodec    59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter    8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample  4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100



This is Jellyfin Version: 10.8.13, here's the docker-compose:


version: '3'
services:

  jellyfin:
    image: lscr.io/linuxserver/jellyfin
    container_name: jellyfin
    network_mode: 'host'
    devices:
      - /dev/nvidia-uvm
      - /dev/nvidia-uvm-tools
      - /dev/nvidia-modeset
      - /dev/nvidiactl
      - /dev/nvidia0
#      - /dev/nvidia-caps
    restart: unless-stopped
    environment:
      - PUID=1001
      - PGID=1001
      - NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
      - NVIDIA_VISIBLE_DEVICES=all
      - TZ=Europe/London
      - UMASK_SET=022 #optional
    volumes:
      - /opt/appdata/jellyfinConfused-faceconfig
      - /mnt/storage/jellyfin_transcodesConfused-faceconfig/data/transcodes
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]


Config in /etc/pve/lxc/102.conf includes:


lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/nvram dev/nvram none bind,optional,create=file


Given everything was working fine before the update (though I allow this could have been caused by many other adjacent things, such as a pull of the docker image which has also happened in the meantime), I've duly consulted the interwebs. There's a lot of noise about these nvidia drivers and quite a lot of conflicting information. I gather there may be a bug w/ jellyfin losing track of nvidia hw device because it has changed names:

https://github.com/jellyfin/jellyfin/issues/9177

There's a very remote possibility that I'm hitting (arbitrary) limits of concurrent hw encoding, as I haven't bothered to install the patched nvidia driver yet (we rarely run more than 2 streaming sessions concurrently):

https://github.com/keylase/nvidia-patch

I also gather there may be some codec issues, but have struggled to follow exactly what the issues there are:

https://github.com/Artiume/jellyfin-docs/blob/master/general/wiki/main.md

So now I turn to this fine community to see if other users have experienced something similar lately and what steps worked for you.


RE: Hardware Encoding broken after updating proxmox - User 5268 - 2024-01-07

I have a similar problem. I run jellyfin as a standalone application on lxc. I have assigned my nvidia m2000 to the lxc container. It worked fine until December 2023. Now it shows an error :

ffmpeg version 6.0.1-Jellyfin Copyright © 2000-2023 the FFmpeg developers
built with gcc 12 (Debian 12.2.0-14)
configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-static --disable-libxcb --disable-sdl2 --disable-xlib --enable-lto --enable-gpl --enable-version3 --enable-shared --enable-gmp --enable-gnutls --enable-chromaprint --enable-libdrm --enable-libass --enable-libfreetype --enable-libfribidi --enable-libfontconfig --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libopenmpt --enable-libdav1d --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libfdk-aac --arch=amd64 --enable-libsvtav1 --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-opencl --enable-vaapi --enable-amf --enable-libvpl --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
libavutil 58. 2.100 / 58. 2.100
libavcodec 60. 3.100 / 60. 3.100
libavformat 60. 3.100 / 60. 3.100
libavdevice 60. 1.100 / 60. 1.100
libavfilter 9. 3.100 / 9. 3.100
libswscale 7. 1.100 / 7. 1.100
libswresample 4. 10.100 / 4. 10.100
libpostproc 57. 1.100 / 57. 1.100
[AVHWDeviceContext @ 0x5594b6a55bc0] cu->cuInit(0) failed -> CUDA_ERROR_UNKNOWN: unknown error
Device creation failed: -542398533.
Failed to set value 'cuda=cu:0' for option 'init_hw_device': Generic error in an external library
Error parsing global options: Generic error in an external library

I have experimented with different methods but the error is still there. I have tried the NVIDIA drivers from the official website and with the debian repository, with cuda toolkit and without it, with jellyfin-ffmepg5 and jellyfin-ffmpeg6; nothing makes a difference and the error continues. nvidia-smi shows no transcoding going on as well. This functioned flawlessly in the past; I did not change anything except the periodic apt updates and it stopped working.

Does anyone have a solution for this?


RE: Hardware Encoding broken after updating proxmox - TheDreadPirate - 2024-01-07

Check out this walkthrough a community member wrote.

https://forum.jellyfin.org/t-proxmox-lxc-with-nvidia-transcoding-and-network-share


RE: Hardware Encoding broken after updating proxmox - jellyshield - 2024-01-11

For me it used to work with just installing the latest driver from NVIDIA in both host and lxc. I didn't even had to install libnvcuvid1 libnvidia-encode1 separately. After some kernel update jellyfin-ffmpeg5/6 is broken. I did what was mentioned and downgraded the driver to match the Debian version. I was also able to install the  libnvcuvid1 libnvidia-encode1 in lxc and successfully ran nvidia-smi in both host and lxc. Still I have the same error.

Both the libraries are installed

Code:
root@jellyfin:~# apt install -y libnvcuvid1 libnvidia-encode1
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
libnvcuvid1 is already the newest version (525.147.05-4~deb12u1).
libnvidia-encode1 is already the newest version (525.147.05-4~deb12u1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.


NVIDIA-SMI

Code:
root@jellyfin:~# nvidia-smi
Thu Jan 11 00:43:19 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.147.05  Driver Version: 525.147.05  CUDA Version: 12.0    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|        Memory-Usage | GPU-Util  Compute M. |
|                              |                      |              MIG M. |
|===============================+======================+======================|
|  0  Quadro M2000        On  | 00000000:21:00.0 Off |                  N/A |
| 56%  28C    P8    6W /  75W |      1MiB /  4096MiB |      0%      Default |
|                              |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU  GI  CI        PID  Type  Process name                  GPU Memory |
|        ID  ID                                                  Usage      |
|=============================================================================|
|  No running processes found                                                |
+-----------------------------------------------------------------------------+



Still get the same error in Jellyfin during transcoding

Code:
ffmpeg version 5.1.4-Jellyfin Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12 (Debian 12.2.0-14)
  configuration: --prefix=/usr/lib/jellyfin-ffmpeg --target-os=linux --extra-libs=-lfftw3f --extra-version=Jellyfin --disable-doc --disable-ffplay --disable-ptx-compression --disable-static --disable-libxcb --disable-sdl2 --disable-xlib --enable-lto --enable-gpl --enable-version3 --enable-shared --enable-gmp --enable-gnutls --enable-chromaprint --enable-libdrm --enable-libass --enable-libfreetype --enable-libfribidi --enable-libfontconfig --enable-libbluray --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libopenmpt --enable-libdav1d --enable-libwebp --enable-libvpx --enable-libx264 --enable-libx265 --enable-libzvbi --enable-libzimg --enable-libfdk-aac --arch=amd64 --enable-libsvtav1 --enable-libshaderc --enable-libplacebo --enable-vulkan --enable-opencl --enable-vaapi --enable-amf --enable-libmfx --enable-ffnvcodec --enable-cuda --enable-cuda-llvm --enable-cuvid --enable-nvdec --enable-nvenc
  libavutil      57. 28.100 / 57. 28.100
  libavcodec    59. 37.100 / 59. 37.100
  libavformat    59. 27.100 / 59. 27.100
  libavdevice    59.  7.100 / 59.  7.100
  libavfilter    8. 44.100 /  8. 44.100
  libswscale      6.  7.100 /  6.  7.100
  libswresample  4.  7.100 /  4.  7.100
  libpostproc    56.  6.100 / 56.  6.100
[AVHWDeviceContext @ 0x562bf18b8d00] cu->cuInit(0) failed -> CUDA_ERROR_UNKNOWN: unknown error
Device creation failed: -542398533.
Failed to set value 'cuda=cu:0' for option 'init_hw_device': Generic error in an external library
Error parsing global options: Generic error in an external library



RE: Hardware Encoding broken after updating proxmox - TheDreadPirate - 2024-01-11

Double check on the host that the group IDs didn't change for the nvidia devices in /dev/dri after a kernel and or driver update. Update your lxc config accordingly if they did change.