Jellyfin Forum
[closed] Transcoding on i3-N305 / iGPU - Printable Version

+- Jellyfin Forum (https://forum.jellyfin.org)
+-- Forum: Support (https://forum.jellyfin.org/f-support)
+--- Forum: General Questions (https://forum.jellyfin.org/f-general-questions)
+--- Thread: [closed] Transcoding on i3-N305 / iGPU (/t-closed-transcoding-on-i3-n305-igpu)



[closed] Transcoding on i3-N305 / iGPU - bitstream - 2025-05-27

Transcoding on i3-N305 / iGPU (UHD 770)

Sure not the most powerful CPU and even less iGPU. Still, I gave it a chance.

JF is running in a VM on PVE with 4 vCores and 4GB vRAM. So rather on the lower limits. As it only has to provide one stream a time, there's not much more needed and JF never is hitting the limits.

I passed through the iGPU (using the 8.5 Linux kernel which provides full support of the 305's GPU). Setup JF to use hardware acceleration, testing with QSV and VAAPI. Limited CPU threads to 4. Both worked, VAAPI I got the impression performed a bit better.

However, although there is significant drop of CPU usage, CPU usage still is high.

To compare: Same video, same client, 3.4 Mbps MP4 H264 AAC, transcoding runtime:

Without hardware acceleration: Ffmpeg 320 - 350%CPU during 2:52
With QSV: Ffmpeg 200 - 250%CPU during 1:40

So a clear reduction of CPU usage, approx 60%, but I was expecting even more.
I also tried with intel low power conversion, no significant difference detected.

Am I too optimistic to expect more? Or should I be able to achieve even better reduction of CPU usage?


RE: Transcoding on i3-N305 / iGPU - bitstream - 2025-05-27

Just did a test on the cli directly feeding ffmpeg with the same file i tested above, however without streaming of course. So the runtime is not directly comparable to the results above but the difference between the runtime of the test without hardware acc and with hardware acc is comprable.

without hardware acc:
time ffmpeg -i /nas/test.avi -c:v libx264 -f null - 
real    8m11.390s
user    30m8.818s
sys    0m45.056s

with hardware acc:
time ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i /nas/test.avi -vf 'format=nv12,hwupload' -c:v h264_vaapi  -f null -
real    2m49.731s
user    1m41.904s
sys    0m11.403s

The relevant duration is "user" (that's duration times cpu cores used). The difference is massive: The encoding has used 18x less cpu time and the whole processes finished 4.5x faster using the iGPU compared to no hardware acceleration.

This would have been the factors i was expecting to see in JF aswell. Why is there such a big difference? I'm aware that there is also audio transconding and pot. sub titles burn-in (not the case here), but compare to video processing, thes tasks seem to consume rather low cpu power.


RE: Transcoding on i3-N305 / iGPU - bitstream - 2025-05-27

I see similar behaviour and similar results with trickplay - there, for a couple of seconds the GPU load is high (70 - 100%) and then for a minute or two goes down to 0.1% while CPU is up at around 30% on 4 vCores. So the offloading to the GPU seems to be rather little.


RE: Transcoding on i3-N305 / iGPU - nyanmisaka - 2025-05-28

Three possibilities:
1. You installed vanilla ffmpeg instead of jellyfin-ffmpeg (zero-copy transcoding will be disabled)
2. Some hardware codecs are not enabled in the dashboard
3. Audio transcoding is happening, which can only run on the CPU


RE: Transcoding on i3-N305 / iGPU - bitstream - 2025-05-29

>  1. You installed vanilla ffmpeg instead of jellyfin-ffmpeg (zero-copy transcoding will be disabled)
This point I can rule out.

> 2. Some hardware codecs are not enabled in the dashboard
Except for VP8 all offered codes for decoding are activated
Hardware encoding incl. both intel low power encoders are also activated.

> 3. Audio transcoding is happening, which can only run on the CPU
Yes, audio transcoding is happening. See example below. 

Is my understanding incorrect, that audio de- and encoding compared to video de- and encoding would only take a fraction of cpu time? Meaning, it shouldn't make a big difference and wouldn't explain the results of my tests above?

I.e.

Abspielinformationen
Player
Html Video Player
Abspielmethode
Transkodierung
Protokoll
https
Streamtyp
HLS
Videoinformationen
Playerabmessungen
1762x1095
Videoauflösung
720x540
Ausgelassene Frames
0
Fehlerhafte Frames
0


Transkodierungsinfo
Videocodec
H264
Audiocodec
AAC
Audiokanäle
2
Bitrate
3.8 Mbps


Grund für die Transkodierung
Der Container wird nicht unterstützt
Der Videocodec wird nicht unterstützt
Der Audiocodec wird nicht unterstützt


Originale Medieninformation
Container
avi
Größe
550 MiB
Bitrate
1.8 Mbps
Videocodec
MPEG4 Advanced Simple Profile
Video-Bitrate
1.4 Mbps
Video-Dynamikumfang
SDR
Audiocodec
AC3
Audio-Bitrate
192 kbps
Audiokanäle
2
Audio-Abtastrate
48000 Hz


RE: Transcoding on i3-N305 / iGPU - bitstream - 2025-05-29

So, if i get it right, this would be the highlevel data flow:

[HW DECODING] → [CPU: Filter, Audio, Subtitle, Container] → [HW ENCODING]

While video decoding and encoding can be shifted to the GPU, the rest remains on the CPU. That also includes Demuxing, Remuxing, Bit-rate-changes, and so on. In my example this means, this part takes 250% CPU during 1:40.


RE: Transcoding on i3-N305 / iGPU - bitstream - 2025-06-06

Tested with a couple of combinations in the meantime.
When directstreaming is possible, cpu load is low, as expected.
There are almost no combinations of video direct and sound encoding i found. Same for vice versa.
When video and sound needs to be encoded, i have found not video that would not be shifted to the iGPU for transcoding, which happens where fast. The bottleneck, as already found out above, is the remaining stuff that needs to go to the CPU. That's still significant load.