Yesterday, 07:18 PM
(2024-11-08, 09:11 PM)TheDreadPirate Wrote: Our chat over in troubleshooting got me to come back here and update my process. Now that jf-ffmpeg7 has Dolby Vision removal bundled, it turned a multi-step workflow into a "one liner". This also removed a bunch of steps I needed to take to keep video, audio, and subtitles in sync.
I had some additional discussions with Nyanmisaka about how to handle video with a lot of film grain, which caused significantly reduced quality due to the randomness. I had considered your de-noise approach, but ended up choosing to preserve the film grain. So I explored some additional parameters Nyanmisaka had suggested to try to better use the available bit rate.
Okay, so I've found a happy medium. The code you provided here does AMAZING things with recently-produced media and offers some improvement on older, grainy media. File sizes for test media for me are approximately halved for light grain to zero grain live action. I see much less improvement on file sizes with animated media, grain or no. For the latter, non-grainy sources work pretty well with vanilla av1_qsv, so I'm not too hurt on that front.
What I've done is research on the web as well as toil via LOTS of trial and error (mostly error, and I can't understate how much error) to find a good solution for grainier media, animated and live action alike. I wanted to keep some grain but reduce file sizes while maintaining the visual clarity and detail I've come to expect with my encodes. Earlier I was mentioning the use of nlmeans_opencl but I didn't have a good tune put together, which hurt performance and didn't provide the results I wanted. Now? I have an easily-modified solution that provides amazing results. The only issue is that I wouldn't recommend automating...I'm not aware of a noise-detecting algorithm to provide information like a first pass or a map of CRP/QP scene-by-scene.
Code:
MEDIA="MEDIA" && INPUT="INPUT" && \
cd "/mnt/media/folder" && \
mkdir -p "/mnt/media/output/${MEDIA}" && \
echo "Calculating crop values..." && \
cropval=$(jffmpeg -ss 120 -i "${INPUT}" -t 5:00 -filter:v fps=1/2,cropdetect -f null - 2>&1 | awk '/crop/ { print $NF }' | tr ' ' '\n' | sort | uniq -c | sort -n | tail -1 | awk '{ print $NF }') && \
cw=$(echo "${cropval#*crop=}" | cut -d : -f 1) && ch=$(echo "${cropval#*crop=}" | cut -d : -f 2) && \
cx=$(echo "${cropval#*crop=}" | cut -d : -f 3) && cy=$(echo "${cropval#*crop=}" | cut -d : -f 4) && \
jffmpeg -init_hw_device vaapi=va:/dev/dri/renderD129 -init_hw_device qsv=qs@va -init_hw_device opencl=ocl@va \
-analyzeduration 200M -probesize 1G -extra_hw_frames 40 \
-i "${INPUT}" \
-map 0:v:0 -map 0:a:0 -map 0:s:0 -map 0:s:1 \
-c copy -c:v:0 "av1_qsv" -preset "veryslow" -low_delay_brc 1 -extbrc 1 -look_ahead_depth 40 \
-adaptive_i 1 -adaptive_b 1 -b_strategy 1 -bf 39 \
-global_quality:v:0 22 \
-filter:v "hwupload=derive_device=opencl:extra_hw_frames=40,nlmeans_opencl=s=2.3:p=7:pc=3:r=7:rc=3,hwdownload,format=yuv420p,hwupload=derive_device=qsv:extra_hw_frames=40,vpp_qsv=cw=${cw}:ch=${ch}:cx=${cx}:cy=${cy}:format=p010le" \
-filter:s "hwupload=derive_device=qsv:extra_hw_frames=40,vpp_qsv=scale_mode=2:w=1920:h=-1" \
-c:a libopus -b:a 256k -ac 6 -filter:a aformat=channel_layouts="7.1|5.1|stereo" \
-metadata title="${MEDIA}" -metadata:s:v:0 title='AV1 GQ22' \
-metadata:s:a:0 title='5.1 Surround' -metadata:s:a:0 language='eng' -disposition:a:0 default \
-metadata:s:s:0 title='English' -metadata:s:s:0 language='eng' -disposition:s:0 0 \
-metadata:s:s:1 title='English (SDH)' -metadata:s:s:1 language='eng' -disposition:s:1 0 \
"/mnt/media/luxe/import/${MEDIA}/${MEDIA} [Bluray-1080p AV1 OPUS 5.1][EN][EN]-RLSGRP.mkv"
So the big thing here is I figured out how to run nlmeans_opencl in conjunction with qsv filters through hwdownload and format. This offers the amazing performance of nlmeans_opencl (compared to vanilla nlmeans) while maintaining the flexibility of vpp_qsv or other qsv filters. You can also run software filters BEFORE the first hwupload (e.g., crop or scale to use lanczos or other flags). You could likely run them after as well or use an alternative like scale_opencl which supports lanczos as an algorithm. Deinterlacing gets super wonky, so the solution is to use filter_complex to tie everything together and it works like a charm.
Now, I don't really benchmark but I average around 10x encoding speed with just QSV on 1080p content. It's a wide range because that's the nature of media. With nlmeans_opencl I generally only get 0.9-1.7x encoding speed, but I can take a really stubborn source that encodes around 11-12 GB with QSV and get it down to 4-6 GB with very little loss of apparent grain and excellent retention of visual clarity. I've seen anywhere from 30-70% reduction in file size. On to how it's easily modifiable...
Code:
nlmeans_opencl=s=2.3:p=7:pc=3:r=7:rc=3
So this is really the filter of interest. We have strength, patch size, chroma plane patch size, research window, and chroma plane research window.
- Cranking up strength obviously denoises more. I've tended to stay between 1.9-2.4 as I see a lot of degradation outside those bounds.
- Increasing patch size increases the pixels (?) in a patch considered for denoising. I leave this at default, but have played around and found lower patch sizes work well for retaining fine details lost at higher values. High values do very weird things to your video and moving either direction affects encoding speed slightly.
- Chroma plane patch size seems to have a bit less drastic effect. Default is 0, limit is 99. I like the results I get with 3 and it was a recommendation to try as a starting place online.
- Research window can absolutely tank your encoding speed. Range is 0-99 and default is 15. With the default I was getting speeds of 0.2-0.6x, which is what I get with libsvtav1 piped into ffmpeg with better end quality and lower file sizes. Increase at your own risk, small adjustments have large speed effects and limited effect on quality in my experience.
- Chroma plane research window is the same, but for chroma. Range 0-99 and default of 0. Again, 3 is what I found through my searching as a suggestion for a starting place. I have cranked this up and not seen much improvement with a hit to encoding speed.
So if you have a stubborn source and the ability to use OpenCL filters, I highly recommend cropping your source and giving nlmeans_opencl a try. Play with the strength for each source (thus why automation is difficult) and see what you get. I recently took a source with 22x15 GB segments down to 650 MB - 1.7 GB with, admittedly, more quality loss than I'd prefer due to bumping up my preferred CRF, but nlmeans_opencl was a godsend for getting this media to a manageable size.
Jellyfin 10.10.0 LSIO Docker | Ubuntu 24.04 LTS | i7-13700K | Arc A380 6 GB | 64 GB RAM | 79 TB Storage