• Login
  • Register
  • Login Register
    Login
    Username/Email:
    Password:
    Or login with a social network below
  • Forum
  • Website
  • GitHub
  • Status
  • Translation
  • Features
  • Team
  • Rules
  • Help
  • Feeds
User Links
  • Login
  • Register
  • Login Register
    Login
    Username/Email:
    Password:
    Or login with a social network below

    Useful Links Forum Website GitHub Status Translation Features Team Rules Help Feeds
    Jellyfin Forum Support Troubleshooting Thread Pool Starvation

     
    • 0 Vote(s) - 0 Average

    Thread Pool Starvation

    All threads pegged near 100%
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #1
    2023-09-04, 03:07 PM
    I am facing a situation where my system is being flooded with processes from /usr/bin/jellyfin

    [Image: sid-e42ef379647a1bdb1a6b3468f51b79df0ceb...b?type=raw]

    There are hundreds, possible thousands of these entries in htop. It's obvious something is hung here and I'd like to know some ways to further investigate it. I have not rebooted the server which in my experience does clear it, but I want to root cause this first.

    More confirmation details regarding CPU usage being starved.

    [Image: sid-81831d7faa9aa79844de86033e2b84486d22...3?type=raw]

    [Image: sid-4d1e1731e230f52b49d43cdda56195b0cf66...c?type=raw]

    ● jellyfin.service - Jellyfin Media Server
        Loaded: loaded (/lib/systemd/system/jellyfin.service; enabled; vendor preset: enabled)
        Drop-In: /etc/systemd/system/jellyfin.service.d
                └─jellyfin.service.conf
        Active: active (running) since Fri 2023-09-01 19:29:02 UTC; 2 days ago
      Main PID: 732 (jellyfin)
          Tasks: 3636 (limit: 18546)
        Memory: 13.7G
            CPU: 2d 19h 15min 58.966s
        CGroup: /system.slice/jellyfin.service
                └─732 /usr/bin/jellyfin --webdir=/usr/share/jellyfin/web --restartpath=/usr/lib/jellyfin/restart.sh --ffmpeg=/usr/lib/jellyfin-ffmpeg/ffmpeg

    Sep 04 14:59:29 jellyfin jellyfin[732]: [14:59:29] [WRN] As of "09/04/2023 14:59:09 +00:00", the heartbeat has been running for "00:00:20.7576155" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 14:59:50 jellyfin jellyfin[732]: [14:59:50] [WRN] As of "09/04/2023 14:59:31 +00:00", the heartbeat has been running for "00:00:10.8371778" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:00:13 jellyfin jellyfin[732]: [15:00:13] [WRN] As of "09/04/2023 14:59:51 +00:00", the heartbeat has been running for "00:00:21.2132112" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:00:42 jellyfin jellyfin[732]: [15:00:42] [WRN] As of "09/04/2023 15:00:22 +00:00", the heartbeat has been running for "00:00:20.3549039" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:00:53 jellyfin jellyfin[732]: [15:00:53] [WRN] As of "09/04/2023 15:00:43 +00:00", the heartbeat has been running for "00:00:10.0880467" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:01:16 jellyfin jellyfin[732]: [15:01:16] [WRN] As of "09/04/2023 15:00:55 +00:00", the heartbeat has been running for "00:00:21.4189816" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:01:25 jellyfin jellyfin[732]: [15:01:25] [WRN] As of "09/04/2023 15:01:18 +00:00", the heartbeat has been running for "00:00:07.2229611" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:01:37 jellyfin jellyfin[732]: [15:01:37] [WRN] As of "09/04/2023 15:01:26 +00:00", the heartbeat has been running for "00:00:10.6905181" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:01:39 jellyfin jellyfin[732]: [15:01:39] [WRN] As of "09/04/2023 15:01:38 +00:00", the heartbeat has been running for "00:00:01.5668541" which is longer than "00:00:01". This could be caused by thread pool starvation.
    Sep 04 15:01:43 jellyfin jellyfin[732]: [15:01:43] [WRN] As of "09/04/2023 15:01:41 +00:00", the heartbeat has been running for "00:00:02.4147117" which is longer than "00:00:01". This could be caused by thread pool starvation.

    System details

    No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 22.04.3 LTS
    Release: 22.04
    Codename: jammy

    NVIDIA-SMI 525.125.06  Driver Version: 525.125.06  CUDA Version: 12.0


    Attached Files
    .zip   09-04-logs.zip (Size: 99.43 KB / Downloads: 52)
    TheDreadPirate
    Offline

    Community Moderator

    Posts: 15,375
    Threads: 10
    Joined: 2023 Jun
    Reputation: 460
    Country:United States
    #2
    2023-09-04, 03:27 PM
    Can you describe your setup? Number of users, GPU used for transcoding, storage for the VM/container, is this storage local or remote? Some local, some remote?
    Jellyfin 10.10.7 (Docker)
    Ubuntu 24.04.2 LTS w/HWE
    Intel i3 12100
    Intel Arc A380
    OS drive - SK Hynix P41 1TB
    Storage
        4x WD Red Pro 6TB CMR in RAIDZ1
    [Image: GitHub%20Sponsors-grey?logo=github]
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #3
    2023-09-04, 03:30 PM
    (2023-09-04, 03:27 PM)TheDreadPirate Wrote: Can you describe your setup?  Number of users, GPU used for transcoding, storage for the VM/container, is this storage local or remote?  Some local, some remote?

    Day to day the number of active users could be 4-6 but mostly around 2-3 sometimes.
    GPU is a Quadro P400 - Not everything needs to transcode, but I did install the patch for unlocking the limit a while ago.
    Storage for this VM is 200GB for the system, media storage is a local NFS share
    Venson
    Offline

    Moderator, Server Dev, XBox Maintainer

    Posts: 373
    Threads: 7
    Joined: 2023 Jun
    Reputation: 15
    Country:Germany
    #4
    2023-09-04, 03:32 PM
    Although i cannot put my thumb on it but there seems to be something fundamentally wrong with this setup. I see some ffmpeg processes crashing for no apparent reason, lots of network issues with corrupt packages, Plackback tracker not being cleaned up and more. Also chapter extractions being aborted.

    I dont think its actually JFs issue but you really somehow started tons of JF instances.
    typos are finders, keepers.
    Next Jellyfin release 10.11.0 will be Soon™
    Soon™ is an unregistered trademark of Jellyfin International
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #5
    2023-09-04, 03:38 PM
    (2023-09-04, 03:32 PM)Venson Wrote: Although i cannot put my thumb on it but there seems to be something fundamentally wrong with this setup. I see some ffmpeg processes crashing for no apparent reason, lots of network issues with corrupt packages, Plackback tracker not being cleaned up and more. Also chapter extractions being aborted.

    I dont think its actually JFs issue but you really somehow started tons of JF instances.

    Your comment made me think it might be requests coming from my reverse proxy but I paused that container and it had no effect. I am watching the cpu counter lower than shoot back up so you are right.
    TheDreadPirate
    Offline

    Community Moderator

    Posts: 15,375
    Threads: 10
    Joined: 2023 Jun
    Reputation: 460
    Country:United States
    #6
    2023-09-04, 04:08 PM
    (2023-09-04, 03:30 PM)natzilla Wrote: Storage for this VM is 200GB for the system

    Can you get more specific about the 200GB VM storage?  What I'm trying to get at is whether the storage is local and what file system.  All of the problems here and what Venson mentioned tell me that there is an issue with disk I/O and throughput.

    How many VMs are you running on this machine?
    Jellyfin 10.10.7 (Docker)
    Ubuntu 24.04.2 LTS w/HWE
    Intel i3 12100
    Intel Arc A380
    OS drive - SK Hynix P41 1TB
    Storage
        4x WD Red Pro 6TB CMR in RAIDZ1
    [Image: GitHub%20Sponsors-grey?logo=github]
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #7
    2023-09-04, 04:15 PM
    (2023-09-04, 04:08 PM)TheDreadPirate Wrote:
    (2023-09-04, 03:30 PM)natzilla Wrote: Storage for this VM is 200GB for the system

    Can you get more specific about the 200GB VM storage?  What I'm trying to get at is whether the storage is local and what file system.  All of the problems here and what Venson mentioned tell me that there is an issue with disk I/O and throughput.

    How many VMs are you running on this machine?

    Sure, Jellyfin's drive is currently the only VM running on this specific drive in my hypervisor. I have other disks for other VM's but kept jellyfin on it's own. It's a Samsung 870 EVO for jellyfin. It's a total 500GB capacity but limited it to 200GB

    Storage is fully local to the hypervisor and it should be ext4 with the client, and the vm disk is raw

    [Image: sid-71ea680157634557fa07ea1b08d6b691e7bc...0?type=raw]
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #8
    2023-09-04, 04:29 PM (This post was last modified: 2023-09-04, 04:35 PM by natzilla. Edited 1 time in total.)
    The system appears to have calmed down now. I didn't do anything to it at all so I am at a loss. I checked the scheduled tasks page for anything that was running and it was all hours ago and taking less than a minute. I am at a loss.

    Edit: take that back, the issue returned
    natzilla
    Offline

    Junior Member

    Posts: 26
    Threads: 3
    Joined: 2023 Jun
    Reputation: 0
    #9
    2023-10-01, 12:37 AM
    This still appears to be a problem after 10.8.11 update. Still very random, and I'm not sure whats causing it.
    pcm
    Offline

    Member

    Posts: 62
    Threads: 4
    Joined: 2024 May
    Reputation: 0
    Country:Uzbekistan
    #10
    2024-06-05, 01:08 AM (This post was last modified: 2024-06-05, 01:59 AM by pcm. Edited 1 time in total.)
    I'm wondering if /usr/lib/jellyfin/restart.sh has something to do with it. I'm taking a wild stab in the dark, but I'm thinking that jellyfin process is somehow thinking it not healthy and keeps trying to restart using the restart.sh script.

    Someone familiar with how --restartpath flag works might be able to weigh in better.

    In the meantime could you provide the last few lines of journalctl ?

    Code:
    journalctl -u jellyfin -n 200 --no-pager
    « Next Oldest | Next Newest »

    Users browsing this thread: 1 Guest(s)


    • View a Printable Version
    • Subscribe to this thread
    Forum Jump:

    Home · Team · Help · Contact
    © Designed by D&D - Powered by MyBB
    L


    Jellyfin

    The Free Software Media System

    Linear Mode
    Threaded Mode