• Login
  • Register
  • Login Register
    Login
    Username/Email:
    Password:
    Or login with a social network below
  • Forum
  • Website
  • GitHub
  • Status
  • Translation
  • Features
  • Team
  • Rules
  • Help
  • Feeds
User Links
  • Login
  • Register
  • Login Register
    Login
    Username/Email:
    Password:
    Or login with a social network below

    Useful Links Forum Website GitHub Status Translation Features Team Rules Help Feeds
    Jellyfin Forum Development Server Development Defining expected music metadata behavior

     
    • 0 Vote(s) - 0 Average

    Defining expected music metadata behavior

    Proposing to develop a proposal for a metadata handling spec
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #1
    2026-02-19, 05:26 PM
    Sound metadata is half the service a media player provides (the other half is serving the media). And of course it can be the messiest part given the history of metadata in digital media history (different standards, ignored standards, users customizing tags and then sharing files, older ripped data without tags, etc).

    So what is Jellyfin's guiding philosophy on this problem? Yes, parts of existing behavior are holdovers from it's Embry roots, but if, as a community, we can decide on both what it should do, life becomes a lot easier for developers to contribute.

    On one hand, a default Jellyfin Docker (which includes plugins) clearly tries to automate some aspects of pulling missing data (like guessing an artist and album in lieu of metadata and pulling images). On the other hand, I've seen it argued at least twice in the development forums that code contributions to improve these automations were unnecessary because users can just hand-edit what they don't like. So does Jellyfin believe in automation, or does it place the onus for sound metadata on the user? And of course, things aren't black and white. Pragmatic issues often dictate the final behavior (so we get a best-effort version of a philosophy). 

    Community clarity on Jellyfin's philosophy would help. Even better would be hashing out the expected behavior and the reasoning behind it. What is the order of precedence for determining the metadata values Jellyfin will use, and why? I'm aware this isn't simple because, as communities grow, such software systems try to accommodate a variety of user preferences. So some users eschew external metadata sources, others prefer metadata database A, and others prefer metadata database B. And then some want part of their metadata to come from here and other parts from there. It's complex. The reality is that some logic and reasoning already exist, even if we don't write it down, it is smeared across the code itself. But without an actual spec, how can anyone say if the behavior is broken? Maybe I'm missing the testing of the metadata decision logic (quite possible), but right now all I see is testing of the metdata parsing. 

    I'm sure lots of folks come saying what should and shouldn't be done, but in this case, if we have clarity, I'd be glad to help on the code front. In my case, I only use Jellyfin as a music server. So for me, that is my focus. I do suspect some of the practices worked out here could also be transferred to the movie side.

    The fundamental problem is the lack of guaranteed UUIDs for any digital media. Some tracks aren't likely to have them (say home recordings), and others could but don't (some random rip from before MusicBrainz and others). So if Jellyfin is trying for best-effort automation, it needs an order of precedence for which information it considers most precise in lieu of a UUID. The secondary problem is contradictory information. If tags contradict each other, then which do you choose to believe and why?

    If this has been done, I apologize. I've looked around and haven't found it.
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #2
    2026-02-19, 05:55 PM (This post was last modified: 2026-02-19, 06:02 PM by Brandon Hill. Edited 1 time in total.)
    To start the conversation, for me, this started with GitHub issue #9623. A user had two artists with the name of "Meg" but with different capitalizations. Naturally, they wanted the system to recognize them as different. To explore how the system behaved, I first looked through the code, but the management of such metadata isn't all in one place (especially when plugins are installed). So I created a test setup to see what the system does in different situations.

    Setup a test Jellyfin env
    Code:
    mkdir -p jellyfin_tests/srv/jellyfin/{config,cache}
    mkdir -p jellyfin_tests/media/
    docker run -d -v /home/your_id/jellyfin_tests/srv/jellyfin/config:/config -v /home/your_id/jellyfin_tests/srv/jellyfin/cache:/cache -v /home/your_id/jellyfin_tests/media:/media --net=host jellyfin/jellyfin:latest

    All tests were done with Jellfyin v10.11.6. I select all the defaults during the server setup. The Docker image automatically contains the MusicBrainz and AudioDB plugins (among others).

    Test 1: No metadata
    Code:
    mkdir -p jellyfin_tests/media/{MEG,Meg}
    mkdir -p jellyfin_tests/media/MEG/BEAM
    mkdir -p jellyfin_tests/media/Meg/VESUVIA
    cd jellyfin_tests/media/MEG/BEAM
    flac --no-padding --force-raw-format --endian=little --sign=signed --channels=2 --bps=16 --sample-rate=44100 -o zero.flac - </dev/null
    cd ../../Meg/VESUVIA
    flac --no-padding --force-raw-format --endian=little --sign=signed --channels=2 --bps=16 --sample-rate=44100 -o zero.flac - </dev/null

    In the Music Library page
    Albums views:
    • two albums, each has the correct album name but no artist name
    • correct album art for the BEAM album and MEG fan art for the VESUVIA album
    Album Artist view and Artist view
    • no listings


    In the Metadata Manager It looks like:

    - Music
    -- MEG {Path:/media/MEG, Title:MEG, Date Added:02/18/2026, MusicBrainz Artist Id:b3b665a4-5a57-40f3-9df3-cdbb4e9ade59}
    --- BEAM {Path:/media/MEG/BEAM, Title:BEAM, Date Added:02/18/2026, Year:2007, MusicBrainz Album Id:9c3a1dcc-242a-40d7-98da-0f12f09f1e25, MusicBrainz Release Group Id:56936306-80a3-3f01-8acf-b8371932e051}
    ---- zero.flac {Path:/media/MEG/BEAM/zero.flac, Title:zero, Date Added:02/18/2026, Release Date:01/01/1901}
    -- Meg {Path:/media/Meg, Title:Meg, Date Added:02/18/2026, MusicBrainz Artist Id:b3b665a4-5a57-40f3-9df3-cdbb4e9ade59}
    --- VESUVIA {Path:/media/Meg/VESUVIA, Title:VESUVIA, Date Added:02/18/2026, Year:2013, MusicBrainz Album Id:0a88a49f-3039-4dbc-bf73-156e88382e44, MusicBrainz Release Group Id:1db1fd9a-cf5a-4ebe-87b8-6a60b2d641f6}
    ---- zero.flac {Path:/media/Meg/VESUVIA/zero.flac, Title:zero, Date Added:02/18/2026, Release Date:01/01/1901}

    This is interesting. It won't use the directory name for specifying an artist or album in its own library, but it will use them to pull album art, set a year for the album, and create links to the album in MusicBrainz. This seems like having it both ways. It doesn't want to commit to the error of a wrong album assumption, but it will use the art and year from its guess. This data is also associated with the album but not the track on the album.

    Also, across multiple tests, it doesn't always link the VESUVIA album to the same MEG album. Sometimes it links to SAVE other time PRECIOUS.

    Test 2: ARTISTS
    I specified not to prefer the 'Artists' tag, but surely it should fall back to that?

    Code:
    cd media/MEG/BEAM
    metaflac --set-tag="ARTISTS=MEG" zero.flac
    cd ../../Meg/VESUVIA
    metaflac --set-tag="ARTISTS=Meg" zero.flac

    When selecting all the defaults during server configuration, it includes leaving the checkbox empty for "preferring the non-standard Artists tag over the Artist tag during metadata reading." This is actually the only tag that I gave each file in this test. Turns out "prefer" is the wrong wording. It ignores the tag rather than using it in lieu of nothing else. No change in metadata within Jellyfin.

    The fastest fix is for someone to tweak the wording in the default web UI.

    Test 3: ARTIST
    Code:
    cd media/MEG/BEAM
    metaflac --set-tag="ARTIST=MEG" zero.flac
    cd ../../Meg/VESUVIA
    metaflac --set-tag="ARTIST=Meg" zero.flac

    In the Music Library page

    Album view:
    • two albums: BEAM/MEG and VESUVIA/Meg
    • correct album art for the BEAM album and MEG fan art for the VESUVIA album
    • click on BEAM album and it shows other albums by MEG and links to the album VESUVIA with a date of 2013 (though it was releaed in 2022)
    • click on VESUVIA album and it shows other albums by MEG and links to the album BEAM
    Album Artist view and Artist view
    • lists both MEG and Meg and both use the same MEG artist image
    • both artist pages list the same albums


    In the Metadata Manager it looks like:

    - Music
    -- MEG {Path:/media/MEG, Title:MEG, Date Added:02/18/2026, MusicBrainz Artist Id:b3b665a4-5a57-40f3-9df3-cdbb4e9ade59}
    --- BEAM {Path:/media/MEG/BEAM, Title:BEAM, Date Added:02/18/2026, Artists:MEG, Album Artists:MEG, Year:2007, MusicBrainz Album Id:9c3a1dcc-242a-40d7-98da-0f12f09f1e25, MusicBrainz Release Group Id:56936306-80a3-3f01-8acf-b8371932e051}
    ----zero.flac {Path:/media/MEG/BEAM/zero.flac, Title:zero, Date Added:02/18/2026, Artists:MEG, Album Artists:MEG, Release Date:01/01/1901}
    -- Meg {Path:/media/Meg, Title:Meg, Date Added:02/18/2026, MusicBrainz Artist Id:b3b665a4-5a57-40f3-9df3-cdbb4e9ade59}
    --- VESUVIA {Path:/media/Meg/VESUVIA, Title:VESUVIA, Date Added:02/18/2026, Artists:Meg, Album Artists:Meg, Year:2013, MusicBrainz Album Id:0a88a49f-3039-4dbc-bf73-156e88382e44, MusicBrainz Release Group Id:1db1fd9a-cf5a-4ebe-87b8-6a60b2d641f6}
    ---- zero.flac {Path:/media/Meg/VESUVIA/zero.flac, Title:zero, Date Added:02/18/2026, Artists:Meg, Album Artists:Meg, Release Date:01/01/1901}

    No surprise here. It is merely committing to the artist now that metadata has given confirmation. Still doesn't have enough info to pull the correct images.

    Test 4: MUSICBRAINZ_ARTISTID
    Give it more info about the problematic artist (Meg)

    Code:
    cd media/Meg/VESUVIA
    metaflac --set-tag="MUSICBRAINZ_ARTISTID=2ba752f0-cc6e-4472-aeda-00f0b2ca2c10" zero.flac

    In the Music Library pages:
    • No change - two artists, two albums, two album artists, but all use the MEG images, all ascribe the albums to MEG

    In the Metadata Manager, the only changes is:
    ---- zero.flac {Path:/media/Meg/VESUVIA/zero.flac, Title:zero, Date Added:02/18/2026, Artists:Meg, Album Artists:Meg, Release Date:01/01/1901, MusicBrainz Other Artist Id:2ba752f0-cc6e-4472-aeda-00f0b2ca2c10}

    The Metadata manager lists the MBID under "MusicBrainz Other Artist Id" but doesn't seem to do anything with it.

    Artist page for Meg lists that she has a MusicBrainz Artist entry, but the link still goes to the MEG MusicBrainz page and associates her with both albums.

    Test 5: MUSICBRAINZ_ALBUMARTISTID
    Give it more info about the problematic album (VESUVIA)

    Code:
    cd media/Meg/VESUVIA
    metaflac --set-tag="MUSICBRAINZ_ALBUMARTISTID=2ba752f0-cc6e-4472-aeda-00f0b2ca2c10" zero.flac

    In the Music Library pages:

    Individual Album page
    • Meg/VESUVIA: there is now a link to the correct MusicBrainz Album Artist (Meg), but all the art is still from MEG and the Album and Release Group links still link to SAVE by MEG.
    Album Artist view and Artist view
    • The Meg page in both views has a MusicBrainz link to MEG, not Meg
    • The Meg page in both views still lists both albums as being by Meg
    • The MEG page in both views still lists both albums as being by MEG
    The Metadata manager.
    ---- zero.flac {Path:/media/Meg/VESUVIA/zero.flac, Title:zero, Date Added:02/18/2026, Artists:Meg, Album Artists:Meg, Release Date:01/01/1901, MusicBrainz Other Artist Id:2ba752f0-cc6e-4472-aeda-00f0b2ca2c10, MusicBrainz Album Artist Id:2ba752f0-cc6e-4472-aeda-00f0b2ca2c10}

    So the track is updated, but that information is only used on the album page. The MusicBrainz links on that album page stay linked to the wrong album. The presence of the contradiction never triggers the retrieval of more accurate information.

    I understand that the goal is to make the best with limited information.

    In Test 1, there is only information from the file path. It does what it can with this an send an artist and album name to AudioDB for images. Fine, but even then, it is assuming a particular album from MusicBrainz (but limits its commitment to this guess).

    With actual artist information in Test 3, it does commit. It sees two different artist and album names, respects that, except when it comes to MusicBrainz. It doesn't have enough specificity to give MusicBrainz a chance to retrieve an accurate guess, but it tries anyway despite its design decision to respect the different artist directories. In the case of Meg vs MEG the two artist entries is correct, but in the case of Black Eyed Peas vs The Black Eyed Peas it is wrong. Meanwhile, the links/images will be messed up for Meg vs MEG, but right for the Black Eyed Peas.

    In Test 4. The system has additional information. Barring bad info in the tags (when all bets should be off anyway), it could realize it has the wrong images for the artist Meg. AudioDB actually accepts MusicBrainz ArtistIDs for image requests, so the correct images could be pulled. It could also realize that it has assumed the wrong album associations for Meg. The MusicBrainz ID, when present, provides more specific information than the other tags. I would argue you always use the most specific information you have when you have it. The system already does this when it tries to determine various values: e.g., first try "albumartist", then "album artist", then "album_artist”; if those are all bust, just use whatever artist list was already determined.

    Test 5 is more of the same. More specific information is present. Especially in this case, when no Album info was ever present except in the directory name.

    In the default server settings, the system doesn't just guess a MusicBrainz ID when none are present; it then uses that to assume dates and images. When they are present, suddenly they shouldn't be relied on.

    In this case, we have two artists that can't be easily differentiated by spelling, but are clearly differentiated by MB IDs. There isn't even a name decision to be made here.

    There is a related but also opposite case. Different names that are the same artist. Only with a UUID like a MBID can you resolve this. A simple example is Black Eyed Peas/The Black Eyed Peas. In these kinds of cases, you do have more questions arise. For anything with MusicBrainz IDs, it should be the main grouping index (functionally, I'm not saying to index the DB by that). It is the most precise information you have. And the name for all works by that ID should be the "canonical" name chosen on MusicBrainz. If there are albums in the collection by the same artist that lack a MusicBrainz ID, do what Jellyfin already does: the best you can with the tags you have. Different artist names without an MBID get different Artist entries. Lookups to AudioDB and MusicDB go by names and not MBIDs, because you don't have any MBIDs. The messiness of this was already current behavior. This merely cleans it up when it can.

    In discussions, I see a common "what if" about contradictory tags. If your tags contradict each other, then you were never guaranteed correct behavior. Any system must make a choice in such cases, and it won't always go the way you want. Trashy or limited input results in less-than-perfect performance. We see Jellyfin do that today with Meg/MEG and the images it grabs and assumptions it makes. But when more precise information is present, why would you favor the ambiguous information?

    Maybe I missed it, but one thing I didn't see in the test cases was tests that verify that every release of the system maintains its metadata decision-making process, assumptions, and tag priority. If that is correct, then that further undermines arguments against change. Small breaking changes in those decisions could already have happened or occur in the next release without being caught. If those tests don't exist, then it seems like it is time to define the desired behavior, the justifications for those behaviors, and encode it in testing. But if that is being visited, then it seems logical that a part of that should be favoring the tags with the most precise information.

    Much of the logic for these decisions currently resides here, but not all of it. Enough seems to be scattered around that I found it fastest to actually test a real system. Hopefully, the command lists provided convince others how easy local testing (at least on Linux) can be for tags. It might help others quickly poke at the behavior and debug what tags are needed to get Jellyfin to behave as they hope. Not everyone wants to set up a complete development environment just to get an answer (even those of us who code for a living). Feel free to post the Windows equivalent if that's how you swing.

    Also, someone has pointed out that in the latest version (v10.11.6), there is a known bug where it can take a day or so for Artists or Albums to be resolved. First, that misses the primary point of most of what I'm trying to discuss above. Second, fine, but how and where is that behavior defined? Third, where are the tests for that behavior? Debugging becomes a madhouse if the forums ask did you wait an unspecified amount of time to see if the problem went away by some automation that we presume is there because things eventually changed on our system.
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #3
    2026-02-22, 03:17 PM
    Ok, so 40 views and no comments. Anyone? Anyone? Bueller?

    Did I write too much? Not controversial enough? Is it just bots reading my post? Did I post in the wrong forum? Someone throw me a lifeline here.
    Emailluka
    Offline

    Junior Member

    Posts: 46
    Threads: 2
    Joined: 2023 Nov
    Reputation: 2
    Country:Austria
    #4
    2026-02-22, 08:54 PM
    Sorry that you wont get a reply from me that will help in any way.. i am just a user and got absolutely no clue how to contribute/fix the problem with music-librarys.
    But i wanted to let you know that its not bots reading your post. Further i guess to give a proper answer people who got a clue about fetching musicmetadata need some time to review/retest your findings.
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #5
    2026-02-22, 09:04 PM
    Your reply does help.
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #6
    2026-02-26, 05:32 PM (This post was last modified: 2026-02-27, 10:45 PM by Brandon Hill. Edited 1 time in total. Edit Reason: Adds Gist link to the code )
    So, for my own testing, I've developed a simple JSON schema and script that reads a JSON file to generate a series of directories and media files, quickly creating a test library. As the examples above show, this is primarily focused on testing library scanning. There are separate test files to test the actual codec performances here. I'm sharing it because others may find it handy too. I'm not suggesting this needs to be anything official.

    The goals were:
    • create minimal examples to reproduce a behavior/bug (and capture that in a simple JSON)
    • create large libraries for profiling library scanning performance
    • avoid testing on real libraries
    • avoid making disk-consuming test libraries with real media
    • quickly reproduce guaranteed metadata flags by always generating from scratch

    For the tech-savvy who aren't ready to commit to downloading the build environment and building from scratch, this works great with a Docker version of the latest to sanity check the behavior you think you see in your own library. That's how I started. Now I use it to build small media libraries to trigger specific sections of code while I piece together how the metadata scanning works.

    Code:
    # generate the library described in the JSON file in the current directory
    $ ./generate_test_library.sh test_library.json
    # generate the library described in the JSON file in /path/to/output
    $ ./generate_test_library.sh test_library.json /path/to/output

    Drawbacks:
    • This isn't unit test-friendly, but meant for manual testing
    • I'm focused on music, so the movie / tv stuff may be underdeveloped (as you can tell by my sad mockup of movie directories in my example)
    • All files are generated by ffmpeg, but many file types require that your ffmpeg was compiled with specific support
    • File types that are specified by your JSON but not supported by your ffmpeg are generated with the touch command
    • Some ffmpeg flag tweaking may be needed to make this more universal
    • I haven't tested this on Windows
    • As I write this, I realize I need to make a companion script to scan an existing library to generate JSON (which gets tiring to write for larger examples)

    The JSON describes the directory and file layout, the file types, and any metadata to include. I also allow the setting of permissions (hex code only) and ownership for files and directories. Hopefully, the example JSON is self-explanatory. 
    JSON:
    • root: array of file and dir objects
    • dir object: must have name. permissions, owner, and contents are all optional 
    • dir.contents: arrracy of file and dir objects
    • file object: must have name and type (extension). permissions, owner, and metadata are optional
    • file.metadata: an object with all the key/values you like where those key/values are translated into tag/values for metadata.

    [Edit] Attachments weren't working for me. Here is a gist of the script and sample JSON.
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #7
    2026-02-26, 05:39 PM (This post was last modified: 2026-02-27, 10:46 PM by Brandon Hill. Edited 1 time in total. Edit Reason: Points to resolution of the issue. )
    Despite the UI saying "YOUR ALLOCATED ATTACHMENT USAGE QUOTA IS UNLIMITED." It doesn't seem to accept my attachments (I've tried both drag and drop and clicking through the file selection dialog). Any suggestions?

    The Bash script is 451 lines long, and the JSON is about 100. I don't want to just paste them in if it isn't necessary.

    [Edit] Resolved. See the Gist link added to the post above.
    elfalem
    Offline

    Junior Member

    Posts: 2
    Threads: 0
    Joined: 2026 Mar
    Reputation: 0
    #8
    2026-03-08, 01:03 AM
    This is a very complex topic. My suggestion is that the music metadata behavior should have defaults for the most likely use-case. A collection of files with reasonably accurate but incomplete metadata. Probably just artist, track title, and album title being present in either the directory/file structure and/or tags.

    However, it should follow a plugin architecture so that different parts are customizable. For example:
    • user can specify which tag names to look at for artist: "artist" then "artists"
    • user can specify a different way to identify tracks, e.g. create a signature from the audio and use that to find info online
    Brandon Hill
    Offline

    Junior Member

    Posts: 7
    Threads: 1
    Joined: 2026 Feb
    Reputation: 0
    #9
    11 hours ago
    That matches one of my early notions. For each metadata item that Jellyfin determines, it could have a list of sources and order of precedence for each metadata item (e.g., artist (metadata tag), artists (metadata tag), artist from MusicBrainz album artist, directory name) as you suggest. Then the server could have a web interface that lets you select and order those items. I don't even think it needs to be a plugin. If the default setting matched the historical choices, then upgrades would incur no change.

    As someone else pointed out, though, this means that when the admin changes these settings, it will change it for all users. The current internal database schema doesn't allow for a per-user preference of album and artist, and it seems like a pretty significant change to make that possible. I've argued that it would be a hostile admin who changes things without getting user preferences. I'm not sure that protecting users from a "hostile" admin is an achievable design philosophy. I think it blocks not only this change but also many possible improvements.

    Maybe what should be a plugin is a tool that rescans the library using a desired metadata/precedence schema and generates a report of which items would change under that schema from the current settings. It would allow an admin to test the possible effects of a change and share that with users. It might actually be useful to build this first, to play with the idea before making any changes to the server itself.
    elfalem
    Offline

    Junior Member

    Posts: 2
    Threads: 0
    Joined: 2026 Mar
    Reputation: 0
    #10
    3 hours ago
    Quote:I don't even think it needs to be a plugin. If the default setting matched the historical choices, then upgrades would incur no change.
    Yes, that logic specifically doesn't need to be a plugin. But the code that provides the metadata from a specific source (whether online or local) should be a plugin. That would enforce a clear boundary. And default/popular plugins such as MusicBrainz can be shipped with the core server.

    Quote:I'm not sure that protecting users from a "hostile" admin is an achievable design philosophy. I think it blocks not only this change but also many possible improvements.

     I agree. It's impractical for this to be per-user. It should remain as a global configuration that the admin manages.
    « Next Oldest | Next Newest »

    Users browsing this thread: 1 Guest(s)


    • View a Printable Version
    • Subscribe to this thread
    Forum Jump:

    Home · Team · Help · Contact
    © Designed by D&D - Powered by MyBB
    L


    Jellyfin

    The Free Software Media System

    Linear Mode
    Threaded Mode