While we’ll never be able to cover every possible edge case, I’d like to propose a standardized test content collection.
Initially it would be a static collection of files to work with. As it grows, there will still be an archive of files kept (just in case), but since a lot of the content can come from some pretty big names with (hopefully) reliable links, it can change to be primarily a download script.
One would run it, get their base testing content downloaded, and then future updates can be downloaded on demand. Since the test library would consist of free content, items will typically never be removed - updates would just be additive.
An example of packaged diff updates working well in history is something like the High Voltage SID Collection: https://hvsc.c64.org/
They have, for years, maintained a collection of items, and always provided a path to keeping them up to date.