Extra metadata in .vvv

This is a draft, basically I think it'd be a good thing to make one standard format for extra metadata (like song titles and textual notes) in .vvv files.


The existing .vvv format

The .vvv format is normally structured in two parts: 128 song headers (despite only 16 being used), followed by all the songs concatenated.

Each of the 128 song headers is structured as follows in a C/C++ struct:

	struct resourceheader
	{
		char name[48];     /* 48 bytes */
		int start;         /* 4 bytes */
		int size;          /* 4 bytes */
		bool valid;        /* 1 byte, plus 3 bytes padding */
	};
	

For the songs in VVVVVV, the name attribute is always one of the following:

data/music/0levelcomplete.ogg
data/music/1pushingonwards.ogg
data/music/2positiveforce.ogg
data/music/3potentialforanything.ogg
data/music/4passionforexploring.ogg
data/music/5intermission.ogg
data/music/6presentingvvvvvv.ogg
data/music/7gamecomplete.ogg
data/music/8predestinedfate.ogg
data/music/9positiveforcereversed.ogg
data/music/10popularpotpourri.ogg
data/music/11pipedream.ogg
data/music/12pressurecooker.ogg
data/music/13pacedenergy.ogg
data/music/14piercingthesky.ogg
data/music/predestinedfatefinallevel.ogg

As far as I know the game relies on the name to match these expected values, and loads in song numbers by these filenames, and won't search for any other names than these. Most you could do by changing these is reorder songs, or get a song to not play in VVVVVV at all.

The start attribute is not used at all. In fact, it's not even initialized in Terry's original music packing code, so it may be completely random.

The size attribute is the size of the song in bytes. Note that this is stored in little-endian, so length 0x00112233 is stored as 33 22 11 00. But depending on the way you read/write music files, you may not even have to be aware about this. For invalid songs (see below), this can be uninitialized.

The valid attribute basically indicates how many songs there are; the first song marked invalid and all the ones after will not be loaded. In practice, songs 0-15 are valid and the rest invalid. The 2.0 music file used to mark song 15 as invalid as well, since the Predestined Fate remix didn't exist yet.
Because of alignment, three bytes after valid are used as padding, which may be uninitialized random bytes.

The format can be represented more graphically as follows, assuming not all 128 songs are 'valid':

path [48] [4] size T [3] /* song header 0 */
path [48] [4] size T [3] /* song header 1 */
.
.
.
path [48] [4] size T [3] /* last valid song header (usually 15) */
null bytes [48] [4] [4] F [3] /* first invalid song header (usually 16) */
.
.
.
null bytes [48] [4] [4] F [3] /* song header 126 */
null bytes [48] [4] [4] F [3] /* song header 127 */

song 0 data
[size 0..231, given in header 0]
 

song 1 data
[size 0..231, given in header 1]
 
.
.
.

last valid song data, (usually 15)
[size 0..231, given in corresponding header]
 

Extra metadata

What would be cool to store is stuff like song names and textual notes, without breaking VVVVVV of course, and while keeping compatibility with existing music files (we want to reduce the chance of uninitialized garbage data being interpreted as data in the new format). That way, people could make .vvvs without having to keep separate text files or level notes. It would also be nice to have some fields for the music file in general, like album name or information about the level it was made for.

Everything that follows is basically up for discussion and modification. To be clear, don't implement anything here yet, this is still a draft. I should also rewrite this to be shorter, more of a documentation, and less explaining what the rationale behind everything is. Also, todo: given the documentation of .vvv above, give a (TLDR) summary of everything that's different in the new format. Like, as if you have a changelog, but worded slightly differently. "The start field is now used for..., this is added to there..."

Inspired by what Alexia said on Discord, the best thing to do would probably be to add all the extra information at the end of the music file. Information about a song could include a file name, song name, and further comments. I think each of these strings should be terminated with a null byte (including the last), and I think it'd be a good idea to use the unused start field in the header to indicate the total size of that song-specific metadata, including the final null. That means, just like size indicates how long one music file is, start would indicate how long one song's metadata is.

If a song has no extra metadata defined, the data may be non-existent altogether, and the start field would correspondingly be set to 0. It may also be existent with all its fields having length 0, and the start field set to the length of this 'empty' data, the number of null terminators. (This all assumes we only want strings as attributes, if we want integer/boolean/bitflag types we should put those at the start)

The music file as a whole could also have extra metadata. This could be put after all the songs' metadata, and its total length could be indicated in the start field of the last invalid song header (song 127).

To prevent uninitialized garbage data in start fields from being interpreted as lengths of metadata, we could set the size of the last invalid song (song 127) to the total size of all metadata combined. Then, only if the sum of all start fields for songs 0-15 and song 127 equals size of song 127 (assuming songs 0-15 are valid), the file can be considered to have valid extra metadata. I think it would be unlikely that this equation adds up with uninitialized garbage memory, even if it was a bunch of 01 (therefore solving the problem), unless all garbage memory was zeroes. Which would be perfect, since it means all songs have no extra metadata defined.

Here's what the format would look like now:
(md.size: size of this song's metadata, md.sum: sum of all sizes of metadata, fd.size: size of file's metadata)

path [48] md.size size T [3] /* song header 0 */
path [48] md.size size T [3] /* song header 1 */
.
.
.
path [48] md.size size T [3] /* last valid song header (usually 15) */
null bytes [48] [4] [4] F [3] /* first invalid song header (usually 16) */
.
.
.
null bytes [48] [4] [4] F [3] /* song header 126 */
null bytes [48] fd.size md.sum F [3] /* song header 127 */

song 0 data
[size 0..231, given in header 0]
 

song 1 data
[size 0..231, given in header 1]
 
.
.
.

last valid song data, (usually 15)
[size 0..231, given in corresponding header]
 

song 0 metadata (md)
[size 0..231, given in header 0 as start]
 

song 1 metadata (md)
[size 0..231, given in header 1 as start]
 
.
.
.

last valid song metadata (md), (usually 15)
[size 0..231, given in corresponding header as start]
 

file metadata (fd)
[size 0..231, given in header 127 as size]
 

Format of song metadata

TODO, but imo, first fixed-size things if they exist, then strings, which would just be one after the other, each of them ended by a null. Maybe we could even make the strings start with a key and an "=", like: title=Song's title, but maybe not.
Again, DRAFT

Format of file metadata

TODO, but imo same text as above

Loading

For loading a .vvv with extra metadata, a program supporting this format must add up the sum of all start fields of valid songs, and the start field of song 127, and check whether this is equal to the size field of song 127. If these values are not equal, then the file should be interpreted as having no extra metadata. If the values are equal, the metadata may be loaded as documented in this document.

Saving

When saving a .vvv with extra metadata, a program supporting this format must:

Decisions to be made in this