Difference between revisions of "Variable Length Value"
(8 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Variable Length Values are an encoding system for multi-byte integers used to store timing information in [[standard MIDI files]]. They are a form of compression which minimizes storage for small integers, yet still able to handle larger integers. | + | Variable Length Values are an encoding system for multi-byte integers used to store timing information in [[standard MIDI files]]. They are a form of compression which minimizes storage for small integers, yet is still able to handle larger integers. MIDI-file VLVs can be 1–5 bytes long, depending on the size of the number it represents. MIDI-file VLVs always unpack into a four-byte integer. The length of the VLV is controlled by the most-significant bit in each byte: a "1" bit indicates that a VLV continues to the next byte in the data, and a "0" bit indicates the end of a VLV. |
== Motivation for Variable Length Values == | == Motivation for Variable Length Values == | ||
− | MIDI files contain two basic components: (1) MIDI messages to be sent to a synthesizer and (2) the times when the messages will be sent to the synthesizer. When MIDI and MIDI files were initially designed, storage space on computers was limited | + | MIDI files contain two basic components: (1) MIDI messages to be sent to a synthesizer and (2) the times when the messages will be sent to the synthesizer. When MIDI and MIDI files were initially designed, storage space on computers was limited and MIDI-file timestamps were encoded as VLVs to save space. MIDI was created in the early 1980's when personal computers often didn't have hard disks, and floppy disks could store 256KB. Timestamps in MIDI files are differential: they record the amount of time to wait since the previous message was sent. So in an active musical texture, many of the delta timestamps will be integers less than 128. These small numbers can be stored as single bytes VLVs, compared to four bytes if using a fixed-width integer. When space is at a premium, the 3-byte savings is useful; however with modern storage capabilities, the added encoding complexity becomes pointless. |
− | After header information in MIDI files, the data consists of a list of events. Each event is composed of a timestamp followed by a MIDI message. These time/message pairs follow immediately after each other in a MIID file, | + | After header information in MIDI files, the data consists of a list of events. Each event is composed of a timestamp followed by a MIDI message. These time/message pairs follow immediately after each other in a MIID file, so the order of data in a MIDI file looks like this in one long stream: |
<ul> | <ul> | ||
Line 11: | Line 11: | ||
</ul> | </ul> | ||
− | The <tt>time</tt> value is a measurement of the time to wait between sending the <tt>message</tt> before that time and the <tt>message</tt> after that time. This method of specifying the time is called '''delta time'', where ''delta'' is a jargon term in mathematics and related fields which means "difference" | + | The <tt>time</tt> value is a measurement of the time to wait between sending the <tt>message</tt> before that time and the <tt>message</tt> after that time. This method of specifying the time is called '''delta time'', where ''delta'' is a jargon term in mathematics and related fields which means "difference"—so a ''delta time'' is a time value which specifies the duration (or time difference) between two events. |
Note that MIDI messages usually come after each other rather quickly—especially if chords are played. So the predominant time between each MIDI message in a MIDI file will usually be small. This is the reason that VLVs are used to store MIDI-file timestamps, since they take up less space for small numbers. | Note that MIDI messages usually come after each other rather quickly—especially if chords are played. So the predominant time between each MIDI message in a MIDI file will usually be small. This is the reason that VLVs are used to store MIDI-file timestamps, since they take up less space for small numbers. |
Latest revision as of 22:43, 15 January 2013
Variable Length Values are an encoding system for multi-byte integers used to store timing information in standard MIDI files. They are a form of compression which minimizes storage for small integers, yet is still able to handle larger integers. MIDI-file VLVs can be 1–5 bytes long, depending on the size of the number it represents. MIDI-file VLVs always unpack into a four-byte integer. The length of the VLV is controlled by the most-significant bit in each byte: a "1" bit indicates that a VLV continues to the next byte in the data, and a "0" bit indicates the end of a VLV.
Motivation for Variable Length Values
MIDI files contain two basic components: (1) MIDI messages to be sent to a synthesizer and (2) the times when the messages will be sent to the synthesizer. When MIDI and MIDI files were initially designed, storage space on computers was limited and MIDI-file timestamps were encoded as VLVs to save space. MIDI was created in the early 1980's when personal computers often didn't have hard disks, and floppy disks could store 256KB. Timestamps in MIDI files are differential: they record the amount of time to wait since the previous message was sent. So in an active musical texture, many of the delta timestamps will be integers less than 128. These small numbers can be stored as single bytes VLVs, compared to four bytes if using a fixed-width integer. When space is at a premium, the 3-byte savings is useful; however with modern storage capabilities, the added encoding complexity becomes pointless.
After header information in MIDI files, the data consists of a list of events. Each event is composed of a timestamp followed by a MIDI message. These time/message pairs follow immediately after each other in a MIID file, so the order of data in a MIDI file looks like this in one long stream:
time message time message time message time message time message time message
time message time message time message time message time message time message
time message time message time message time message time message time message
time message time message time message time message time message time message
The time value is a measurement of the time to wait between sending the message before that time and the message after that time. This method of specifying the time is called 'delta time, where delta is a jargon term in mathematics and related fields which means "difference"—so a delta time is a time value which specifies the duration (or time difference) between two events.
Note that MIDI messages usually come after each other rather quickly—especially if chords are played. So the predominant time between each MIDI message in a MIDI file will usually be small. This is the reason that VLVs are used to store MIDI-file timestamps, since they take up less space for small numbers.
MIDI files, like plain MIDI data, are stored in the form of bytes. Bytes can be though of as numbers in the range from 0 to 255, since a byte is a binary number with 8 digits, which can store numbers from 000000002 to 111111112. This poses a problem in binary data such as MIDI files. The MIDI file is just a stream of numbers, each of which is in the range from 0 to 255. How do you tell when a time or message starts/stops? For example, here is some data from a MIDI file (in hexadecimal notation):
00 ff 58 04 04 02 30 08 00 ff 59 02 00 00 00 90 3c 28 81 00 90 3c 00 00 90 3c 1e 81 00 90 3c 00 00 90 43 2d 81 00 90 43 00 00 90 43 32 81 00 90 43 00 00 90 45 2d 81 00 90 45 00 00 90 45 32 81 00 90 45 00 00 90 43 23 82 00 90 43 00 00 90 41 32 81 00 90 41 00 00 90 41 2d 81 00 90 41 00 00 90 40 32 40 90 40 00 40 90 40 28 40 90 40 00 40 90 3e 2d 40 90 3e 00 40 90 3e 32 40 90 3e 00 40 90 3c 1e 82 00 90 3c 00 00 ff 2f 00
This can easily be parsed with knowledge of VLVs and the MIDI protocol into the following list of time-stamped MIDI message. The raw bytes are shown in the list on the left and the uncompressed delta times (in decimal format) are shown in the list on the right.
|
|
Notice that time values in the above list are sometimes two bytes long and sometimes a single byte. Whenever a VLV byte is greater or equal to hex 80 (decimal 128), this indicates that the VLV value is not yet finished, and at least one more following byte is contained in the VLV to be unpacked. When a byte less than hex 80 is encountered, this marks the end of the VLV.
DELTA TIME MESSAGE 00 00 00 00 ff 58 04 04 02 30 08 00 00 00 00 ff 59 02 00 00 00 00 00 00 90 3c 28
The people who designed the structure of the Standard MIDI file were concerned about storage space. The MIDI protocol was created in the early 1980's when PC computers often didn't have hard disks, and the floopy disks contains 256KB of information max. Now in the year 1999, a 1GB hard disk is considered small (which could hold around 5,000 256KB floppies on it). We can't easily change the MIDI file format now however, because there are lots of MIDI files and programs that read MIDI files which would become obsolete for no significant improvement in the file format other than to make it easier for humans to read (which they shouldn't be doing anyway).
Since delta times are usually small, we can use just one byte usually to store the time. However if delta times are large, then more than one byte is needed to store the delta time. Therefore, by using VLV's, the size of a MIDI file is reduced. So, you can think of VLV's as a form of compression.
Definition of Variable Length Values
A MIDI file Variable Length Value is stored in bytes. Each byte has two parts: 7 bits of data and 1 continuation bit. The highest-order bit is set to 1 if there is another byte of the number to follow. The highest-order bit is set to 0 if this byte is the last byte in the VLV. To recreate a number represented by a VLV, first you remove the continuation bit and then concatenate the leftover bits into a single number. To generate a VLV from a given number, break the number up into 7 bit units and then apply the correct continuation bit to each byte. In theory, you could have a very long VLV number which was quite large; however, in the standard MIDI file specification, the maximum length of a VLV value is 5 bytes, and the number it represents can not be larger than 4 bytes.
Examples
00000000
First segretate the continuation bit from the data bits for the value:
0 0000000
What is the continuation bit? It is zero. This means that there are no more bytes to follow this byte for the delta time's value. Now look and figure out what number they are. 0000000 in binary notation is equal to 0 hex, or 0 decimal. Therefore the VLV 00000000 is equal to 0.
Now look at a harder example: 81 00 (hex). Since 81 is larger than 80, you can see that the 00 byte is also part of the VLV. Lets look at 81 00 in binary notation:
10000001 00000000 1 0000001 0 0000000 0000001 0000000
81 00
00000010000000 00 0000 1000 0000 0000 0000 1000 0000 0000 0000 1000 0000 0 0 8 0
00808080
80hex = 8 * 161 0 * 160 = 8 * 16 0 = 128 decimal
decimal
DELTA TIME MESSAGE 0 ff 58 04 04 02 30 08 0 ff 59 02 00 00 0 90 3c 28 128 90 3c 00 0 90 3c 1e 128 90 3c 00 0 90 43 2d 128 90 43 00 0 90 43 32 128 90 43 00 0 90 45 2d 128 90 45 00 0 90 45 32 128 90 45 00 0 90 43 23 256 90 43 00 0 90 41 32 128 90 41 00 0 90 41 2d 128 90 41 00 0 90 40 32 64 90 40 00 64 90 40 28 64 90 40 00 64 90 3e 2d 64 90 3e 00 64 90 3e 32 64 90 3e 00 64 90 3c 1e 256 90 3c 00 0 ff 2f 00
What are the units of measurements for delta times? In a MIDI file, the units are arbitrary, and you have to look at the header of the MIDI file to see what the units mean. For this example, the time units are 128 ticks to the quarter note, so 128 is a quarter note duration, 256 is a half-note, and 64 is an eighth-note duration.
For the class, you can use the program [vlv.cpp vlv] to convert between variable length values and the numbers that they represent (or vice-versa).
Here are some more equivalences between VLV's and the number they represent: