©2001 by Joe Monzo
--- In tuning@y..., jpehrson@r... wrote: http://groups.yahoo.com/group/tuning/message/23368 > I thought somebody said, was it John DeLaubenfels (??) > that the MIDI unit was not a set division, but could be > altered somewhat, so it is not a permanent measure... > Did I get that wrong (??) Hi Joe, I had originally coined the term "midipu" (= MIDI Pitch-bend Unit) and defined it as 1/4096 = 1/(2^12) of a Semitone. This was based on my use of pitch-bend in Cakewalk. Manuel corrected me and posted a link to the official MIDI tuning specification webpage, wherein he stated correctly that the finest tuning resolution available in MIDI is 1/16384 = 1/(2^14) of a Semitone, and that the figure in my definition was merely a less-finely-resolved choice that Cakewalk had made. IIRC, John deL. chimed in in agreement. (I have since renamed that measurement a "cawapu" = CAkeWAlk Pitch-bend Unit, and changed my definition of midipu to agree with Manuel's). So there is much variability in how different manufacturers choose to implement the MIDI tuning spec, but the spec itself offers 1/(2^14) Semitone as the limit of resolution. Now, here's my long essay on how it all works.... ----------------------------------------------------- GENTLE INTRODUCTION TO THE MIDI TUNING SPECIFICATION by Joe Monzo ----------------------------------------------------- Much of this is going to be elementary for anyone who knows anything about how computer work internally. But following my explanation should shed some light on how MIDI tuning and pitch-bend works. I'm going to explain four numbering systems that all have a bearing on learning how to use the MIDI tuning specification: 1. decimal base_10 2. binary base_2 3. hexadecimal base_16 4. octal base_8 Think in terms of a prime-factor vector, which you've seen me (and Graham Breed) use here many times before. This use of a prime-factor vector was the confusing aspect of my HEWM notation which Joe Pehrson cited from my paper a couple of months back (in April 2001 posts to the Tuning List), where I omit the primes themselves and just string the exponents out in a series. Understanding bits and bytes is similar to this. The difference is that instead of representing exponents of prime-factors, the numbers represent values of the base-number raised to successive exponents. Different numbering systems --------------------------- For example, let's start with the system you're most familiar with: decimal, also called base_10. Of course I realize everyone understands this, but it's necessary to spell out the procedure. There are 10 decimal digits: 0 1 2 3 4 5 6 7 8 9. These can be thought of as indicating the value of a number of individual "units". When we've used up all 10 digits and must continue counting, how do we do it? We imagine that these numbers may appear in a "place" which has a specific meaning: that each place refers to an exponent of the base number, and any digit in that place indicates the value of that exponent. So to get to the number "ten", simply form a new place to the left of the original one and make that place represent not units but "tens". In math, the difference is indicated by: unit = 10^0, ten = 10^1. So placing a "1" in the "tens" place indicates 1 "ten", or simply 10. The zero in the units place shows that the total value is only 10 and not more than 10. Then we cycle thru the units place again until we reach 19, and then go back to zero in the units and bump the tens up to 2, with the result of 20. Etc., etc. The next place to the left is 100s (= 10^2), the next to the left of that 1000s (= 10^3), etc. So in other words the number 111, for example, really means: 1*(10^2) + 1*(10^1) + 1*(10^0) = 100 + 10 + 1 = 111. Now on to binary, also called base_2. There are only 2 binary digits: 0 and 1. The reason why this is so effective for use in computers is because electronic switches can relay data by means of being in one of either of two states: on or off. The word "bit" is a contraction of the two words "b-inary dig-it". A "byte" is an 8-digit binary string. So how do we get any bigger numbers than 2 in binary? Easy... the same way as in the decimal system. Each place to the left will be a higher exponent of 2. So the right-most place is 2^0 = 1, then next to the left is 2^1 = 2, the next is 2^2 = 4, the next is 2^3 = 8, etc. So we cycle thru the whole set by placing first a zero, then a one, in each successive place: binary decimal 0 = 0 1 = 1 10 = 2 11 = 3 100 = 4 101 = 5 110 = 6 111 = 7 1000 = 8 etc. So, for example, the number 7 [decimal] is represented in binary as with "1"s filling the first three places: 1 1 1 [_base-2] = 1*(2^2) + 1*(2^1) + 1*(2^0) = 4 + 2 + 1 = 7 [_base-10] The next number, 8 [decimal], divides evenly into the third power of 2, so it has a "1" filling the 2^3 place and zeros in all the others, and looks like this is binary: 1000. 1 0 0 0 [_base-2] = 1*(2^3) + 0*(2^2) + 0*(2^1) + 0*(2^0) = 8 0 + 0 + 0 = 8 [_base-10] The next number system I'll discuss is hexadecimal, or hex for short, also called base_16. Long strings of 1s and 0s are difficult to comprehend visually, so hex is used by programmers as a convenient shorthand for binary. Its combination of numbers and letters is much easier for humans to parse. In this system, each place can hold numbers which represent 0 thru 15. But we need to keep our "digits" to, obviously, a single digit, so we invoke the first few letters of the alphabet after we pass 9. Let's continue counting from the table above, this time with the results in hex rather than decimal (you only see a difference after 9). I'll give the decimal value at the end, just so you can see what it is: binary hex decimal 1001 = 9 1010 = A = 10 1011 = B = 11 1100 = C = 12 1101 = D = 13 1110 = E = 14 1111 = F = 15 Because 1 hex digit can represent exactly 4 binary digits, this is sometimes found to be a useful grouping, and is called a "nibble" (I've also seen it spelled "nybble"). A nibble is exactly half a byte... get it? I'm mentioning nibbles here because they play a role in understanding how the MIDI tuning spec uses the data contained in each byte of a signal. I'll use nibble-size binary groupings below to illustrate the hex numbers that we'll come across. Also note that it's easy to specify binary strings of 1s in decimal format in a manner which makes obvious their binary derivation, by bumping up to the next higher power of 2 and adding "-1". For example, our last number in the above table, 1111 [binary] = 10000 [binary] minus 1, or in decimal, (2^4)-1 = 16 - 1 = 15, which = F in hex. After F [hex], which = 15 [decimal], we get to 16 [decimal] by using the same procedure as before: bump up to the next higher exponent of 16 and start over again. So a "1" in the next higher place means 16^1, and in the next higher place after that, 16^2, etc. So the highest 2-digit number in hex is FF [hex], which equals (15 * (16^1)) + (15 * (16^0)) = (15 * 16) + (15 * 1) = 240 + 15 = 255 The next higher number would be 256 [decimal], which divides evenly as the second power of 16, so we put a "1" in the third place followed by two zeros: 100 [hex] = 256 [decimal]. So, in hex, FF + 1 = 100. There's an intermediate numbering system called octal, which is (you guessed it) base_8. This system is a bit easier to understand because it works just like decimal but only has digits 0 thru 7. After 7 comes 10 [octal] = 8 [decimal]. In fact it's somewhat akin to our usual diatonic musical numbering system which uses A B C D E F G, then starts again at A when we reach the next "octave". Octal is not much used these days, hex being preferred. But because of the structure of the MIDI tuning data protocol, octal plays an important role in MIDI tuning calculations. Now, with that out of the way, on to the specifics of the MIDI tuning data format. From the official MIDI website: > Frequency data shall be sent via system exclusive > messages. Because system exclusive data bytes have > their high bit set low, containing 7 bits of data, > a 3-byte (21-bit) frequency data word is used for > specifying a frequency with the suggested resolution. In other words, the left-most bit or place, in each of the three 8-digit binary strings called a "byte" which belong to a tuning command, is set to 0, which in this case is simply a flag to let the hardware know that these 3 bytes of data are a SysEx type of message. This is extremely important, because it means that the rules of enumeration which I elaborated above are not quite followed. There's a different system in use here that's called an "offset". More from the midi website: > Frequency data shall be defined in units which are > fractions of a semitone. The frequency range starts > at MIDI note 0, C = 8.1758 Hz, and extends above > MIDI note 127, G = 12543.875 Hz. The first byte of > the frequency data word specifies the nearest > equal-tempered semitone below the frequency. The next > two bytes (14 bits) specify the fraction of 100 cents > above the semitone at which the frequency lies. > Effective resolution = 100 cents / 214 = .0061 cents. There's a serious typo error on the webpage here, in the denominator of that fraction: "214" should really be 214, so that effective resolution should be given as 100 cents / 2^14 = ~0.006103516 cent. Thus, the greatest possible MIDI tuning resolution is equivalent to 1200 / (0.006103516) = 196608-EDO, or 2^14 = 16384 midipus per Semitone. Also, the frequencies may be specified to a more accurate number of decimal places than those published in the MIDI spec, which is particularly important in the case of the low frequencies:Let's begin by illustrating the nature of the MIDI data. I'll use variables s and m, to stand for Semitone bit and midipu bit, respectively: 0sssssss 0mmmmmmm 0mmmmmmm So in other words, the highest value that any of these bytes can have is 1111111 [binary] = (2^8)-1 [decimal], which equals 127. This is the same as saying: (2^6)+(2^5)+(2^4)+(2^3)+(2^2)+(2^1)+(2^0) = 64 + 32 + 16 + 8 + 4 + 2 + 1 = 127. 127 [decimal] = 7F [hex], because the first "nibble" is 0111 [binary] = 7 [hex], and the second "nibble" is 1111 [binary] = F [hex]. So all the values in the three MIDI data bytes must be between 0 and 127 [decimal], which is the same as between 0 and 7F [hex]. The Semitone component ------------------------ This is why there are 128 possible different MIDI notes, numbered from C0 to G10; or, to put it another way which relates to the illustration above, 128 different semitone divisions of the total pitch-space. In Semitones, this is: 10 "octaves" + the highest "octave" of C + a "5th" = (12 * 10) + 1 + 7 = 128 total MIDI-notes. Let's examine some specific MIDI-note numbers to see how it works. First let's try the familiar "octave". This is the 12th note above the starting note. Recapping the hex table I gave above, we see that: hex decimal 9 = 9 A = 10 B = 11 C = 12 So in hex the 12th note would be the digit "C": 0C [hex] = (0 * (16^1)) + (12 * (16^0)) [decimal] = (0 * 16) + (12 * 1) = 0 + 12 = 12 [decimal], or an "octave" = MIDI-note C1. Let's try the highest hex digit, F. Remember, F [hex] = 15 [decimal]: 0F [hex] = (0 * (16^1)) + (15 * (16^0)) [decimal] = (0 * 16) + (15 * 1) = 0 + 15 = 15 [decimal] = (15 / 12) = 1 & 3/12 "octaves" above C0 = an "octave" + a "minor 3rd" above C0 = MIDI-note Eb1/D#1. And to find out highest MIDI-note: 7F [hex] = (7 * (16^1)) + (15 * (16^0)) [decimal] = (7 * 16) + (15 * 1) = 112 + 15 = 127 [decimal] = (127 / 12) = 10 & 7/12 "octaves" above C0 = 10 "octaves" + a "perfect 5th" above C0 = MIDI-note G10. Now let's reverse the procedure, so that for any given MIDI-note we find the hex value. Let's find "middle-C": = MIDI-note C5. = 5 "octaves" above C0 = 5 "octaves" * 12 Semitones = 60 [decimal] (MIDI-note number 60) (60 / 16 = 3 & 12/16, therefore...) 60 [decimal] = 48 + 12 = (3 * 16) + (12 * 1) = (3 * (16^1)) + (12 * (16^0)) [decimal] = 3C [hex] Since A-440 Hz is the MIDI tuning reference, let's find that: = MIDI-note A5. = 5 "octaves" + a "major 6th" above C0 = 5 & 9/12 "octaves" above C0 = ((5 * 12) + 9) Semitones = (60 + 9) Semitones = 69 [decimal] (MIDI-note number 69) (69 / 16 = 4 & 5/16, therefore...) 69 [decimal] = (4 * 16) + 5 = 64 + 5 = (4 * 16) + (5 * 1) = (4 * (16^1)) + (5 * (16^0)) [decimal] = 45 [hex] I chose these MIDI-notes deliberately because they appear in the table on the MIDI spec website. This explanation should make it easier to understand that table. Knowing the MIDI-note number of A-440 Hz enables us to calculate more accurate frequency values to replace those given on the MIDI website (one needs to be especially careful when rounding off very low frequencies): Frequency of lowest MIDI-note = 440 Hz / (ratio of A5:C0) = 440 / (2^((69-0)/12)) Hz = ~8.175798916 Hz Frequency of highest MIDI-note = (ratio of A5:G10) * 440 Hz = (2^((127-69)/12)) * 440 Hz = ~12543.85395 Hz The pitch-bend component ------------------------ That's easy enough for the semitone component of the tuning spec, because it only occupies one byte. But for the fraction-of-a-semitone component (or pitch-bend component), which occupies *two* bytes, the math is a bit more complicated. You can't simply keep bumping up to the next higher exponent of your base as in normal calculation, because the MIDI spec requires that the first bit of each MIDI data byte must be a zero in order to flag it as a data byte. That zero in what is called the most significant bit throws the calculation off by one exponent. (It's called the "most significant bit" because it has the highest *potential* value in its byte, even tho in the MIDI spec it is actually equal to zero.) Here's the solution. 127 [decimal] = 7F [hex] is the highest value we can have in any of the three tuning data bytes. So if our first data byte (the one to the right) has a value of 7F [hex] = 127 [decimal], we can't simply use the regular 80 [hex] to represent 128 [decimal] . We have to skip over the predetermined zero in the most significant bit of this byte, and put a "1" into the next available place, which would be the least significant bit of the next higher byte. In binary notation, we may designate the mandatory zero in the highest bit with an "x", to illustrate that it cannot be used in our calculation. This ends up giving us a rather bizarre combination of octal and hex in our calculations, and has made MIDI tuning math more complicated than it probably needed to be. I will refer to this as "octal-hex" in my labels. We will also find that it is easier to understand the octal-hex combination if we divide the bytes into two nibbles for the purposes of binary notation, and if we use zeros as place-holders in the unused places of the octal-hex numbers and divide them into bytes. Thus, in effect, the two pitch-bend data bytes are divided into 4 nibbles which are counted in the pattern: octal - hex - octal - hex (from left to right). So if we start now at: 127 [decimal] = 00 7F [octal-hex] = x000 0000 x111 1111 [binary], this gives a tuning inflection of 0.775146484 (= 3175/4096) cent. The next number is: 128 [decimal] = 01 00 [octal-hex] = x000 0001 x000 0000 [binary]. This gives a tuning inflection of 0.78125 (= 25/32) cent. So we can see that 01 00 [octal-hex], instead of representing 256 [decimal] as in a regular hex calculation, will now represent 128 [decimal] instead. So now we may cycle thru all the possible combinations of digits in the lower (right-most) byte until we fill all the places with their highest digit (which is "1" in binary), which would give us: x000 0001 x111 1111 [binary] = 01 7F [octal-hex] = 128 + 127 [decimal] = 255 [decimal]. This gives a tuning inflection of ~1.556396484 (= 1 + 2279/4096) cents. The next number is: 256 [decimal] = 02 00 [octal-hex] = x000 0010 x000 0000 [binary] This gives a tuning inflection of 1.5625 (= 1 + 9/16) cents. So this rather complicated calculation is achieved by treating the left byte the same way as the right one, then multiplying it by 128, then adding both bytes together. Alternatively, perhaps it is easier to think of each hex digit as a certain exponent of 2 which follows an alternating irregular pattern: 4 3 4 ... pattern of exponent increase / \ / \ / \ = 2^11 2^7 2^4 2^0 ... exponent of 2 = 2048 128 16 1 ... decimal value Since this interrupted pattern (i.e., the mandatory zero byte that doesn't count in the calculation) is a non-standard kind of math, let's cycle thru all the remaining pairs of numbers where the next "place" changes, to be absolutely clear on how it works. = x000 0010 x111 1111 [binary] = 02 7F [octal-hex] = 383 [decimal] Tuning inflection: 2.337646484 (= 2 + 1383/4096) cents. = x000 0011 x000 0000 [binary] = 03 00 [octal-hex] = 384 [decimal] Tuning inflection: 2.34375 (= 2 + 11/32) cents. = x000 0011 x111 1111 [binary] = 03 7F [octal-hex] = 511 [decimal] Tuning inflection: 3.118896484 (3 + 487/4096) cents. = x000 0100 x000 0000 [binary] = 04 00 [octal-hex] = 512 [decimal] Tuning inflection: 3.125 (= 3 + 1/8) cents. = x000 0111 x111 1111 [binary] = 07 7F [octal-hex] = 1023 [decimal] Tuning inflection: 6.243896484 (= 6 + 999/4096) cents. = x000 1000 x000 0000 [binary] = 08 00 [octal-hex] = 1024 [decimal] Tuning inflection: 6.25 (= 6 + 1/4) cents. = x000 1111 x111 1111 [binary] = 0F 7F [octal-hex] = 2047 [decimal] Tuning inflection: 12.49389648 (= 12 + 2023/4096) cents. = x001 0000 x000 0000 [binary] = 10 00 [octal-hex] = 2048 [decimal] Tuning inflection: 12.5 (= 12 + 1/2) cents. = x001 1111 x111 1111 [binary] = 1F 7F [octal-hex] = 4095 [decimal] Tuning inflection: 24.99389648 (= 24 + 4071/4096) cents. = x010 0000 x000 0000 [binary] = 20 00 [octal-hex] = 4096 [decimal] Tuning inflection: 25 cents. = x011 1111 x111 1111 [binary] = 3F 7F [octal-hex] = 8191 [decimal] Tuning inflection: 49.99389648 (= 49 + 4071/4096) cents. = x100 0000 x000 0000 [binary] = 40 00 [octal-hex] = 8192 [decimal] Tuning inflection: 50 cents. = x111 1111 x111 1111 [binary] = 7F 7F [octal-hex] = 16383 [decimal] Tuning inflection: 99.99389648 (= 99 + 4071/4096) cents. Whew! Still with me? We made it thru the hardest part. Since 7F 7F [octal-hex] = 16383 [decimal] is the highest possible value, there are a total of 16384 = 2^14 possible divisions of the Semitone in the MIDI tuning spec. This is how I found the error in the fraction on the MIDI website. Most instruments and software do not take full advantage of this super-fine resolution. As the MIDI spec says: > An instrument which does not support the full > suggested resolution may discard any unneeded > lower bits on reception, but it is preferred > where possible that full resolution be stored > internally, for possible transmission to other > instruments which can use the increased resolution. Cakewalk [TM] 2.0, the MIDI sequencer I use, gives a tuning resolution of 4096 = 2^12 cawapus (a new term I just coined) per Semitone. Thus it ignores the first two bits available in the spec, and therefore gives a range of possible values from 0 to 4095 [decimal] = 00 00 to 1F 7F [hex]. In other words, the first nibble can only be a 0 or 1 in all four numbering systems considered here: let "x" designate the two bits that cannot be used because they are reserved for the SysEx flag. let "y" designate the two bits that Cakewalk's tuning spec cannot recognize. the cawapu spec uses a total of 1+4+3+4 = 12 bits. thus, the maximum possible value is: xyy1 1111 x111 1111 [binary] = 1 F 7 F [hex] = 4095 [decimal] since the leading nibble can only designate a binary digit, the cawapu data stream is thus really a weird progression of binary-hex-octal-hex. -monz http://www.monz.org "All roads lead to n^0"
MIDI note 0 is the 12edo "C" which is 5 "8ves" plus a "major-6th" below A-440, which = 440 * 2^(-5-(9/12)) = ~8.175798916 Hz, which is ~1/4355 of a cent lower than the published figure.
MIDI note 127 is the 12edo "G" which is 4 "8ves" plus a "minor-7th" above the reference tone of A-440, which = 440 * 2^(4+(10/12)) = ~12543.85395 Hz, which is ~1/344 of a cent lower than the published figure. (The published figure appears to be the result of an erroneous calculation, because rounding off intermediate values in the calculation results in other values which are all different from it.)
Errata on official MIDI tuning webpage
As stated above, there are several errors on the official MIDI tuning page.
At the end of the section titled "FREQUENCY DATA FORMAT" is a table titled "Examples of frequency data:" (almost halfway down the page).
Below i give the correct figures for Hz, using 8 decimal places of precision instead of the 4 as on the MIDI page, along with some additional data showing MIDI-note, pitch-bend amount in both tetradekamus and cents, and ratios from the tuning reference of A-440 Hz.
The "7F 7F 7F" command is reserved to indicate "no change", thus the highest possible frequency obtainable in MIDI is 13289.6566 Hz. ("14mu" is my abbreviation for "tetradekamu".)
MIDI MIDI --pitch-bend-- ratio freq.data note +14mus +cents from A-440 Hz 00 00 00 = 0 0 0.0000 0.018581361 8.17579892 00 00 01 = 0 1 0.0061 0.018581427 8.17582774 01 00 00 = 1 0 0.0000 0.019686266 8.66195722 0C 00 00 = 12 0 0.0000 0.037162722 16.35159783 3C 00 00 = 60 0 0.0000 0.594603558 261.62556530 3D 00 00 = 61 0 0.0000 0.629960525 277.18263098 44 7F 7F = 68 16383 99.9939 0.999996474 439.99844877 45 00 00 = 69 0 0.0000 1 440.00000000 45 00 01 = 69 1 0.0061 1.000003526 440.00155124 78 00 00 = 120 0 0.0000 19.02731384 8372.01808962 78 00 01 = 120 1 0.0061 19.02738092 8372.04760546 7F 00 00 = 127 0 0.0000 28.50875898 12543.85395142 7F 00 01 = 127 1 0.0061 28.50885949 12543.89817521 7F 7F 7E = 127 16382 99.9878 30.20376504 13289.65661609 7F 7F 7F -- -- -- -- --The most egregious error in the table is the second note in the list, which the official MIDI page gives as "00 00 01 = 8.2104 Hz". The interval between this note and the first one is ~7.3111 cents, whereas it states explicitly in the text that 1 unit of pitch-bend equals only 0.0061 cents! The actual frequency data needed to obtain this frequency would be 00 09 2E -- quite a difference! This must have been the result of an error in the calculation.
The other errors are much smaller, and are probably the result of rounding various values at some point in the calculation. (Remember that the larger difference in numbers for the higher frequencies doesn't actually sound as big as it looks, because we perceive pitch logarithmically.)
updated:
2003.07.05 -- added last section correcting errata on official MIDI page
2002.10.27 -- added more information on cawapus
2001.7.29
2001.5.21
or try some definitions. |
I welcome
feedback about this webpage:
|