©2001 by Joe Monzo
(NOTE: Versions of this webpage prior to 2014.0812 used a MIDI-note numbering which started with C0 as the lowest note. On that date the octave-numbers were revised downward by one, to conform to standard usage; thus C-1 is the lowest MIDI-note, C4 is "middle-C", A4 is the standard tuning reference of treble-A at 440 Hz, etc.) --- In tuning@y..., jpehrson@r... wrote: http://groups.yahoo.com/group/tuning/message/23368 > I thought somebody said, was it John DeLaubenfels (??) > that the MIDI unit was not a set division, but could be > altered somewhat, so it is not a permanent measure... > Did I get that wrong (??) Hi Joe, I had originally coined the term "midipu" (= MIDI Pitch-bend Unit) and defined it as 1/4096 = 1/(2^12) of a Semitone. This was based on my use of pitch-bend in Cakewalk. Manuel corrected me and posted a link to the official MIDI tuning specification webpage, wherein he stated correctly that the finest tuning resolution available in MIDI is 1/16384 = 1/(2^14) of a Semitone, and that the figure in my definition was merely a less-finely-resolved choice that Cakewalk had made. IIRC, John deL. chimed in in agreement. (I have since renamed that measurement a "cawapu" = CAkeWAlk Pitch-bend Unit, and changed my definition of midipu to agree with Manuel's. In 2001, both of these terms were superceded by the whole family of units called "mus", for "MIDI-units": dodekamu, abbreviated 12mu = 1/4096 Semitone, and tetradekamu, abbreviated 14mu = 1/16384 Semitone.) So there is much variability in how different manufacturers choose to implement the MIDI tuning spec, but the spec itself offers 1/(2^14) Semitone as the limit of resolution. Now, here's my long essay on how it all works.... ----------------------------------------------------- GENTLE INTRODUCTION TO THE MIDI TUNING SPECIFICATION by Joe Monzo ----------------------------------------------------- Much of this is going to be elementary for anyone who knows anything about how computer work internally. But following my explanation should shed some light on how MIDI tuning and pitch-bend works. I'm going to explain four numbering systems that all have a bearing on learning how to use the MIDI tuning specification: 1. decimal base_10 2. binary base_2 3. hexadecimal base_16 4. octal base_8 Think in terms of a prime-factor vector, which you've seen me (and Graham Breed) use here many times before. This use of a prime-factor vector was the confusing aspect of my HEWM notation which Joe Pehrson cited from my paper a couple of months back (in April 2001 posts to the Tuning List), where I omit the primes themselves and just string the exponents out in a series (since then defined by Gene Ward Smith as a monzo). Understanding bits and bytes is similar to this. The difference is that instead of representing exponents of prime-factors, the numbers represent values of the base-number raised to successive exponents. Different numbering systems --------------------------- For example, let's start with the system you're most familiar with: decimal, also called base_10. Of course I realize everyone understands this, but it's necessary to spell out the procedure. There are 10 decimal digits: 0 1 2 3 4 5 6 7 8 9. These can be thought of as indicating the value of a number of individual "units". When we've used up all 10 digits and must continue counting, how do we do it? We imagine that these numbers may appear in a "place" which has a specific meaning: that each place refers to an exponent of the base number, and any digit in that place indicates the value of that exponent. So to get to the number "ten", simply form a new place to the left of the original one and make that place represent not units but "tens". In math, the difference is indicated by: unit = 10^0, ten = 10^1. So placing a "1" in the "tens" place indicates 1 "ten", or simply 10. The zero in the units place shows that the total value is only 10 and not more than 10. Then we cycle thru the units place again until we reach 19, and then go back to zero in the units and bump the tens value up to 2, with the result of 20. Etc., etc. The next place to the left is 100s (= 10^2), the next to the left of that 1000s (= 10^3), etc. So in other words the number 111, for example, really means: 1*(10^2) + 1*(10^1) + 1*(10^0) = 100 + 10 + 1 = 111. Now on to binary, also called base_2. There are only 2 binary digits: 0 and 1. The reason why this is so effective for use in computers is because electronic switches can relay data by means of being in one of either of two states: on or off. The word "bit" is a contraction of the two words "b-inary dig-it". A "byte" is an 8-digit binary string. So how do we get any bigger numbers than 2 in binary? Easy... using the same method as in the decimal system. Each place to the left will be a higher exponent of 2. (There is a concept in computer science called "endianness", which refers to the most significant bit being stored in either the smallest or largest address; all the examples I am using here, where the numbers are being written from right-to-left with the most significant digit on the right, i.e. lowest address, are called "big-endian". Whether the bits are actually stored as big-endian or little-endian depends on how the hardware is manufactured.) So the right-most place is 2^0 = 1, then next to the left is 2^1 = 2, the next is 2^2 = 4, the next is 2^3 = 8, etc. So we cycle thru the whole set by placing first a zero, then a one, in each successive place: binary decimal 0 = 0 1 = 1 10 = 2 11 = 3 100 = 4 101 = 5 110 = 6 111 = 7 1000 = 8 etc. So, for example, the number 7 [decimal] is represented in binary as with "1"s filling the first three places: 1 1 1 [_base-2] = 1*(2^2) + 1*(2^1) + 1*(2^0) = 4 + 2 + 1 = 7 [_base-10] The next number, 8 [decimal], divides evenly into the third power of 2, so it has a "1" filling the 2^3 place and zeros in all the others, and looks like this is binary: 1000. 1 0 0 0 [_base-2] = 1*(2^3) + 0*(2^2) + 0*(2^1) + 0*(2^0) = 8 0 + 0 + 0 = 8 [_base-10] The next number system I'll discuss is hexadecimal, or hex for short, also called base_16. Long strings of 1s and 0s are difficult to comprehend visually, so hex is used by programmers as a convenient shorthand for binary. Its combination of numbers and letters is much easier for humans to parse. In this system, each place can hold numbers which represent 0 thru 15. But we need to keep our "digits" to, obviously, a single digit, so we invoke the first few letters of the alphabet after we pass 9. Let's continue counting from the table above, this time with the results in hex rather than decimal (you only see a difference after 9). I'll give the decimal value at the end, just so you can see what it is: binary hex decimal 1001 = 9 1010 = A = 10 1011 = B = 11 1100 = C = 12 1101 = D = 13 1110 = E = 14 1111 = F = 15 Because 1 hex digit can represent exactly 4 binary digits, this is sometimes found to be a useful grouping, and is called a "nibble" (I've also seen it spelled "nybble"). A nibble is exactly half a byte... get it? I'm mentioning nibbles here because they play a role in understanding how the MIDI tuning spec uses the data contained in each byte of a signal. I'll use nibble-size binary groupings below to illustrate the hex numbers that we'll come across. Also note that it's easy to specify binary strings of 1s in decimal format in a manner which makes obvious their binary derivation, by bumping up to the next higher power of 2 and adding "-1". For example, our last number in the above table, 1111 [binary] = 10000 [binary] minus 1, or in decimal, (2^4)-1 = 16 - 1 = 15, which = F in hex. After F [hex], which = 15 [decimal], we get to 16 [decimal] by using the same procedure as before: bump up to the next higher exponent of 16 and start over again. So a "1" in the next higher place means 16^1, and in the next higher place after that, 16^2, etc. So the highest 2-digit number in hex is FF [hex], which equals (15 * (16^1)) + (15 * (16^0)) = (15 * 16) + (15 * 1) = 240 + 15 = 255 The next higher number would be 256 [decimal], which divides evenly as the second power of 16, so we put a "1" in the third place followed by two zeros: 100 [hex] = 256 [decimal]. So, in hex, FF + 1 = 100. There's an intermediate numbering system called octal, which is (you guessed it) base_8. This system is a bit easier to understand because it works just like decimal but only has digits 0 thru 7. After 7 comes 10 [octal] = 8 [decimal], then 100 [octal] = 64 [decimal], etc. In fact it's somewhat akin to our usual diatonic musical numbering system which uses A B C D E F G, then starts again at A when we reach the next "octave". This type of musical notation functions as a base-7 numbering system: A = 0, B = 1, ... G = 6, A = 0. Octal was more commonly used from the 1950s to 1970s, but is not much used these days, hex being preferred. But because of the structure of the MIDI tuning data protocol, octal plays an important role in MIDI tuning calculations. Now, with that out of the way, on to the specifics of the MIDI tuning data format. From the official MIDI website: > Frequency data shall be sent via system exclusive > messages. Because system exclusive data bytes have > their high bit set low, containing 7 bits of data, > a 3-byte (21-bit) frequency data word is used for > specifying a frequency with the suggested resolution. In other words, in each of the three 8-digit binary strings (called a "byte") which belong to a tuning command, the most-significant bit or place is set to 0, which in this case is simply a flag to let the hardware know that these 3 bytes of data are a SysEx data type of message. This is extremely important, because it means that the rules of enumeration which I elaborated above are not quite followed. There's a different system in use here that's called an "offset". More from the midi website: > Frequency data shall be defined in units which are > fractions of a semitone. The frequency range starts > at MIDI note 0, C = 8.1758 Hz, and extends above > MIDI note 127, G = 12543.875 Hz. The first byte of > the frequency data word specifies the nearest > equal-tempered semitone below the frequency. The next > two bytes (14 bits) specify the fraction of 100 cents > above the semitone at which the frequency lies. > Effective resolution = 100 cents / 214 = .0061 cents. There's a serious typo error on the webpage here, in the denominator of that fraction: "214" should really be 214 (if that is still not displayed correctly in your browser, it should read "2 to the 14th power", not "214"), so that effective resolution should be given as 100 cents / 2^14 = ~0.006103516 cent = exactly 25/4096 cent. Thus, the greatest possible MIDI tuning resolution divides the 12edo semitone into 2^14 = 16384 equal steps, which defines the 14mu. The 8ve is therefore divided into 1200 / (25/4096) equal steps = 1200 * (4096 / 25 ) = 48 * 4096 = 196608-edo. or 2^14 = 16384 14mus per Semitone. Also, the frequencies may be specified to a more accurate number of decimal places than those published in the MIDI spec, which is particularly important in the case of the low frequencies:
MIDI note 0 is the 12edo "C" which is 5 "8ves" plus a "major-6th" below A-440, which = 440 * 2^(-5-(9/12)) = ~8.175798916 Hz, which is ~1/4355 of a cent lower than the published figure.
MIDI note 127 is the 12edo "G" which is 4 "8ves" plus a "minor-7th" above the reference tone of A-440, which = 440 * 2^(4+(10/12)) = ~12543.85395 Hz, which is ~1/344 of a cent lower than the published figure. (The published figure appears to be the result of an erroneous calculation, because rounding off intermediate values in the calculation results in other values which are all different from it.)
As stated above, there are several errors on the official MIDI tuning page.
At the end of the section titled "FREQUENCY DATA FORMAT" is a table titled "Examples of frequency data:" (almost halfway down the page).
Below i give the correct figures for Hz, using 8 decimal places of precision instead of the 4 as on the MIDI page, along with some additional data showing MIDI-note, pitch-bend amount in both tetradekamus and cents, and ratios from the tuning reference of A-440 Hz.
The "7F 7F 7F" command is reserved to indicate "no change", thus the highest possible frequency obtainable in MIDI is 13289.6566 Hz. ("14mu" is my abbreviation for "tetradekamu".)
MIDI MIDI --pitch-bend-- ratio freq.data note +14mus +cents from A-440 Hz 00 00 00 = 0 0 0.0000 0.018581361 8.17579892 00 00 01 = 0 1 0.0061 0.018581427 8.17582774 01 00 00 = 1 0 0.0000 0.019686266 8.66195722 0C 00 00 = 12 0 0.0000 0.037162722 16.35159783 3C 00 00 = 60 0 0.0000 0.594603558 261.62556530 3D 00 00 = 61 0 0.0000 0.629960525 277.18263098 44 7F 7F = 68 16383 99.9939 0.999996474 439.99844877 45 00 00 = 69 0 0.0000 1 440.00000000 45 00 01 = 69 1 0.0061 1.000003526 440.00155124 78 00 00 = 120 0 0.0000 19.02731384 8372.01808962 78 00 01 = 120 1 0.0061 19.02738092 8372.04760546 7F 00 00 = 127 0 0.0000 28.50875898 12543.85395142 7F 00 01 = 127 1 0.0061 28.50885949 12543.89817521 7F 7F 7E = 127 16382 99.9878 30.20376504 13289.65661609 7F 7F 7F -- -- -- -- --
The most egregious error in the table is the second note in the list, which the official MIDI page gives as "00 00 01 = 8.2104 Hz". The interval between this note and the first one is ~7.3111 cents, whereas it states explicitly in the text that 1 unit of pitch-bend equals only 0.0061 cents! The actual frequency data needed to obtain this frequency would be 00 09 2E -- quite a difference! This must have been the result of an error in the calculation.
The other errors are much smaller, and are probably the result of rounding various values at some point in the calculation. (Remember that the larger difference in numbers for the higher frequencies doesn't actually sound as big as it looks, because we perceive pitch logarithmically.)
The tonalsoft.com website is almost entirely the work of one person: me, Joe Monzo. Please reward me for my knowledge and effort by choosing your preferred level of financial support. Thank you.