44.1 kHz is a popular sample rate, for legacy reasons.
PCM encoders generate a lot of digital data. Back in the 1980s, it was simply too much to fit on any household medium, with the exception of videotape. So that's what they recorded to.
|Standard||Framerate (contains two interlaced fields)||Visible lines||Lines used by PCM encoder||Total lines|
|B&W NTSC||30 Hz||480||490||525|
|Colour NTSC||≈29.97 Hz||480||490||525|
By multiplying the framerate (frames per second) by the number of lines used per frame, we get the number of lines used per second:
|Standard||Frames per second||Lines used per frame||Lines used per second|
|B&W NTSC||30 Hz||490||14,700|
|Colour NTSC||≈29.97 Hz||490||≈14,685.3|
Each line can store 3 stereo pairs of 16-bit samples. By multiplying the lines used per second by the pairs of samples per line, we get the sample rate per second:
|Standard||Lines used per second||Pairs of samples||Sample rate|
|B&W NTSC||14,700||3||44,100 Hz|
|Colour NTSC||≈14,685.3||3||44,056 Hz|
Each line stores 3 pairs of 16-bit samples, so 3 × 2 × 16 = 96 bits per line.
As far as I can tell, depending on the version, Sony's PCM-F1 could record at either 44.056 kHz or 44.1 kHz (but seemingly not both), so it seems likely there was one version designed for colour NTSC tapes, and another for PAL (judging by it being featured in British magazines of the time) and possibly monochrome NTSC.
This explains the conflicting information I found online about whether 44.1 kHz was chosen because it worked with both PAL and NTSC, or whether 44.1 kHz was for PAL and 44.056 kHz was for NTSC. In actuality, 44.1 kHz is compatible with both PAL and monochrome NTSC, while 44.056 kHz is compatible with just colour NTSC.
Most likely, in NTSC territories, consumers used popular colour videotapes at 44.056 kHz, whereas professionals used rarer monochrome videotapes at 44.1 kHz for global compatibility. (Not being American or Japanese, this is speculation on my part. I don't know if 30 Hz tapes were rare while ≈29.97 Hz tapes were common. In the UK, we only had 25 Hz tapes.)
The PAL and monochrome NTSC sample rate was adopted by the global CD format, perhaps because CDs were created in both PAL and NTSC territories, or perhaps simply for the slightly increased sound quality. Either way, CDs were made using the output of a professional PCM encoder, such as Sony's PCM-1600.
Curiously, while CDs are split into sectors and frames (analogous to fields and lines respectively), they're different to any of the above videotape field and line lengths, at 75 fields per second, 98 lines per field, and 6 stereo pairs of 16-bit samples per line.
Incidentally, a full 75 Hz field is the shortest addressable point on a CD, just as a single field is the shortest addressable point on a videotape. This explains some tracks — such as Nine Inch Nails's "A Warm Place" — starting with a very brief snippet of the previous track. Perhaps they should have addresed the next field instead...
While I'm not clear on how CDs were able to be so much smaller than LaserDiscs, it presumably helped that they have no colour signal, and that the luminance signal can be entirely Boolean, as that's all a PCM decoder needs.
CDs became wildly popular, as the first pre-recorded digital format available to consumers. As a result, the sample rate of 44.1 kHz became even more popular, spreading to home computers and their beige successors, and subsequent formats such as DAT and MiniDisc.
- "ITU-R BT.470-6" International Telecommunication Union, p. 2
- "CD-Recordable FAQ: What's a Frame? CIRC Encoding? How Does ECC Work?" Andy McFadden