24-bit/192kHz music downloads and why they make no sense (2012)

(people.xiph.org)

131 points | by Kaapeine a day ago ago

256 comments

stego-tech a day ago
I cannot hear the difference between 16/44.1 (and by extension, 16/48) and High-Res Content generally, be they HDCD, SACD, or just straight-up Masters from Qobuz. This is on multiple sets of equipment, ranging from El Cheapo earbuds all the way to HD800 cans and full-fledged tower speakers being bi-amped.
That’s not why I go for High-Res stuff, though.
It’s all about archival, at least for me. With a 24/192 Master in FLAC or ALAC, I can downsample to whatever the destination form factor is. I can transcode to a 320kbps MP3, or a 16/48 WAV stream for a smart speaker, or a 24/96 stream for the theater. The point isn’t that I can hear the difference, it’s the fear that I might lose something irrecoverable by sticking with lower-quality files for bulk storage. Once data has been discarded, it cannot be retrieved, and that influences my preference for storage (and is also why my BD/UHD rips are into MKVs, no re-encoding).
Now that being said, I will absolutely hem and haw and ABX different releases to determine if I opt for the 16/44.1 CD rip of an album from the 80s or the new 202X remaster in 24/192 (spoiler: almost always the former), and I absolutely prefer anything with classic instruments (Jazz, Classical) in higher-quality formats because of a subjective perception of a wider, clearer sound stage, though this is almost certainly a psychological effect from performing in concert bands and orchestras rather than physical or objective in nature.
Like I tell newcommers: if it sounds better enough to you to warrant the purchase price, then that’s all that really matters. Enjoy the hobby.
[-]
- saltcured a day ago
  Decades ago, I was treated to an ABX test in my brother's recording studio. I easily recognized and preferred a 24/192 master he played versus the 16/44.1 down-mix. I honestly don't know whether there was something wrong with the down-mix, but qualitatively it did feel like it was "muffled" and coming from speakers, while the master really felt like live performance. He was surprised that I could tell them apart.
  I also spent a lot of time ripping my old CDs to FLAC and trying different MP3 and AAC encoder settings to get playback that felt transparent enough to me. I could never tolerate Sirius/XM radio streaming due to the horrid compression I heard with every futile attempt. I still seem to have more sensitive hearing than most people around me, but in my 50s I know it isn't what it once was.
  I never had huge budgets, but did strive for hi-fi in my limited ways. I used things like toslink and HDMI to send raw PCM data from Linux to my Yamaha A/V receiver's DACs + amplifier to drive somewhat nice Polk tower speakers. But then COVID-19 happened, and this stuff was packed up to move house.
  Nowadays, music playback is streaming with mundane "subwoofer + satellite" PC speakers or MP3 playback with a mini-SD card permanently parked in my car's infotainment system.
  [-]
  - vor_ 21 hours ago
    > Decades ago, I was treated to an ABX test in my brother's recording studio. I easily recognized and preferred a 24/192 master he played versus the 16/44.1 down-mix. I honestly don't know whether there was something wrong with the down-mix, but qualitatively it did feel like it was "muffled" and coming from speakers, while the master really felt like live performance. He was surprised that I could tell them apart.
    As referenced in the article, a common explanation for those audible differences is that the high-resolution version of the album is sourced from a different master.
    [-]
    - saltcured 19 hours ago
      > As referenced in the article, a common explanation for those audible differences is that the high-resolution version of the album is sourced from a different master.
      In this case, it was my brother's own 24/192 recording, down-mixed by him to CD format with the intent that it be transparent. I believe he said his software was supposed to be dithering, but this was ~25 years ago and I can't really confirm the details anymore.
    - TheOtherHobbes 19 hours ago
      This is easy to disprove by downsampling from a 24/192 source to 16/44.1 Even if the downsampling is (close to) ideal there are obvious differences.
      In fact if you can't hear the difference between 24/192 and 16/44.1 you shouldn't be working in audio. (Doesn't apply to consumers. Does apply to musicians and engineers.)
      It's like being colour blind.
      And if you don't understand the math behind quantisation, you shouldn't be posting pseudo-scientific videos where you use an oscilloscope and a cheap spectrum analyser - both tools with very limited resolution - to "prove" your point.
      16 bit isn't enough for hard, objective reasons. One is that the noise spectrum of quantisation is not simple. Most people assume it's something close to plain white noise, but it really isn't. It's actually a very complex spectrum with some prominent peaks at specific subdivisions of the sample rate. Those frequency peaks are significantly above audibility. 24-bit quantisation shrinks them below audibility.
      The other is that most people can hear dither/noise-shaping at 16-bits. That adds a single bit of noise which should - if you're being very literal - be far below the threshold of audibility. But it clearly isn't.
      These two facts are related.
      The more complex reason is that listening is an active perceptual process. The brain does a huge amount of processing to separate sources and place them in a perceptual field which includes information about perceived object type, distance, and ambience cues. Some of those cues are very quiet, and we don't hear them linearly.
      So using sine waves as some kind of perceptual reference for audibility is nonsensical. We hear much more complex signals in an active way, and if there's information missing in the quiet parts - which there is with limited quantisation - then the signal simply isn't accurate.
      [-]
      - leothetechguy 6 hours ago
        I agree with most of your points, but saying you shouldn't work in audio if you can't tell the difference between 192khz and 44.1khz is a bit elitist imo. And saying you're color blind if you can't tell the difference is like saying you're blind if you don't have 20/20 vision and shouldn't draw. You can always use meters to check for aliasing artifacts.
        It's not like all of your samples and virtual instruments are 192khz or even 96k. Many are 48khz or even 44.1k.
        I think there are many cases where people never need to go above 44.1khz unless you maybe have saturation on the master bus. I agree that good dithering is important though and think that there hasn't been enough research on that so far.
      - mschulkind 14 hours ago
        But surely the difference is 16 vs 24 and nothing to do with the sample rate?
        [-]
        hdgvhicv 10 hours ago
        The larger problem with 44.1 is
        But it depends what you're sourcing from. If you source 44.1 then you will have a worse recording if you change it to 192. If you source at 48k then you just waste samples. If you’re recording analog inputs at 192k in a crappy adc then you will have a worse outcome than a good adc at 48k (or 44.1k)
        Same with bit depth - the adc is far more important.
  - amluto 5 hours ago
    One possibility (pure speculation) is a bad antialiasing filter. The Nyquist frequency at 44.1ksps is 22.05kHz, which is only ~10% above the audible band. This means that you need a rather sharp filter both when downmixing and when playing to avoid potentially audible aliasing into the audible band or attenuation within the audible band.
    If you look at a site like audiosciencereview.com and pull up measurements of a DAC or ADC, you can find graphs of the antialiasing filter response. Some are great and some are not.
    One could think of 16/44.1 PCM as being a codec that is potentially perfect but requiring some degree of care to encode and decode correctly.
  - nullc 20 hours ago
    This is an extremely hard comparison to do well. I'll give a few examples as to why:
    Small differences in gain are ABX able much more readily than differences in noise at the 16 vs 24 bit level. So if the signal chain gives even a small difference in gain between the samples that's what you'll track. A reasonable conversion path to 16 bits for mastering will also apply dithering and some kind of brickwall limiting (you have to limit after the dither or as part of the dither as dither can change levels!), and this can result in gain changes. The DAC may behave differently or have outright bugs for some configurations too.
    This is particularly true wrt reconstruction filters for sample rate differences. And if you were comparing 44.1k and 192k then the physical DAC itself was likely running at a different rate and its _analog_ filters are probably better optimized for one vs the other (this is less true for 48k vs 192k, as the hardware likely runs at the same rate for both). So one answer to this comparison can be "on this particular hardware this rate is better than that rate"-- but that's a implementation property not a property of format choice.
    You might think, "okay I'll use a mathematically perfect down and up conversion process and run the DAC in the exact same configuration for all cases". But even then you run into issues like after reconstruction the _inter sample_ peak levels will be higher than the levels of the samples, so you have to handle that and in a way that doesn't produce a gain difference between the two configurations. (probably by running your perfect process and finding the gain level that results in no limiting, then making the gain of the original match).
    And then for the high rate vs non-high rate you have to deal with the fact that most amplifiers are not particularly linear (compared to well constructed software at least!) and that any real speaker is very far from linear. This means that the presence or absence of ultrasonics will change the audio in the 0-20khz band.. Before you think "well that could be a reason that high rate is better" observe that if there was some consistently good effect from the ultrasonics you could just bake it into the low rate sample.
    > but in my 50s I know
    Yeah if you're in your 50's you're absolutely not hearing differences way up above 20khz (especially if you're male), I bet you can't even hear CRT flybacks from 100 yards anymore. :P Most people have no idea how much their high frequency hearing degrades as they age because it plays approximately no role in your life, but it's real, dramatic, and as far as I know happens to everyone.
    I don't mean to discount your experience: I don't really doubt that it was real. But answering the general question of the necessity of low vs high rate probably takes a team of experts, armed with test gear and the designs of the HW/SW in question, to vet the test configuration. Testing a _particular_ configuration without the ability to distinguish its implementation quirks from format-fundamentals is much easier and that's what most attempts to test this question are actually testing.
    By testing in a recording studio you were doing far better than most such comparisons. Usually people try comparing different files and they're comparing entirely different mastering processes. Files made for the "high res" market will often have much less compression and limiting then files made for commercial radio play / casual listening... and truly do sound obviously much better. Some of my favorite recordings are rips from vinyl. Vinyl is an awful format from the perspective of audio fidelity, but it's also pretty intolerant of excessive compression and limiting because the record will skip if the needle is bouncing off the rails. And more recently I suppose they also avoid over compression there because of the difference in target listener/environment.
    [-]
    - Tor3 13 hours ago
      Ah, 20kHz and CRT flybacks.. when I was a child I could of course hear that (in Europe that would be 15625 Hz), when I studied electronics and TV repair we could all hear that, and because we had the equipment we "tested" what we could hear using a function generator. The limit for conscious hearing for me was somewhere around 17kHz. Or not 18kHz for sure.
      But I think I lost the ability to hear the flyback not long after I passed twenty. The world turned silent as far as that's concerned (before, you could hear it anywhere and everywhere, in shops, homes, some workplaces..)
      The "20kHz" thing is kind of a myth for most people, at least that's what it looked to me after all the testing we did at school. I think it can influence what you hear, somehow, but in any case it's for very young people.
      > Most people have no idea how much their high frequency hearing degrades as they age because it plays approximately no role in your life, but it's real, dramatic, and as far as I know happens to everyone.
      I agree completely. I recall some discussions a long time ago on RMMGA (Usenet: rec.music.makers.guitar.acoustic) where some distinguished and experienced, but middle-aged guitarists got practically angry when a young guy described the sound of a certain type of newly-introduced strings "harsh" and "like fingernails on a blackboard" when used on a particular guitar.
      The difference was, of course, that what the young guy could hear is something which stopped existing at least when you had passed 30.. I was at an age where I too couldn't hear that kind of sound from strings, but it was still not that long ago and I remembered and had noticed the difference, i.e. that I could not hear what I could hear before. For example the huge difference between fresh strings and week-old strings (and that fact has, over the decades, saved me tons of money which I would otherwise have spent on replacing strings all the time..)
    - saltcured 19 hours ago
      Yes, perhaps the amplitude was subtly different.
      This was supposed to be running the DACs to match the source configuration, not resampling into some common format. I think that is an unavoidable part of the whole end-to-end ABX test concept.
      Maybe it would be interesting to up-sample back into 24/192 and play both in that mode. But then people would argue about what type of up-sample to use.
      I was in my mid 20s for this test. I understand my high-band hearing was better back then.
      [-]
      - alexalx666 13 hours ago
        Speaking about up-sampling, Im curious to know your opinion on the benefits of it. Im sending CD resolution audio as well as web streams from soundcloud.com to cambridge audio azur 840C and its not clear if its the up-sampling that makes the difference or their per channel wolfson dac arch. The iPod Video with their dac sounded amazing with just normal AAC files compared to the iPods before or after it.
    - bigiain 20 hours ago
      > Small differences in gain are ABX able much more readily than differences in noise at the 16 vs 24 bit level.
      This was common knowledge at least as far back as the mid 80s, when every hifi shop and salesguy knew to ensure the bit of gear with the highest profit margin got played an almost imperceptible bit louder than the gear the customer came in to buy during back to back testing.
      [-]
      - nullc 20 hours ago
        It's also a reason why double-blind testing is important. If someone doing the setup is expecting one piece of kit to sound better, if it doesn't they'll check the configuration more, and difference in gain can come from many sources. So errors that result in higher gain in favor of the "better" candidate go uncorrected, while ones that favor the worse tends to be fixed.
        Point being: it doesn't even require an unscrupulous sales person to get similar results to an unscrupulous sales person! :P
  - empiricus a day ago
    Even for PC, I recommend some cheap studio monitors.
    [-]
    - bob1029 20 hours ago
      https://www.bhphotovideo.com/c/product/964752-REG/yamaha_hs8...
      [-]
      - bzzzt 6 hours ago
        Those things have huge drivers and are probably too big for a lot of rooms. Unless you (and your neighbors) absolutely want to have that thumping sound and you go out of your way to kill unwanted bass reflection you're probably better off with the HS5 or something similar.
        [-]
        bob1029 6 hours ago
        8" is not a huge driver. The point with these is to avoid the need for a separate subwoofer for most program material.
        [-]
        bzzzt 4 hours ago
        It's the biggest model in the Yamaha monitor series and if you care about bass that much you probably also want a subwoofer ;)
    - saltcured 19 hours ago
      Yeah I'm just lazy about dealing with the room. If I find the motivation, I'll pull the original equipment back out of storage.
  - Applejinx 19 hours ago
    That would be how you'd go about telling, sure enough. You can't go by 'frequencies' or distortions or anything like that, these analog departures from convincing reality aren't how digital failings manifest.
    You try to hear the brickwall by the muffled, enclosed quality and possibly by the weird pre-ring blurriness of the filter making things sound more vague than they have to be, and you hear the truncation not because it is audible 'distortion' as we know it, but because depth collapses and it sounds like it's coming from the speakers and not being a separate space behind/around the speakers. At no point will it be the most glaringly obvious thing but it'll never be 'distortions' as we imagine them, it's more a 'pod people' lack of personality thing.
    Like a much subtler version of listening to AI music :)
    I'm quite happy with 24/96 as suitable overkill for anything I might want to hear or do. Neil Young went hard on the proposition that 192 was necessary. Sold the Ponoplayer, I had one but it died on me, battery failed eventually. It really did sound awesome beyond just about any other listening device I've ever heard…
    [-]
    - TheOtherHobbes 19 hours ago
      24 > 16 is not debatable. Sample rates are more complex because then higher the clock rate the more you get distortions from jitter and the design of the DAC/ADC. Most converters introduce different artefacts at different sample rates, especially at the prosumer end, so you're not comparing like for like.
      The last couple of generations of converters have gotten a lot better, so 192kHz today is likely to sound cleaner and smoother than it did ten years ago, where there was a good chance the clock was quite jittery.
      Personally I don't think it's worth the extra bandwidth for playback, but I can understand why some people might want it.
      Generally all of these "debates" come down to people who think math > circuitry. All real designs are imperfect trade-offs. They all have issues, and arguing as if converters are perfect when they never are, and the imperfections can be benched objectively, is... not very scientific.
      [-]
      - bzzzt 6 hours ago
        >Generally all of these "debates" come down to people who think math > circuitry. All real designs are imperfect trade-offs. They all have issues, and arguing as if converters are perfect when they never are, and the imperfections can be benched objectively, is... not very scientific.
        There is one purely objective benchmark: a true blind test. You can believe if something is different or not, but if nobody's capably of hearing the difference, does it matter?
        [-]
        Applejinx 6 hours ago
        You're missing a few words. To be correct you would have to have said 'if nobody's capable of hearing the difference every single time beyond a statistical doubt, does it matter?'
        You can say that and be correct, while also sounding a little more silly than perhaps you'd like.
        edit: rather than go even harder, I'm instead going to suggest it's perfectly fine to care about things you don't hear every single time, but still like or dislike :)
        My pet example is sand in the lettuce for a salad. If you dislike that particular cronch against your teeth while eating salad, it has a spectacular ability to ruin your enjoyment of your salad, even though you don't perceive it every single time. Digital distortions are like that for some of us, things like wow and flutter and vinyl surface noise are like that for others. People vary. (which is also why not to generalize about what 'people can hear')
        [-]
        bzzzt 4 hours ago
        Oh, I care about audio quality a lot more than most people, but not into 'magic cable' territory.
        There's no reason you can't gather statistics about a representative part of the population. It doesn't make sense to make the entire world pay for better audio because some guy somewhere might be a bit more sensitive and he thinks it ruins everything.
        I don't know if such a person exists, but especially in the 'magic' audio territory there's a big amount of bullsh*t going on. There's a reason the James Randi price was never claimed.
        There's a huge difference between digital 'distortions' caused by sampling at CD quality and things like tape flutter which most people can actually hear. Even then, some people like the imperfections like vinyl or tape artifacts. Some people even prefer MP3 compressed music.
- raxxorraxor 6 hours ago
  https://news.ycombinator.com/item?id=48774112 <-- Here is a test. I would have said the same and I played all samples quite often. It is extremely subtle, but even with a cheap headset it is possible to hear differences, where the high quality version is just a little bit clearer.
- dialogbox 7 hours ago
  > With a 24/192 Master in FLAC or ALAC, I can downsample to whatever the destination form factor is. I can transcode to a 320kbps MP3, or a 16/48 WAV stream for a smart speaker, or a 24/96 stream for the theater.
  I used to think the same. But I realized that downsampling hi-res music to 16/44.1 isn't a transparent conversion. So now I prefer the one downsampled to 16/44.1 by an expert in production env. I almost always download 16/44.1 flac files because of this.
- dtgriscom 19 hours ago
  > I absolutely prefer anything with classic instruments (Jazz, Classical) in higher-quality formats
  High-dynamic-range material benefits from lots of bits.
  [-]
  - javchz 16 hours ago
    This. It may be a niche. But music that works with volume as part of the composition can be amazing.
    But most music today has heavy compressors in the pipeline that kills dynamic range in favor of allowing you to hear almost even whispers, even in traffic or a city with ear pods.
    But if you're from the first group, as you said it's more noticable the benefits of having better codecs and bit depth vs heavily compressed top billboard songs where even listening the master track from the studio, falls into diminishing returns.
- Cider9986 a day ago
  I can't hear the difference between 128 kbps opus and FLAC.
  [-]
  - nullc 20 hours ago
    > I can't hear the difference between 128 kbps opus and FLAC.
    A reasonable definition of transparency for high bitrate compressed audio is "Can the worst files be distinguished by a listener trained in what artifacts sound like". Maybe also add in having to use a high discrimination listening setup, including not running excessively loud (increases masking).
    If that's not the test you're doing, it's unsurprising. At moderately high bitrates no one can reliably distinguish them on arbitrary samples: most inputs are easy.
    If you test on known-difficult "killer samples" you'll probably easily distinguish them, even without first being shown what to look for, and certainly after.
    During the development of Opus I created many 'trained listeners' and selected many killer samples, and I don't recall* ever encountering a tin ear that couldn't be taught to ABX any high rate samples, though some people are obviously much better at it.
    I'm not sure I'd recommend it though: learning to identify artifacts has a frequent side effect of making low rate audio like the HE-aac used in SirusXM absolutely intolerable. I'm bothered by it even when I hear cars driving by using it. :)
    [*] My memory for such things sucks, so I could be wrong-- but my point that it's not expected remains.
    [-]
    - Cider9986 19 hours ago
      I did the ABX test extension in foobar2000 with Octopus's Garden. It was on nice headphones.
      You're right it's just minor details.
  - stego-tech a day ago
    And that's fine! I've got a flatmate who loves 320kpbs MP3s on studio monitors, I've got musician friends who swear by CD-audio and Sennheiser HD200s, and others who love how vinyl uniquely degrades over time on big speakers.
    The takeaway from these sorts of posts, at least in my opinion, should be two-fold:
    * Understand the physical limits of human senses and perceptions to help inoculate yourself against outright scams and grifts
    * Liberate you from the "tech grind" and allow you to enjoy what you like, how you like it.
    [-]
    - Cider9986 19 hours ago
      The thing I didn't understand with higher quality music files is that it's not like the entire song is different and better when you go from 64 to 128 kbps opus, it's just these super minor details that get changed. It was enlightening doing an abx test, but I still use flacs because it's nice not worrying about the quality mattering.
    - dspillett 21 hours ago
      > Understand the physical limits of human senses and perceptions to help inoculate yourself against outright scams and grifts
      Also understand that while there is an upper limit, we are all different within that. I can hear the difference between 128Kbps and FLAC, at least for some content, but not 256Kbps, maybe not 192. For some content (spoken word etc.), 64Kbps, sometimes less, is perfectly acceptable (to me). There was a time I could hear the difference between some encoders, but that was decades ago and anything in active use is pretty damn good (and my ears are not what they used to be) unless you really crank the bitrate down or tweak other options daftly.
      [-]
      - sgarland 20 hours ago
        I’ve not tried encoding my own MP3s in at least a decade, but when I was doing so, 128 kbps was instantly distinguishable to me on anything with cymbals, especially hi-hat: it loses that shimmery sound. At 192 kbps I could tell if I really, really tried, but it was so minute I didn’t really care. I was never able to reliably tell the difference between 256 and 320 kbps rips.
      - PaulDavisThe1st 20 hours ago
        > I can hear the difference between 128Kbps and FLAC
        You've established this with double bind testing, correct?
        [-]
        dspillett 9 hours ago
        Some time ago, yes. For 128 and there-about & below, not 192+ hence I'm less certain about that (but I'm pretty sure I wouldn't be able to tell the difference).
        Not recently, so it is possible that improvements in encoding methods, and changes in my ears, could mean that I'd get a different result now.
- IshKebab 3 hours ago
  That's not really how down sampling works. You already have all the information with 48 kHz sampling.
  Higher rate sampling is just like storing integers to 3 decimal places, or archiving an upscaled DVD.
  I recommend you actually read the article. I vaguely recall they did it in video form too.
z_open a day ago
As they say, most people listen to their music with equipment. Audiophiles listen to their equipment with music.
[-]
- nntwozz a day ago
  This is perfect, thank you this goes straight into my long-term memory bank.
  On a tangent, whenever someone mentions LP sounding warmer or whatever I like to point out that I prefer wax cylinders (a.k.a. phonograph cylinders).
  [-]
  - _kb 19 hours ago
    Those wax cylinders are a modern hack. The curved surface distorts the real artistic intent. The only way to appreciate the true beauty of sound is a the purity of soot etchings on a phonautogram.
    [-]
    - nntwozz 8 hours ago
      I will add this to my repertoire :)
      I had a good laugh listening to the sample at https://en.wikipedia.org/wiki/Phonautograph
      The sound is very pure indeed.
  - fecal_henge a day ago
    You Edison shill.
- pimeys 21 hours ago
  I might be something from the middle. Yes, I did spend a hefty 5000 euros to my headphone setup. And yes it sounds absolutely magical and every day I'm happy listening to music with it.
  But I also have a large multi-terabyte music collection, I follow new music, go to concerts, go to parties, talk about music with my friends in signal group chats.
  It's a hobby, and when you get a bit older and start having some savings, if you love music treating yourself with a better system is not that crazy.
  [-]
  - UmYeahNo 21 hours ago
    When I got old enough to finally afford those toys I discovered I couldn't hear above 16khz anymore.
    [-]
    - pimeys 20 hours ago
      It is not only that. It's the spacing, how the bass sounds, separation of instruments. There's so many interesting headphones in the midrange to try out. Compare the Hifiman HE1000se to Heddphone 2 GT, or to Focal Clear MG and you'll understand.
      Also with HEDD you get a handcrafted device made in Berlin. And if you go with nicer cables, they are very beautifully done and feel great. There is no difference in sound of course. Some people like jewelry, I can get similar enjoyment from beautiful audio equipment and cables.
  - az226 20 hours ago
    What’s the quality of this trove? As in bitrate or similar.
    [-]
    - pimeys 20 hours ago
      Depends. I'm more into finding certain masters. And some of the albums are DSD tape transfers. DSD if that was the original recording format, if it was mixed and PCM was needed, DXD flac.
      And so many CDs of course.
- mingus88 a day ago
  That’s true, but I consider myself a collector. Think of how a comic book collector operates.
  If I have an option to get a 16bit version of a recording or a high-res version, I choose the highest quality version very time
  Same with a physical copy. A limited edition, better quality vinyl LP is more attractive if you are going through the trouble of curating a collection.
  I’ve been curating a music library of digital files since before the iPod was released and I will always go for the highest quality version out of principle. I can always downsample it to any thing that makes sense.
- eimrine 13 hours ago
  Why not to listen to the equipment from time to time? And why not to learn how class D sounds?
rahimnathwani a day ago
The article says "I've run across a few articles and blog posts that declare the virtues of 24 bit or 96/192kHz by comparing a CD to an audio DVD (or SACD) of the 'same' recording. This comparison is invalid; the masters are usually different."
It may be simultaneously true that:
A) Humans cannot tell the difference between 44.1kHz/16-bit audio and any higher resolution, and
B) For a particular song, the best commercially available 44.1kHz/16-bit version may not be the best commercially available version
[-]
- zamadatix a day ago
  While 100% true, I'd phrase B) as:
  "The quality of the particular mastering can still make a noticeable difference, regardless of the ability for the digital sampling rates to perfectly represent it perceptually"
  Just to be clear that the statement applies to any releases meeting the A) criteria, not just 44.1 kHz @ 16-bit ones.
- black_knight 20 hours ago
  I usually A/B test the different versions before choosing my canonical one. I will listen to the same sections in each version, flipping back and forth to hear the differences. It is incredible how much finding the right master improves the experience of listening to a track. Often times that means I end up with a hi-res version, but not always.
hackingonempty a day ago
@xiphmont also made an amazing video response to the many responses he received to this article. Using analog equipment he busts a bunch of myths and demonstrates what really happens with digital audio.
https://video.xiph.org/vid2.shtml
or on YT if you can't play it https://www.youtube.com/watch?v=cIQ9IXSUzuM
[-]
- pringk02 7 hours ago
  One of the best educational videos on any topic ever imo. I can't help but watch it again each time. So incredibly well paced and accessible
- niccl 20 hours ago
  Thank you for posting this. I thought I knew a bit about what was going on with audio sampling and reproduction, but I learned a surprising amount from this well presented introduction
- justin66 8 hours ago
  A classic.
ycui7 20 hours ago
24-bit was created because microphone want to record large dynamic range without gain switching circuit.
96kHz was created to better reproduce 20kHz high frequency, so the digital noise shaping filter does not need to be super sharp right at the Nyquist frequency.
Both were introduced for a sound technical reason. beyond that, most are marketing non-sense to cheat consumers.
geraldmcboing 21 hours ago
The OP is a bit off with their description of why pro audio engineers work in higher bit rates and sample rates. We use 24bit to preserve low level sounds eg reverb, breaths etc and use 32bit float when recording as the headroom is so massive clipping is not an issue (other than of course still neeing to avoid overloading microphones with max SPL - cleanly recorded distorted sound is still a fail). Unclipping 32bit float feels like voodoo - I did a test, recording fireworks & unclipping the 32bit float recordings.
I use microphones that can 'hear' up to 100kHz (Sanken CUX100K) and for film sound design playing 192kHz audio at half and quarter speed the results are very significant, and reveal there IS 'content' above human hearing. Irrelevant for general listening but very important for sound design.
[-]
- PaulDavisThe1st 21 hours ago
  Have you ever actually checked the number of actual bits your ADC can use? Most 24 bit converters struggle to get to 18 bits.
  Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian); its strictly a format for mixing and processing. You don't get any more resolution from 32 bit floating point than you do from 24 bit integer formats, but the result of "clipping" is less dramatic, hence the appeal of the format.
  While there is some evidence that non-auditory human sensory perception may be sensitive to ultrasonic acoustic waves, it's pretty weak right now, and somewhat in the "woo" zone. It may turn out to be significant, or it may not. I wouldn't base an audio production workflow that requires 4x the cpu power and 4x the disk space on such tentative claims, but you're welcome to.
  [-]
  - nullc 20 hours ago
    > Nobody uses 32 bit float for recording
    Yes they do, almost all high end field recorders used for film work are 32-bits now and have been for much of the last decade, often with some fancy preamp integration so that there is no expertise required for gain staging the recording. (I believe the implementations use a second matched 24bit ADC with 48 dB less gain in front of it).
    The result obviously doesn't have a noise floor which is lower (as the noise of a room temperature _resistor_ gets in the way of that even at the 24-bit level) but they have more dynamic range so that your recording isn't ruined by hard clipping some unexpected loud sound.
    It's a big improvement for practical usage, and also likely does improve SNR somewhat because you can run higher gains without as much fear that you'll ruin the recording. The reason it would pay off is that the SNR loss you get from splitting the signal is easily smaller than the SNR loss you would get from gain reduction to avoid clipping.
    (maybe... capsule self noise is also limiting... at these levels, and usually people aren't using microphones designed for the lowest possible self noise unless they're doing something special)
    [-]
    - PaulDavisThe1st 20 hours ago
      There are precisely zero 32 bit ADCs in existence.
      There are ADCs that will provide 32 bits per sample but that's entirely different.
      Current technology limits the bit depth to 18-22 bits and going beyond that you'd be very quickly recording brownian (atomic) noise anyway.
      The point about 32 bit float is that it is a useful format for mixing, editing and general processing, so it is widely used in digital audio tools. But it is not a format that ADCs generate "natively" via their electronics - almost all of them are generate a 24 bit integer or fixed point value and then just supplying that as a 32 bit float value because the software asked for it (the software could have done it all by itself.
      [EDITED: DAC->ADC since that is what I meant and what this is all about]
      [-]
      - adrian_b 14 hours ago
        The ADCs that do direct sampling of the input signal (i.e. by successive approximation or by the pipelined algorithm) become very expensive at high resolutions and they are limited to 18 bits per sample or at most 20 bits per sample.
        Due to their high cost such ADCs have no longer been used in audio for many decades. They may still be encountered in some expensive measurement instruments that need high resolutions at significantly higher sampling frequencies than needed for audio.
        All audio ADCs have a very low resolution per sample, e.g. 4 bits or even lower, but they sample at a very high frequency, of many MHz. Then the bit stream is digitally processed to generate whatever format is desired for output, at a lower sampling frequency and a higher resolution, e.g. 24 bits @ 192 kHz.
        There is a difference between the actual resolution at the output and the effective resolution, which is limited by noise, e.g. the 24 bit samples may have an effective resolution of 20 bits or 21 bits or 23 bits, etc., i.e. they contain noise with an amplitude corresponding to those effective resolutions.
        The digital algorithm that converts the low resolution input samples (e.g. 4 bits @ 5 MHz) inside the ADC can easily be modified to generate a different numeric output format, e.g. FP32.
        Neither FP32 nor 24-bit is the native format of the A/D conversion. If the ADC outputs FP32, that is even more convenient for further audio processing. Obviously, the quality of the ADC is independent of whether it outputs FP32, and the FP32 samples will have a different effective resolution on each ADC, which seldom would be as high as 24 bits, due to the noise.
      - guenthert 11 hours ago
        > There are precisely zero 32 bit ADCs in existence.
        > There are ADCs that will provide 32 bits per sample but that's entirely different.
        Now that requires elaboration.
        There is e.g. AD's LTC2500 (https://www.analog.com/en/products/ltc2500-32.html). Not meant for audio (too slow at 32b) and not noise free, but it's a bona-fide 32b ADC.
        Now there might be no ADC which provides 32b wide noise-free samples at sample rates needed for audio and given the absurdly low level of a LSB signal that might be as infeasible as it would be pointless, but that's a bit of a different statement.
      - nok22kon 20 hours ago
        Rode NT1-A 5th gen microphone claims 32-bit float output, insisting it will not clip peaks
        so maybe they do sample at 24 bit at a well chosen gain level and then convert to 32 bit float, with the max 24 bit value being above 1.0 float
        or as GP said, use two separate ADCs at two different gains and combine their output
        [-]
        guenthert 10 hours ago
        > use two separate ADCs at two different gains and combine their output
        That's what could be done if ADCs were perfectly linear and noise free and limited only by their bit-width. Sadly, they are not. The non-linearity one can in theory measure and correct for, but the noise can be corrected for only by oversampling. And then you might as well use a single ADC of lesser bit width and higher sampling rate.
        [-]
        amluto 4 hours ago
        I feel like you’re arguing against a straw man.
        No one is arguing that there are practical audio microphones + ADCs that produce accurate, undistorted 32-bit float output across the full representable range. But they don’t need to! For professional use, the ability to produce perceptually accurate output, with inaudible noise, across a very wide dynamic range, is extremely useful. Think of it as fancy, real-time AGC. It does not need to be perfect. If you can record a loud transient without substantial distortion, and also record sounds with 2^16-fold lower amplitude (~96dB lower) while still remaining well above the noise floor immediately after the transient is gone, this ability is useful. Plenty of real-world noises are well above 120dB, and plenty of human-audible sounds are below 20dB. You can’t play back the recording, at least not without making parts inaudible or injuring your audience, but you can edit it. And a setup like this lets you do it with one microphone and no fiddling with gains in advance.
        nok22kon 10 hours ago
        this uses 2 ADCs:
        https://tascam.jp/int/feature/32-bit_float
        [-]
        guenthert 9 hours ago
        One cannot create a noise-free, perfectly linear 32b ADC using 2 lesser ADCs as described above. That is however not needed and I suspect isn't what they are attempting.
        If, say, two 24b ADC (20b noise free, non-linearity 2LSB) with one receiving the input signal with an approximate 10bit higher gain (+60dB) and one would combine their outputs with that 10b shift (and ignoring the input of the low gain path, if the signal falls below a given threshold to reduce the noise contribution of that ADC and the input of the high gain path if the signal exceeds another threshold in order to avoid clipping), then one could construct a 32b float.
        This doesn't improve resolution (which arguably would be pointless) or linearity (not all that critical in audio methinks) but dynamic range, which I can see some appeal of (in extreme recording situations, say you'd want to record the breathing of a shooter followed by the gun shot -- there remains the challenge of finding a microphone capable of a 120dB range, but perhaps one could use two different ones ...).
        PaulDavisThe1st 20 hours ago
        > Rode NT1-A 5th gen microphone claims 32-bit float output, insisting it will not clip peaks
        Of course it does! And that's what it does, of course. But that has absolutely nothing to do with the AD process itself, which is chip-limited to 24 bits and likely physics-limited to somewhat less than that.
        You can't beat the physical limit of a DA circuit by doubling them up at different gains.
        And .. you don't want to. Going beyond 22 bits gets you into brownian noise pretty quickly, which is completely pointless.
        The best you can do (or could do) is get a very, very, very good DA that can really do 22 bits (likely not commercially available because of the expense), and then get the samples from it in whatever format works best for your purpose (24 bit integer, some fixed point value, or 32 bit floating point).
        [-]
        nok22kon 20 hours ago
        you have 22 bits for the typical audio voltage level, which you call 1.0 float
        but what if you "allow" double that voltage and call it 2.0 float? a strong pressure into the microphone generates a stronger voltage
        thermal noise limits you on the quiet signals, but not on the powerfull ones
        so 22 bit for typical -1.0 -> 1.0 range and you can add a few more bits on top of that for stronger audio pressures (voltages) which you would traditionally clip
        [-]
        PaulDavisThe1st 19 hours ago
        Sorry, but this not how AD works. If your idea was valid, we'd have new generations of ADCs in our hands.
        [-]
        nok22kon 19 hours ago
        > In a 32-bit float recorder, you have two ADCs working in tandem to create a single audio file. One “low gain” ADC is optimized for high-level audio, and the other “high gain” ADC is optimized for low-level audio. If the high gain ADC clips due to loud sounds, the low gain ADC does not. And if sounds are too quiet for the low gain ADC to capture clearly above its noise floor, the high gain ADC still has plenty of headroom above its noise floor. Said another way, the low-level ADC handles the quieter sections, and the high-level ADC handles loud sections.
        https://tascam.jp/int/feature/32-bit_float
        [-]
        PaulDavisThe1st 19 hours ago
        The first diagram in that article is pretty ironic in an HN comment thread about Monty @ Xiph's stuff. Have you never seen his takedown of the "stairstep" drawing?
        [-]
        nok22kon 13 hours ago
        I have seen it.
        This is a marketing page after all, explaining point-wise samples, reconstruction filters and "staircases" is way beyond scope.
        Same as in images, pixels are not "little squares"
        nullc 19 hours ago
        It's discontinuous.
        You have some low noise amplifier. There is a signal. You split it. The result on each side has >=1 bit worse noise floor, probably somewhat worse as we're not using superconductors :P-- as you expect: there is no free lunch.
        Now: take one copy and attenuate it 48dB, further degrading its noise floor. Sample both. The attenuated copy is mostly useless, except when the input goes high enough that it would have hard clipped the other ADC.
        So the tradeoff is that you lose a small amount of noise floor constantly-- out at the 20th bit, that you probably didn't care about (microphone self-noise is limiting you out there anyways at normal volume levels), in exchange for never clipping.
        To turn this into a better ADC generally, you'd need the splitting stage to not hurt the noise floor, but it does.
        The reason it's not the same as just lowering the gain so that you won't ever clip is that to get the same dynamic range you'd have to lower it by 48dB and now your ADC doesn't achieve its potential for typical signals. You could lower the gain by 3dB (or whatever the splitting cost you) and get the same results for the low gain signal and a little more headroom, but you would not get the massive headroom increase of this approach.
        For this to work one must also have amplifiers with much wider dynamic range and SNR than ADCs, but we do.
        The natural output for this approach is a float-- the most natural would be a weird float where instead of an exponent one bit tells you which ADC is in use and represents a factor of 256 or whatever, but in practice these recorders just output 32-bit floats. I haven't looked but I wouldn't be surprised if there were only two exponent values ever used in their output.
        [-]
        PaulDavisThe1st 19 hours ago
        > So the tradeoff is that you lose a small amount of noise floor constantly-- out at the 20th bit
        So, basically, no better than the best AD converters we already have?
        My understanding of the fundamental limit to AD performance is that the brownian noise level is around the 22nd bit level. So even if you come up with techniques to successfully measure down to that level, you're basically picking up .. inevitable, irremovable, irrelevant noise.
        Possibly there are gains to be made by not worrying about the noise floor and caring more about the lack of clipping, but I'm not seeing people screaming about that. The "noise" seems to be "N bits of dynamic range", not "slightly less dynamic range but it will never clip!"
        [-]
        nullc 19 hours ago
        Yeah people describe the benefits incompletely/inaccurately. This approach has a worse theoretical SNR, but an effect that improves the delivered SNR in real usage: Without the clipping protection the user would massively lower the gain, hurting the SNR.
        A common experience for someone doing field recording of performers (my experience is music) is you twiddle your setup to get the gains reasonably high to get good SNR even for quiet parts. ... and then you record the actual performance, and you find that the tuba player really got into it for the real performance and the new peaks are 10dB over where they were in the practice. And now your recording is screwed up with a bunch of hard clipping you have to deal with. So then experience tells you in the future to take whatever you thought was safe and lower gains another dozen db.
        The multi-ranged recorders eliminate that problem and the result is that you don't need to use precautionary gains, and you get a better SNR in your recordings. You probably don't need to adjust gains at all: The gain can be whatever makes the self-noise of the microphone dominate the SNR of the process, ... which would be too high for the loudest samples, but the clipping handling deals with that.
        The samples that need to use the extended range have worse SNR (and probably poor linearity due to mismatches between the converters), but human hearing is much less critical to noise with loud signals anyways.
      - geraldmcboing 20 hours ago
        Why are you obsessed with DAC? Its the ADC that is WHY we capture 32/192.
        [-]
        PaulDavisThe1st 20 hours ago
        If I said DAC, it was a mistyping. I am (in this context) always talking about the ADC.
      - nullc 20 hours ago
        I didn't say anything about DACs! I'm correcting a specific claim you made
        > Nobody uses 32 bit float for recording (to do so is just to capture at least 10 bits of noise, most of that being brownian);
        This is not true and not true for a good and important reason! One which has no bearing on the kind of DACs that exist.
        Modern field recorders allow gains set a 'reasonable' level that maximizes SNR for recordings but still won't clip when there are much louder peaks. Not so dissimilar to how a 6-digit multimeter can achieve its advertised performance both on a 0-5v range and a 0-300v range but cannot give more than 6 digits at the higher range.
        [-]
        PaulDavisThe1st 20 hours ago
        When I said "nobody uses 32 bit float for recording", I am referring to the result of the DA process that generates samples values used by a recorder.
        Obviously, everyone and their mother uses 32 bit float as an internal sample format because of its fitness for purpose (except the folks who think they need 64 or 80 bit floating point, of course). But they are not using "32 bit floating point samples" - the samples come from an (at best) 18-22 bit integer conversion.
  - geraldmcboing 20 hours ago
    "Nobody uses 32 bit float for recording" - you are just displaying total ignorance here.
    [-]
    - PaulDavisThe1st 19 hours ago
      My comment should have been more emphatic that: nobody uses AD converters that generate 32 bit floating point values natively when recording, or anywhere close to the resolution that format implies.
      I am extremely aware that as a data format in DAWs and other recorders, 32 bit floating point is completely common.
      [-]
      - adrian_b 15 hours ago
        While the best that ADCs can provide is linear 24-bit audio samples, the following audio processing is better done after converting the samples to FP32, and keeping this format until the final 16-bit encoded audio suitable for listening is generated.
        For the same reason, video processing is preferably done on FP16 samples of the color components even if both the input ADCs and the output video signal may use only 10-bit or 12-bit per sample, at most.
        Moreover, most high-resolution audio ADCs do not really sample the input audio at a 24-bit resolution, but they use only a sigma-delta method where the actual samples have only a few bits, possibly only even 1 bit.
        Then DSP techniques are used to convert the audio stream with a high sampling frequency and a low resolution per sample into an audio stream with a low sampling frequency and a high resolution per sample, which is the external output of the ADC.
        If you had access to the raw audio bit stream as actually captured by the ADC, you could modify the decimation algorithm to really output FP32 samples, though no existent ADC could actually have a so high dynamic range (except if the output bandwidth would be reduced a lot, to filter the input noise).
        [-]
        Archit3ch 2 hours ago
        > the following audio processing is better done after converting the samples to FP32
        Or, in some cases, FP64.
  - geraldmcboing 20 hours ago
    Dude I've been doing sound design on films using these techniques for years. There is zero 'woo' involved, it is ALL practical evidence based use. I've been using 32bit float multitrack field recorder by Sound Devices MixPre10-II professionally for many years now. The recorder has three preamps per mic input, each gain staged to provide optimum signal to the 32bit float AD. Read this to clarify your thinking: https://www.sounddevices.com/32-bit-float-files-explained/
    Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz and at quarter speed only 6kHz. You can very clearly hear the filter cut off due to Nyquist. Record at 192kHz with mics capable of 100kHz capture and when played at quarter speed, the sound is full spectrum because there is no truncated frequency response. And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz. (not with every sound of course)
    I repeat, i am not talking about 'normal' listening. I am talking about an industruy you have no knowledge or lived experience with, so spare me the incorrect claims about what can & cant be heard.
    [-]
    - PaulDavisThe1st 20 hours ago
      > I am talking about an industruy you have no knowledge or lived experience with
      I'm the original/lead developer of Ardour, a cross-platform DAW, and have been working with digital audio for more than 25 years.
      There are no 32 bit ADCs - your SD MixPre's are giving you (at best) 22 bits packaged as a 32 bit float value. The preamps make absolutely zero difference to the AD conversion (though they might sound real nice).
      > Surely you understand a recording made at 48kHz has a max freq response of 24kHz and played at half speed that max freq is 12kHz
      This is a very naive version of what "played at half speed" might actually mean. If properly and correctly resampled, this is not true.
      > And when I load a 192kHz recording to izotope RX I can literallu see the harmonics going up to 96kHz
      Well, I'd certainly hope so! But the question is: what are the energy levels associated with the partials above Nyquist? If you recorded at 384kHz with sensitive enough equipment, you'd see partials above 96kHz - but at extremely low energies because ... well, that's just how physics works.
      [EDITED to remove AD/DA confusion]
      [-]
      - geraldmcboing 20 hours ago
        I do not use the DACs in the MixPre. Its a recording device. The field recordings & studio recordings are transferred as data and used in a 32bit float 192kHz Protools session. So the recorders DAC is completely irrelevant. The sounds are then used as source material, for processing and manipulation at 192k, 96k and 48k. There is no debate to be had. This is how film sound designers work & have worked for years now.
        The half speed you call naive is again just showing your ignorance. Sound editors have been using this technique since the days of recording on a Nagra at 15ips and literally replaying at 7.5ips half speed, and at 3.75ips for quarter speed. There is nothing naive about it, it is a very well know technique. To be able to achieve the same result digitally with full spectrum has impacted every feature film you have experienced in recent years. Again I speak from decades of lived experience.
        [-]
        PaulDavisThe1st 19 hours ago
        Running tape at half speed has almost nothing to do with digital resampling, which is what playing digital audio at half speed is generally all about.
        My use of DAC was a thinko, I've edited at least post to correct it since in the current context we're always talking about ADC. Apologies for that.
        [-]
        geraldmcboing 19 hours ago
        Wrong again. As a sound designer I can choose to import a 192kHz file into a 48kHz PT session in two ways, one as resampled audio which means pitch & duration stay the same, OR I can choose to import it without SR conversion, in which case the audio plays at quarter speed & pitch is 2 octaves lower. We use both techniques ALL the time, every day. It's a common technique every sound designer uses.
        You are arguing about techniques you have no experience with.
        [-]
        PaulDavisThe1st 19 hours ago
        I wrote a DAW that does precisely what you describe. I've been doing it for 25 years.
        [-]
        porridgeraisin 7 hours ago
        cperciva moment right here
jerf a day ago
If you can't hear the squeals of the plants [1] in the studio's reception area, are you really getting the full experience of a piece of music?
[1]: https://www.cnn.com/2023/03/30/world/plants-make-sounds-scn
[-]
- Blackthorn a day ago
  Oh great. And here I thought that fantasy literature where forest elves could hear the screams of the plants they stepped on when they walked was just that -- fantasy.
  [-]
  - SketchySeaBeast a day ago
    Triffid music.
Tsarp a day ago
This really is driving a muscle/super car, or drinking expensive wine. At the end none of specs or tests matter. It is a form of art. If it makes the listener feel better (even if its just psychological) then its probably worth it.
[-]
- munchler a day ago
  To expand on this a bit, I appreciate some audio overkill because, if I do hear sizzle or distortion, it eliminates one possible reason and helps me figure out what’s actually happening.
  It’s like having gigabit internet to my house: I don’t actually need it, but when a website is slow, I know the problem isn’t in my internet connection.
  [-]
  - ubercow13 21 hours ago
    Would 192khz audio result in less sizzle and distortion? Or more audible band IMD from the sound >22khz
- smilekzs a day ago
  Well, at least there are objective performance benchmarks on cars, and some of them are okay proxies of performance in motorsports.
  https://www.carwow.co.uk/blog/carwow-quarter-mile-400-metre-...
  https://en.wikipedia.org/wiki/List_of_N%C3%BCrburgring_Nords...
- meowface a day ago
  Correct. I've paid for Tidal for a decade because I just like the peace of mind that it's closer to the original recording. I'm sure it's mostly placebo, but I like it.
  [-]
  - handedness 20 hours ago
    I tried Tidal nearly a decade ago, and the audible fluttering effect caused by their audio watermarking totally ruined certain types of music, like choral recordings, strings and such. It was obviously apparent on $20 ear buds driven by any device, far beyond the more stereotypical audiophile gripes.
    I opened a support ticket but they never responded. After that it was difficult to take their lossless claims seriously when the labels were providing such garbage source material. Their whole value prop was totally hollowed out.
    I don't know whether the labels still impose such horrible practices, but I largely gave up on streaming services after that experience and now focus on keeping good digital archives of my physical library.
    [-]
    - meowface 16 hours ago
      They don't have anything like that anymore
  - starky 13 hours ago
    I'm pretty strongly in the camp of trust the science and measurements for audio stuff. Thus I suspect its mostly just better sounding masters, but I was shocked at how much I noticed the sound quality of Tidal compared to Spotify when I switched.
  - PaulDavisThe1st 20 hours ago
    The original recording of almost all music on Tidal was done with equipment that was very, very far from the 192kHz "fidelity" it claims.
    [-]
    - meowface 16 hours ago
      I still prefer lossless over MP3
  - yellowapple a day ago
    It's also sort of an inverted “Van Halen demanding a bowl of M&Ms with the brown ones removed” thing for me, too. The vast majority of my Tidal listening happens over Bluetooth, so that 24bit/192kHz FLAC stream is just gonna get downsampled to 16bit/48kHz anyway because that's all any Bluetooth speaker or headset is capable of doing — but the fact that it's an option in the first place signals that other things are being done right, too (namely: that Tidal's whole “we're the streaming service that pays artists the most per listen” premise actually has some semblance of merit rather than being complete marketing bullshit; while recording quality ain't the strongest signal possible for that, it's certainly a good sign when musicians/publishers are willing to send over the highest-bitrate lossless recordings they've got and not just the same ol' compressed-to-shit MPEG audio you can yank off YouTube for free).
- wat10000 a day ago
  I'd distinguish between differences that anyone can detect but some may not care about, and differences that may not be objectively detectable at all. Muscle cars, at least, are different in a way that anyone can see. Push that pedal to the floor and it feels different from a Honda Civic or whatever. Whether that difference is actually interesting or good is, of course, a matter of taste. Whereas audiophile nonsense is often indistinguishable even to the connoisseur and depends entirely on some form of self-deception. Still could be worth it, depending on what one considers worthy.
- mock-possum a day ago
  That’s actually a really good comparison, especially because - yes I can hear the difference between an excruciatingly lossless digitization of a piece of music that I’m intimately familiar with, played back on expertly configured hardware… but the difference is so little, that most of the time, I’m find just listening to it at medium high quality streaming on a pair of <$50 headphones.
  I’ve played with the nice toys, and they are nice, but for 100x the price, they barely deliver 1.5x the experience.
WarmWash a day ago
Foobar2000 has an extension that allows you to blindly test whether you can tell the difference between two tracks.[1] The prime use is to compare different encodings of the same song from the same lossless master.
It kind of changed me a bit when I ran through 20 lossless tracks I had re-encoded to various mp3 bitrates and realized that even on a fancy system, it can be really hard if not impossible to discern even moderate lossy from lossless.
If you are an audiophile geek, really think about if you want to try this, the reality check might crack your foundations.
[1]https://www.foobar2000.org/components/view/foo_abx
[-]
- pimeys 21 hours ago
  But try out to stream that mp3 from your home server in lower bitrate to save data, e.g. as opus. And now you suddenly hear the lossy encoding.
  We store files in the highest quality because it gives us the option to encode the music without audible loss of quality.
  [-]
  - vitamark 4 hours ago
    I personally encode flacs as 192/256k opus from the start and that's fairly enough for most data save purposes, so no reencode for streaming is needed
    that's a bitrate of 1GB per 9-12h, and for cases when it's too much I just have cached music on my device (I'm lucky to have mostly empty storage on my 256gb phone)
cozzyd a day ago
What a human centric view. I like my music to scare neighbor's pets.
kstenerud 19 hours ago
What's really interesting to do with all of these people arguing over audio formats (as always happens on HN) is to point a frontier model at this thread.
In a nutshell: nullc, rahimnathwani, zamadatix and vor_ know their shit, and geraldmcboing and PaulDavis are technicially correct but talking past each other. speak_on and TheOtherHobbes are confidently wrong.
And also: 44.1 kHz captures the entire human audible spectrum with room to spare, and 16-bit already goes beyond anything useful for listening. The higher resolution / sample size format is useful for production or archival purposes only.
The two main reasons why you hear a difference between the two formats: (1) it's likely a different master, (2) tiny gain differences in the signal (salesmen use this trick, but it's also easy to do it by mistake).
ryankrage77 20 hours ago
I decided to test for myself, downloaded Lacinato ABX and tested a 32-bit 352.8Khz flac I had lying around, to the same file downsampled to 16-bit 44.1KHz. I couldn't tell any difference. Then I tried 192k mp3... still no difference. Couldn't reliably differentiate 128 or 64kbps mp3 either. I had to go down to 32k before I could be certain which was which, and even then I still had to listen carefully. Think I need to get my ears checked. I know I can't hear much above 15-16KHz but I didn't think it was this bad.
[-]
- amanaplanacanal 15 hours ago
  15khz is about my upper limit. Too many loud concerts back in the 70s and 80s without hearing protection.
- jnaina 19 hours ago
  same here. seems all the years of q-tip use is saving me money by not needing to buy expensive Hi-End Audio gear.
- DiabloD3 19 hours ago
  Yeaaaaaaaaaahhhh.... you might be a tiny bit deaf.
  OTOH, we know nothing of your audio equipment nor how its setup.
  [-]
  - ryankrage77 17 hours ago
    HiFiMan Sundara headphones, focusrite scarlett 2i2 interface. Although, it turns out the Scarlett is set to 48KHz anyway, and I can't seem to change it easily under linux. Not that it seems to matter for my ears, lol.
    EDIT: Did some more ABX testing with a CD-quality track that I'm much more familiar with ('Introduction' from the Mirrors Edge soundtrack, which has been my go-to for comparing audio gear for the last decade). I could sometimes distinguish 128k mp3 this time, though interestingly, I got it consistently wrong rather than right. For some reason the compressed version seems to be my preference. Dropping to 96k mp3, I got it right 100% of the time - though only because there was a very noticeable difference in the stereo positioning of the first sound, rather than a difference in the quality of the sound. I think if it were mono I would still be unable to tell.
sholladay a day ago
Music producer here. High resolution audio is useful for editing and anywhere there might be downstream processing or format conversion that may or may not be high quality, let alone lossless. The article covers that pretty well.
However, the article claims that the final distribution doesn’t need to have a bit depth of more than 16. That does not match my experience. I can tell the difference between my renders that are 16 bit vs 24 bit. I cannot tell the difference between 44.1 kHz and higher sample rates, and that’s consistent with the math (Nyquist-Shannon), but bit depth is a different matter. Would be fun to participate in a double-blind test that includes my own tracks and others.
[-]
- PaulDavisThe1st 20 hours ago
  > I can tell the difference between my renders that are 16 bit vs 24 bit.
  established using double blind testing, I assume?
- nok22kon 21 hours ago
  thermal noise allows about 18-22 bits of real precision at audio level voltages, so it's plausible that 16 bit is somewhat limiting
  [-]
  - PaulDavisThe1st 20 hours ago
    16 bit may limit it on the input side, but the question is more about human hearing's sensitivity on the "output" side ...
  - guenthert 10 hours ago
    What? You still operate your ADCs without active cooling?
    I'm writing in jest, but long time ago, -hp- used actively cooled FETs (not a very popular approach today as that caused problems with condensation and we have better FETs now).
HelloUsername a day ago
(2012) https://news.ycombinator.com/item?id=3668310 316 comments
(2014) https://news.ycombinator.com/item?id=8689231 424 comments
(2015) https://news.ycombinator.com/item?id=10520639 228 comments
(2017) https://news.ycombinator.com/item?id=15127633 428 comments
(2019) https://news.ycombinator.com/item?id=19318898 314 comments
glimshe a day ago
Just get one of those "hi fi" valve amplifiers from Amazon you see under $100. The valve already distorts the sound, so you don't need to bother paying more for low distortion anywhere else in the audio chain. Saved you thousands of dollars, done!
[-]
- PaulDavisThe1st 20 hours ago
  Distortion is why people love the sound of vinyl.
  And its all good! It's perfectly fine to say "I prefer the sound when the whole mix (or just that guitar) ends up being subject to interesting and possibly harmonically relevant distortion at low levels".
  Just don't say "The version with the distortion is more accurate than the one without", because that's a lie.
manoDev 21 hours ago
They make sense for so called audiophiles who don’t understand Nyqist frequency theory.
It’s like photographers who are confused about the difference between raw and bitmap (jpeg), videographers confused about the difference between linear raw vs log vs gamma encoded, etc.
Just because a data format with higher bit depth/sampling frequency/whatever exists for editing purposes, doesn’t mean it’s “better” or makes sense as a consumption format for a finished work.
[-]
- eimrine 13 hours ago
  Nyquist describes harmonic frequency while music can contain non-harmonic oscillations, also Nyquist theory tells about the maximum possible frequency, but this same frequency can be placed in such a way with respect for quantization points that it becomes a silence.
  I like your point about editing, because trying to mix too much tracks in 44100 or 48000 makes spontaneus click sounds which are not supposed to be here.
- casion 21 hours ago
  They make sense for sound designers and derivative artists (e.g. sampling, which is a real artform).
  Forms of manipulation bring inaudible content into the audible range.
  Of course that doesn't mean audiophiles aren't being audiofooled by it, but there is legitimate usage.
hobonation a day ago
Counter: An ultra high bit rate solves the problem and you can stop worrying if it's the weakest link.
You can the focus on other things.
Example: I Bought the best skis possible. Now I know I need to just focus on my skills and not blame the equipment.
[-]
- RijilV a day ago
  I hate to be the one to break it to you, but high end skis make tradeoffs which are harmful to beginner or intermediate level skiers... also there's sorta no thing as "best ski". what you'd want for high speed bombing double blacks is going to be different from off piste or moguls or snow park fun.... double also, skis wear out. Depending on who you want to believe it's as low as 20-30 days. Which, granted the average skier is at something like 5 days a year. but if that's you... triple also?
  As for how this relates to audio compression, in particular in the context of 2012. you are making a tradeoff of storage size and decompression cost. Maybe that doesn't matter to you, but maybe it either did in 2012 or still does.
  [-]
  - hobonation an hour ago
    You're acting like I don't have everything from Hellbents all the way to Fisher GS skis all the way into Vole Scaled Skis for Telemarking. Just don't buy shit, and you don't get to blame the equipment.
    And none of them are broken after 20 days unless it's low tide or I fuck up on a cliff band. I ski a minimum of 100 days a year, and the only thing you can notice after 60 days is some slight decambering on softer skis.
    I will say that boots tend to get soft around 100 days. But usually dealing with that's also a skill issue; get good balance, and you don't have to have the boot hold your sorry form in place. It's how people basically skied for a thousand years with leather boots: They were good.
- hackingonempty a day ago
  The point of this article and video is there is no problem with 16-bit 44-kHZ PCM. It thoroughly covers the audible range and is there is absolutely no need for more when distributing music for humans to listen to.
  The problem is the people spreading myths and disinformation out of ignorance or to promote their enterprise.
  The weak links are producers/mastering-engineers, speakers/headphones and the room when using speakers.
me551ah a day ago
Nobody downloads music these days and everybody just streams. Audio at 24 bit still takes a small fraction of the bandwidth that 1080p video takes, so I don’t understand the hate for it.
I use a DAC by focusrite which can do 24-bit, and if I want to listen to higher fidelity audio on my planer headphones then I should be able to. Why should I limit myself to 16-bit
[-]
- mingus88 a day ago
  Counterpoint: bandcamp is doing well. Vinyl sales are doing well.
  If I like an artist that I find on streaming, I buy an LP and get a lossless download for free. I still have a music library and I will never rent my favorite music.
  Artists prefer to connect directly with their fans and BC is probably the best platform for people who care to pay and support acts directly. They have high res downloads and I import them.
- zamadatix a day ago
  I don't think the hate is about people who know it doesn't actually sound different if the audio file is 16 bit or 24 bit or necessarily about receiving a few more bytes than they need, it's about the pushes by these types of streaming services/offerings or people insisting that it's supposed to be any better for listening when it's not.
  Also the playback rate and the file rate are different topics. The former can get into scenarios more like the audio processing section of the article e.g. I had this one shitty headset for work which required me to set the volume to 1-2 (out of 100) on the computer and I could actually blind test tell when it was in 16 bit or 24 bit mode because it was cutting and boosting it so much it effectively lost precision in 16 bit mode.
- pimeys 21 hours ago
  Wait, what? I do download everything I listen. And Roon is quite popular in the music communities. How else you can make sure you have that correct mastering of your favorite album?
dlcarrier a day ago
There is a good reason to distribute it though, and compressed it doesn't really change the file size.
There's multiple YouTube channels that I listen to as podcasts, that are professionally created and the creators presume that exported audio works like studio audio, so what you end up with is really quiet audio that can't be turned up without pre-processing.
If we distributed audio the same way we work with it in a studio, we could forgo a lot of problems.
Also, the human ear does have enough dynamic range to make 24 bits worthwhile, though that much dynamic range is rarely used in recordings, and that high of a bit depth provides no benefits within a small dynamic range. A 192 kHz sample rate, on the other hand, is always useless.
PcChip a day ago
I'm curious if the audio was being sent bit-perfect to the DAC for all of these tests (ALSA direct), or if it was being run through the audio mixer and being resampled
I can always tell if my 44.1 songs are being resampled to 48 because they're being run through the OS mixer
[-]
- dist-epoch a day ago
  Proper audio resampling should not be identifiable. Of course, the OS mixer probably doesn't do proper (CPU expensive) resampling.
  But a quality audio player should account for this and do it's own.
  [-]
  - PaulDavisThe1st 20 hours ago
    If you're not on a US-based IP, you should check out https://src.infinitewave.ca/
    It is an incredible resource to see the quality of the resampling algorithms used by the actual production software likely used in any digital audio workflow.
    You will see that while the best are indeed almost 100% transparent, many are not.
    [-]
    - danadam 19 hours ago
      > If you're not on a US-based IP, you should check out https://src.infinitewave.ca/
      There is also https://src.hydrogenaudio.org/ (with no IP based restrictions, AFAIK).
      [-]
      - PaulDavisThe1st 19 hours ago
        Currently giving me: "Internal Server error reading database"
    - nok22kon 20 hours ago
      I remember using Adobe Audition for resampling audio, this site shows I had good intuition
      your software is among the best, but not pitch black best :)
      [-]
      - PaulDavisThe1st 20 hours ago
        Yeah, we use Secret Rabbit Code for ours, though we have access to the sox code now and that is "perfect". We might change to that as the default sometime this year.
  - PcChip a day ago
    I'm also one of those audiophile crazies that obsesses over which metals to use in cabling, power filtering, swapping opamps, and builds their own DACs, amps, and speakers
  - rasz a day ago
    "proper" resampling was expensive in 1997 when Intel was introducing fixed sampling AC'97, but was below noise floor of CPU load meter in 2007 when Microsoft released Vista killing hardware mixing.
rz2k a day ago
My good enough amplifier and DAC combo claims up to 24bit/192kHz, I use a cheap optical interface from my computer that claims up to 32bit/192kHz, and the streaming service I use serves most albums at 24bit/44.1kHz.
It would have cost the same for the entire stack to be 16bit/44.1kHz at every step, but with excessive resolution I can control the volume anywhere. The bits right before the analog conversion at the end are essentially the same whether I turn down the volume in the software player, the operating system, or the DAC/amplifier.
[-]
- PcChip a day ago
  you might want to see if your DAC re-clocks incoming optical, if not then it's relying on the cheap clock generator from your computer
  [-]
  - rz2k a day ago
    Some people have claimed to hear an improvement with an external clock on a Wiim Ultra, but I do not think it is possible to re-clock the WiiM Amp Ultra with an outboard clock.
    When I play from the computer, I'm not sure whether it is using the clock on my Mac, the clock on the optical interface, or the WiiM's clock. However, I do not notice any difference in fidelity when I use the Qobuz software player on my Mac or use Qobuz Connect to allow the player to directly stream from the source, so either it isn't a difference that I can hear, or the WiiM's internal clock is used for both sources.
  - rasz 6 hours ago
    That cheap clock source is good enough for 16 GHz PCIE signaling, but Im sure you will be able to hear the jitter!
    [-]
    - porridgeraisin 3 minutes ago
      After that "audio player that lays out the file in contiguous memory to avoid cpu jitter" example from above, this one is my new favourite audiophile trope. "cheap clock on the computer" I giggled.
saidnooneever 7 hours ago
the make sense for DJs and producers. preferably WAV full quality.
in DSP it matters a lot, so if mixing digitally or producting remixes etc. its useful to have more and larger samples to work with
codedokode 21 hours ago
192 kHz vs 48 kHz can make a difference if you slow down the audio. If you pitch shift down 2 octaves, the ultrasonic range 20-80 kHz turns into 5-20 kHz and there will be large difference between 192 kHz and 48 kHz sources. However, I do not know if it would sound good because the mixing engineer cannot hear those frequencies and mix them properly, or the microphone might not catch it or some of the material could be recorded with lower quality.
Also, sadly consumers are getting used to low quality audio nowadays - they often listen to lossly compressed audio on social media (sometimes decompressed and re-compressed several times) which is then re-compressed to send to bluetooth headphones, or played back on an awful smartphone speakers. Streaming services also use compressed audio.
speak_on a day ago
At a minimum, anything above 16/44.1 requires far more than just files: monitors, a treated room, listening position, DAC, etc... but most importantly - a trained ear. That last one is the most uncomfortable truth.
[-]
- Blackthorn a day ago
  Are you, per chance, a dog posting on the internet? Since 44.1khz sample rate is already past the range of the human ear, regardless of training.
  [-]
  - MertsA a day ago
    You need at least twice the frequency range for sample rate in order to represent the original signal. That's slightly misleading though, that's from the Nyquist-Shannon sampling theory and it's a mathematical fact but that is true for exact numerical samples, once you add in quantization that muddies the water a bit. Taken at the extreme, it's straightforward to see why a 1 bit quantization per sample at 44.1 kHz would not capture a perfect representation of some analog signal even if there's only a 1 kHz frequency component to the signal. If we instead decide to sample at 10 MHz but still one bit quantization, now that 1 kHz frequency component can be much more accurately represented even though we're still using the worst quantization possible. Don't think of quantization like a square wave or a step pattern, think of it as "the signal is closer to here than any other discrete value".
    Now in terms of realistic audio encoding, 16 bit at 44.1 kHz is designed to be a faithful representation as far as human hearing is concerned. Can someone with a trained ear potentially tell the difference between that and 24 bit at 192 kHz? In a studio environment it's possible. Most audiophile claims are dubious and a blind A/B test catches them out on most of it but the Nyquist-Shannon sampling theorem does not directly apply to quantized samples, it's about exact samples and with quantization, sampling rate is intertwined somewhat with the quantization depth.
  - move-on-by a day ago
    I don’t have great hearing, so I’m not sure I can really weigh in here (thanks punk concerts in my teens). I remember similar arguments around screens and 60Hz vs ‘the human eye’. I think a lot of people, myself included, can easily perceive the difference between 60Hz and something higher- given the right conditions. I would not be so quick to disregard claims of more sensitive hearing.
    [-]
    - speak_on a day ago
      (I commented on this topic above/below in more detail.) Even with not-so-great hearing you would still be able to identify the difference (ie artifacts are pushed down, not up). Look up articles on the practical limitations of AD/DA converters and why the seemingly counter-intuitive claim that the difference between 44.1 kHz and above is noticeable, is actually a fully industry-accepted practical reality: aliasing, AD/DA lowpass filters, etc.
    - labcomputer a day ago
      I would. It’s really simple.
      The human threshold-of-hearing curve intersects the threshold-of-pain curve at about 20 kHz.
      Above that frequency (or thereabouts) the sound has to be so loud that it will literally instantly damage your hearing before you can hear it.
      This has been replicated across many studies for more than 100 years.
      Flicker threshold is completely different. You can’t damage your vision by increasing the FPS, and it has always been commercially desirable to use a lower frequency because that is cheaper.
      [-]
      - speak_on 21 hours ago
        Would you agree that a trained human could identify artifacts produced by an imperect conversion process? If you lean "yes", then that's your answer: AD/DA is not a Rust function perfectly implementing the Nyquist theorem, it's a collection of physical components many of which introduce artifacts into the audio path. This thread is not about the theory of human hearing, the electronic components are literally imperfect.
        [-]
        PaulDavisThe1st 20 hours ago
        They're no more imperfect than the pickups on an electric guitar, the assembly inside the microphone, the circuit in the compressor and everything else in the analog signal chain that exists long before AD happens.
        [-]
        speak_on 20 hours ago
        Absolutely! All these examples have imperfect audio paths - that is the point.
        [-]
        PaulDavisThe1st 20 hours ago
        But the central point is that there's no reason to pick on the digital elements in any particular way. Recorded music in 2026 is a pretty good recreation of the original acoustic pressure waves when it is intended to be, but (a) not perfect, even in the pure analog domain and (b) it is frequently not intended to be.
        [-]
        speak_on 20 hours ago
        The central point is that AD conversion can and will introduce artifacts. DA process wil intrduce more artifacts. The "imperfect" is a huge range and AD/DA converters play a role in that. We are not talking about "golden cables" bs here, conversion does introduce measurable artifacts in the audio path. The more tracks you record the more artifacts you have. Can everyone hear them? Definitely no. Can they be heard - yes, I can hear the difference between an old Digidesign interface and Grace Design interface.
        [-]
        PaulDavisThe1st 20 hours ago
        No, the central point is that the analog signal handling before AD introduces vastly more "artifacts" than the AD or DA does.
        In addition, nobody cares about "measurable" artifacts (or rather, they should not). What matters are "audible" artifacts. We have measuring equipment that is vastly more sensitive than human ears (e.g. your recording equipment that can pick up signals far above 22kHz). What's measurable is not particularly interesting - what's audible is.
        Artifacts do not sum linearly, because they do not originate from correlated sources (unless you're doing something rather unusual).
        Glad you can hear the difference between two converters, but I trust you've tested it in a double blind setting?
        [-]
        speak_on 19 hours ago
        Hm, no. The discussion was never about analog artifacts vs AD conversion artifacts. Both are present. And not sure why you use "artifacts", do you not believe the artifacts are real? How can the lowpass filter not introduce artifacts?
        And absolutely - I blind tested coverters extensively. Mbox2, Black Lion Audio upgraded converters, UA, Prism.
        [-]
        PaulDavisThe1st 19 hours ago
        Note: blind testing is not double blind testing. Scientists evolved double blind for a reason: blind testing doesn't remove bias.
        Yes, the discussion was "never about analog vs AD". But my point is that I see little point wasting time on one set of artifacts (in the digital realm) that are tiny compared to those introduced in the analog realm. If there's a mouse and an elephant about to enter your home, you focus on the elephant, no?
        The big difference, of course, is that "everyone" has convinced themselves that most/all of the analog artifacts, as big as they are, are somehow "tasteful" or "artistic", whereas the digital ones are just "math errors". I don't think is too helpful.
        And look, if lots of people could get through double blind tests and still show they can hear aliasing or whatever the digital artifact du jour is, then I'd say "yes, absolutely, we need to be very aware of this and do everything we can to reduce or eliminate it". But as far as I can tell, this just isn't the case.
        [-]
        speak_on 18 hours ago
        This is a more philosophical take… And I totally agree with you. I mix at 16/44.1 just for the record. I do not buy into the idea of gold plated connectors or 96 kHz mixing. My point was never about quality - I can hear the difference (the point!), doesn't mean for me personally > 44.1 is "better" or "worse".
        To your main point: yes, all artifacts are just our learned, cultural, developed preferences. In the exact same way major/minor thirds were considered dissonant just a few hundred years ago - it's all a learned perception, not an absolute judgment.
        I would go even further, doesn't matter whether people perceive aliasing as a major issue, it's no different from the U47 "warmth". You can't afford this, probably, as a software developer in a way, but at the most fundamental level any sound's - or artifact's - judgment is based on our our current diagram of "sounds nice" vs "sounds bad".
        ses1984 20 hours ago
        Can you give any examples of people identifying these artifacts in a/b tests?
        Who has the best ears? What can they detect?
        [-]
        speak_on 20 hours ago
        OK, so we are entering the stage of "can you provide a double-blind study link". I can look it up, I am not a researcher. Here is one: https://qmro.qmul.ac.uk/xmlui/bitstream/handle/123456789/134...
        I know from my 20-ish year mixing experience that I can hear the difference when mixing. Is it good evidence? No. So we can agree to disagree then.
  - speak_on a day ago
    As I responded below, you are confusing math with physical reality. A true 44.1 kHz converter can't realistically capture frequencies ~18-20 kHz due to the limitations of filters used in the process. A perfect lowpass brick-wall filter just does not exist - they all introduce artifacts, which a trained ear can identify. You don't need to be a dog to hear the difference, just someone who does not assume that Nyquist theorem can be magically applied in the real world (and, ideally, someone who utilizes high quality converters with oversampling).
    [-]
    - Blackthorn 20 hours ago
      That extra 4.1 khz sample rate is for headroom for a low pass filter (and not necessarily a brick wall one). Leftovers or any such artifacts are below the noise floor, which is also an important part of the physical reality.
      Would be happy to see an actual, real study to prove that humans can notice, but to my knowledge none exist that confirm they can. Not even any on teenagers or younger (the only group that can even hear close up 20khz).
    - vor_ 21 hours ago
      Is there evidence that a trained ear can reliably perceive these artifacts in a blind test of converters? I'd be interested in reading those links since converters typically oversample into the mHz range. At 11.29 mHz (256x 44.1 mHz), Nyquist will be at 5.64 mHz. Even the cheapest consumer converters are performing this type of oversampling.
      [-]
      - speak_on 20 hours ago
        If you are looking for studies, this one comes to mind: https://www.researchgate.net/publication/289039184_The_audib...
        A quick search returned this PDF with a nice diagram of what aliasing looks like: https://download.tek.com/document/76W_30631_0_HR_Letter.pdf
        To draw a design parallel: pixel-perfect design isn't something we are born with, noticing tiny details is a developed skill.
        And yes, you are on point: oversampling is used extensively, but this just points at the exact issue: Nyquist theorem gave us a math algorithm, we still need to account for the electronic component imperfections. And then we are entering a different space of quality/precision/psychoacoustics/perception/etc. Meaning, not all converters, not all pre-amps, not all mics "sound" the same, even when they use same types of components on paper.
        [-]
        vor_ 20 hours ago
        Oh, dear, that AES 2014 paper from Meridian (which was trying to push its controversial proprietary MQA audiophile system the same year) was widely criticized on audio forums when it came out, ranging from the rectangular dithering method to the use of a hard metal tweeter that could cause IM.
        Do you have more convincing sources?
        [-]
        speak_on 19 hours ago
        I don't. Do you? I am not a researcher. Saying that, do you have a double-blind study handy on MP3 256 vs 320 actual audible differences? If not, can you yourself hear the difference? If you can - it might be an illusion.
  - clawlor a day ago
    Max representable frequency is half the sampling rate (nyquist-shannon theorem), which is still a bit above normal but IIRC the extra headroom has something to do with eliminating aliasing
    [-]
    - Blackthorn a day ago
      Indeed. And what is the max frequency that a human can hear?
      [-]
      - speak_on a day ago
        The artifacts produced by pure 44.1 kHz convertion are aliased back down to lower frequencies. It's not about a theoretical human ear, it's about the actual physics of AD/DA conversion.
        [-]
        PaulDavisThe1st 20 hours ago
        But the energies of the signal present above the Nyquist frequency (22050Hz in this case) are almost always incredibly weak, and double blind testing rarely shows any indication that humans can actually hear the aliasing.
        [-]
        speak_on 20 hours ago
        Mixing process often involves hundreds of tracks, and if each introduces aliasing, this can become a problem. Some engineers do swear by "the final mix is 16/44.1 so why mix at a different resolution?" mantra - that's fine too.
        [-]
        PaulDavisThe1st 20 hours ago
        This is false. Aliasing is not additive in any meaningful way.
        [-]
        speak_on 20 hours ago
        Ok dude, you obviously never recorded anything. Twelve mics on a drum kit, 60 tracks of rhythm guitars, several bass guitar layers, vocals, backing vocals, electric organ, percussions, saxophone solo. Do you think recording them at 44.1 somehow creates a shared "cloud-based" aliasing artifact that I store in S3?
        [-]
        PaulDavisThe1st 20 hours ago
        > Ok dude, you obviously never recorded anything.
        https://ardour.org/ is my website.
        [-]
        speak_on 19 hours ago
        Ha! OK I take that back.
        Firstly, it's an amazing experience to randomly interact with people like you - I love and use your software. Hats off and thanks for what you offered to the industry!
        But secondly, your statement makes even less sense to me: obviously artifacts do add up. Yes, not linearly, like any complex audio in general. But the more tracks with artifacts I have, the more artifacts I have overall. It's not like they cancel each other (outside of normal frequency cancellation).
      - Rotundo a day ago
        Depends on age of the listener, on average, 30 to 50 year olds hear a maximum frequency of 14 to 16 kHz.
        [-]
        Blackthorn a day ago
        Right. Which are quite below 1/2 of 44.1k!
        [-]
        OkayPhysicist a day ago
        Sure, but those are averages. I'm 30-ish, and my hearing doesn't cut out until somewhere in the 21kHz range. When I was younger, it was even higher. One of my roommates in college had one of those anti-rodent high-frequency noise generators, we almost came to blows over it.
- UtopiaPunk a day ago
  If you want to hear the difference between an audio file recorded at 44.1 and 88.2kHZ, then you need slow the audio playback down. Otherwise, a trained ear cannot physically hear the difference.
  [-]
  - speak_on a day ago
    44.1 is "enough" only in theory. This assumes a physically impossible steep filter. Realistically, frequencies around 20 kHz will create audible artifacts (aliasing). So yes, a trained ear can tell the diffrenece between 44.1 and even 48 kHz. Like many other commenters in this thread, you are mixing up math theory with physical limitations of AD/DA converters. Oversampling is a common way to address this limitation, but strictly speaking 44.1 kHz is not as obviously "enough" as it seems.
    [-]
    - PaulDavisThe1st 20 hours ago
      > Realistically, frequencies around 20 kHz will create audible artifacts (aliasing)
      The energy of the signal components above the Nyquist is generally very low, and very few double blind tests have given any indication that humans can detect the resulting aliasing (even though many people claim to be able to do, almost always in non-double-blind environments).
      Badly written digital synthesis can generate high energy signal components above 22kHz, but that's because they're badly written, not because the theory is wrong.
      [-]
      - speak_on 20 hours ago
        Genereally very low for a single track? What about 200 tracks? Badly written synthesis, or badly recorded live instruments, or bounced and re-bounced dozens of times... we are not talking about the quality-defining aspect here. You can produce an excellent mix on KRKs connected directly to a MacBook.
        This space is not driven by a single precise formula. 48/96 kHz helps some engineers to produce better sounding mixes. Can everyone hear the extended range of Adam tweeters? Probably not. But some can, and they benefit from that. Even if there is no double-blind study to prove this in absolute terms.
        [-]
        PaulDavisThe1st 20 hours ago
        If you recorded 200 tracks of the same instrument, so that the partials above Nyquist were all broadly the same, then sure, summing the tracks would include summing 200 copies of the aliasing results too.
        But very little music is like that, and the energy profile above Nyquist will differ dramatically. Consequently, you're not summing a set of identical aliasing results, and in general, the results will still be undetectable to almost everyone.
        Jacob Collier routinely works with 300+ tracks in Logic. He doesn't worry about this sort of thing, and neither do the Grammy voters who love what he does.
        [-]
        speak_on 19 hours ago
        Got it. Grammy voters love Collier's mixes. What about Tony Maserati? He can clearly tell the difference between 44.1 and 88.2. If your argument is that these engineers can't hear the difference - you are going to be disappointed. They can. Even Dave Pensado who mixes at 16/44.1, does that because he rejects the idea, he can hear the difference according to him.
        [-]
        PaulDavisThe1st 19 hours ago
        I can find no evidence that Maserati (or 99% of any other mix or mastering engineers) has ever tested his "appreciation" of the "crunch" at 44.1 in a double blind environment.
        It is always amazing how much that is claimed about what people can hear fails to show up when tested in this, the only acceptable scientific way.
        Perhaps Maserati has done this, and could still tell the difference. In which case, he should carry on! But he should carry on anyway! People should do what brings them joy, and if he likes working at 44.1kHz or whatever, he should absolutely do that.
        What people should not do is lecture about stuff that isn't true and/or isn't demonstrable in proper test settings, and most (not all, but most) of the SR stuff fits into one or other or both of those categories.
    - vor_ 21 hours ago
      Do you have citations for this claim? The "golden ears" argument is often employed by audiophiles, but even the cheapest converters oversample by up to several hundred times as well as employ antialiasing filters.
- scns a day ago
  A treated room would be the most impactful, DACs the least.
  [-]
  - yellowapple a day ago
    The DAC is pretty impactful if it's outright incapable of outputting anything beyond the usual 48kHz :)
    [-]
    - vor_ 20 hours ago
      Even the cheapest consumer DACs oversample into the megahertz range.
  - speak_on a day ago
    The most impactful for noticing the difference? Again, I would argue it's the trained ear. If you have plenty of mixing experience then all these details add up, and a treated room becomes the most critical - agree with that.
    [-]
    - vor_ 20 hours ago
      So far, here isn't sufficient evidence that anyone has such reliably golden ears.
      [-]
      - speak_on 19 hours ago
        Other than the top engineers in the industry. This is a discussion that always ends up in the "double-blind study" vs actual real engineers working in the industry.
dist-epoch a day ago
The whole audiophile industry is built on stuff which doesn't make any sense
My favourite: "audiophile-grade" audio players which allocate a single continuous buffer of RAM into which they load/decode the whole .WAV/.FLAC file, because supposedly the CPU "jumping" between "fragmented audio" causes audible "jitter".
Of course, they don't know that what looks like continuous memory to user-code is probably discontinuous in kernel/physical RAM.
Didn't check in many years, I wonder if they created kernel level players to account for that, to have "true continuous memory"
[-]
- platinumrad a day ago
  Don't forget: "most players use malloc to get memory while new is the c++ method and sounds better."[1]
  [1] https://www.audioasylum.com/messages/pcaudio/119979/
- lmc a day ago
  > My favourite: "audiophile-grade" audio players which allocate a single contignuous buffer of RAM into which they load/decode the whole .WAV/.FLAC file, because supposedly the CPU "jumping" between "fragmented memory" causes audible "jitter".
  Thanks for the laugh... this is absolutely bonkers. In case anyone is wondering, before sound hits our ears it has to go through a digital to analog conversion, which takes place on hardware independent of the CPU, operating with its own clock and buffers etc.
  [-]
  - justsomehnguy a day ago
    Am486DX/100 was enough to decode and listen an MP3 at 22KHz (and maybe mono?) and was more than enough to listen for 44/16/2 PCM. It's 31 y.o. today.
    [-]
    - Sohcahtoa82 21 hours ago
      I remember playing 44khz 16-bit stereo MP3s encoded at 128 kbit/sec on a 133 Mhz 486.
      It gobbled like 90% of the CPU and I had to make sure I gave it a pretty large buffer so it didn't stutter when an app claimed CPU for more than a second, but it worked.
  - wat10000 a day ago
    In addition to that, while it is possible to hit a delay and run out of buffer because memory access is slow (the most obvious would be if the input got swapped to disk at an inopportune moment), but the audible effect is really obvious. This isn't some subtle "oh my music sounds ineffably worse" effect, it's "my computer is glitching and my music is unlistenable."
- billyjobob a day ago
  I can tell when my CPU usage spikes because it causes a hum through my speakers, so this does not seem that far-fetched.
  [-]
  - justsomehnguy a day ago
    It's just means you have a shitty audio tract with not enough shielding. Move to SPDIF/TOSLINK.
    [-]
    - codedokode 21 hours ago
      I have an external audio card, if I put it on a laptop I can hear the modem-like sounds. I wonder why it is so sensitive, should not DAC produce strong signal that cannot be easily affected by radio waves?
      Also my headphones are extremely sensitive. I can touch the ring and sleeve of a jack with a finger, and touch a metal bed frame with a tip and I hear quiet clicks as I move the tip along the metal. Sometimes I do not even need to touch the jack with a finger. It doesn't work with small objects like a knife though.
      [-]
      - PaulDavisThe1st 20 hours ago
        Bad grounding everywhere. This is insanely basic stuff.
      - nok22kon 20 hours ago
        the radio waves could be interfeering with the signal before it gets amplified
- emtel 20 hours ago
  audiophiles (https://forums.stevehoffman.tv/threads/turntables-with-pace....) also claim that turntables can be rated on "timing, rhythm, and pace" in which supposedly the timing of the music can be affected by the turntable's mass and other properties.
  How this would occur without also producing grossly audible pitch distortion never seems to be discussed.
- bellowsgulch a day ago
  The latter is probably true, but the former does actually happen, and it's easy to accidentally do--lossless or not.
LarsAlereon a day ago
The main benefit for me is that digital watermarking becomes completely inaudible with high-res audio, but I can sometimes clearly hear it in standard resolution.
dijit a day ago
huh...
So I guess the programmer equivalent is distributing .pdb's (or, symbols)
[-]
- Blackthorn a day ago
  Pretty good analogy. Thing is though, the person who receives the 16-bit, 44.1khz music file can always upsample it to 192khz and not lose anything in the process (heck, lots of audio stuff oversamples internally to this level or beyond, for extra aliasing headroom!). I'm not sure about expansion from 16bit to 24bit though, downward expansion isn't necessarily perfect.
  [-]
  - gizajob a day ago
    You’d be adding 150khz and 8bits of nothing.
hgoel 21 hours ago
I still insist on the higher bitrate stuff. I don't expect to notice the difference, I just think that music where the artists have bothered to prepare those files is probably recorded with more care than otherwise. I'm not generally listening to big artists where this can just be expected, and while I don't have any evidence to support my belief, I choose to continue believing it.
I'm not interested in finetuning everything in my life for efficiency.
viccis a day ago
If you try to use empiricism when it comes to certain groups audiophiles, you are going to be sorely reminded that it's basically the equivalent of healing crystals for a different type of person. 24/192 is useful for mixing/mastering, but completely unnecessary for the end product to distribute for listening.
[-]
- evo a day ago
  24/192 is also great for digital synthesizers--if you're generating a waveform like a sawtooth that has theoretically instantaneous transitions, they can eat as much frequency as you can give them. Running at 44khz loses noticeable high-end content.
  Most modern digital synths have already caught onto this and run internally at much higher sampling rates even if their output gets downsampled, but sometimes you run across a vintage plugin that runs at the host audio rate and working in a higher sampling rate is audible.
  [-]
  - Blackthorn a day ago
    You can generate perfect band-limited sawtooth waves at 44.1khz, there are multiple techniques for doing this and most production digital synthesizers use them.
    Oversampling gives you headroom for aliases for the rest of the synth that is more vulnerable to it.
    [-]
    - evo a day ago
      Yeah, I was oversimplifying a blit, the raw waveforms are usually okay, but I distinctly remember old-school VSTs where you couldn't achieve a nice saw lead at 44.1.
      [-]
      - Blackthorn a day ago
        It's tough to tell without specific names, but I imagine a lot of particularly old* VSTs were written to use naive sawtooths rather than perfect band-limited ones, which would have terrible aliasing at 44.1 khz. Oversampling those would help a lot!
        * Some people are still making this mistake, despite information on the (many) ways to do it the right way being widely and freely available!
        [-]
        evo a day ago
        I wonder if there's also distortion or ring modulation stages where some of the energy above hearing range might spill into audible sidebands if they're not nyquist-limited first.
        [-]
        Blackthorn a day ago
        Yeah, that's the "rest of the synth" part that's more vulnerable to aliasing.
        There's some ways to do band-limited distortion but...they aren't nearly as widespread, easy, or universal as band-limited oscillators.
        Ring modulation is funny though because you'd ideally want the sidebands to modulate down by default rather than filter them out, that's why you're using it.
  - Applejinx 20 hours ago
    Hydrasynth aliases like a mad thing. My flagship synth ended up being Summit, and its oscillators are digital but run at a crazy high sample rate. Did likewise with some Chord Organ modules: that Teensy board it was built on could do chord audio at 300k and over a megahertz if you were just generating one wave as simply as possible. The freedom from aliasing really helped the sound, for all that it's a 12 bit analog output. A squarewave is a 1 bit signal…
  - nullc 20 hours ago
    > 24/192 is also great for digital synthesizers--if you're generating a waveform like a sawtooth that has theoretically instantaneous transitions, they can eat as much frequency as you can give them.
    So if your synthesizers do not use proper band-limited oscillators then 192KHz is _FAR_ too slow. You'd want to be running at hundreds of KHz, perhaps a few MHz.
    In reality synth software that doesn't sound like crap uses band limited oscillators and should work okay at 48KHz too. That said, even if the oscillators are band limited it may be the case the varrious modulations aren't band limited properly, as getting those wrong won't sound instantly wrong (in particular because you have to modulate to make it wrong, and the underlying change of the modulation may make it harder to tell its wrong).
    Though also in those cases if you're not counting on every step being properly band limited then 192KHz may be an improvement but you're still probably getting some meaningful aliasing. I think given how fast computers have become relative to digital audio there is probably a good case to just make any "modular synth" run at 32-bit 480KHz or even 4.8MHz through every stage that could process the audio.
    Maybe 192KHz really is enough to suppress the aliasing artifacts but I think to be convinced of that I'd want to see a system that supported both and validate that the difference between a downsampled 48KHz output from the two modes was below -90dB or something.
    Or otherwise you can just declare that the aliasing is part of the sound and then there are no right choices... 24khz sampling, 48k, 192k ... who cares, use what you like best. :)
    [-]
    - Archit3ch 2 hours ago
      > I think given how fast computers have become relative to digital audio there is probably a good case to just make any "modular synth" run at 32-bit 480KHz or even 4.8MHz through every stage that could process the audio.
      1. It should run at FP64 if you want to preserve filter resonances, etc.
      2. At 10x/100x fixed-rate oversampling, even a modern "fast" CPU will have very few cycles per (higher-rate) sample to run the DSP for 1 "module" of the software modular. Forget about interconnected modules, multiple tracks, or polyphony. For this kind of "analog"-style processing, it's better to run adaptive-rate algorithms (think SPICE) instead of wasting compute on unnecessary extra audio samples.
  - dist-epoch a day ago
    No synth generates sawtooths by literally drawing a saw tooth in PCM. The distorsion you get if you do that is not subtle at all.
- colmmacc a day ago
  32-bits are great for recording too because they do an incredible job of capturing the dynamic range without having to be precise on the preamp settings. It removes an entire job from the recording workflow.
  192 for mixing and mastering can be useful especially if you're doing a lot of effects, especially anything that pitch shifts. But I've seen low quality phone-microphone recordings make it to the master; if you capture lightning in a bottle, it hardly matters what the settings were, what the microphone was, or anything else.
  [-]
  - PaulDavisThe1st 20 hours ago
    The limit on current DACs is 18-22 bits. The rest is just brownian noise. Literally.
- Aldipower a day ago
  Even with mixing/mastering 96khz is enough for persisting to files. But as another commenter said, 192 is useful, if you bend and stretch samples!
- tshaddox a day ago
  They literally sell actual crystals that you’re supposed to place on top of speakers and amplifiers to make them sound better.
  [-]
  - Blackthorn a day ago
    We had a really nice crystal decoration that I happened to put on top of one of my TV speakers and, wouldn't you know it, it had this resonant frequency somewhere around specific human speech frequencies that drove us absolutely bonkers until I figured out the cause and moved it.
teach a day ago
(2012)
lokar a day ago
I wonder how many people think that 24 bit audio encodes 50% “more”
[-]
- recursive a day ago
  It is 50% more headroom above the noise floor in logarithmic decibels.
BoingBoomTschak 9 hours ago
No mention of Toole? Obligatory considering the subject, even if it's about acoustics and not digital audio:
https://www.routledge.com/Sound-Reproduction-The-Acoustics-a...
And a talk by the author covering some of the material for those who prefer that: https://www.youtube.com/watch?v=zrpUDuUtxPM
ChrisArchitect a day ago
(2012)
Some previous discussions:
2023 https://news.ycombinator.com/item?id=34698427
2022 https://news.ycombinator.com/item?id=30138561
2019 https://news.ycombinator.com/item?id=19318898
2017 https://news.ycombinator.com/item?id=15127633
2015 https://news.ycombinator.com/item?id=10520639
2014 https://news.ycombinator.com/item?id=8689231
2012 https://news.ycombinator.com/item?id=3668310
Arodex a day ago
I completely accept that human audition has limits that are easy to determine by playing a pure sound. But is it the same with music, where multiple frequencies are played and interfere with each other? Aren't some harmonics or effects created by these "inaudible" frequencies?
To try to imagine something similar: the human eye is unable to see UV light, yet fluorescent paint has a visible quality of its own compared to "normal" pigments.
[-]
- nok22kon 20 hours ago
  when beams of ultrasounds interract they can produce audible frequencies
  this has practical applications
UltraSane 14 hours ago
The only real advantage it has is being able to re-encode it into any lossy format you want. With modern storage capacity the size isn't much of an issue.
0l a day ago
Obligatory mention of https://xiph.org/video/ which clears up a lot of misconceptions.
trashcluster a day ago
24 bits is now ubiquitous and 32 bit is becoming the norm in recording studios.
[-]
- evo a day ago
  32-bit float has become popular in filmmaking/field recording equipment lately because, with a microphone preamp that supports it, you can capture the entire dynamic range of the microphone--there's no accidental clipping if you drive the gain stage too hard.
  It's a bit redundant for a skilled technician, they're already used to setting the gain staging, inbound compression, and feathering the mics to avoid this in 24-bit, but if you're handing a boom mic to a novice and have a scene where e.g. someone's whispering and another person's screaming, it can be nice to not have to worry about it.
- lysace a day ago
  That use case is literally addressed in the first sentence.
metalman a day ago
sheeesh , measly 24-bit/192kHz of course it makes no sense, unless it is downloaded through low oxyegen wire, which somehow and unfathomably, must have been omited or forgotten.
[-]
- b3orn a day ago
  If it has been transmitted via hollow-core fibres it will obviously sound hollow.
waffletower a day ago
For typical listening (though humans can perceive bone-conducted vibrations up to 100 kHz or even 120 kHz) 16-bit-fixed/44.1kHz is a high-fidelity transport format. As a DSP researcher, I prefer 32-bit-float/44.1kHz as a transport format. I often upsample to 32-bit-float/188.2kHz or even 32-bit-float/192kHz for signal processing applications such as high-fidelity reverberation via direct and FFT convolution. While the author advocates for the transport to ear use case, I would argue that 24-bit/192kHz provides greater fidelity and resolution for sound processing. I found the pedantic arrogance of the author to be annoying. But yes, the sampling theory is an important consideration -- but so is the quality of the actual digital filters used in the DAC->ADC pipeline. They are much more forgiving and less lossy at 192kHz.
haunter a day ago
The more the bits the better the music, easy as one two three
Don't forget to buy the new low oxygen platinum plated HDMI cables for the better experience!
/s