Audio File Duration Inaccuracies

My music collection is a mess of jumbled formats, inconsistent naming conventions, and incomplete (or inaccurate) tags. I am writing a script to rectify that, and discovered in the process that finding the duration of an audio file is not trivial in Python.

It’s quite easy to get a library to give you a duration. The problem is that number might not be accurate.

Most audio files contain metadata. For ease of use, that metadata includes things that could be calculated. For example, the byte rate is calculable from the channels, bytes and bitrate, but many media formats include it in metadata for convenience.

The problem arises when the metadata is wrong. This can happen, for example, when a file is truncated. Maybe the file thinks it is three minutes long, but because the program that generated it crashed, only two minutes of data got written. If you rely on the metadata, you will think the file is longer than it is.

Relying on metadata is lunacy, and it seems a lot of libraries are lunatics. I shouldn’t complain about inconsistencies in this craziness, but… they are inconsistent! Mutagen1, for example, will accurately report the duration of a truncated MP3, but not an MP4.

Even shelling out to well-regarded tools such as ffmpeg fails. Capturing stderr from “ffmpeg -i” will not accurately tell you how long an AIF is. Fortunately, ffmpeg is very good at quickly converting AIF to WAV, and ffmpeg does the right thing with WAV files. It’s inefficient, but getting the duration right would be trivial to implement.2 Perhaps inaccurate file durations don’t prevent ffmpeg from getting its job done.

Tools3 and file formats4 abound, but I still don’t know how to accurately time an MP4. I’m tempted to drill down into the libs to see how far this problem goes, but I

Fn 1: The MPEG Audio Decoder library (pymad) will also give accurate times on MP3s, but rounds to the second. Mutagen is slightly more accurate.

Fn 2:I filed a bug against ffmpeg.

Fn 3: As far as I can tell, few program can accurately time all their input formats. Totem can time AIF but not M4A, for example. Faad, which only handles MP4 and AAC, cannot time MP4. I didn’t test it with any AAC files.

Fn 4: I didn’t test any ogg files because I don’t have any. As far as I can tell, the purpose of ogg is to encrypt a media file so it only works on your linux box. It’s like DRM working for freedom instead of against it.

Fn 5: Yes, this is a blog post about minor inaccuracies in audio file duration, and yes it contains footnotes and yes I’m disappointed there’s no nerd merit badge for blogging with footnotes.

Stick This In Your Ear

Audio is up from the luncheon talk I did at the Berkman last October. I’m always surprised by how thin my voice sounds on recordings.

Thanks to Amar Ashar for passing me the link.