IDs for lone mp3s

Peter Bierman (pmb nospam at mycds.com)
Wed, 15 Dec 1999 13:19:57 -0800

At 11:22 AM -0800 12/14/99, Mayhem & Chaos Coordinator wrote:
>> I vote for a hash of data *in* the mp3, so that tags don't have to come
>> from an "authoritative" source, and truely different mp3's don't get the
>> same tag.
>>
>> Hash the first 4 bytes of every 20k or something like that.
>
>This is the nicest way of doing it, but it has a *NASTY* side effect. The
>rip-encode process is very sensitive to slight changes -- the audio ripping
>process is not deterministic. If you rip one track once and then rip it
>again, it is unlikely that the two ripped wav files will be bit-for-bit the
>same. Also, if you using two different MP3 encoders would also yield two
>different MP3s at the end. My point is that if two people rip the same
>track, even though its the same length and they may sound identical, they
>are not going to yield the same hash.
>
>So, the user will not be able to come up with an authoritative hash by just
>deriving it from 4 bytes from every 20k. Unfortunately.
>
>What needs to happen is that when someone is ripping/encoding a track, they
>CD should be looked up in the CD index. The CD Index will then provide the
>ripper with a set of IDs for each track. The encoder can then stash that
>track in the metadata (ID3vX, or whatever) so that the player can uniquely
>identify this track.

That takes us back to the "authoritative" ID source.

I'm a lot less concerned about having multiple IDs point to the same file
than I am about having to go to "one true location" for getting an ID.
After all, we already have same-CDs that have different IDs, so the
database needs to be able to store multiple IDs per user-level object.
Which is trivial.

If the hash is based on the bits in the file, then we have a unique, user
accessable hash for every unique file. Not every unique file will *sound*
unique, but they *are* unique files.

That might even be useful. A file ripped with a certain encoder at a
certain bit rate will be distinct from a file ripped completely differently.

-pmb

--
Ring around the Internet, | Peter Bierman <pmb nospam at pez.com>
Packet with a bit not set | http://www.mycds.com/pmb/
SYN ACK SYN ACK,          |"Nobody realizes that some people expend
We all go down. -A. Stern | tremendous energy merely to be normal."-Al Camus