Re: IDs for lone mp3s

Johan Pouwelse (pouwelse nospam at pds.twi.tudelft.nl)
Thu, 16 Dec 1999 13:23:12 +0100 (MET)

On Wed, 15 Dec 1999, Mayhem & Chaos Coordinator wrote:
> > That takes us back to the "authoritative" ID source.
> I don't think an 'authortative' source is necessarily bad. It is a bad when
> a company like Escient comes in and starts dicating the rules of the game
> and trying to push their brand on everyone. There is no question about it,
> that's bad.
> However, the current plan for the CD Index is to be mirrorred worldwide with
> the data belonging to the public. This in essence makes it impossible for
> any one party to 'sell-out' and give a corporation control over the data.

We do not need "one true location" for getting a valid ID. You can use the
sequence number of the track_ID in combination with the server adress
If the last added track at cdindex had the ID 4048 then the next
valid ID would be 4049 nospam at www.cdindex.org

But what is the exact problem we want to solve?
Why would you want information from CDIndex if the information is already
in the MP3 file?

1. Lone mp3 files from an audio CD can carry the Disc_ID inside an ID3 tag
and all other information, like artist audio CD title, etc..

2. Lone mp3 files with only a filename and no ID3 tag info could be found
on the CDIndex server using a new perl script that takes the filename, the
presumed artist and looks for arist/tracks/albums and returns the
matches. The user can then select a match and insert an ID3 tag inside the
MP3 file with all information.

3. Lone mp3 files without an audio CD on CDIndex, only a filename, and
no ID3 tag info form a problem. If the mp3 file was not released on an
audio CD we can store the trackname inside a new table or inside the track
table. The combination of artist and songname is unique and can be used a
key. This information can be stored inside the MP3 file. Is it
OK to require the artist name+songname for registration of an MP3 file?

Did I forget a situation?

If you want to solve point 3 without any info and just do hard signal
processing/hasing to match it with an MP3 file of a different coding rate
you will need some very smart program. Correct?

> > I'm a lot less concerned about having multiple IDs point to the same file
> > than I am about having to go to "one true location" for getting an ID.
> > After all, we already have same-CDs that have different IDs, so the
> > database needs to be able to store multiple IDs per user-level object.
> > Which is trivial.
> I agree, but you need to consider the sheer volume of data that will be
> created in this process. When a batch of CDs is mastered and pressed you may
> get one new ID for 100,000 CDs. That growth is manageable. However, when you
> start recoding IDs for *every* single MP3 that people rip, you're going to
> be recording thousands of new IDs every day. Your data load is going to grow
> significantly without any appreciable benefits.
An ID for *every* single MP3 is not an option.

Johan.