Re: [cdin] [DB] Feature Set 1.1

Brian Murray (brian nospam at proximity.com.au)
Fri, 12 Mar 1999 12:19:16 +0000

Thanks David for sorting through this stuff.

On Mar 11, 13:58, Schuetz, David wrote:

> + "Weaker" clients like CD jukeboxes for the home or car,
> that might have small displays (80x4 characters)

We need to work out what fields this would apply to. All fields?
Can any (text) field potentially fork out to multiple language and
full vs. abbreviated versions?

> + Sub-track and Super-Track information (arbitrarily defined
> by timestamps, related in a hierarchical manner to physical
> CD tracks) in order to break up long physical tracks or group
> short ones.

Definately a good idea. Especially for CD's with indexes (which aren't
stored in the TOC as such), or CD's which should have indexes but
don't, CD's with poor track separation (sloppy mastering) etc.

Perhaps the multiple CD per album case could be incorporated here.

> Second, the Purpose of all This. I was thinking last night, that it might
> help to remind ourselves what we're trying to accomplish. I figure there are
> three goals of this system. Remember that it's a *CD* information system,
> not a music database or Phonolog.

Why not? We have a golden opportunity to build such a thing. With just
a little more thought put into the database, we can farm the CD-playing
public for their data-entry efforts. Why not build the one, canonical
music database? Those with an interest in particular artists, who run
fan pages etc will have an incentive to ensure their artists' entries
are correct.

Then, with the extra CD identification hooks (whichever ID system gets
chosen), it becomes a live, real, useful thing. Hook URL's of artist's
fan pages in, and you're one click away from those pages when you
pop the CD in. Lyrics, front cover images, etc... (copyright issues
notwithstanding).

How far down the normalized database / MuRuPu path do we want to go?
It would be great for all CD's by a particular artist to point to
an Artist record, where all centralized artist-based info is located,
including artist's name, abbreviated name, fan pages, etc etc. This
would effectively link all albums by a particular artist together,
allowing iteration etc. But who is "artist" for a classical recording?
Link Composers, Conductors, Orchestra's etc... which fields behave
in this way?

Data entry needn't be as overwhelming as it may seem to be. You could
still have a plain vanilla entry form - when you submitted it it would
match the artist field with artists already in the DB, and provide
the user with a choice in the case of ambiguity (e.g. misspelled
artist name).

> For any CD I'm playing, I want to:
> + ...know exactly what song I'm listening to
> + ...control how/what I'm choosing to hear
> + ...learn a little more about the CD

Sounds good to me. Whatever happens, we need to balance the interests
of a full-blown music database with those who want plain vanilla
Title/Artist/Track[n] info to appear when they pop a CD in their drive.

> While I'm not averse to building an "UberDB" for Music (sort of like the
> Internet Music Database), with liner notes, Band links, images, video clips,
> and whatnot, that's really not why we're here. If anyone wants to tackle
> *that* project (after we finish this one), I'd love to be involved with it
> (and there's no reason that such a DB can't share information with this
> project). But that's too big a fish to fry...

If so, lets at least work out how we can make a smooth transition
to such a thing, so that as little as possible (if any) of the data entry
this project will yield is wasted.

> Anyway, here's "Version 1.1" of the DB structure *proposal*. Changes
> indicated by a "|" character on the left edge (God, I wish we all used the
> same rich mailers...)

Anything not explicitly mentioned I agree with.

> | * Artist - primary artist as defined on the CD (generally,
> | where it's sorted in most music stores). Might be Conductor,
> | for a classical piece (while the composer is noted elsewhere).

If we don't store last name, first name separately (and I don't think
we should), should we at least recommend the format in which this
is entered, i.e. "Last, First" vs "First Last", or whatever the CD
sleeve says ... what about sorting? Is sorting an issue?

> | * Composer - if the primary composer is different from the artist
> | * Performer - Primary performers (London Symph Orch w/James Galway)

We don't want to duplicate information, so if "Artist" is an umbrella
for things like Composer, Performer, Conductor etc it should not simply
be an additional field with the same info. Should it inherit the first
value of a set of fields (the first non-blank field), or point to
a specific field...

There has been mention of "Studio", "Producer", "Engineer", and I
can think of "Remixed-By" .... obviously these are non-essential
optional fields, but we should at least standardize on a decent set
of field names for people to draw from, so that for all CD's with
a known "Producer" we know how to extract that information.

Also, there may be multiple producers etc, so you may want
a standard way of representing a "list" of entities (individuals
or whatever) for a field.

> | * Title Abbrev - Short (40 char) title, for small-display devices
> | * Artist Abbrev - Short (40 char) artist, for small-display devices

So are we going to define which fields have an abbreviated alternative,
or allow it for any text field?

> | * Language - Language of DB record. Client could request a particular
> language, and if that exists, it'd be returned. Default would likely
> be "english" (sorry, rest of the world.. :-( ).

Touched on above - do we allow multiple language versions of any
text field?

> Also, we might have some optional information (maybe this starts to
> get into the "freeform" section of property/value pairs)

Freeform fields are good, but a standard set of fields to draw from
would be a good idea.

> | * Copyright - (c) date of the disc itself
> | * Release Date - when was the CD released

Yep - this could be year granularity only, or a date depending on
what info is available. Also "Recording Date" which may be different...

> | * Genre - some way to catalog or describe the music

See separate posting ...

> | * Production Information (producer, engineer, studio, etc.)

Discussed above

> and for each "track":

> | * Start Time - MM:SS:FF from beginning of disc
> | * Display Track Number - how track is displayed (1, 2.3, 5.2.1, etc.)
> | * Physical Track Number - If track corresponds to an actual track
> as published on the CD, then the track number. Should also
> accomodate "Work/Title" (or whatever, I forget) nomenclature
> for DVDs, and anything else that might later develop (:-)).

This is related to the sub-track / super-track idea - don't we also
need an End Time, or is to be inferred from the next track's Start Time?
There may be long pauses between tracks we don't want to have included.

> Note that some of the "freeform" information might also pertain to
> individual tracks.

In general there is a bunch of fields (most of the text fields described
above) which could apply to a track OR an album, so we should separate
them out into "description" fields. Then we can talk about Albums and
Tracks and the fields specific to those, and either can reference
"description" fields. This handles the "various artists" case.

Cheers,
Brian.

--
-----------------------------------
Brian Murray      Proximity Pty Ltd
http://www.proximity.com.au/~brian/
-----------------------------------