Some sugestions.

Travis Tabbal (bigboss nospam at xmission.com)
Thu, 11 Mar 1999 08:03:26 -0700 (MST)

I'm not on the CDIN list, but they were complaining that cdindex doesn't
have an archive so I'm CC'ing them anyway. Hope it goes through.

I've seen a lot of interesting proposals sugested here over the last few
days. I'd like to combine ideas from a few to create a proposal for the
communications.

HTTP for requests. Data could be served as HTTP headers or XML. I would
suggest a two-tier approach. HTTP headers for basic disc info, such as
title, artist (primary artist, or various), and track names, lengths, and
a sugested playorder (from the disc). The program could then request the
XML for extended info like artists for each track, lyrics, liner notes, or
anything else. We could have a basic format for offically recognized tags,
and the ability for additional tags. XML would be static pages updated on
a regular basis, via cron for example. XML could be served via standard
WWW server, such as Apache. This is nice for existing programs. They can
provide the same level of capability without any complex modifications
while having an upgrade path.

Storage: RDBMS. Any type will do if it supports our basic layout. I would
recomend we use basic SQL types for the data fields to provide maximum
compatibilty. Tables and other data format specs to be determined.

Replication: Some RDBMS's support replication allready, don't they? If
not, how about a nightly (or other time period) diff file to be sent via
HTTP? If the records had a timestamp it would be easy to write a SQL query
for all submissions within a certain timeframe to be exported to a diff
database that could be compressed and made available for the remote
databases to download and merge. Servers looking for an update would just
connect and request. Some type of mirror structure would be nice. The
primary site would only allow certain sites to request the new data, thus
keeping server load down. We could also write some PERL to automate being
added to the distribution lists for mirrors. Then the primary server would
push the data to the servers in it's lists and the process would repeat
all the way down the line. Each server admin could set access privilages
for requested connections, so they can manage thier own bandwidth and
server resources to prevent collapse.

Submissions: HTTP seems the best way. Just send headers for the
submission, maybe formated XML would be a good way. Then the server would
parse it, check for errors/duplicates and send status messages back to the
client. That way if there is an error the user has a chance to correct the
data and resubmit. Submissions collected by the various servers could be
periodicly gathered and uploaded to the root server for distribution to
the rest of the group.

Duplicates: The root server would have to decide somehow what to do if two
identical submissions came in on the same batch. We should have a policy
for this. Maybe take the latest one? Timestamps would come in handy here
too, this would also help for updates to a particular record. They would
get added automaticly.

DNS could be used to round-robbin the servers for client systems. For
example, a client program would look for us.cdin.org and get sent to a
system in the US somewhere. Other countries/locales would be supported in
whatever fashion they desire. Perhaps something in the protocol to request
a list of regions to choose from, that way when the client is setting up
they ask the user to select the region they wish to query. This would help
distribute the load.

The GPL CDDB code could be used to create a front end for compatibility.

This does leave a single point of failure, the root server. We should
consider options to cover for this. Some kind of failover method maybe? Or
we could just pick someone with nice stable servers with a hot spare and
hope for the best. The system itself could function for quite some time
without the root server before problems got out of hand.

The above is simple, isn't too chatty (preserve bandwidth if possible!),
and most of the tools exist allready. It also seems to cover the primary
topics I see posted in the list and on various pages, and gives us a basic
roadmap for a quick implimentation. I would think a basic implimentation
could be completed within a few weeks at most if we can get everyone to
agree on a basic database layout. The nice thing is that client access to
the DB doesn't need much of anything. Just basic HTTP, which most CDDB
capable programs allready have. This will help gain converts to the new
format, it's easy to do with the tools they allready have. Remember
people, KISS! (Keep It Simple, Stupid)

Travis Tabbal
bigboss nospam at xmission.com