Re: [cdin] Re: Distributed Data

Greg Stein (gstein nospam at lyra.org)
Tue, 09 Mar 1999 13:35:05 -0800

Matt Sergeant wrote:
>
> "Kyle R. Rose" wrote:
> >...
> > Because the hash function required to insert data into the database is
> > one-way. This means that a troll cannot reasonably be expected to
> > know the hash value of ANY CD that he does not possess -- even CDs
> > that won't exist for millenia! Since it is reasonable to expect that
> > only a person who likes the CD is going to rush out, buy it, and put
> > it into their DMI/CDDP+/whatever-enabled player, they aren't going to
> > enter crap into the database. (Of course, they may just be
> > brain-dead, stupid, or poor spellers, but I'm not sure there's a way
> > around this short of creating a trust system for "reliable" data
> > providers.)
>
> Look at the # of hits on cddb.com - entries are about 300 a day. That's
> not too many to check for dupes manually if there isn't an exact match.
> I'd bet that if you used some decent code that checks for duplications
> you'd have only about 10 a day to check manually. One person could cope
> with that.

Excellent.

A post-process each day could scan the new submissions and send a report
to an editing team. That 10 a day could be processed by a group to
lighten the load even further.

This implies an authenticated/privileged editing access to the master
database (I'm still assuming a master/slave replication model). I don't
think this is a problem, but it merits a mention.

Cheers,
-g

--
Greg Stein, http://www.lyra.org/