[mb-users] Extending MusicBrainz to hold audio checksums
2009-06-05 10:32:00 GMT
How can we extend MusicBrainz so that people ripping digital media may
verify their audio content is correctly media-shifted from audio CD
sources?
There is a higher challenge here. Reading CD audio is an inexact science:
- Labels may release multiple pressings of the same content, leading to the same audio at slightly differing sample offsets
- Consumer CD drives are not consistent at which sample offset they start reading from
- No drive on the market can reliably detect all errors when reading audio from CD media
If we can store file audio data checksums in a meaningful way, should we?
Possible approach:
1)
No "zero" reference. Cover all possible read offsets given drives
available on the market today. Make a lot of SHA-1 checksums (3072
count) per track, and per album. This amounts to be 33KiB * (num tracks
+ 1) of checksum data per release. This has the benefit of working
without drive calibration, as most CD pressings contain no useful data
in the missing samples (which is like maximum 5 sectors i.e. 5/72 of a
second audio). Drawbacks are the required time to compute checksums and
the nearly 400KiB of checksum data per release.
2) Above but with a "zero" reference. Maintain a list of approved
"zero offset" drives (I own such a drive, the Plextor PX-712SA). This
differs from the AccurateRip(TM) method by 30 samples. Checksums stored
in the database will be moderated and voted on by persons submitting
from approved hardware only. This reduces the data storage requirement
to less than 5KiB per release. Client verification software is still
tasked with heavy computational load to generate all possible checksums
as described in method #1
3) Guess what the actual audio content is for the release and cover
all possible read offsets for the release's audio content as a whole.
It would also have some kind of "inner checksum" calculated at offset
from the start and finish of useful audio. Required storage is less
than 70KiB per release.
This concept of verifying CD audio rips does kind of walk the line
between what does and does not apply to the purpose of MusicBrainz
database.
Thoughts? Comments welcomed.
_______________________________________________ MusicBrainz-users mailing list MusicBrainz-users <at> lists.musicbrainz.org http://lists.musicbrainz.org/mailman/listinfo/musicbrainz-users
RSS Feed