soundfingerprinting is a C# framework designed for companies, enthusiasts, researchers in the fields of digital signal processing, data mining and audio/video recognition. It implements an efficient algorithm which provides fast insert and retrieval of acoustic and video fingerprints with high precision and recall rate.
Full documentation is available on the Wiki page.
Below code snippet shows how to extract acoustic fingerprints from an audio file and later use them as identifiers to recognize unknown audio query. These fingerprints will be stored in a configurable datastore.
private readonly IModelService modelService = new InMemoryModelService(); // store fingerprints in RAM
private readonly IAudioService audioService = new SoundFingerprintingAudioService(); // default audio library
public async Task StoreForLaterRetrieval(string file)
{
var track = new TrackInfo("GBBKS1200164", "Skyfall", "Adele");
// create fingerprints
var avHashes = await FingerprintCommandBuilder.Instance
.BuildFingerprintCommand()
.From(file)
.UsingServices(audioService)
.Hash();
// store hashes in the database for later retrieval
modelService.Insert(track, avHashes);
}
Once you've inserted the fingerprints into the datastore, later you might want to query the storage in order to recognize the song those samples you have. The origin of query samples may vary: file, URL, microphone, radio tuner, etc. It's up to your application, where you get the samples from.
public async Task<TrackData> GetBestMatchForSong(string file)
{
int secondsToAnalyze = 10; // number of seconds to analyze from query file
int startAtSecond = 0; // start at the begining
// query the underlying database for similar audio sub-fingerprints
var queryResult = await QueryCommandBuilder.Instance.BuildQueryCommand()
.From(file, secondsToAnalyze, startAtSecond)
.UsingServices(modelService, audioService)
.Query();
return queryResult.BestMatch.Track;
}
The default storage, which comes bundled with soundfingerprinting NuGet package, is a plain in-memory storage, available via InMemoryModelService
class. If you plan to use an external persistent storage for fingerprints Emy is the preferred choice. Emy provides a community version which is free for non-commercial use. More about Emy can be found on wiki page.
Read Supported Media Formats page for details about processing different file formats or realtime streams.
Since v8.0.0
video fingerprinting support has been added. Similarly to audio fingerprinting, video fingerprints are generated from video frames, and used to insert and later query the datastore for exact and similar matches. You can use SoundFingerprinting
to fingerprint either audio or video content or both at the same time. More details about video fingerprinting are available here.
- Can I apply this algorithm for speech recognition purposes?
No. The granularity of one fingerprint is roughly ~1.46 seconds.
- Can the algorithm detect exact query position in resulted track?
Yes.
- Can I use SoundFingerprinting to detect ads in radio streams?
Yes. Actually this is the most frequent use-case where SoundFingerprinting was successfully used.
- How many tracks can I store in
InMemoryModelService
?
100 hours of content with
DefaultFingerprintingConfiguration
will consume ~5GB of RAM.
Install-Package SoundFingerprinting
My description of the algorithm alogside with the demo project can be found on CodeProject. The article is from 2011, and may be outdated. The demo project is a Audio File Duplicates Detector. Its latest source code can be found here. Its a WPF MVVM project that uses the algorithm to detect what files are perceptually very similar.
If you want to contribute you are welcome to open issues or discuss on issues page. Feel free to contact me for any remarks, ideas, bug reports etc.
The framework is provided under MIT license agreement.
© Soundfingerprinting, 2010-2021, [email protected]