forked from seasalt-ai/snowboy
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 23758ba
Showing
17 changed files
with
564 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
/lib/libsnowboy-detect.a | ||
snowboy-detect-swig.cc | ||
snowboydetect.py | ||
|
||
*.pyc | ||
*.o | ||
*.so |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,117 @@ | ||
# Snowboy Hotword Detection | ||
|
||
by [KITT.AI](http://kitt.ai). | ||
|
||
[Home Page](https://snowboy.kitt.ai) | ||
|
||
[Full Documentation](https://snowboy.kitt.ai/docs) | ||
|
||
|
||
Version: 1.0.0 (5/10/2016) | ||
|
||
Snowboy is a customizable hotword detection engine for you to create your own | ||
hotword like "OK Google" or "Alexa". It is powered by deep neural networks and has the following properties: | ||
|
||
* **highly customizable**: you can freely define your own magic phrase here – | ||
let it be “open sesame”, “garage door open”, or “hello dreamhouse”, you name it. | ||
|
||
* **always listening** but protects your privacy: Snowboy does not use Internet and does *not* stream your voice to the cloud. | ||
|
||
* light-weight and **embedded**: it even runs on a Raspberry Pi and consumes less than 10% CPU on the weakest Pi (single-core 700MHz ARMv6). | ||
|
||
* Apache licensed! | ||
|
||
Currently Snowboy supports: | ||
|
||
* all versions of Raspberry Pi (with Raspbian based on Debian Jessie 8.0) | ||
* 64bit Mac OS X | ||
* 64bit Ubuntu (12.04 and 14.04) | ||
|
||
It ships in the form of a **C library** with **Python** wrappers generated by SWIG. We welcome wrappers for other languages -- feel free to send a pull request! | ||
|
||
If you want support on other hardware/OS, please send your request to [[email protected]](mailto:snowboy.kitt.ai) | ||
|
||
|
||
## Dependencies | ||
|
||
Snowboy's Python wrapper uses PortAudio to access your device's microphone. | ||
|
||
### Mac OS X | ||
|
||
`brew` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`: | ||
|
||
brew install swig portaudio sox | ||
pip install pyaudio | ||
|
||
If you don't have Homebrew installed, please download it [here](http://brew.sh/). If you don't have `pip`, you can install it [here](https://pip.pypa.io/en/stable/installing/). | ||
|
||
Make sure that you can record audio with your microphone: | ||
|
||
rec t.wav | ||
|
||
### Ubuntu | ||
|
||
First `apt-get` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`: | ||
|
||
sudo apt-get install swig3.0 python-pyaudio python3-pyaudio sox | ||
pip install pyaudio | ||
|
||
Then install the `atlas` matrix computing library: | ||
|
||
sudo apt-get install libatlas-base-dev | ||
|
||
Make sure that you can record audio with your microphone: | ||
|
||
rec t.wav | ||
If you need extra setup on your audio (especially on a Raspberry Pi), please see the [full documentation](https://snowboy.kitt.ai/docs). | ||
|
||
## Compile a Python Wrapper | ||
|
||
cd swig/python | ||
make | ||
|
||
SWIG will generate a `_snowboydetect.so` file and a simple (but hard-to-read) python wrapper `snowboydetect.py`. We have provided a higher level python wrapper `snowboydecoder.py` on top of that. | ||
|
||
Feel free to adapt the `Makefile` in `swig/python` to your own system's setting if you cannot `make` it. | ||
|
||
|
||
## Quick Start | ||
|
||
Go to the `swig/python` folder and open your python console: | ||
|
||
In [1]: import snowboydecoder | ||
|
||
In [2]: def detected_callback(): | ||
....: print "hotword detected" | ||
....: | ||
|
||
In [3]: detector = snowboydecoder.HotwordDetector("resources/snowboy.umdl", sensitivity=0.5, audio_gain=1) | ||
|
||
In [4]: detector.start(detected_callback) | ||
|
||
Then speak "snowboy" to your microphone to see whetheer Snowboy detects you. | ||
|
||
The `snowboy.umdl` file is a "universal" model that detect different people speaking "snowboy". If you want other hotwords, please go to [snowboy.kitt.ai](https://snowboy.kitt.ai) to record, train and downloand your own personal model (a `.pmdl` file). | ||
|
||
When `sensitiviy` is higher, the hotword gets more easily triggered. But you might get more false alarms. | ||
|
||
`audio_gain` controls whether to increase (>1) or decrease (<1) input volume. | ||
|
||
Two demo files `demo.py` and `demo2.py` are provided to show more usages. | ||
|
||
Note: if you see the following error: | ||
|
||
TypeError: __init__() got an unexpected keyword argument 'model_str' | ||
|
||
You are probably using an old version of SWIG. Please upgrade. We have tested with SWIG version 3.0.7 and 3.0.8. | ||
|
||
## Advanced Usages & Demos | ||
|
||
See [Full Documentation](https://snowboy.kitt.ai/docs). | ||
|
||
## Change Log | ||
|
||
**5/10/2016** | ||
|
||
* initial release |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
// include/snowboy-detect.h | ||
|
||
// Copyright 2016 KITT.AI (author: Guoguo Chen) | ||
|
||
#ifndef SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ | ||
#define SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ | ||
|
||
#include <memory> | ||
#include <string> | ||
|
||
namespace snowboy { | ||
|
||
// Forward declaration. | ||
struct WaveHeader; | ||
class PipelineDetect; | ||
|
||
//////////////////////////////////////////////////////////////////////////////// | ||
// | ||
// SnowboyDetect class interface. | ||
// | ||
//////////////////////////////////////////////////////////////////////////////// | ||
class SnowboyDetect { | ||
public: | ||
// Constructor that takes a resource file, and a list of hotword models which | ||
// are separated by comma. In the case that more than one hotword exist in the | ||
// provided models, RunDetection() will return the index of the hotword, if | ||
// the corresponding hotword is triggered. | ||
// | ||
// CAVEAT: a personal model only contain one hotword, but an universal model | ||
// may contain multiple hotwords. It is your responsibility to figure | ||
// out the index of the hotword. For example, if your model string is | ||
// "foo.pmdl,bar.umdl", where foo.pmdl contains hotword x, bar.umdl | ||
// has two hotwords y and z, the indices of different hotwords are as | ||
// follows: | ||
// x 1 | ||
// y 2 | ||
// z 3 | ||
// | ||
// @param [in] resource_filename Filename of resource file. | ||
// @param [in] model_str A string of multiple hotword models, | ||
// separated by comma. | ||
SnowboyDetect(const std::string& resource_filename, | ||
const std::string& model_str); | ||
|
||
// Resets the detection. This class handles voice activity detection (VAD) | ||
// internally. But if you have an external VAD, you should call Reset() | ||
// whenever you see segment end from your VAD. | ||
bool Reset(); | ||
|
||
// Runs hotword detection. Supported audio format is WAVE (with linear PCM, | ||
// 8-bits unsigned integer, 16-bits signed integer or 32-bits signed integer). | ||
// See SampleRate(), NumChannels() and BitsPerSample() for the required | ||
// sampling rate, number of channels and bits per sample values. You are | ||
// supposed to provide a small chunk of data (e.g., 0.1 second) each time you | ||
// call RunDetection(). Larger chunk usually leads to longer delay, but less | ||
// CPU usage. | ||
// | ||
// Definition of return values: | ||
// -1: Error. | ||
// 0: No event. | ||
// 1: Hotword 1 triggered. | ||
// 2: Hotword 2 triggered. | ||
// ... | ||
// | ||
// @param [in] data Small chunk of data to be detected. See | ||
// above for the supported data format. | ||
int RunDetection(const std::string& data); | ||
|
||
// Sets the sensitivity string for the loaded hotwords. A <sensitivity_str> is | ||
// a list of floating numbers between 0 and 1, and separated by comma. For | ||
// example, if there are 3 loaded hotwords, your string should looks something | ||
// like this: | ||
// 0.4,0.5,0.8 | ||
// Make sure you properly align the sensitivity value to the corresponding | ||
// hotword. | ||
void SetSensitivity(const std::string& sensitivity_str); | ||
|
||
// Returns the sensitivity string for the current hotwords. | ||
std::string GetSensitivity() const; | ||
|
||
// Applied a fixed gain to the input audio. In case you have a very weak | ||
// microphone, you can use this function to boost input audio level. | ||
void SetAudioGain(const float audio_gain); | ||
|
||
// Writes the models to the model filenames specified in <model_str> in the | ||
// constructor. This overwrites the original model with the latest parameter | ||
// setting. You are supposed to call this function if you have updated the | ||
// hotword sensitivities through SetSensitivity(), and you would like to store | ||
// those values in the model as the default value. | ||
void UpdateModel() const; | ||
|
||
// Returns the number of the loaded hotwords. This helps you to figure the | ||
// index of the hotwords. | ||
int NumHotwords() const; | ||
|
||
// Returns the required sampling rate, number of channels and bits per sample | ||
// values for the audio data. You should use this information to set up your | ||
// audio capturing interface. | ||
int SampleRate() const; | ||
int NumChannels() const; | ||
int BitsPerSample() const; | ||
|
||
~SnowboyDetect(); | ||
|
||
private: | ||
std::unique_ptr<WaveHeader> wave_header_; | ||
std::unique_ptr<PipelineDetect> detect_pipeline_; | ||
}; | ||
|
||
} // namespace snowboy | ||
|
||
#endif // SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_ |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Example Makefile that converts snowboy c++ library (snowboy-detect.a) to | ||
# python library (_snowboydetect.so, snowboydetect.py), using swig. | ||
|
||
# Some versions of swig does not work well. We prefer compiling swig from source | ||
# code. We have tested swig-3.0.7.tar.gz. | ||
SWIG := swig | ||
|
||
SNOWBOYDETECTSWIGITF = snowboy-detect-swig.i | ||
SNOWBOYDETECTSWIGOBJ = snowboy-detect-swig.o | ||
SNOWBOYDETECTSWIGCC = snowboy-detect-swig.cc | ||
SNOWBOYDETECTSWIGLIBFILE = _snowboydetect.so | ||
|
||
TOPDIR := ../../ | ||
CXXFLAGS := -I$(TOPDIR) -O3 -fPIC | ||
LDFLAGS := | ||
|
||
ifeq ($(shell uname), Darwin) | ||
CXX := clang++ | ||
PYINC := $(shell /usr/bin/python2.7-config --includes) | ||
PYLIBS := $(shell /usr/bin/python2.7-config --ldflags) | ||
SWIGFLAGS := -bundle -flat_namespace -undefined suppress | ||
LDLIBS := -lm -ldl -framework Accelerate | ||
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/osx/libsnowboy-detect.a | ||
else | ||
CXX := g++ | ||
PYINC := $(shell python-config --cflags) | ||
PYLIBS := $(shell python-config --ldflags) | ||
SWIGFLAGS := -shared | ||
CXXFLAGS += -std=c++0x | ||
# Make sure you have Atlas installed. You can statically link Atlas if you | ||
# would like to be able to move the library to a machine without Atlas. | ||
LDLIBS := -lm -ldl -lf77blas -lcblas -llapack_atlas -latlas | ||
SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/ubuntu64/libsnowboy-detect.a | ||
endif | ||
|
||
all: $(SNOWBOYSWIGLIBFILE) $(SNOWBOYDETECTSWIGLIBFILE) | ||
|
||
%.a: | ||
$(MAKE) -C ${@D} ${@F} | ||
|
||
$(SNOWBOYDETECTSWIGCC): $(SNOWBOYDETECTSWIGITF) | ||
$(SWIG) -I$(TOPDIR) -c++ -python -o $(SNOWBOYDETECTSWIGCC) $(SNOWBOYDETECTSWIGITF) | ||
|
||
$(SNOWBOYDETECTSWIGOBJ): $(SNOWBOYDETECTSWIGCC) | ||
$(CXX) $(PYINC) $(CXXFLAGS) -c $(SNOWBOYDETECTSWIGCC) | ||
|
||
$(SNOWBOYDETECTSWIGLIBFILE): $(SNOWBOYDETECTSWIGOBJ) $(SNOWBOYDETECTLIBFILE) | ||
$(CXX) $(CXXFLAGS) $(LDFLAGS) $(SWIGFLAGS) $(SNOWBOYDETECTSWIGOBJ) \ | ||
$(SNOWBOYDETECTLIBFILE) $(PYLIBS) $(LDLIBS) -o $(SNOWBOYDETECTSWIGLIBFILE) | ||
|
||
clean: | ||
-rm -f *.o *.a *.so snowboydetect.py *.pyc $(SNOWBOYDETECTSWIGCC) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,35 @@ | ||
import snowboydecoder | ||
import sys | ||
import signal | ||
|
||
interrupted = False | ||
|
||
|
||
def signal_handler(signal, frame): | ||
global interrupted | ||
interrupted = True | ||
|
||
|
||
def interrupt_callback(): | ||
global interrupted | ||
return interrupted | ||
|
||
if len(sys.argv) == 1: | ||
print("Error: need to specify model name") | ||
print("Usage: python demo.py your.model") | ||
sys.exit(-1) | ||
|
||
model = sys.argv[1] | ||
|
||
# capture SIGINT signal, e.g., Ctrl+C | ||
signal.signal(signal.SIGINT, signal_handler) | ||
|
||
detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5) | ||
print('Listening... Press Ctrl+C to exit') | ||
|
||
# main loop | ||
detector.start(detected_callback=snowboydecoder.play_audio_file, | ||
interrupt_check=interrupt_callback, | ||
sleep_time=0.03) | ||
|
||
detector.terminate() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
import snowboydecoder | ||
import sys | ||
import signal | ||
|
||
# Demo code for listening two hotwords at the same time | ||
|
||
interrupted = False | ||
|
||
|
||
def signal_handler(signal, frame): | ||
global interrupted | ||
interrupted = True | ||
|
||
|
||
def interrupt_callback(): | ||
global interrupted | ||
return interrupted | ||
|
||
if len(sys.argv) != 3: | ||
print("Error: need to specify 2 model names") | ||
print("Usage: python demo.py 1st.model 2nd.model") | ||
sys.exit(-1) | ||
|
||
models = sys.argv[1:] | ||
|
||
# capture SIGINT signal, e.g., Ctrl+C | ||
signal.signal(signal.SIGINT, signal_handler) | ||
|
||
sensitivity = [0.5]*len(models) | ||
detector = snowboydecoder.HotwordDetector(models, sensitivity=sensitivity) | ||
callbacks = [lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DING), | ||
lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DONG)] | ||
print('Listening... Press Ctrl+C to exit') | ||
|
||
# main loop | ||
# make sure you have the same numbers of callbacks and models | ||
detector.start(detected_callback=callbacks, | ||
interrupt_check=interrupt_callback, | ||
sleep_time=0.03) | ||
|
||
detector.terminate() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
PyAudio==0.2.9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
../../resources/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
// swig/snowboy-detect-swig.i | ||
|
||
// Copyright 2016 KITT.AI (author: Guoguo Chen) | ||
|
||
%module snowboydetect | ||
|
||
// Suppress SWIG warnings. | ||
#pragma SWIG nowarn=SWIGWARN_PARSE_NESTED_CLASS | ||
%include "std_string.i" | ||
|
||
%{ | ||
#include "include/snowboy-detect.h" | ||
%} | ||
|
||
%include "include/snowboy-detect.h" |
Oops, something went wrong.