initial commit

simonchen007 · May 10, 2016 · 23758ba · 23758ba
commit 23758ba
Show file tree

Hide file tree

Showing 17 changed files with 564 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,7 @@
+/lib/libsnowboy-detect.a
+snowboy-detect-swig.cc
+snowboydetect.py
+
+*.pyc
+*.o
+*.so
diff --git a/README.md b/README.md
@@ -0,0 +1,117 @@
+# Snowboy Hotword Detection
+
+by [KITT.AI](http://kitt.ai).
+
+[Home Page](https://snowboy.kitt.ai)
+
+[Full Documentation](https://snowboy.kitt.ai/docs)
+
+
+Version: 1.0.0 (5/10/2016)
+
+Snowboy is a customizable hotword detection engine for you to create your own
+hotword like "OK Google" or "Alexa". It is powered by deep neural networks and has the following properties:
+
+* **highly customizable**: you can freely define your own magic phrase here –
+let it be “open sesame”, “garage door open”, or “hello dreamhouse”, you name it.
+
+* **always listening** but protects your privacy: Snowboy does not use Internet and does *not* stream your voice to the cloud.
+
+* light-weight and **embedded**: it even runs on a Raspberry Pi and consumes less than 10% CPU on the weakest Pi (single-core 700MHz ARMv6).
+
+* Apache licensed!
+
+Currently Snowboy supports:
+
+* all versions of Raspberry Pi (with Raspbian based on Debian Jessie 8.0)
+* 64bit Mac OS X
+* 64bit Ubuntu (12.04 and 14.04)
+
+It ships in the form of a **C library** with **Python** wrappers generated by SWIG. We welcome wrappers for other languages -- feel free to send a pull request!
+
+If you want support on other hardware/OS, please send your request to [[email protected]](mailto:snowboy.kitt.ai)
+
+
+## Dependencies
+
+Snowboy's Python wrapper uses PortAudio to access your device's microphone.
+
+### Mac OS X
+
+`brew` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`:
+
+    brew install swig portaudio sox
+    pip install pyaudio
+
+If you don't have Homebrew installed, please download it [here](http://brew.sh/). If you don't have `pip`, you can install it [here](https://pip.pypa.io/en/stable/installing/).
+
+Make sure that you can record audio with your microphone:
+
+    rec t.wav
+
+### Ubuntu
+
+First `apt-get` install `swig`, `sox`, `portaudio` and its Python binding `pyaudio`:
+
+    sudo apt-get install swig3.0 python-pyaudio python3-pyaudio sox
+    pip install pyaudio
+
+Then install the `atlas` matrix computing library:
+
+    sudo apt-get install libatlas-base-dev
+
+Make sure that you can record audio with your microphone:
+
+    rec t.wav
+        
+If you need extra setup on your audio (especially on a Raspberry Pi), please see the [full documentation](https://snowboy.kitt.ai/docs).
+
+## Compile a Python Wrapper
+
+    cd swig/python
+    make
+
+SWIG will generate a `_snowboydetect.so` file and a simple (but hard-to-read) python wrapper `snowboydetect.py`. We have provided a higher level python wrapper `snowboydecoder.py` on top of that.
+
+Feel free to adapt the `Makefile` in `swig/python` to your own system's setting if you cannot `make` it.
+
+
+## Quick Start
+
+Go to the `swig/python` folder and open your python console:
+
+    In [1]: import snowboydecoder
+
+    In [2]: def detected_callback():
+       ....:     print "hotword detected"
+       ....:
+
+    In [3]: detector = snowboydecoder.HotwordDetector("resources/snowboy.umdl", sensitivity=0.5, audio_gain=1)
+
+    In [4]: detector.start(detected_callback)
+
+Then speak "snowboy" to your microphone to see whetheer Snowboy detects you.
+
+The `snowboy.umdl` file is a "universal" model that detect different people speaking "snowboy". If you want other hotwords, please go to [snowboy.kitt.ai](https://snowboy.kitt.ai) to record, train and downloand your own personal model (a `.pmdl` file).
+
+When `sensitiviy` is higher, the hotword gets more easily triggered. But you might get more false alarms.
+
+`audio_gain` controls whether to increase (>1) or decrease (<1) input volume.
+
+Two demo files `demo.py` and `demo2.py` are provided to show more usages.
+
+Note: if you see the following error:
+
+    TypeError: __init__() got an unexpected keyword argument 'model_str'
+
+You are probably using an old version of SWIG. Please upgrade. We have tested with SWIG version 3.0.7 and 3.0.8.
+
+## Advanced Usages & Demos
+
+See [Full Documentation](https://snowboy.kitt.ai/docs).
+
+## Change Log
+
+**5/10/2016**
+
+* initial release
diff --git a/include/snowboy-detect.h b/include/snowboy-detect.h
@@ -0,0 +1,112 @@
+// include/snowboy-detect.h
+
+// Copyright 2016  KITT.AI (author: Guoguo Chen)
+
+#ifndef SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_
+#define SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_
+
+#include <memory>
+#include <string>
+
+namespace snowboy {
+
+// Forward declaration.
+struct WaveHeader;
+class PipelineDetect;
+
+////////////////////////////////////////////////////////////////////////////////
+//
+// SnowboyDetect class interface.
+//
+////////////////////////////////////////////////////////////////////////////////
+class SnowboyDetect {
+ public:
+  // Constructor that takes a resource file, and a list of hotword models which
+  // are separated by comma. In the case that more than one hotword exist in the
+  // provided models, RunDetection() will return the index of the hotword, if
+  // the corresponding hotword is triggered.
+  //
+  // CAVEAT: a personal model only contain one hotword, but an universal model
+  //         may contain multiple hotwords. It is your responsibility to figure
+  //         out the index of the hotword. For example, if your model string is
+  //         "foo.pmdl,bar.umdl", where foo.pmdl contains hotword x, bar.umdl
+  //         has two hotwords y and z, the indices of different hotwords are as
+  //         follows:
+  //         x 1
+  //         y 2
+  //         z 3
+  //
+  // @param [in]  resource_filename   Filename of resource file.
+  // @param [in]  model_str           A string of multiple hotword models,
+  //                                  separated by comma.
+  SnowboyDetect(const std::string& resource_filename,
+                const std::string& model_str);
+
+  // Resets the detection. This class handles voice activity detection (VAD)
+  // internally. But if you have an external VAD, you should call Reset()
+  // whenever you see segment end from your VAD.
+  bool Reset();
+
+  // Runs hotword detection. Supported audio format is WAVE (with linear PCM,
+  // 8-bits unsigned integer, 16-bits signed integer or 32-bits signed integer).
+  // See SampleRate(), NumChannels() and BitsPerSample() for the required
+  // sampling rate, number of channels and bits per sample values. You are
+  // supposed to provide a small chunk of data (e.g., 0.1 second) each time you
+  // call RunDetection(). Larger chunk usually leads to longer delay, but less
+  // CPU usage.
+  //
+  // Definition of return values:
+  // -1: Error.
+  //  0: No event.
+  //  1: Hotword 1 triggered.
+  //  2: Hotword 2 triggered.
+  //  ...
+  //
+  //  @param [in]  data               Small chunk of data to be detected. See
+  //                                  above for the supported data format.
+  int RunDetection(const std::string& data);
+
+  // Sets the sensitivity string for the loaded hotwords. A <sensitivity_str> is
+  // a list of floating numbers between 0 and 1, and separated by comma. For
+  // example, if there are 3 loaded hotwords, your string should looks something
+  // like this:
+  //   0.4,0.5,0.8
+  // Make sure you properly align the sensitivity value to the corresponding
+  // hotword.
+  void SetSensitivity(const std::string& sensitivity_str);
+
+  // Returns the sensitivity string for the current hotwords.
+  std::string GetSensitivity() const;
+
+  // Applied a fixed gain to the input audio. In case you have a very weak
+  // microphone, you can use this function to boost input audio level.
+  void SetAudioGain(const float audio_gain);
+
+  // Writes the models to the model filenames specified in <model_str> in the
+  // constructor. This overwrites the original model with the latest parameter
+  // setting. You are supposed to call this function if you have updated the
+  // hotword sensitivities through SetSensitivity(), and you would like to store
+  // those values in the model as the default value.
+  void UpdateModel() const;
+
+  // Returns the number of the loaded hotwords. This helps you to figure the
+  // index of the hotwords.
+  int NumHotwords() const;
+
+  // Returns the required sampling rate, number of channels and bits per sample
+  // values for the audio data. You should use this information to set up your
+  // audio capturing interface.
+  int SampleRate() const;
+  int NumChannels() const;
+  int BitsPerSample() const;
+
+  ~SnowboyDetect();
+
+ private:
+  std::unique_ptr<WaveHeader> wave_header_;
+  std::unique_ptr<PipelineDetect> detect_pipeline_;
+};
+
+}  // namespace snowboy
+
+#endif  // SNOWBOY_INCLUDE_SNOWBOY_DETECT_H_
diff --git a/lib/ios/libsnowboy-detect.a b/lib/ios/libsnowboy-detect.a
diff --git a/lib/osx/libsnowboy-detect.a b/lib/osx/libsnowboy-detect.a
diff --git a/lib/ubuntu64/libsnowboy-detect.a b/lib/ubuntu64/libsnowboy-detect.a
diff --git a/resources/common.res b/resources/common.res
diff --git a/resources/ding.wav b/resources/ding.wav
diff --git a/resources/dong.wav b/resources/dong.wav
diff --git a/resources/snowboy.umdl b/resources/snowboy.umdl
diff --git a/swig/python/Makefile b/swig/python/Makefile
@@ -0,0 +1,52 @@
+# Example Makefile that converts snowboy c++ library (snowboy-detect.a) to
+# python library (_snowboydetect.so, snowboydetect.py), using swig.
+
+# Some versions of swig does not work well. We prefer compiling swig from source
+# code. We have tested swig-3.0.7.tar.gz. 
+SWIG := swig
+
+SNOWBOYDETECTSWIGITF = snowboy-detect-swig.i
+SNOWBOYDETECTSWIGOBJ = snowboy-detect-swig.o
+SNOWBOYDETECTSWIGCC = snowboy-detect-swig.cc
+SNOWBOYDETECTSWIGLIBFILE = _snowboydetect.so
+
+TOPDIR := ../../
+CXXFLAGS := -I$(TOPDIR) -O3 -fPIC
+LDFLAGS :=
+
+ifeq ($(shell uname), Darwin)
+  CXX := clang++
+  PYINC := $(shell /usr/bin/python2.7-config --includes)
+  PYLIBS := $(shell /usr/bin/python2.7-config --ldflags)
+  SWIGFLAGS := -bundle -flat_namespace -undefined suppress
+  LDLIBS := -lm -ldl -framework Accelerate
+  SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/osx/libsnowboy-detect.a
+else
+  CXX := g++
+  PYINC := $(shell python-config --cflags)
+  PYLIBS := $(shell python-config --ldflags)
+  SWIGFLAGS := -shared
+  CXXFLAGS += -std=c++0x
+  # Make sure you have Atlas installed. You can statically link Atlas if you
+  # would like to be able to move the library to a machine without Atlas.
+  LDLIBS := -lm -ldl -lf77blas -lcblas -llapack_atlas -latlas
+  SNOWBOYDETECTLIBFILE = $(TOPDIR)/lib/ubuntu64/libsnowboy-detect.a
+endif
+
+all: $(SNOWBOYSWIGLIBFILE) $(SNOWBOYDETECTSWIGLIBFILE)
+
+%.a:
+	$(MAKE) -C ${@D} ${@F}
+
+$(SNOWBOYDETECTSWIGCC): $(SNOWBOYDETECTSWIGITF)
+	$(SWIG) -I$(TOPDIR) -c++ -python -o $(SNOWBOYDETECTSWIGCC) $(SNOWBOYDETECTSWIGITF)
+
+$(SNOWBOYDETECTSWIGOBJ): $(SNOWBOYDETECTSWIGCC)
+	$(CXX) $(PYINC) $(CXXFLAGS) -c $(SNOWBOYDETECTSWIGCC)
+
+$(SNOWBOYDETECTSWIGLIBFILE): $(SNOWBOYDETECTSWIGOBJ) $(SNOWBOYDETECTLIBFILE)
+	$(CXX) $(CXXFLAGS) $(LDFLAGS) $(SWIGFLAGS) $(SNOWBOYDETECTSWIGOBJ) \
+	$(SNOWBOYDETECTLIBFILE) $(PYLIBS) $(LDLIBS) -o $(SNOWBOYDETECTSWIGLIBFILE)
+
+clean:
+	-rm -f *.o *.a *.so snowboydetect.py *.pyc $(SNOWBOYDETECTSWIGCC)
diff --git a/swig/python/demo.py b/swig/python/demo.py
@@ -0,0 +1,35 @@
+import snowboydecoder
+import sys
+import signal
+
+interrupted = False
+
+
+def signal_handler(signal, frame):
+    global interrupted
+    interrupted = True
+
+
+def interrupt_callback():
+    global interrupted
+    return interrupted
+
+if len(sys.argv) == 1:
+    print("Error: need to specify model name")
+    print("Usage: python demo.py your.model")
+    sys.exit(-1)
+
+model = sys.argv[1]
+
+# capture SIGINT signal, e.g., Ctrl+C
+signal.signal(signal.SIGINT, signal_handler)
+
+detector = snowboydecoder.HotwordDetector(model, sensitivity=0.5)
+print('Listening... Press Ctrl+C to exit')
+
+# main loop
+detector.start(detected_callback=snowboydecoder.play_audio_file,
+               interrupt_check=interrupt_callback,
+               sleep_time=0.03)
+
+detector.terminate()
diff --git a/swig/python/demo2.py b/swig/python/demo2.py
@@ -0,0 +1,41 @@
+import snowboydecoder
+import sys
+import signal
+
+# Demo code for listening two hotwords at the same time
+
+interrupted = False
+
+
+def signal_handler(signal, frame):
+    global interrupted
+    interrupted = True
+
+
+def interrupt_callback():
+    global interrupted
+    return interrupted
+
+if len(sys.argv) != 3:
+    print("Error: need to specify 2 model names")
+    print("Usage: python demo.py 1st.model 2nd.model")
+    sys.exit(-1)
+
+models = sys.argv[1:]
+
+# capture SIGINT signal, e.g., Ctrl+C
+signal.signal(signal.SIGINT, signal_handler)
+
+sensitivity = [0.5]*len(models)
+detector = snowboydecoder.HotwordDetector(models, sensitivity=sensitivity)
+callbacks = [lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DING),
+             lambda: snowboydecoder.play_audio_file(snowboydecoder.DETECT_DONG)]
+print('Listening... Press Ctrl+C to exit')
+
+# main loop
+# make sure you have the same numbers of callbacks and models
+detector.start(detected_callback=callbacks,
+               interrupt_check=interrupt_callback,
+               sleep_time=0.03)
+
+detector.terminate()
diff --git a/swig/python/requirements.txt b/swig/python/requirements.txt
@@ -0,0 +1 @@
+PyAudio==0.2.9
diff --git a/swig/python/resources b/swig/python/resources
@@ -0,0 +1 @@
+../../resources/
diff --git a/swig/python/snowboy-detect-swig.i b/swig/python/snowboy-detect-swig.i
@@ -0,0 +1,15 @@
+// swig/snowboy-detect-swig.i
+
+// Copyright 2016  KITT.AI (author: Guoguo Chen)
+
+%module snowboydetect
+
+// Suppress SWIG warnings.
+#pragma SWIG nowarn=SWIGWARN_PARSE_NESTED_CLASS
+%include "std_string.i"
+
+%{
+#include "include/snowboy-detect.h"
+%}
+
+%include "include/snowboy-detect.h"