Skip to content

sig.events

Olivier Lartillot edited this page Feb 1, 2018 · 4 revisions

Estimation of a so-called “onset” detection curve, showing the successive bursts of energy corresponding to the successive events. A peak picking is automatically performed on the onset detection curve, in order to show the estimated positions of the events.

The onset detection curve can be computed in various ways:

  • sig.events(…,'Envelope’) computes an amplitude envelope, using sig.envelope (default choice). The envelope extraction can be specified, as in sig.envelope using either ’Spectro’ or ’Filter’ option:
    • Either the 'Spectro' option (default)
      • sig.events(...,‘SpectroFrame’, fl, fh) species the frame length fl (in s.) and the hop factor fh (as a value between 0 and 1). Default values: fl = .1 s., fh = .1
      • sig.events(..., ‘PowerSpectrum’, 0) turns off the computation of the power of the spectrum.
      • sig.events(..., ‘PreSilence’) adds further frames at the beginning of the audio sequence by adding silence before the actual start of the sequence.
      • sig.events(..., ‘PostSilence’) adds further frames at the end of the audio sequence by adding silence after the actual end of the sequence.
    • or the ‘Filter’ option: Related options in mirenvelope can be specified: ‘FilterType’, ‘Tau’, ‘CutOff’, ’PreDecim’, ‘Hilbert’ with same default value than for mirenvelope.
      • sig.events(..., ‘PreSilence’) adds further silence at the beginning of the audio sequence.
      • sig.events(..., ‘PostSilence’) adds further silence at the end of the audio sequence.
    • Other available options, related to sig.envelope: ‘HalfwaveCenter’, ‘Log’, ‘MinLog’, ‘Power’, ‘Diff’, ‘HalfwaveDiff’, ‘Center’, ‘Smooth’, ‘PostDecim’, ‘Sampling’, ‘UpSample’, all with same default as in sig.envelope. In addition, sig.envelope’s ‘Normal’ option can be controlled as well, with a default set to 1.
  • sig.events(..., ‘SpectralFlux’) computes a spectral flux. Options related to sig.flux can be passed here as well:
    • ‘Inc’ (toggled on by default here),
    • ‘Halfwave’ (toggled on by default here),
    • ‘Complex’ (toggled off by default as usual),
    • ‘Median’ (toggled on by default here, with same default parameters than in sig.flux)

Whatever the chosen method, the detection curve is finally converted into an envelope (using sig.envelope), and further operations are performed in this order:

  • 'Center' (performed if 'Center' was specified while calling sig.events).
  • 'Normal' (always performed by default).

sig.events accepts as input data type either:

  • envelope curves (resulting from sig.envelope),
  • fluxes (resulting from sig.flux)
  • waveforms, which can be:
    • segmented (using sig.segment),
    • decomposed into channels (using sig.filterbank),
    • decomposed into frames or not (using sig.frame):
      • if the audio waveform is decomposed into frames, the detection curve will be based on the spectral flux;
      • if the audio waveform is not decomposed into frames, the default detection curve will be based on the envelope;
  • file name or the ‘Folder’ keyword,
  • any other object: it is decomposed into frames (if not already decomposed) using the parameters specified by the ‘Frame’ option; the flux will be automatically computed by default.

Event detection

  • sig.events(..., 'Detect’, d) specifies options related to the peak picking from the detection curve:
    • d = ‘Peaks’ (default choice): local maxima are chosen as event positions;
    • d = ‘Valleys’: local minima are chosen as event positions;
    • d = 0, or ‘no’, or ‘off’: no peak picking is performed.

Options associated to the mirpeaks function can be specified as well. In particular:

  • sig.events(..., ‘Contrast’, c) with default value here c = .01,
  • sig.events(..., ‘Threshold’, t) with default value here t = 0.
  • sig.events(..., ‘Single’) selects only the highest peak.

Segmentation

The temporal localization of events can be used for segmentation of the initial waveform:

o = sig.events(‘ragtime.wav’); sig.segment(‘ragtime.wav’, o)

Frame decomposition

The detection curve can be further decomposed into frames if the ‘Frame’ option has been specified, with default frame length 3 seconds and hop factor of 10% (0.3 second).

Auditory model

Auditory modelling of event detection is available in aud.events.

Clone this wiki locally