Acoustic features

Library provides basic acoustic feature extractors: dcase_util.features.MelExtractor, dcase_util.features.MfccStaticExtractor, dcase_util.features.MfccDeltaExtractor, dcase_util.features.MfccAccelerationExtractor, dcase_util.features.ZeroCrossingRateExtractor, dcase_util.features.RMSEnergyExtractor, and dcase_util.features.SpectralCentroidExtractor.

Extracting features

Extracting mel-band energies for the audio signal (using default parameters):

# Get audio in a container, mixdown of a stereo signal
audio_container = dcase_util.containers.AudioContainer().load(
  filename=dcase_util.utils.Example.audio_filename()
).mixdown()

# Create extractor
mel_extractor = dcase_util.features.MelExtractor()

# Extract features
mels = mel_extractor.extract(audio_container)

Extracting mel-band energies for specific audio segment:

# Get audio in a container, mixdown of a stereo signal
audio_container = dcase_util.containers.AudioContainer().load(
  filename=dcase_util.utils.Example.audio_filename()
).mixdown()

# Set focus
audio_container.set_focus(start_seconds=1.0, stop_seconds=4.0)

# Create extractor
mel_extractor = dcase_util.features.MelExtractor()

# Extract features
mels = mel_extractor.extract(audio_container.get_focused())

# Plot
dcase_util.containers.DataMatrix2DContainer(
    data=mels,
    time_resolution=mel_extractor.hop_length_seconds
).plot()

Extract features directly from numpy matrix:

# Create an audio signal
t = numpy.linspace(0, 2, 2 * 44100, endpoint=False)
x1 = numpy.sin(220 * 2 * numpy.pi * t)

# Create extractor
mel_extractor = dcase_util.features.MelExtractor()

# Extract features
mels = mel_extractor.extract(x1)

# Plot
dcase_util.containers.DataMatrix2DContainer(
    data=mels,
    time_resolution=mel_extractor.hop_length_seconds
).plot()

All audio extractors provided in the library work for single channel audio.

Visualization

Plotting the extracted features:

# Get audio in a container, mixdown of a stereo signal
audio_container = dcase_util.containers.AudioContainer().load(
  filename=dcase_util.utils.Example.audio_filename()
).mixdown()

# Create extractor
mel_extractor = dcase_util.features.MelExtractor()

# Extract features
mels = mel_extractor.extract(audio_container)

# Plotting
dcase_util.containers.DataMatrix2DContainer(
    data=mels,
    time_resolution=mel_extractor.hop_length_seconds
).plot()

(Source code, png, hires.png, pdf)

_images/tutorial_features-1.png