dcase_util.features.OpenL3Extractor
- class dcase_util.features.OpenL3Extractor(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, input_repr='mel256', content_type='music', embedding_size=6144, center=True, batch_size=32, verbose=False, **kwargs)[source]
OpenL3 Embedding extractor class
Constructor
- Parameters
- fsint
Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000
- hop_length_samplesint
Hop length in samples. Default value None
- hop_length_secondsfloat
Hop length in seconds. Default value 0.02
- modelkeras.models.Model or None
Loaded model object. If a model is provided, then input_repr, content_type, and embedding_size will be ignored. If None is provided, the model will be loaded using the provided values of input_repr, content_type and embedding_size. Default value None
- input_repr“linear”, “mel128”, or “mel256”
Spectrogram representation used for model. Ignored if model is a valid Keras model. Default value “mel256”
- content_type“music” or “env”
Type of content used to train the embedding model. Ignored if model is a valid Keras model. Default value “music”
- embedding_size6144 or 512
Embedding dimensionality. Ignored if model is a valid Keras model. Default value 6144
- centerbool
If True, pads beginning of signal so timestamps correspond to center of window. Default value True
- batch_sizeint
Batch size used for input to embedding model Default value 32
- verbosebool
If True, prints verbose messages. Default value False
- __init__(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, input_repr='mel256', content_type='music', embedding_size=6144, center=True, batch_size=32, verbose=False, **kwargs)[source]
Constructor
- Parameters
- fsint
Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000
- hop_length_samplesint
Hop length in samples. Default value None
- hop_length_secondsfloat
Hop length in seconds. Default value 0.02
- modelkeras.models.Model or None
Loaded model object. If a model is provided, then input_repr, content_type, and embedding_size will be ignored. If None is provided, the model will be loaded using the provided values of input_repr, content_type and embedding_size. Default value None
- input_repr“linear”, “mel128”, or “mel256”
Spectrogram representation used for model. Ignored if model is a valid Keras model. Default value “mel256”
- content_type“music” or “env”
Type of content used to train the embedding model. Ignored if model is a valid Keras model. Default value “music”
- embedding_size6144 or 512
Embedding dimensionality. Ignored if model is a valid Keras model. Default value 6144
- centerbool
If True, pads beginning of signal so timestamps correspond to center of window. Default value True
- batch_sizeint
Batch size used for input to embedding model Default value 32
- verbosebool
If True, prints verbose messages. Default value False
Methods
__init__
([fs, hop_length_samples, ...])Constructor
extract
(y)Extract features for the audio signal.
log
([level])Log container content
show
([mode, indent, visualize])Print container content
to_html
([indent])Get container information in a HTML formatted string
to_string
([ui, indent])Get container information in a string
Attributes
description
Extractor description
label
Extractor label
logger
Logger instance