dcase_util.features.EdgeL3Extractor

class dcase_util.features.EdgeL3Extractor(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, retrain_type='ft', sparsity=95.45, center=True, verbose=False, **kwargs)[source]

EdgeL3 Embedding extractor class

Constructor

Parameters

fsint: Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000
hop_length_samplesint: Hop length in samples. Default value None
hop_length_secondsfloat: Hop length in seconds. Default value 0.02
modelkeras.models.Model or None: Loaded model object. If a model is provided, then sparsity will be ignored. If None is provided, the model will be loaded using the provided sparsity value. Default value None
retrain_type{‘ft’, ‘kd’}: Type of retraining for the sparsified weights of L3 audio model. ‘ft’ chooses the fine-tuning method and ‘kd’ returns knowledge distilled model. Default value “ft”
sparsity{95.45, 53.5, 63.5, 72.3, 73.5, 81.0, 87.0, 90.5}: The desired sparsity of audio model. Default value 95.45
centerbool: If True, pads beginning of signal so timestamps correspond to center of window. Default value True
verbosebool: If True, prints verbose messages. Default value False

__init__(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, retrain_type='ft', sparsity=95.45, center=True, verbose=False, **kwargs)[source]

Constructor

Parameters

fsint: Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000
hop_length_samplesint: Hop length in samples. Default value None
hop_length_secondsfloat: Hop length in seconds. Default value 0.02
modelkeras.models.Model or None: Loaded model object. If a model is provided, then sparsity will be ignored. If None is provided, the model will be loaded using the provided sparsity value. Default value None
retrain_type{‘ft’, ‘kd’}: Type of retraining for the sparsified weights of L3 audio model. ‘ft’ chooses the fine-tuning method and ‘kd’ returns knowledge distilled model. Default value “ft”
sparsity{95.45, 53.5, 63.5, 72.3, 73.5, 81.0, 87.0, 90.5}: The desired sparsity of audio model. Default value 95.45
centerbool: If True, pads beginning of signal so timestamps correspond to center of window. Default value True
verbosebool: If True, prints verbose messages. Default value False

Methods

`__init__`([fs, hop_length_samples, ...])	Constructor
`extract`(y)	Extract features for the audio signal.
`log`([level])	Log container content
`show`([mode, indent, visualize])	Print container content
`to_html`([indent])	Get container information in a HTML formatted string
`to_string`([ui, indent])	Get container information in a string

Attributes

`description`	Extractor description
`label`	Extractor label
`logger`	Logger instance