dcase_util.features.EdgeL3Extractor

class dcase_util.features.EdgeL3Extractor(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, retrain_type='ft', sparsity=95.45, center=True, verbose=False, **kwargs)[source]

EdgeL3 Embedding extractor class

Constructor

Parameters
fsint

Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000

hop_length_samplesint

Hop length in samples. Default value None

hop_length_secondsfloat

Hop length in seconds. Default value 0.02

modelkeras.models.Model or None

Loaded model object. If a model is provided, then sparsity will be ignored. If None is provided, the model will be loaded using the provided sparsity value. Default value None

retrain_type{‘ft’, ‘kd’}

Type of retraining for the sparsified weights of L3 audio model. ‘ft’ chooses the fine-tuning method and ‘kd’ returns knowledge distilled model. Default value “ft”

sparsity{95.45, 53.5, 63.5, 72.3, 73.5, 81.0, 87.0, 90.5}

The desired sparsity of audio model. Default value 95.45

centerbool

If True, pads beginning of signal so timestamps correspond to center of window. Default value True

verbosebool

If True, prints verbose messages. Default value False

__init__(fs=48000, hop_length_samples=None, hop_length_seconds=0.02, model=None, retrain_type='ft', sparsity=95.45, center=True, verbose=False, **kwargs)[source]

Constructor

Parameters
fsint

Sampling rate of the incoming signal. If not 48kHz audio will be resampled. Default value 48000

hop_length_samplesint

Hop length in samples. Default value None

hop_length_secondsfloat

Hop length in seconds. Default value 0.02

modelkeras.models.Model or None

Loaded model object. If a model is provided, then sparsity will be ignored. If None is provided, the model will be loaded using the provided sparsity value. Default value None

retrain_type{‘ft’, ‘kd’}

Type of retraining for the sparsified weights of L3 audio model. ‘ft’ chooses the fine-tuning method and ‘kd’ returns knowledge distilled model. Default value “ft”

sparsity{95.45, 53.5, 63.5, 72.3, 73.5, 81.0, 87.0, 90.5}

The desired sparsity of audio model. Default value 95.45

centerbool

If True, pads beginning of signal so timestamps correspond to center of window. Default value True

verbosebool

If True, prints verbose messages. Default value False

Methods

__init__([fs, hop_length_samples, ...])

Constructor

extract(y)

Extract features for the audio signal.

log([level])

Log container content

show([mode, indent, visualize])

Print container content

to_html([indent])

Get container information in a HTML formatted string

to_string([ui, indent])

Get container information in a string

Attributes

description

Extractor description

label

Extractor label

logger

Logger instance