Data

Classes for data handling

Buffers

DataBuffer

dcase_util.data.DataBuffer

Data buffering class, which can be used to store data and meta data associated to the item. Item data is accessed through item key. When internal buffer is filled, oldest item is replaced.

`DataBuffer`([size])	Data buffer (First in, first out)
`DataBuffer.set`(key[, data, meta])	Insert item to the buffer
`DataBuffer.get`(key)	Get item based on key
`DataBuffer.clear`()	Empty the buffer
`DataBuffer.count`	Buffer usage
`DataBuffer.full`	Buffer full
`DataBuffer.key_exists`(key)	Check that key exists in the buffer

Encoders

BinaryMatrixEncoder

dcase_util.data.BinaryMatrixEncoder

`BinaryMatrixEncoder`([label_list, ...])	Binary matrix encoder base class
`BinaryMatrixEncoder.pad`(length[, binary_matrix])	Pad binary matrix along time axis
`BinaryMatrixEncoder.plot`([plot, ...])	Visualize binary matrix, and optionally synced data matrix.

OneHotEncoder

dcase_util.data.OneHotEncoder

`OneHotEncoder`([label_list, time_resolution, ...])	One hot encoder class
`OneHotEncoder.encode`(label[, length_frames, ...])	Generate one hot binary matrix

ManyHotEncoder

dcase_util.data.ManyHotEncoder

`ManyHotEncoder`([label_list, ...])	Many hot encoder class
`ManyHotEncoder.encode`(label_list[, ...])	Generate one hot binary matrix

EventRollEncoder

dcase_util.data.EventRollEncoder

`EventRollEncoder`([label_list, ...])	Event list encoder class
`EventRollEncoder.encode`(metadata_container)	Generate event roll from MetaDataContainer

LabelMatrixEncoder

dcase_util.data.LabelMatrixEncoder

LabelMatrixEncoder([label_list, time_resolution])

Label matrix encoder base class

OneHotLabelEncoder

dcase_util.data.OneHotLabelEncoder

`OneHotLabelEncoder`([label_list, ...])	One Hot label encoder class
`OneHotLabelEncoder.encode`(label[, ...])	Generate one hot label matrix

Data manipulators

Normalizer

dcase_util.data.Normalizer

`Normalizer`([n, s1, s2, mean, std])	Data normalizer to accumulate data statistics
`Normalizer.log`([level])	Log container content
`Normalizer.show`([mode, indent, visualize])	Print container content
`Normalizer.load`([filename])	Load file
`Normalizer.save`([filename])	Save file
`Normalizer.mean`	Mean vector
`Normalizer.std`	Standard deviation vector
`Normalizer.reset`()	Reset internal variables.
`Normalizer.accumulate`(data[, time_axis])	Accumulate statistics
`Normalizer.finalize`()	Finalize statistics calculation
`Normalizer.normalize`(data, **kwargs)	Normalize data matrix with internal statistics of the class.

RepositoryNormalizer

dcase_util.data.RepositoryNormalizer

`RepositoryNormalizer`([normalizers, filename])	Data repository normalizer
`RepositoryNormalizer.load`(filename[, ...])	Load normalizers from disk.
`RepositoryNormalizer.normalize`(data, **kwargs)	Normalize data repository

Aggregator

dcase_util.data.Aggregator

Data aggregator can be used to process data matrix in a processing windows. This processing stage can be used to collapse data within certain window lengths by calculating mean and std of them, or flatten the matrix into single vector.

Supported processing methods:

flatten
mean
std
cov
kurtosis
skew

The processing methods can combined.

Usage examples:

data_aggregator = dcase_util.data.Aggregator(
    recipe=['mean', 'std'],
    win_length_frames=10,
    hop_length_frames=1,
)

data_stacker = dcase_util.data.Stacker(recipe='mfcc')
data_repository = dcase_util.utils.Example.feature_repository()
data_matrix = data_stacker.stack(data_repository)
data_matrix = data_aggregator.aggregate(data_matrix)

`Aggregator`([win_length_frames, ...])	Data aggregator
`Aggregator.log`([level])	Log container content
`Aggregator.show`([mode, indent, visualize])	Print container content
`Aggregator.load`([filename])	Load file
`Aggregator.save`([filename])	Save file
`Aggregator.aggregate`([data])	Aggregate data

Sequencer

dcase_util.data.Sequencer

Sequencer class processes data matrices into sequences (images). Sequences can overlap, and sequencing grid can be altered between calls (shifted).

`Sequencer`([sequence_length, hop_length, ...])	Data sequencer
`Sequencer.log`([level])	Log container content
`Sequencer.show`([mode, indent, visualize])	Print container content
`Sequencer.load`([filename])	Load file
`Sequencer.save`([filename])	Save file
`Sequencer.sequence`(data[, shift])	Convert 2D data matrix into sequence of specified length 2D matrices
`Sequencer.increase_shifting`([shift_step])	Increase temporal shifting

Stacker

dcase_util.data.Stacker

Data stacking class. Class takes vector recipe and DataRepository, and creates appropriate data matrix.

Vector recipe

With a recipe one can either select full matrix, only part of with start and end index, or select individual rows from it.

Example recipe:

[
 {
    'method': 'mfcc',
 },
 {
    'method': 'mfcc_delta'
    'vector-index: {
        'channel': 0,
        'start': 1,
        'end': 17,
        'full': False,
        'selection': False,
    }
  },
 {
    'method': 'mfcc_acceleration',
    'vector-index: {
        'channel': 0,
        'full': False,
        'selection': True,
        'vector': [2, 4, 6]
    }
 }
]

See dcase_util.utils.VectorRecipeParser how recipe string can be conveniently used to generate above data structure.

`Stacker`([recipe, hop])	Data stacker
`Stacker.log`([level])	Log container content
`Stacker.show`([mode, indent, visualize])	Print container content
`Stacker.load`([filename])	Load file
`Stacker.save`([filename])	Save file
`Stacker.stack`(repository, **kwargs)	Vector creation based on recipe

Selector

dcase_util.data.Selector

Data selecting class.

`Selector`(**kwargs)	Data selector
`Selector.log`([level])	Log container content
`Selector.show`([mode, indent, visualize])	Print container content
`Selector.load`([filename])	Load file
`Selector.save`([filename])	Save file
`Selector.select`(data[, selection_events])	Selecting data repository with given events

Masker

dcase_util.data.Masker

Data masking class.

`Masker`(**kwargs)	Data masker
`Masker.log`([level])	Log container content
`Masker.show`([mode, indent, visualize])	Print container content
`Masker.load`([filename])	Load file
`Masker.save`([filename])	Save file
`Masker.mask`(data[, mask_events])	Masking data repository with given events

Probabilities

ProbabilityEncoder

dcase_util.data.ProbabilityEncoder

`ProbabilityEncoder`([label_list])	Constructor
`ProbabilityEncoder.log`([level])	Log container content
`ProbabilityEncoder.show`([mode, indent, ...])	Print container content
`ProbabilityEncoder.load`([filename])	Load file
`ProbabilityEncoder.save`([filename])	Save file
`ProbabilityEncoder.collapse_probabilities`(...)	Collapse probabilities along time_axis
`ProbabilityEncoder.collapse_probabilities_windowed`(...)	Collapse probabilities with a sliding window.
`ProbabilityEncoder.binarization`(probabilities)	Binarization

Decisions

DecisionEncoder

dcase_util.data.DecisionEncoder

`DecisionEncoder`([label_list])	Constructor
`DecisionEncoder.log`([level])	Log container content
`DecisionEncoder.show`([mode, indent, visualize])	Print container content
`DecisionEncoder.load`([filename])	Load file
`DecisionEncoder.save`([filename])	Save file
`DecisionEncoder.majority_vote`(frame_decisions)	Majority vote.
`DecisionEncoder.many_hot`(frame_decisions[, ...])	Many hot
`DecisionEncoder.find_contiguous_regions`(...)	Find contiguous regions from bool valued numpy.array.
`DecisionEncoder.process_activity`(...[, ...])	Process activity array (binary)