Data

Classes for data handling

Buffers

DataBuffer

dcase_util.data.DataBuffer

Data buffering class, which can be used to store data and meta data associated to the item. Item data is accessed through item key. When internal buffer is filled, oldest item is replaced.

DataBuffer([size])

Data buffer (First in, first out)

DataBuffer.set(key[, data, meta])

Insert item to the buffer

DataBuffer.get(key)

Get item based on key

DataBuffer.clear()

Empty the buffer

DataBuffer.count

Buffer usage

DataBuffer.full

Buffer full

DataBuffer.key_exists(key)

Check that key exists in the buffer

Encoders

BinaryMatrixEncoder

dcase_util.data.BinaryMatrixEncoder

BinaryMatrixEncoder([label_list, ...])

Binary matrix encoder base class

BinaryMatrixEncoder.pad(length[, binary_matrix])

Pad binary matrix along time axis

BinaryMatrixEncoder.plot([plot, ...])

Visualize binary matrix, and optionally synced data matrix.

OneHotEncoder

dcase_util.data.OneHotEncoder

OneHotEncoder([label_list, time_resolution, ...])

One hot encoder class

OneHotEncoder.encode(label[, length_frames, ...])

Generate one hot binary matrix

ManyHotEncoder

dcase_util.data.ManyHotEncoder

ManyHotEncoder([label_list, ...])

Many hot encoder class

ManyHotEncoder.encode(label_list[, ...])

Generate one hot binary matrix

EventRollEncoder

dcase_util.data.EventRollEncoder

EventRollEncoder([label_list, ...])

Event list encoder class

EventRollEncoder.encode(metadata_container)

Generate event roll from MetaDataContainer

LabelMatrixEncoder

dcase_util.data.LabelMatrixEncoder

LabelMatrixEncoder([label_list, time_resolution])

Label matrix encoder base class

OneHotLabelEncoder

dcase_util.data.OneHotLabelEncoder

OneHotLabelEncoder([label_list, ...])

One Hot label encoder class

OneHotLabelEncoder.encode(label[, ...])

Generate one hot label matrix

Data manipulators

Normalizer

dcase_util.data.Normalizer

Normalizer([n, s1, s2, mean, std])

Data normalizer to accumulate data statistics

Normalizer.log([level])

Log container content

Normalizer.show([mode, indent, visualize])

Print container content

Normalizer.load([filename])

Load file

Normalizer.save([filename])

Save file

Normalizer.mean

Mean vector

Normalizer.std

Standard deviation vector

Normalizer.reset()

Reset internal variables.

Normalizer.accumulate(data[, time_axis])

Accumulate statistics

Normalizer.finalize()

Finalize statistics calculation

Normalizer.normalize(data, **kwargs)

Normalize data matrix with internal statistics of the class.

RepositoryNormalizer

dcase_util.data.RepositoryNormalizer

RepositoryNormalizer([normalizers, filename])

Data repository normalizer

RepositoryNormalizer.load(filename[, ...])

Load normalizers from disk.

RepositoryNormalizer.normalize(data, **kwargs)

Normalize data repository

Aggregator

dcase_util.data.Aggregator

Data aggregator can be used to process data matrix in a processing windows. This processing stage can be used to collapse data within certain window lengths by calculating mean and std of them, or flatten the matrix into single vector.

Supported processing methods:

  • flatten

  • mean

  • std

  • cov

  • kurtosis

  • skew

The processing methods can combined.

Usage examples:

 1data_aggregator = dcase_util.data.Aggregator(
 2    recipe=['mean', 'std'],
 3    win_length_frames=10,
 4    hop_length_frames=1,
 5)
 6
 7data_stacker = dcase_util.data.Stacker(recipe='mfcc')
 8data_repository = dcase_util.utils.Example.feature_repository()
 9data_matrix = data_stacker.stack(data_repository)
10data_matrix = data_aggregator.aggregate(data_matrix)

Aggregator([win_length_frames, ...])

Data aggregator

Aggregator.log([level])

Log container content

Aggregator.show([mode, indent, visualize])

Print container content

Aggregator.load([filename])

Load file

Aggregator.save([filename])

Save file

Aggregator.aggregate([data])

Aggregate data

Sequencer

dcase_util.data.Sequencer

Sequencer class processes data matrices into sequences (images). Sequences can overlap, and sequencing grid can be altered between calls (shifted).

Sequencer([sequence_length, hop_length, ...])

Data sequencer

Sequencer.log([level])

Log container content

Sequencer.show([mode, indent, visualize])

Print container content

Sequencer.load([filename])

Load file

Sequencer.save([filename])

Save file

Sequencer.sequence(data[, shift])

Convert 2D data matrix into sequence of specified length 2D matrices

Sequencer.increase_shifting([shift_step])

Increase temporal shifting

Stacker

dcase_util.data.Stacker

Data stacking class. Class takes vector recipe and DataRepository, and creates appropriate data matrix.

Vector recipe

With a recipe one can either select full matrix, only part of with start and end index, or select individual rows from it.

Example recipe:

 1[
 2 {
 3    'method': 'mfcc',
 4 },
 5 {
 6    'method': 'mfcc_delta'
 7    'vector-index: {
 8        'channel': 0,
 9        'start': 1,
10        'end': 17,
11        'full': False,
12        'selection': False,
13    }
14  },
15 {
16    'method': 'mfcc_acceleration',
17    'vector-index: {
18        'channel': 0,
19        'full': False,
20        'selection': True,
21        'vector': [2, 4, 6]
22    }
23 }
24]

See dcase_util.utils.VectorRecipeParser how recipe string can be conveniently used to generate above data structure.

Stacker([recipe, hop])

Data stacker

Stacker.log([level])

Log container content

Stacker.show([mode, indent, visualize])

Print container content

Stacker.load([filename])

Load file

Stacker.save([filename])

Save file

Stacker.stack(repository, **kwargs)

Vector creation based on recipe

Selector

dcase_util.data.Selector

Data selecting class.

Selector(**kwargs)

Data selector

Selector.log([level])

Log container content

Selector.show([mode, indent, visualize])

Print container content

Selector.load([filename])

Load file

Selector.save([filename])

Save file

Selector.select(data[, selection_events])

Selecting data repository with given events

Masker

dcase_util.data.Masker

Data masking class.

Masker(**kwargs)

Data masker

Masker.log([level])

Log container content

Masker.show([mode, indent, visualize])

Print container content

Masker.load([filename])

Load file

Masker.save([filename])

Save file

Masker.mask(data[, mask_events])

Masking data repository with given events

Probabilities

ProbabilityEncoder

dcase_util.data.ProbabilityEncoder

ProbabilityEncoder([label_list])

Constructor

ProbabilityEncoder.log([level])

Log container content

ProbabilityEncoder.show([mode, indent, ...])

Print container content

ProbabilityEncoder.load([filename])

Load file

ProbabilityEncoder.save([filename])

Save file

ProbabilityEncoder.collapse_probabilities(...)

Collapse probabilities along time_axis

ProbabilityEncoder.collapse_probabilities_windowed(...)

Collapse probabilities with a sliding window.

ProbabilityEncoder.binarization(probabilities)

Binarization

Decisions

DecisionEncoder

dcase_util.data.DecisionEncoder

DecisionEncoder([label_list])

Constructor

DecisionEncoder.log([level])

Log container content

DecisionEncoder.show([mode, indent, visualize])

Print container content

DecisionEncoder.load([filename])

Load file

DecisionEncoder.save([filename])

Save file

DecisionEncoder.majority_vote(frame_decisions)

Majority vote.

DecisionEncoder.many_hot(frame_decisions[, ...])

Many hot

DecisionEncoder.find_contiguous_regions(...)

Find contiguous regions from bool valued numpy.array.

DecisionEncoder.process_activity(...[, ...])

Process activity array (binary)