dcase_util.containers.DataRepository

class dcase_util.containers.DataRepository(data=None, filename=None, default_stream_id=0, processing_chain=None, **kwargs)[source]

Data repository container class to store multiple DataContainers together.

Containers are stored in a dict, label is used as dictionary key and value is associated data container.

Constructor

Parameters
filename: str or dict

Either one filename (str) or multiple filenames in a dictionary. Dictionary based parameter is used to construct the repository from separate FeatureContainers, two formats for the dictionary is supported: 1) label as key, and filename as value, and 2) two-level dictionary label as key1, stream as key2 and filename as value.

default_stream_idstr or int

Default stream id used when accessing data Default value 0

processing_chainProcessingChain

Processing chain to be included into repository Default value None

__init__(data=None, filename=None, default_stream_id=0, processing_chain=None, **kwargs)[source]

Constructor

Parameters
filename: str or dict

Either one filename (str) or multiple filenames in a dictionary. Dictionary based parameter is used to construct the repository from separate FeatureContainers, two formats for the dictionary is supported: 1) label as key, and filename as value, and 2) two-level dictionary label as key1, stream as key2 and filename as value.

default_stream_idstr or int

Default stream id used when accessing data Default value 0

processing_chainProcessingChain

Processing chain to be included into repository Default value None

Methods

__init__([data, filename, ...])

Constructor

clear()

copy()

delimiter([exclude_delimiters])

Use csv.sniffer to guess delimiter for CSV file

detect_file_format([filename])

Detect file format from extension

empty()

Check if file is empty

exists()

Checks that file exists

filter([data, excluded_key_prefix])

Filter nested dict

fromkeys(iterable[, value])

Create a new dictionary with keys from iterable and values set to value.

get(key[, default])

Return the value for key if key is in the dictionary, else default.

get_container(label[, stream_id])

Get container from repository

get_dump_content(data)

Clean internal content for saving

get_file_information()

Get file information, filename

get_hash([data])

Get unique hash string (md5) for given parameter dict.

get_hash_for_path([dotted_path])

Get unique hash string for the data under given path.

get_leaf_path_list([target_field, ...])

Get path list to all leaf node in the nested dict.

get_path(path[, default, data])

Get value from nested dict with dotted path

is_package([filename])

Determine if the file is compressed package.

items()

keys()

load([filename, collect_from_containers])

Load file list

log([level])

Log container content

merge(override[, target])

Recursive dict merge

plot([plot, figsize])

Visualize data stored in the repository.

pop(key[, default])

If key is not found, default is returned if given, otherwise KeyError is raised

popitem(/)

Remove and return a (key, value) pair as a 2-tuple.

push_processing_chain_item(processor_name[, ...])

Push processing chain item

save([filename, split_into_containers])

Save file

set_container(container, label[, stream_id])

Store container to repository

set_path(path, new_value[, data])

Set value in nested dict with dotted path

setdefault(key[, default])

Insert key with a value of default if key is not in the dictionary.

show([mode, indent, visualize])

Print container content

stream_ids(label)

Stream ids stores for the label in the repository.

to_html([indent])

Get container information in a HTML formatted string

to_string([ui, indent])

Get container information in a string

update([E, ]**F)

If E is present and has a .keys() method, then does: for k in E: D[k] = E[k] If E is present and lacks a .keys() method, then does: for k, v in E: D[k] = v In either case, this is followed by: for k in F: D[k] = F[k]

validate_format()

Validate file format

values()

Attributes

bytes

File size in bytes

labels

Item labels stores in the repository.

logger

Logger instance

md5

Checksum for file.

valid_formats

Valid file formats