Release notes
v0.2.20
Updates
Add MKV as valid file extension in
AudioContainer
Add
overlay
method inAudioContainer
Add balancing_mode parameter to
validation_files_balanced
method inSoundEventDataset
Add
dataset
field toMetaDataItem
and adddataset
field related properties toMetaDataContainer
Add identifier count reporting for tags in
MetaDataContainer
Add
map_tags
method inMetaDataContainer
Add
TAUUrbanAcousticScenes_2022_Mobile_EvaluationSet
datasetUpdate
unique_source_labels
method inMetaDataContainer
to be more efficientUpdate
model_summary_string
function to work with latest Keras versions
Bug fixes
Fix field override in
non_hashable_fields
method inAppParameterContainer
Fix
process_meta_container
inAcousticSceneDataset
to retain container filenameFix
is_jupyter
function to work when IPython is not installedFix
plot
inAudioContainer
method to use figsize parameter for dual plots
v0.2.19
Updates
Add
TAUUrbanAcousticScenes_2022_Mobile_DevelopmentSet
Add
active_scenes
andactive_events
parameter for Datasets class to select scene and event classes.Add event activity and inactivity calculation for
MetaDataContainer
Add
TorchOpenL3Extractor
andTorchOpenL3ExtractorProcessor
Add
float5
andfloat6
value types toformatted_value
method inFancyStringifier
.Add valid content type ‘all’ for
RemoteFile
Bug fixes
Fix
unique_files
inMetaDataContainer
to be more efficient with large number of files.Fix
check_metadata
inDataset
to be more efficient with large number of files.Fix field ‘event_label’ processing to be robust for non-string values in
MetaDataItem
Fix data loading to be robust for empty files in
DictContainer
Fix ‘float1_percentage+ci’, ‘float2_percentage+ci’, ‘float3_percentage+ci’, and ‘float4_percentage+ci’ value types in
FancyStringifier
to have fallback data types if values do not have full data.
v0.2.18
Updates
Add
TAUUrbanAudioVisualScenes_2021_EvaluationSet
classTAUUrbanAcousticScenes_2021_Mobile_EvaluationSet
Bug fixes
Fix
normalize
method inAudioContainer
to work with multi and single channel audio.Fix
pack
method inDatasetPacker
correctly identify changed files and trigger package regenerationFix
TAUUrbanAcousticScenes_2021_Mobile_EvaluationSet
to correctly extract zip-packages
v0.2.17
Updates
Add TensorFlow keras utilities (tf.keras)
Add
get_media_duration
andmerge_media_files
functionsAdd
filename_audio
andfilename_video
properties toMetaDataItem
classAdd
TAUUrbanAudioVisualScenes_2021_DevelopmentSet
classUpdate travis tests: Python 2.7 tests are dropped, only Python 3.X tests are used
Bug fixes
Fix base_path removal in
DatasetPacker
v0.2.16
Updates
Add scene_labels for
TAUUrbanAcousticScenes_2020_Mobile_EvaluationSet
andTAUUrbanAcousticScenes_2020_3Class_EvaluationSet
.Update
SubmissionChecker
v0.2.15
Bug fixes
Fix all_data fold handling in
TAUUrbanAcousticScenes_2020_Mobile_EvaluationSet
andTAUUrbanAcousticScenes_2020_3Class_EvaluationSet
to be uniform.
v0.2.14
Updates
Update PIP package
v0.2.13
New features
TAUUrbanAcousticScenes_2020_Mobile_EvaluationSet
andTAUUrbanAcousticScenes_2020_3Class_EvaluationSet
datasets
v0.2.12
Updates
Update
TAUUrbanAcousticScenes_2020_Mobile_DevelopmentSet
to use dataset version 2.0
v0.2.11
New features
Add
OpenL3Extractor
,EdgeL3Extractor
, andEmbeddingExtractor
feature extractor classesAdd
OpenL3ExtractorProcessor
andEdgeL3ExtractorProcessor
processorsAdd
TAUUrbanAcousticScenes_2020_Mobile_DevelopmentSet
andTAUUrbanAcousticScenes_2020_3Class_DevelopmentSet
datasetsAdd
get_audio_info
function to allow fetching audio file information without reading full file.Add MP3 audio example file
Updates
Update
AudioContainer
constructor to allow initialization with multi-channel audio data in form of list of audio data vectors.Update
load
method inAudioContainer
to have parameterauto_trimming
to automatically trim stop parameter to audio file lengthUpdate
load
method inAudioContainer
to check start and stop parameters against actual audio file durationUpdate
AudioContainer
to storechannel_labels
, updateplot_wave
andplot_spec
methods withchannel_labels
as well.Update
plot_wave
method inAudioContainer
to supportmax_sr
parameter and different color per channel.Update
plot
method inAudioContainer
to plot both waveform and spectrogram at the same time (dual
plotting mode)Update
segments
method inAudioContainer
to supportactive_segments
Update
FancyLogger
,FancyHTMLPrinter
, andFancyStringifier
to accumulate row value when usingrow
method, and addrow_sum
androw_average
methodsUpdate
setup_keras
to suppress TensorFlow warningsUpdate
debug_packages
method ofDataset
class to show more information about local filesUpdate
FileMixin
to allow overridingvalid_format
through constructor parameterUpdate code to support Librosa 0.7.0
Bug fixes
Fix
MetaDataContainer
sorting to work with numeric filenamesFix
get_byte_string
to work with small valuesFix filename handling in FeatureRepository when dict of filenames is used
Fix
collapse_probabilities_windowed
method inProbabilityEncoder
to accept arrays of probabilitiesFix example system
sed_gmm.py
to work with current version
v0.2.10
Bug fixes
Fix cross-validation data loading for datasets without reference meta data in
load_crossvalidation_data
method ofDataset
class
v0.2.9
New features
Add
TAUUrbanAcousticScenes_2019_EvaluationSet
,TAUUrbanAcousticScenes_2019_Mobile_EvaluationSet
, andTAUUrbanAcousticScenes_2019_Openset_EvaluationSet
datasets.
v0.2.8
New features
Add
TAUUrbanAcousticScenes_2019_LeaderboardSet
,TAUUrbanAcousticScenes_2019_Mobile_LeaderboardSet
, andTAUUrbanAcousticScenes_2019_Openset_LeaderboardSet
datasets.Add
is_jupyter
function to detect if code is running inside jupyterAdd
shorten
method inPath
to shorten long paths for visualization purposeAdd
FancyHTMLStringifier
,FancyHTMLPrinter
classes for HTML outputAdd
plot
method inDataArrayContainer
Add
plot
method inNormalizer
Updates
Update YAML serialization to use
yaml.FullLoader
formatted_value
method inFancyStringifier
to be staticRefactor printing methods in containers to allow automatic output mode switching between HTML (Jypyter) and string (console)
Update data printing mechanism for containers
Update
plot
methods API to includefigsize
parameterUpdate default parameters in
plot
method inAudioContainer
(color bar is hidden by default)Update error messages in
AudioContainer
to be more informativeUpdate
load
method inMetaDataContainer
to support additional row formatsUpdate
feature_extractor_list
method to have option to return string or display (print to console or print as HTML output in Jupyter)Update
dataset_list
to use table layout and add option to return string or display (print to console or print as HTML output in Jupyter)Update
to_string
method inMetaDataContainer
with optionshow_info
to control what data is printUpdate API for methods
show
andlog
inDataset
to includeshow_meta
parameter andmode
parameter to control output formatUpdate printing
validation_files_balanced
method inAcousticSceneDataset
to support different output modes (print to console or print as HTML output in Jupyter)Update
ProgressLoggerCallback
to includeshow_timing
parameter andnotebook
output typeUpdate
StasherCallback
withto_string
andshow
Update printing inside
setup_keras
functionUpdate
model_summary_string
function with new parameters (show_parameters
anddisplay
)Update
plot
method inDataMatrix2DContainer
withxlabel
andylabel
parametersUpdate
plot
method inBinaryMatrix2DContainer
withpanel_title_position
parametersUpdate usage of
tqdm
library inDataset
to allow locally progress bar disable/enable
Bug fixes
Fix single channel audio plotting in
AudioContainer
v0.2.7
Updates
Update
TAUUrbanAcousticScenes_2019_Mobile_DevelopmentSet
, andTAUUrbanAcousticScenes_2019_Openset_DevelopmentSet
datasets.
v0.2.6
New features
Add
TAUUrbanAcousticScenes_2019_DevelopmentSet
,TAUUrbanAcousticScenes_2019_Mobile_DevelopmentSet
, andTAUUrbanAcousticScenes_2019_Openset_DevelopmentSet
datasets.Add
OneHotEncoder
andOneHotEncodingProcessor
to allow unknown labels.Add automatic meta data check ups in datasets classes, and parameter to control it.
Add
AudioSequencingProcessor
Add
feature_extractor_list
to show all available feature extractors classes, and add description to all feature extraction classes.
Updates
Update
debug_packages
method to allow better control which part of package_list is checked: remote or local.Update
data_collector
to have generic data axis handling.Update
load
method inListDictContainer
to skip empty rows in CSV files.Update
save
method inListDictContainer
for TXT and CSV to avoid extra empty lines under Windows.Update
save
method inMetaDataContainer
for TXT and CSV to avoid extra empty lines under Windows.Update
relative_to_absolute_path
andabsolute_to_relative_path
to give more informative error messages.Update
EventRollEncodingProcessor
to supportpad_length
parameter.Update unit tests to be cross-platform compatible (Linux / Windows)
Update
SuppressStdoutAndStderr
to be more robustUpdate
MetaDataItem
to keep filename field to be posix path when relative path is used.Update dtypes to be compatible with numpy v1.14
Update
setup_keras
to warn when GPU was not found.Update
model_summary_string
to show activation function of the output layer.Update all processors, encoders, and manipulators have __call__ magic class method.
Bug fixes
Fix delimiter detection in
load
method inMetaDataContainer
Fix
MetaDataItem
to better handle empty fields (onset, offset, and event_label).Fix how
validation_split
andvalidation_files_dataset
method usestraining_meta
parameter.
v0.2.5
New features
Add
SoundDataset
base class.Add
feature_extractor_factory
to get feature extractor class based on feature label.Add
OneHotLabelEncoder
label based encoder.Add
OneHotLabelEncodingProcessor
class.Add
DBR_Dataset
class.Add
map_events
method toMetaDataContainer
to map multiple event labels into single target event label.Add
event_inactivity
method toMetaDataContainer
to get inactivity segments between events.Add
__version__
variable to the module.Add
check_installation
function to check module installation.Add
TUTAcousticScenes_2017_FeaturesSet
dataset class.Add
check_metadata
method to dataset classes to double check meta and cross-validation setups automatically during the dataset initialization.
Updates
Update
ProcessingChain
to verify that all items in the chain are instances ofProcessor
class.Update
ProbabilityItem
to have index property.Update
ProbabilityContainer
to support pickle saving and loading.Update
ProbabilityContainer
to haveas_matrix
method.Update
majority_vote
method inDecisionEncoder
to be more generic (works with both labels and class IDs).Move processor classes related to encoding into separate file.
Update
load
method inMetaDataContainer
to translate between decimal comma and point.Update
data_collector
function to be more generic.Update
formatted_value
method inFancyStringifier
to support fixed length strings (stf
).Refactor
SubmissionChecker
to be more flexible.Update
DCASEAppParameterContainer
to support secondary data processing chain.Update
create_sequential_model
function to return optionally functional API Keras model instead of default Keras sequential model.Update
ProgressLoggerCallback
to print estimate of the remaining model learning time.
Bug fixes
Fix dataset class when no
remote_file
is set
v0.2.4
New features
Add
TUTUrbanAcousticScenes_2018_EvaluationSet
andTUTUrbanAcousticScenes_2018_Mobile_EvaluationSet
dataset classes.Add
DCASE2018_Task5_EvaluationSet
dataset class.
Updates
Update
formatted_value
method inFancyStringifier
to have full coverage of float formats (float precision from 1 to 4).
Bug fixes
Fix
TUTRareSoundEvents_2017_EvaluationSet
dataset class to have correct audio path.
v0.2.3
New features
Add
AudioWritingProcessor
andMonoAudioWritingProcessor
processor classes.Add
FeatureWritingProcessor
andRepositoryFeatureWritingProcessor
processor classes.
Bug fixes
Fix
DataRepository
not to have internal variables in the__dict__
after loading container from disk.
v0.2.2
In this version external dependencies of this module are minimized. External modules required for non-core functionality is not anymore included in the setup.py, and not automatically installed. Once user uses functionality requiring these rarely used external modules and module is not found, ImportError is raised with instructions to install correct module through pip. All module requirements are still available in requirements.txt
.
New features
Add
unique_source_labels
property toMetaDataContainer
.Add
file_format
parameter to load and save method forListContainer
andDictContainer
to force specific file format.Add
label_list
parameter toManyHotEncodingProcessor
.Add
DatasetPacker
class to make DCASE styled dataset packages.Add
dataset_exists
helper function to check Dataset classes.Add multi-channel audio example
audio_container_ch4
.Add
TUTUrbanAcousticScenes_2018_LeaderboardSet
andTUTUrbanAcousticScenes_2018_Mobile_LeaderboardSet
dataset classes.
Updates
Update
Dataset
class handle also non-text file meta files by introducing parameterevaluation_setup_file_extension
.Update package list handling in
Dataset
to support custom package extraction parameters by extra parameterpackage_extract_parameters
.Update
pad
method inAudioContainer
to work with multi-channel audio.Update
compress
method to produce split packages only if size limit is met.Update
compress
method to return package filenames.Update
DCASE2018_Task5_DevelopmentSet
dataset.
v0.2.1
New features
Add
md5
andbytes
properties to FileMixin.Add two level hierarchical balancing to
validation_files_balanced
method inAcousticSceneDataset
.Add
TUTUrbanAcousticScenes_2018_DevelopmentSet
andTUTUrbanAcousticScenes_2018_Mobile_DevelopmentSet
datasets.Add
float1_ci
,float2_ci
,float1_ci_bracket
,float2_ci_bracket
,float1_percentage+ci
andfloat2_percentage+ci
value types toformatted_value
method inFancyStringifier
.Add
get_set
method toAppParameterContainer
.Add
data_collector
function to collect data and meta.
Updates
Update
debug_packages
method inDataset
to provide more information.Update validation subset generation methods (
validation_split
,validation_files_dataset
,validation_files_random
, andvalidation_files_balanced
) method inDataset
,AcousticSceneDataset
,SoundEventDataset
, andAudioTaggingDataset
to allow external processing of meta data before processing throughtraining_meta
parameter.Update
filter
method inListDictContainer
to allow filtering based on list of values.Update
set_label
property toMetaDataItem
.Update
filter
method inMetaDataContainer
to usefilter
method from parent class.Update example applications to use current API.
Update random seed setting for TensorFlow in
setup_keras
function.Update
dataset_factory
to handle dataset classes defined outside dcase_util.
Bug fixes
Fix
load_from_youtube
method inAudioContainer
.Fix example applications to work on Windows (Python 3.6).
v0.2.0
New features
Add
row_reset
androw_sep
helper methods toFancyStringifier
,FancyLogger
, andFancyPrinter
classes.
Updates
Update
download
method inRemoteFile
to be more robust when encounter SSL problems.Update
AppParameterContainer
to handleFEATURE_PROCESSING_CHAIN
,DATA_PROCESSING_CHAIN
.Update
filter
method inMetaDataContainer
to acceptsource_label
andsource_label_list
parameters.Update
DCASE2018_Task5_DevelopmentSet
.
Bug fixes
Fix
construct_path
method inApplicationPaths
to work in Windows as well.Fix path creation in
AppParameterContainer
.
v0.1.9
New features
Add new processors
FeatureReadingProcessor
,DataShapingProcessor
,RepositoryAggregationProcessor
,RepositorySequencingProcessor
, andRepositoryToMatrixProcessor
.Add extract method to
SpectralFeatureExtractor
.Add automatic conversion of numeric fields when loading CSV data to
ListDictContainer
.Add filter and get_field_unique methods to
ListDictContainer
.Add MP4 to valid audio formats for
AudioContainer
.Add general path modification method (
Path.modify
).Add Keras profile
cuda0_fast
.Add Keras utility to create optimizer instance (create_optimizer).
Add
DCASE2018_Task5_DevelopmentSet
andDCASE2013_Scenes_EvaluationSet
datasets.Add
DataMatrix4DContainer
.Add
plot` method to ``DataMatrix3DContainer
.Add support for a new annotation format for tags [filename][tab][tags] in
MetaDataContainer
.Add zero padding to
Sequencer
.Add header field override in load method of
MetaDataContainer
.Add support for new compressed audio formats (OGG, MP3) in
AudioContainer
.Add
segments
method inAudioContainer
to split signal into non-overlapping segments with optionally skipped regions.Add
pad
method inAudioContainer
to pad signal into given length.Add
compress
method inPackageMixin
.Add
Package
class to handle local compressed file packages.Add
change_axis
method toDataMatrix2DContainer
,DataMatrix3DContainer
, andDataMatrix4DContainer
.Add
KerasDataSequence
class for data generation through processing chain.Add support for data and meta processing chains to
DCASEAppParameterContainer
.Add
many_hot
method inDecisionEncoder
.
Updates
Update
TUTRareSoundEvents_2017_DevelopmentSet
andTUTRareSoundEvents_2017_EvaluationSet
datasets.Update Keras utility
model_summary_string
to use by default standard method from Keras.Update
FeatureRepository
API to be aligned with Container classes.Update
Sequencer
,SequencingProcessor
, andRepositorySequencingProcessor
API.Update
AppParameterContainer
to allow change of active set even afterprocess
method has been called.Update mechanism to store meta information about chain item when data is processed using processing chain.
Bug fixes
Fix
save
method inMetaDataContainer
when saving with tags in CSV format.Fix many methods to give more appropriate error messages.
API changes and compatibility
Sequencer
,SequencingProcessor
, andRepositorySequencingProcessor
API changes:frames
changed tosequence_length
hop_length_frames
tohop_length
padding
parameter accepts now strings (zero
andrepeat
)
v0.1.8
New features
Add new formats for
MetaDataContainer
(cpickle, CSV).Add forced file formats when reading and saving containers.
Add Keras setup function.
Add frame splitting method into
AudioContainer
.
Bug fixes
Fix unicode string support when printing container information.
Fix data contamination through data references while manipulating data.
Some minor bug fixes.
v0.1.7
New features
Add intersection method for
MetaDataContainer
.
Updates
Update dataset class API (e.g. copy returned metadata prevent accidental manipulation, uniform method names).
Bug fixes
Fix data sequencing when overlapping sequencing is used.
Fix datasets
CHiMEHome_DomesticAudioTag_DevelopmentSet
,TUTAcousticScenes_2017_EvaluationSet
, andTUTSoundEvents_2017_EvaluationSet
.
v0.1.6
New features
Add
CHiMEHome_DomesticAudioTag_EvaluationSet
dataset.
Updates
Update example audio to be 16-bit audio file in wav-format instead of FLAC used earlier.
Update
ProbabilityContainer
API to be more compatible withMetaDataContainer
.Update
MetaDataItem
to be compatible with field naming used previously in DCASE baseline systems.Update ui utilities.
Bug fixes
Fix audio reading when target sampling rate is not set.
Some minor bug fixes.
v0.1.5
Fixing PYPI package.
v0.1.4
Release first PYPI package.
v0.1.0
Initial public release.