parakeet.data package

Submodules

parakeet.data.batch module

Utility functions to create batch for arrays which satisfy some conditions. Batch functions for text sequences, audio and spectrograms are provided.

class parakeet.data.batch.SpecBatcher(pad_value=0.0, time_major=False, dtype=<class 'numpy.float32'>)[source]

Bases: object

A wrapper class for batch_spec

class parakeet.data.batch.TextIDBatcher(pad_id=0, dtype=<class 'numpy.int64'>)[source]

Bases: object

A wrapper class for batch_text_id.

class parakeet.data.batch.WavBatcher(pad_value=0.0, dtype=<class 'numpy.float32'>)[source]

Bases: object

A wrapper class for batch_wav.

parakeet.data.batch.batch_spec(minibatch, pad_value=0.0, time_major=False, dtype=<class 'numpy.float32'>)[source]

Pad spectra to the largest length and batch them.

Args:

minibatch (List[np.ndarray]): list of rank-2 arrays of shape(F, T) for mono-channel spectrograms, or list of rank-3 arrays of shape(C, F, T) for multi-channel spectrograms(F stands for frequency bands.), dtype float. pad_value (float, optional): the pad value. Defaults to 0.. dtype (np.dtype, optional): data type of the output. Defaults to np.float32.

Returns:

np.ndarray: a rank-3 array of shape(B, F, T) or (B, T, F).

parakeet.data.batch.batch_text_id(minibatch, pad_id=0, dtype=<class 'numpy.int64'>)[source]

Pad sequences to text_ids to the largest length and batch them.

Args:

minibatch (List[np.ndarray]): list of rank-1 arrays, shape(T,), dtype np.int64, text_ids. pad_id (int, optional): the id which correspond to the special pad token. Defaults to 0. dtype (np.dtype, optional): the data dtype of the output. Defaults to np.int64.

Returns:

np.ndarray: rank-2 array of text_ids, shape(B, T), B stands for batch_size, T stands for length. The output batch.

parakeet.data.batch.batch_wav(minibatch, pad_value=0.0, dtype=<class 'numpy.float32'>)[source]

pad audios to the largest length and batch them.

Args:

minibatch (List[np.ndarray]): list of rank-1 float arrays(mono-channel audio, shape(T,)), dtype float. pad_value (float, optional): the pad value. Defaults to 0.. dtype (np.dtype, optional): the data type of the output. Defaults to np.float32.

Returns:

np.ndarray: shape(B, T), the output batch.

parakeet.data.dataset module

class parakeet.data.dataset.CacheDataset(dataset)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.ChainDataset(*datasets)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.DictDataset(**datasets)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.FilterDataset(dataset, filter_fn)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.SliceDataset(dataset, start, finish, order=None)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.SubsetDataset(dataset, indices)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.TransformDataset(dataset, transform)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

class parakeet.data.dataset.TupleDataset(*datasets)[source]

Bases: paddle.fluid.dataloader.dataset.Dataset

parakeet.data.dataset.split(dataset, first_size)[source]

A utility function to split a dataset into two datasets.

Module contents

Parakeet’s infrastructure for data processing.