kedro.io

Description

kedro.io provides functionality to read and write to a number of data sets. At the core of the library is the AbstractDataset class.

Classes

kedro.io.AbstractDataset(*args, **kwds)

AbstractDataset is the base class for all data set implementations. All data set implementations should extend this abstract class and implement the methods marked as abstract. If a specific dataset implementation cannot be used in conjunction with the ParallelRunner, such user-defined dataset should have the attribute _SINGLE_PROCESS = True. Example: ::.

kedro.io.AbstractVersionedDataset(filepath, ...)

AbstractVersionedDataset is the base class for all versioned data set implementations.

kedro.io.CachedDataSet

alias of CachedDataset

kedro.io.CachedDataset(dataset[, version, ...])

CachedDataset is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.

kedro.io.DataCatalog([data_sets, feed_dict, ...])

DataCatalog stores instances of AbstractDataset implementations to provide load and save capabilities from anywhere in the program.

kedro.io.IncrementalDataSet

alias of IncrementalDataset

kedro.io.IncrementalDataset(path, dataset[, ...])

IncrementalDataset inherits from PartitionedDataset, which loads and saves partitioned file-like data using the underlying dataset definition.

kedro.io.LambdaDataSet

alias of LambdaDataset

kedro.io.LambdaDataset(load, save[, exists, ...])

LambdaDataset loads and saves data to a data set.

kedro.io.MemoryDataSet

alias of MemoryDataset

kedro.io.MemoryDataset([data, copy_mode, ...])

MemoryDataset loads and saves data from/to an in-memory Python object.

kedro.io.PartitionedDataSet

alias of PartitionedDataset

kedro.io.PartitionedDataset(path, dataset[, ...])

PartitionedDataset loads and saves partitioned file-like data using the underlying dataset definition.

kedro.io.Version(load, save)

This namedtuple is used to provide load and save versions for versioned data sets.

Exceptions

kedro.io.DataSetAlreadyExistsError

alias of DatasetAlreadyExistsError

kedro.io.DataSetError

alias of DatasetError

kedro.io.DataSetNotFoundError

alias of DatasetNotFoundError

kedro.io.DatasetAlreadyExistsError

DatasetAlreadyExistsError raised by DataCatalog class in case of trying to add a data set which already exists in the DataCatalog.

kedro.io.DatasetError

DatasetError raised by AbstractDataset implementations in case of failure of input/output methods.

kedro.io.DatasetNotFoundError

DatasetNotFoundError raised by DataCatalog class in case of trying to use a non-existing data set.