data module¶

This module extends keras.utils.Sequence. dataset generators for the three main cases in image denoising:

Clean Dataset: Only clean (ground-truth) images are available. Hence, you will need to specify a function to corrupt the clean images, so that you can train your network with pairs (noisy, clean). On the section Artificial Noises, we cover built-in functions for adding noise to clean images.
Full Dataset: Both clean and noisy images are available. In that case, the dataset yields the image paris (clean, noisy).
Blind Dataset: Only noisy images are available. These datasets can be used for qualitative avaliation.

We remark that these classes can be used for training and inference on our Benchmark. It is also noteworthy that MatlabModel objects, which are based on Matlab Deep Learning Toolbox, need to use the MatlabDatasetWrapper to be efficiently trained. For more information, look at data.MatlabDatasetWrapper and model.MatlabModel classes documentation.

Along with these classes, we also provide built-in functionalities for preprocessing, such as Data Augmentation and patch extraction. These are covered in the Preprocessing Functions section.

Data generation¶

class data.AbstractDatasetGenerator(path, batch_size=32, shuffle=True, name='AbstractDataset', n_channels=1)[source]¶

Bases: keras.utils.data_utils.Sequence

Dataset generator based on Keras library. implementation based on https://stanford.edu/~shervine/blog/keras-how-to-generate-data-on-the-fly

path¶

String containing the path to image files directory.

Type:	str

batch_size¶

Size of image batch.

Type:	int

n_channels¶

1 for grayscale, 3 for RGB.

Type:	int

shuffle¶

Whether to shuffle the dataset at each epoch or not.

Type:	bool

channels_first¶

Whether data is formatted as (BatchSize, Height, Width, Channels) or (BatchSize, Channels, Height, Width).

Type:	bool

name¶

String containing the dataset’s name.

Type:	str

__init__(self, path, batch_size=32, shuffle=True, name='AbstractDataset', n_channels=1)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__len__(self)[source]¶: Number of batches per epoch

__str__(self)[source]¶: Returns the dataset name.

on_epoch_end(self)[source]¶: Defines and shuffles indexes on epoch end

class data.BlindDatasetGenerator(path, batch_size=32, shuffle=True, name='CleanDataset', n_channels=1, preprocessing=None, target_fcn=None)[source]¶

Bases: OpenDenoising.data.abstract_dataset.AbstractDatasetGenerator

Dataset generator based on Keras library. This class is used for Blind denoising problems, where only noisy images are available.

path¶

String containing the path to image files directory.

Type:	str

batch_size¶

Size of image batch.

Type:	int

n_channels¶

1 for grayscale, 3 for RGB.

Type:	int

shuffle¶

Whether to shuffle the dataset at each epoch or not.

Type:	bool

name¶

String containing the dataset’s name.

Type:	str

preprocessing¶

List of preprocessing functions, which will be applied to each image.

Type:	list

target_fcn¶

Function implementing how to generate target images from noisy ones.

Type:	function

__getitem__(self, i)[source]¶

Generates image batches from filenames.

Parameters:	i (int) – Batch index to get.
Returns:	inp (`numpy.ndararray`) – Batch of noisy images. ref (`numpy.ndarray`) – Batch of target images.

__init__(self, path, batch_size=32, shuffle=True, name='CleanDataset', n_channels=1, preprocessing=None, target_fcn=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__next__(self)[source]¶: Returns image batches sequentially.

__repr__(self)[source]¶: Return repr(self).

class data.CleanDatasetGenerator(path, batch_size=32, noise_config=None, shuffle=True, name='CleanDataset', n_channels=1, preprocessing=None)[source]¶

Bases: OpenDenoising.data.abstract_dataset.AbstractDatasetGenerator

Dataset generator based on Keras library. This class is used for non-blind denoising problems where only clean images are available. To use such dataset to train denoising networks, you need to specify a type of artificial noise that will be added to each clean image.

path¶

String containing the path to image files directory.

Type:	str

batch_size¶

Size of image batch.

Type:	int

n_channels¶

1 for grayscale, 3 for RGB.

Type:	int

shuffle¶

Whether to shuffle the dataset at each epoch or not.

Type:	bool

channels_first¶

Whether data is formatted as (BatchSize, Height, Width, Channels) or (BatchSize, Channels, Height, Width).

Type:	bool

name¶

String containing the dataset’s name.

Type:	str

preprocessing¶

List of preprocessing functions, which will be applied to each image.

Type:	list

noise_config¶

Dictionary whose keys are functions implementing the noise process, and the value is a list containing the noise function parameters. If you do not want to specify any parameters, your list should be empty.

Type:	dict

Examples

The following example corresponds to a Dataset Generator which reads images from “./images”, yields batches of length 32, applies Gaussian noise with intensity drawn uniformely from the range [0, 55] followed by “Super Resolution noise” of intensity 4. Moreover, the dataset shuffles the data, yields them in NHWC format, and does not apply any preprocessing function. NOTE: your list should be in the same order as your arguments.

>>> from OpenDenoising import data
>>> noise_config = {data.utils.gaussian_blind_noise: [0, 55],
...                 data.utils.super_resolution_noise: [4]}
>>> datagen = data.CleanDatasetGenerator("./images", 32, noise_config, True, False, "MyData", 1, None)

__getitem__(self, i)[source]¶: Generates batches of data.

__init__(self, path, batch_size=32, noise_config=None, shuffle=True, name='CleanDataset', n_channels=1, preprocessing=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__repr__(self)[source]¶: Return repr(self).

__str__(self)[source]¶: Returns the dataset name.

class data.FullDatasetGenerator(path, batch_size=32, shuffle=True, name='FullDataset', n_channels=1, preprocessing=None)[source]¶

Bases: OpenDenoising.data.abstract_dataset.AbstractDatasetGenerator

Dataset generator based on Keras library. This class is used for non-blind denoising problems. Unlike ClenDatasetGenerator class, this class corresponds to the case where both clean and noisy samples are available and paired (for each noisy image, there is one and only one clean image with same filename).

path¶

String containing the path to image files directory.

Type:	str

batch_size¶

Size of image batch.

Type:	int

n_channels¶

1 for grayscale, 3 for RGB.

Type:	int

shuffle¶

Whether to shuffle the dataset at each epoch or not.

Type:	bool

name¶

String containing the dataset’s name.

Type:	str

preprocessing¶

List of preprocessing functions, which will be applied to each image.

Type:	list

__getitem__(self, i)[source]¶: Generate batches of data

__init__(self, path, batch_size=32, shuffle=True, name='FullDataset', n_channels=1, preprocessing=None)[source]¶: Initialize self. See help(type(self)) for accurate signature.

__next__(self)[source]¶: Returns image batches sequentially.

__repr__(self)[source]¶: Return repr(self).

class data.MatlabDatasetWrapper(engine, images_path='./tmp/BSDS500/Train/ref', partition='Train', ext=None, patch_size=40, n_patches=16, noiseFcn="@(I) imnoise(I, 'gaussian', 0, 25/255)", channel_format='grayscale', type='Clean')[source]¶

Bases: object

This class wraps FullMatlabDataset and CleanMatlabDataset classes. It makes internal calls to Matlab through Python Matlab’s engine to load one these classes into the workspace.

Notes

This functions does not implement the interface provided by keras.utils.Sequence. It should only be used alongside with MatlabModel objects.

engine¶

Matlab engine instance.

Type:	`matlab.engine.MatlabEngine`

images_path¶

String to directory containing dataset’s images.

Type:	str

partition¶

String containing the dataset partition (‘Train’, ‘Valid’ or ‘Test’).

Type:	str

ext¶

Dictionary holding images extensions. Note that this dictionary should have keys only.

Type:	dict

patch_size¶

Size of patches to be extracted.

Type:	int

n_patches¶

Number of patches to be extracted on each image.

Type:	int

noiseFcn¶

For CleanDataset only. Specifies the noising function that will be applied to images. It should be a function that accepts as input an image, and returns another image. If you need to specify parameters, you can do so by using Matlab’s anonymous function syntax (by specifying noiseFcn=”@(I) imnoise(I, ‘gaussian’, 0, 25/255)”), for instance.

Type:	str

channel_format¶

String containing ‘grayscale’ for grayscale images, or ‘RGB’ for RGB images.

Type:	str

type¶

String containing Clean (for CleanDataset) or Full (for FullDataset).

Type:	str

Artificial Noises¶

data.gaussian_noise(ref, noise_level=15)[source]¶

Gaussian noise

Parameters:	ref (`numpy.ndarray`) – Image to be noised. noise_level (`numpy.ndarray`) – Level of corruption. Always give the noise_level in terms of 0-255 pixel intensity range.
Returns:	inp – Noised image.
Return type:	`numpy.ndarray`

data.poisson_noise(ref)[source]¶

Poisson noise

Parameters:	ref (`numpy.ndarray`) – Image to be noised.
Returns:	inp – Noised image.
Return type:	`numpy.ndarray`

data.salt_and_pepper_noise(ref, noise_level=15)[source]¶

Salt and pepper noise

Parameters:	ref (`numpy.ndarray`) – Image to be noised. noise_level (`numpy.ndarray`) – Percentage of perturbed pixels.
Returns:	inp – Noised image.
Return type:	`numpy.ndarray`

data.speckle_noise(ref, noise_level=15)[source]¶

Speckle noise

Parameters:	ref (`numpy.ndarray`) – Image to be noised. noise_level (`numpy.ndarray`) – Percentage of perturbed pixels.
Returns:	inp – Noised image.
Return type:	`numpy.ndarray`

data.super_resolution_noise(ref, noise_level=2)[source]¶

Noise due to down-sampling followed by up-sampling an image

Parameters:	ref (`numpy.ndarray`) – Image to be noised. noise_level (`numpy.ndarray`) – scaling factor. For instance, for an image (512, 512), a factor 2 down-samples the image to (256, 256), then up-samples it again to (512, 512).
Returns:	inp – Noised image.
Return type:	`numpy.ndarray`

Preprocessing Functions¶

data.dncnn_augmentation(inp, ref=None, aug_times=1, channels_first=False)[source]¶

Data augmentation policy employed on DnCNN

Parameters:

inp (numpy.ndarray) – Noised image.
ref (numpy.ndarray) – Ground-truth images.
aug_times (int) – Number of times augmentation if applied.

Returns:

inp (numpy.ndarray) – Augmented noised images
ref (numpy.ndarray) – Augmented ground-truth images

data.gen_patches(inp, ref, patch_size, channels_first=False, mode='sequential', n_patches=-1)[source]¶

Patch generation function.

Parameters:

inp (numpy.ndarray) – Noised image which patches will be extracted.
ref (numpy.ndarray) – Reference image which patches will be extracted.
patch_size (int) – Size of patch window (number of pixels in each axis).
channels_first (bool) – Whether data is formatted as NCHW (True) or NHWC (False).
mode (str) – One between {‘sequential’, ‘random’}. If mode = ‘sequential’, extracts patches sequentially on each axis. If mode = ‘random’, extracts patches randomly.
n_patches (int) –
Number of patches to be extracted from the image. Should be specified only if mode = ‘random’. If not specified, then,

\[n\_patches = \dfrac{h \times w}{patch_{size}^{2}}\]

Returns:

input_patches (numpy.ndarray) – Extracted input patches.
reference_patches (numpy.ndarray) – Extracted reference patches.