How to use PatchesSelection and PatchesExtraction applications in OTBTF

Could you point me to any tutorials or documentation that would help me better understand how PatchesSelection and PatchesExtraction are to be used to extract the ground truth from the images and create patches?

Here’s the documentation provided: Sampling - OTBTF

Hi srikar,

if you want i’ve made a tutorial to understand how use OTBTF with Docker and how to make a land use classification with deep learning method. At the 20-minute mark I explain how to use PatchExtraction, i’m sorry i’ve made the tutorial in french but you can put the automatic subtitle, works relatively well.
The link : Tutoriel : Classification d'occupation du sol deep learning avec Docker et OrfeoToolbox Tensorflow - YouTube

Best regards,


1 Like

That’s great Adrien. I will follow the tutorial along and report back with an update. Thank you so much for making the tutorial !!


Hey Adrien,

The model that you build is in Tensorflow 1 version, right ? Do otbcli_TensorflowModelTrain and otbcli_TensorFlowServe functions made to work with tensorflow version 1 model paradigm (placeholders, etc.) only ? Our team has worked on creating patches and label files using patchesextraction. But we are planning to build the model in v2, and work towards using otbtf accordingly. Have you tried building the model in TF v2. Do you have any pointers as to how to move forward with otbtf using TF 2.

Hello guys,

Like I mentioned in another post, there is a tutorial in the otbtf doc explaining how to build/train a model with TF v2 (keras). There are only benefits from this: do everything from python, cleaner/simpler code, and native distributed training as bonus. You can then use otbcli_TensorflowModelServe with the savedmodel as always (…from python if you prefer), it works with TF v1 and TF v2 models.


Hey Remi,

Thank you for reply. Our team has followed along the prescribed tutorial but, we are faced with a value error while training the model in the tutorial. We have encountered a compatibility issue with the output of the model and the target (one-hot encoded with dataset_preprocessing_fn).

ValueError: Shapes (8, 1, 1, 20) and (8, 64, 64, 20) are incompatible

FYI, We have 20 classes and an 8-band satellite image. Firstly, we have created the patches-images and corresponding labels files using PatchExtraction. We then successfully created the tfrecords from patchesimages as prescribed in the tutorial using DatasetFromPatchesImages, tf_dataset.to_tfrecords and divided them up into the train, test, and validation sets accordingly. We then built the exact same model as given. But while we are trying to train the model it is throwing out a value error saying the output from the model (8, 64, 64, 20) is not compatible with the target shape of (8, 1, 1, 20). Our patches are of size 64x64, we tried changing our patch sizes to 32,16,1, etc. No matter the patch size, we are getting a similar value error.

We are not sure if the prescribed model is missing something (maybe another layer) where it is not churning out the required output or if we have to adjust the one-hot encoding preprocessing step of the target. Please, guide us through this block. Feel free to ask for more details if needed.

Below is the training script we have used and the full error:

import argparse
from pathlib import Path
import tensorflow as tf
import os
from otbtf.model import ModelBase
from otbtf import DatasetFromPatchesImages, TFRecords

Implementation of a small U-Net like model

# Number of classes estimated by the model

# Name of the input in the `FCNNModel` instance, also name of the input node
# in the SavedModel
INPUT_NAME = "input_xs"

# Name of the output in the `FCNNModel` instance
TARGET_NAME = "predictions"

# Name (prefix) of the output node in the SavedModel
OUTPUT_SOFTMAX_NAME = "predictions_softmax_tensor"

class FCNNModel(ModelBase):
    A Simple Fully Convolutional U-Net like model

    def normalize_inputs(self, inputs: dict):
        Inherits from `ModelBase`

        The model will use this function internally to normalize its inputs,
        before applying `get_outputs()` that actually builds the operations
        graph (convolutions, etc). This function will hence work at training
        time and inference time.

        In this example, we assume that we have an input 12 bits multispectral
        image with values ranging from [0, 10000], that we process using a
        simple stretch to roughly match the [0, 1] range.

            inputs: dict of inputs

            dict of normalized inputs, ready to be used from `get_outputs()`
        return {INPUT_NAME: tf.cast(inputs[INPUT_NAME], tf.float32) * 0.0001}

    def get_outputs(self, normalized_inputs: dict) -> dict:
        Inherits from `ModelBase`

        This small model produces an output which has the same physical
        spacing as the input. The model generates [1 x 1 x N_CLASSES] output
        pixel for [32 x 32 x <nb channels>] input pixels.

            normalized_inputs: dict of normalized inputs

            dict of model outputs

        norm_inp = normalized_inputs[INPUT_NAME]

        def _conv(inp, depth, name):
            conv_op = tf.keras.layers.Conv2D(
            return conv_op(inp)

        def _tconv(inp, depth, name, activation="relu"):
            tconv_op = tf.keras.layers.Conv2DTranspose(
            return tconv_op(inp)

        out_conv1 = _conv(norm_inp, 16, "conv1")
        out_conv2 = _conv(out_conv1, 32, "conv2")
        out_conv3 = _conv(out_conv2, 64, "conv3")
        out_conv4 = _conv(out_conv3, 64, "conv4")
        out_tconv1 = _tconv(out_conv4, 64, "tconv1") + out_conv3
        out_tconv2 = _tconv(out_tconv1, 32, "tconv2") + out_conv2
        out_tconv3 = _tconv(out_tconv2, 16, "tconv3") + out_conv1
        out_tconv4 = _tconv(out_tconv3, N_CLASSES, "classifier", None)

        # Generally it is a good thing to name the final layers of the network
        # (i.e. the layers of which outputs are returned from
        # `MyModel.get_output()`). Indeed this enables to retrieve them for
        # inference time, using their name. In case your forgot to name the
        # last layers, it is still possible to look at the model outputs using
        # the `saved_model_cli show --dir /path/to/your/savedmodel --all`
        # command.
        # Do not confuse **the name of the output layers** (i.e. the "name"
        # property of the tf.keras.layer that is used to generate an output
        # tensor) and **the key of the output tensor**, in the dict returned
        # from `MyModel.get_output()`. They are two identifiers with a
        # different purpose:
        #  - the output layer name is used only at inference time, to identify
        #    the output tensor from which generate the output image,
        #  - the output tensor key identifies the output tensors, mainly to
        #    fit the targets to model outputs during training process, but it
        #    can also be used to access the tensors as tf/keras objects, for
        #    instance to display previews images in TensorBoard.
        softmax_op = tf.keras.layers.Softmax(name=OUTPUT_SOFTMAX_NAME)
        predictions = softmax_op(out_tconv4)

        return {TARGET_NAME: predictions}

def dataset_preprocessing_fn(examples: dict):
    Preprocessing function for the training dataset.
    This function is only used at training time, to put the data in the
    expected format for the training step.
    `otbtf.ModelBase.normalize_inputs` for that).
    Note that this function is not called here, but in the code that prepares
    the datasets.

        examples: dict for examples (i.e. inputs and targets stored in a single

        preprocessed examples

    return {
        INPUT_NAME: examples["input_xs_patches"],
        TARGET_NAME: tf.one_hot(
            tf.squeeze(tf.cast(examples["labels_patches"], tf.int32), axis=-1),

def train(model_dir, batch_size, learning_rate, nb_epochs, ds_train, ds_valid, ds_test):
    Create, train, and save the model.

        params: contains batch_size, learning_rate, nb_epochs, and model_dir
        ds_train: training dataset
        ds_valid: validation dataset
        ds_test: testing dataset


    strategy = tf.distribute.MirroredStrategy()  # For single or multi-GPUs
    with strategy.scope():
        # Model instantiation. Note that the normalize_fn is now part of the
        # model. It is mandatory to instantiate the model inside the strategy
        # scope.
        model = FCNNModel(dataset_element_spec=ds_train.element_spec)

        # Compile the model
            metrics=[tf.keras.metrics.Precision(), tf.keras.metrics.Recall()]

        # Summarize the model (in CLI)

        # Train, epochs=nb_epochs, validation_data=ds_valid)

        # Evaluate against test data
        if ds_test is not None:
            model.evaluate(ds_test, batch_size=batch_size)

        # Save trained model as SavedModel

#---Create TFRecords--------------------------------------------------------------------

def create_tfrecords(patches, labels, outdir):
    patches = sorted(patches)
    labels = sorted(labels)
    outdir = Path(outdir)
    if not outdir.exists():
    #create a dataset
    dataset = DatasetFromPatchesImages(
        filenames_dict = {
            "labels_patches": labels
    #convert dataset into TFRecords
    dataset.to_tfrecords(output_dir=outdir, drop_remainder=False)

if __name__=="__main__":
    datapath = "/home/otbuser/all/data/"
    batch_size = 8
    learning_rate = 0.0001
    nb_epochs = 5

    # create TFRecords
    patches = ['/home/otbuser/all/data/area2_0530_2022_8bands_norm_patches_A.tif', '/home/otbuser/all/data/area2_0530_2022_8bands_norm_patches_B.tif']
    labels = ['/home/otbuser/all/data/area2_0530_2022_8bands_norm_labels_A.tif', '/home/otbuser/all/data/area2_0530_2022_8bands_norm_labels_B.tif']
    create_tfrecords(patches=patches[0:1], labels=labels[0:1], outdir=datapath+"train")
    create_tfrecords(patches=patches[1:], labels=labels[1:], outdir=datapath+"valid")

    # Train the model and save the model
    train_dir = os.path.join(datapath, "train")
    valid_dir = os.path.join(datapath, "valid")
    test_dir = None # define the training directory if test dataset is available
    kwargs = {
        "batch_size": batch_size,
        "target_keys": [TARGET_NAME],
        "preprocessing_fn": dataset_preprocessing_fn
    ds_train = TFRecords(train_dir).read(shuffle_buffer_size=1000, **kwargs)
    ds_valid = TFRecords(valid_dir).read(**kwargs)

    train(datapath+"sandbox_model", batch_size, learning_rate, nb_epochs, ds_train, ds_valid, ds_test=None)


Epoch 1/5
Traceback (most recent call last):
  File "/home/otbuser/all/code/", line 239, in <module>
    train(datapath+"sandbox_model", batch_size, learning_rate, nb_epochs, ds_train, ds_valid, ds_test=None)
  File "/home/otbuser/all/code/", line 184, in train, epochs=nb_epochs, validation_data=ds_valid)
  File "/opt/otbtf/lib/python3/dist-packages/keras/utils/", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/tmp/", line 15, in tf__train_function
    retval_ = ag__.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
ValueError: in user code:

    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 1284, in train_function  *
        return step_function(self, iterator)
    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 1268, in step_function  **
        outputs =, args=(data,))
    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 1249, in run_step  **
        outputs = model.train_step(data)
    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 1051, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 1109, in compute_loss
        return self.compiled_loss(
    File "/opt/otbtf/lib/python3/dist-packages/keras/engine/", line 265, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/opt/otbtf/lib/python3/dist-packages/keras/", line 142, in __call__
        losses = call_fn(y_true, y_pred)
    File "/opt/otbtf/lib/python3/dist-packages/keras/", line 268, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/opt/otbtf/lib/python3/dist-packages/keras/", line 1984, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "/opt/otbtf/lib/python3/dist-packages/keras/", line 5559, in categorical_crossentropy

ValueError: Shapes (8, 1, 1, 20) and (8, 64, 64, 20) are incompatible

Hello @srikar ,

You must train this model with input/output of the same size.
This is a U-Net-like model, that needs input/output of the same size (in your case, 64x64 pixels).

The confusion comes from the fact that PatchesExtraction can extract an additional 1x1 patch carrying the values of the vector data (of the specified field). This is optional, and you don’t always need this 1x1 patch.

  • If you have dense terrain truth (i.e. one terrain truth patch per input patch): To extract two set of patches with the similar size, you must use PatchesExtraction with the environment variable OTB_TF_NSOURCES set to 2 (one for the input, the other for the labels).

  • If your terrain truth is sparse (i.e. one label value per patch), you can build a fully convolutional model that outputs a 1x1 label for an input 64x64 (use convolutions without padding to shrink the output size, layer after layer), and use PatchesExtraction with 64x64 patches for the input, and 1x1 patches for the labels

Hello @remi.cresson,

Thank you so much for your prompt response and solutions to overcome our block.

We have tried both of your suggested solutions and we need some more help:

Solution 1:
We have tried using OTB_TF_NSOURCES = 2 as you have suggested in the below code, We didn’t see any change in the label output dimension though or any extra file being created. we are still getting only one label of dimension(1x1x1) per patch (64x64x8). Whereas, the output of the model is expecting a 64x64 label. How do we get a label output that is equal to the size of input? What are we missing or making a mistake here?

def PatchesExtraction(apptype, datapath, input, vec, out_patches, out_labels, patchsize,OTB_TF_NSOURCES = 2):
        app = otbApplication.Registry.CreateApplication(apptype)
        app.SetParameterStringList("", [datapath + input])
        app.SetParameterString("source1.out", datapath + out_patches) 
        app.SetParameterInt("source1.patchsizex", patchsize)
        app.SetParameterInt("source1.patchsizey", patchsize)
        app.SetParameterString("vec", datapath + vec)
        app.SetParameterString("field", "class")
        app.SetParameterString("outlabels", datapath + out_labels)

output shapes that are being generated as seen in output_shapes.json

    "input_xs_patches": [
    "labels_patches": [

Solution 2:

Also, as suggested by you, we have built a fully convolutional model that outputs a 1x1 label for an input 64x64.

        out_conv5 = _conv(norm_inp, 8, "conv5")
        out_conv1 = _conv(out_conv5, 16, "conv1")
        out_conv2 = _conv(out_conv1, 32, "conv2")
        out_conv3 = _conv(out_conv2, 64, "conv3")
        out_conv4 = _conv(out_conv3, 64, "conv4")
        out_tconv1 = _tconv(out_conv4, 64, "tconv1") + out_conv3
        out_tconv2 = _tconv(out_tconv1, 32, "tconv2") + out_conv2
        out_tconv3 = _tconv(out_tconv2, 16, "tconv3") + out_conv1
        out_tconv5 = _tconv(out_tconv3, 8, "tconv5") + out_conv5
# Replace the transposed convolutions with global average pooling
        gap = tf.keras.layers.GlobalAveragePooling2D(name="global_avg_pool")(out_tconv4)
        gap = tf.expand_dims(gap, axis=1)
        gap = tf.expand_dims(gap, axis=1)
        softmax_op = tf.keras.layers.Softmax(name=OUTPUT_SOFTMAX_NAME)
        predictions = softmax_op(gap)

        return {TARGET_NAME: predictions}

Although, the training was successful and a model got built. We are encountering a runtime error while we trying to use TensorflowModelServe.

import otbApplication
app = otbApplication.Registry.CreateApplication("TensorflowModelServe")
app.SetParameterStringList("", ['/home/otbuser/all/data/area2_0530_2022_8bands_norm.tif'])
app.SetParameterInt("source1.rfieldx", 256)
app.SetParameterInt("source1.rfieldy", 256)
app.SetParameterString("source1.placeholder", "input_xs")
app.SetParameterString("model.dir", "/home/otbuser/all/data/sandbox_model")
app.SetParameterStringList("output.names", ["predictions_softmax_tensor"]) 
app.SetParameterInt("output.efieldx", 128)
app.SetParameterInt("output.efieldy", 128)
app.SetParameterString("out", "/home/otbuser/all/data/softmax.tif")
RuntimeError: Exception thrown in otbApplication Application_WriteOutput: /usr/include/ITK-4.13/itkImageConstIterator.h:210:
itk::ERROR: Region ImageRegion (0x7ffcd3f01910)
  Dimension: 2
  Index: [-64, -64]
  Size: [256, 256]
 is outside of buffered region ImageRegion (0x55affcf20e80)
  Dimension: 2
  Index: [0, 0]
  Size: [257, 257]

Also, fullyconv parameter is causing exception error. It is not being recognized at all.

Exception: TensorflowModelServe: parameter 'fullyconv' was not recognized. 

Available keys are ('source1', '', 'source1.rfieldx', 'source1.rfieldy', 'source1.placeholder', 'model', 'model.dir', 'model.userplaceholders', 'model.fullyconv', 'model.tagsets', 'output', 'output.spcscale', 'output.names', 'output.efieldx', 'output.efieldy', 'optim', 'optim.disabletiling', 'optim.tilesizex', 'optim.tilesizey', 'out')

In conclusion, we seek clarity with regards to creating label output the same as the patch dimension, and with regards to the usage of ModelServe. We are not clear as to how ‘Postprocessing to avoid blocking artifacts’ is done and more clarity in documentation is needed as to what parameters of the ModelServe function are doing.

Thanks in advance,

Yes, you must also set the source2 parameters (il, patchsizex, …). The error message you probably encounter at this point should tell you what is missing.

Also, do not set outlabels parameter: you don’t need it because they are the 1x1 patches carrying the vector data field class value.

Solution 2:

Also, as suggested by you, we have built a fully convolutional model that outputs a 1x1 label for an input 64x64.

You model is not fully convolutional since a global pooling is applied in the end, thus making the output size always 1x1 whatever the input size. But I believe that you noticed that and commented line 8, which is okay. Now, you could apply this model in a patch-based fashion, for every pixels, changing source1.rfieldx/y to 64 and output.efieldx/y to 1: this way the model is applied on the same kind of image as it was trained for.

This model will be really slow, and you could maybe build a FCN instead: just stack as many convolutions with stride 1 and no padding (“valid” instead of “same” in convolution operator), in order to have a 1x1 output from a 64x64 input. Then you could set fullyconv to 1 at inference time!