ImageClassifier and ImageStatistics

arijeannin · July 10, 2019, 9:36am

Hi everybody,

Juste a simple question if someone can give me advice:

I enter a statistics file for the training (TrainImagesClassifier) that is computed over the training raster(s), fine. Then I apply the ImageClassifier on another raster.
So should I enter the training statistics file for the ImageClassifier or should I enter the statistics file of the new raster that is only used at the classifying stage?

Thanks a lot!
Ari

arijeannin · July 10, 2019, 10:07am

Okay I got my answer after testing, apparently the same statistics file is needed for Training and Classifying.

Cedric · July 10, 2019, 10:16am

Hello,

you should use the statistics file used for training when classifying new data. The machine learning model has been trained to classify input vectors normalized specifically with these statistic parameters.

To give an example imagine you have trained a machine learning algorithm using data normalized with statistics corresponding to a large zone containing many different classes. Then you use the trained model on a raster containing mostly urban areas. The statistics estimated on this raster will be very different from the statistics used for training, if you use them for normalization, the resulting vectors will be very different from the vectors of the training sample with the urban class, and the machine learning model will not be able to classify them as urban.

In some sense the statistics file used for training is part of the learned model.

Cédric

arijeannin · July 10, 2019, 10:20am

Hi Cédric, thank you for those very clarifying explanations!