Juste a simple question if someone can give me advice:
I enter a statistics file for the training (TrainImagesClassifier) that is computed over the training raster(s), fine. Then I apply the ImageClassifier on another raster.
So should I enter the training statistics file for the ImageClassifier or should I enter the statistics file of the new raster that is only used at the classifying stage?
Thanks a lot!
Okay I got my answer after testing, apparently the same statistics file is needed for Training and Classifying.
you should use the statistics file used for training when classifying new data. The machine learning model has been trained to classify input vectors normalized specifically with these statistic parameters.
To give an example imagine you have trained a machine learning algorithm using data normalized with statistics corresponding to a large zone containing many different classes. Then you use the trained model on a raster containing mostly urban areas. The statistics estimated on this raster will be very different from the statistics used for training, if you use them for normalization, the resulting vectors will be very different from the vectors of the training sample with the urban class, and the machine learning model will not be able to classify them as urban.
In some sense the statistics file used for training is part of the learned model.
Hi Cédric, thank you for those very clarifying explanations!