Sample augmentation

n-soltanzade · September 16, 2020, 3:03pm

training an image classifier by using samples created by smote

*Hi everyone
In creating training data by smote method (sample agumentation), will the coordinates of all production points be the same(They all have coordinates (1,1), for example)? If not, please help.
How can such data be introduced as training data for classification?I am using otb in qgis & I did a lot of searching but unfortunately I did not find any information and educational videos on the Internet.
Sorry for my bad English
Thanks in advance

Cedric · September 18, 2020, 12:34pm

Hello,

SampleAugmentation creates a new vector containing synthetics samples for one of the class of the input vector file. The important parameters of the application are :

`in: Input Samples` : the input vector file
`out: Output Samples`: the output vector files
`field: Field Name` : the class label in the input vector
`label: Label of the class to be augmented` :  the class that you which to augment
`exclude: Field names for excluded features` : the features that should not be modified in the output vector

For example, if you have an input vector with points and the following features :

class: the class integer label, with two classes, 1 or 2
value_1 : the first feature to be used during classification
value_2 : the second feature to be used during classification
value_3 : the third feature to be used during classification
info: a field containing some info that is not useful for the classification

Let’s say you want to augment class 2, you will set :

field = "class",
label= 2
exclude = "class info", because you don’t want to modify the class of the augmented samples, and you also don’t want to modify info as it is not used in the classification.

This will create a new sample containing 100 (by default) augmented samples for class 2. They will all have the same position, which is the position of the first sample of class 2 of the input dataset. But this is not important for the classification step, the classification step only uses the feature of the input Sample.

You can then create another vector for samples of class 1, and provide both vectors to TrainVectorClassifier.

I hope this clarifies things,
Cédric

n-soltanzade · September 18, 2020, 2:00pm

Thanks for your reply. The solution is what you said and using the TrainVectorClassifier option solved the problem.
best regards