We regularly run into issues with the training of large area based on vectors. There are similar posts in the usage section, where the suggested solution is to use gdal_polygonize to convert the tif into vector. However, this is just crazy because 1) the trainImageClassifier then convert the vector to tif internally at some stage , 2) with large and fragmented raster, the number of polygon is simply too large and the polygonize (when it works) becomes the bottle neck in terms of processing time and 3) if you simplify the polygons, converting in the two ways might modifiy the dataset.
If needed, I have written some code to train with raster (with different sampling strategies) in my application and I can share it. But I think that you already have all what is needed to do it.