The paper “Improving Land Cover Classification Using Genetic Programming for Feature Construction”, authored by LASIGE’s PhD student João E. Batista has been published in the Remote Sensing Journal, a top-ranked journal (h5-index 90; Scimago Q1). The paper co-authors are Ana I. R. Cabral and Maria J. P. Vasconcelos from CEF/Instituto Superior de Agronomia, Leonardo Vanneschi from NOVA IMS, and LASIGE’s integrated researcher Sara Silva.
Since the establishment of the Warsaw Framework in 2013, Remote Sensing (RS) is recommended as an appropriate technology for monitoring and Measuring, Reporting and Verification (MRV) for countries reporting forest land cover and land cover change to the UNFCCC. However, many difficulties, from the availability of adequate in-situ reference data to the spatial and temporal resolution of freely available satellite imagery and data processing power, have been hindering the operational use of this technology for MRV. Now, with the evolution of Earth Observation systems (with the provision of higher spatial and temporal resolution images) and with novel open-data distribution policies, there is an opportunity of applying Machine Learning (ML) to induce models that automatically identify land cover types in satellite images and improve the capacity for producing frequent and accurate land cover maps.
This work deals with using automatic Feature Construction methods (EFS, FFX and M3GP) to create hyper-features in several binary and multiclass classification datasets obtained from satellite imagery from different areas in South America (Brazil) and Africa (Angola, Democratic Republic of Congo, Guinea-Bissau and Mozambique). The importance of these hyper-features is measured by comparing the test accuracy of the Decision Tree, Random Forest and XGBoost classifiers on the original datasets, on datasets where the NDVI, NDWI and NBR indices were added, and on datasets where the constructed hyper-features were added. The results indicate that the classifiers do not suffer a significant impact when the datasets are created from images obtained in a single acquisition date, but reveal a positive impact when the dataset is built from images with multiple acquisition dates. The results also indicate that, although both indices and hyper-features are robust against the radiometric variations within different images, this improvement is more frequent with the constructed hyper-features.
The paper is available here.