Accueil du site
Master
Etats Unis
2010
Random forests applied as a soil spatial predictive model in arid Utah
Titre : Random forests applied as a soil spatial predictive model in arid Utah
Auteur : Stum, Alexander Knell
Université de soutenance : Utah State University
Grade : Master of Science in Soil Science 2010
[[ Résumé
Initial soil surveys are incomplete for large tracts of public land in the western
USA. Digital soil mapping offers a quantitative approach as an alternative to traditional
soil mapping. I sought to predict soil classes across an arid to semiarid watershed of
western Utah by applying random forests (RF) and using environmental covariates
derived from Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and digital elevation
models (DEM). Random forests are similar to classification and regression trees (CART).
However, RF is doubly random. Many (e.g., 500) weak trees are grown (trained)
independently because each tree is trained with a new randomly selected bootstrap
sample, and a random subset of variables is used to split each node. To train and validate
the RF trees, 561 soil descriptions were made in the field. An additional 111 points were
added by case-based reasoning using aerial photo interpretation. As RF makes
classification decisions from the mode of many independently grown trees, model
uncertainty can be derived. The overall out of the bag (OOB) error was lower without
weighting of classes ; weighting increased the overall OOB error and the resulting output
did not reflect soil-landscape relationships observed in the field. The final RF model had
an OOB error of 55.2% and predicted soils on landforms consistent with soil-landscape
relationships. The OOB error for individual classes typically decreased with increasing
class size. In addition to the final classification, I determined the second and third most
likely classification, model confidence, and the hypothetical extent of individual classes.
Pixels that had high possibility of belonging to multiple soil classes were aggregated
using a minimum confidence value based on limiting soil features, which is an effective
and objective method of determining membership in soil map unit associations and
complexes mapped at the 1:24,000 scale. Variables derived from both DEM and Landsat
7 ETM+ sources were important for predicting soil classes based on Gini and standard
measures of variable importance and OOB errors from groves grown with exclusively
DEM- or Landsat-derived data. Random forests was a powerful predictor of soil classes
and produced outputs that facilitated further understanding of soil-landscape
relationships
Page publiée le 19 mai 2023