This paper outlines the challenges in working with satellite imagery data for crop area estimation, and presents some new approaches.
A key strategic goal for the ABS is to exploit emerging sources of data to either partially replace, supplement or validate existing data collections in order to reduce costs, improve quality and produce more responsive and relevant statistical products. Given this imperative, there is a pressing need to assess the available techniques for analysing “big data” problems in terms of their quantifiable statistical reliability and computational feasibility.
In October 2013, methodological research effort commenced at the ABS to investigate the viability of using satellite imagery data to estimate crop area statistics. Other Australian organisations that have decades of experience in analysing satellite imagery data have been pursuing classification of satellite data at the crop type level, but the lack of adequate ground-truth data available in Australia, from which classification methods can learn, has been a major obstacle. The unit record level data that the ABS regularly collects as part of the Rural Environment and Agricultural Statistics Program, however, has the potential to provide a rich source of reference data to train classifiers.
As well as formulating the statistical problem of crop area estimation with satellite imagery data, we outline in this paper the challenges in working with satellite imagery data and present our proposed methodological approaches for estimating crop area statistics, which include four classification methods; namely, support vector machines, Gaussian maximum likelihood classification, classification with kernel density estimation and multinomial logistic regression. We also propose a statistical framework for estimating the bias and variance of crop area estimates that are calculated using crop type predictions for each pixel, given that our ultimate goal is to release these crop area estimates as official statistics. Methods for quantifying the statistical error of such estimates appear to be lacking in the satellite imagery analysis literature, given the focus on prediction.