Bayesian Calibration of AquaCrop Model for Winter Wheat by Assimilating UAV

6 Crop growth model plays a paramount role in smart farming management, which not only provides quantitative 7 information on crop development but also evaluates various management strategies. A reliable model is desirable but 8 challenging due to the presence of unknown and uncertain parameters; therefore, crop model calibration is significant 9 to achieve its potentials. This work is focused on the calibration of AquaCrop model by leveraging advanced Bayesian 10 inference algorithms and UAV multi-spectral images at field scales. In particular, aerial images with high spatial11 temporal resolutions are first applied to obtain Canopy Cover (CC) value by using machine learning based classification. 12 The CC is then assimilated into AquaCrop model and uncertain parameters could be inferred by Markov Chain Monte 13 Carlo (MCMC). Both simulation and experimental validation are performed. The experimental aerial images of winter 14 wheat at Yangling district from Oct/2017 to June/2018 are applied to validate the proposed method against the 15 conventional optimisation based approach by Simulated Annealing (SA). 100 Monte Carlo simulations show that the 16 root mean squared error (RMSE) of Bayesian approach yields a smaller parameter estimation error than optimisation 17 approach. While the experimental results show that: (i) a good wheat/background classification result is obtained for 18 the accurate calculation of CC; (ii) the predicted CC values by Bayesian approach are consistent with measurements 19 by 4-fold cross validation, where the RMSE is 0.0271 smaller than optimisation approach (0.0514); (iii) in addition to 20 parameter estimation, their distribution information is also obtained in the developed Bayesian approach, reflecting 21 the prediction confidence. It is believed that the Bayesian model calibration, although is developed for AquaCrop 22 model, can find a wide range of applications to various simulation models in agriculture and forestry. 23


26
Agricultural crop states are paramount for smart farming management and food security. A timely and accurate 27 estimation of canopy states has become an effective approach for crop monitoring, irrigation decision-making and 28 yield management [1,2]. In this regard, a reliable crop model is desirable for crop state estimation. However, due 29 to the presence of unknown and uncertain parameters in spatial distribution of soil properties and crop parameters,  In addition, RedEdge camera is fixed on a gimbal to attenuate the adverse effects of wind, so that high-quality 123 images can be captured during the survey. The spectral information of RedEdge camera is displayed in Table 1 [27]. Each UAV aerial 127 image is with the necessary information for camera calibration and image stitching. An image of a reflectance cali-128 bration panel was taken (at about 1m height) before and after each flight and used in the process of image calibration 129 to account for the side effects of environmental variations. In addition, commercial Pix4Dmapper software of version 130 4.2.27 is adopted to generate calibrated and georeferenced spectral reflectance data for CC calculation. The detailed 131 process is omitted and can be referred to Section 2.3 of [11] 132 The calculation of CC is first discussed. In this study, UAV remote sensing data (e.g. five-band multispectral 137 image) is preferred due to its high spatial/spectral resolutions. The overall process is displayed in Fig 3, where each 138 element is detailed in the following subsections.   In this study, CC calculation is formulated as a wheat/non-wheat two-class classification problem so that wheat 141 pixel proportion can be calculated for the region of interest. One specific image acquired on 11/December/2017 is 142 used as an illustration example. It is well known that supervised classification relies on labelled data for its training, 143 which include wheat and non-wheat pixels in this study. In this work, wheat/non-wheat pixels are directly labelled   The spectral characteristics of the wheat/non-wheat pixels are also analysed, where the mean spectral reflectance 146 values are shown in Fig 5. It can been seen that the green peak phenomenon is observed for wheat (green) crop where 147 the value of Green band is higher than that of Blue and Red bands. In addition, wheat pixels also have a higher NIR 148 6 reflectance value than non-wheat pixels. The spectral differences provides important information for discriminating 149 wheat pixels from non-wheat pixels. Considering that there are only five spectral bands in the multispectral images, 150 all available bands are used as the features for the classification task in Section 3. 1 as Support Vector Machines (SVMs), neural network, nearest neighbour [10]. In this study, random forest classifier 157 is employed due to its high efficiency and accuracy, where the hyper-parameters are further automatically tuned by 158 Bayesian optimization [11]. The detailed algorithm is omitted, which is referred to [11,29].
159 wheat non-wheat In this work, 70% and 30% of the labelled pixels are for training and testing respectively, where the classification 160 accuracy is calculated on testing dataset. The overall classification accuracy for the example image is 99%. The can be calculated by the formula in Eq 1.
where wp and nwp denote the number of wheat and non-wheat pixels in the region of interest. Repeating the process, 164 the CC measurement values with classification accuracy over time are displayed in Table 2.

165
where W P * is the normalised crop water productivity in g/m 2 ; ET i is daily crop transpiration in mm; ET 0 i is the where F (.) represents the crop model operator and X presents the canopy states (e.g. biomass, canopy cover, root 188 depth) on each simulated date. G(.) denotes the measurement model with measurement noise ξ being with zero mean 189 and a proper covariance σ. posterior distribution. [32]. In particular, the posterior distribution P (θ|Y ) is proportional to the prior parameter 195 distribution P (θ) times the measurement likelihood function P (Y |θ), which is given by where Y is the observational data and θ represents the parameters to be estimated. To simplify the problem, the 197 likelihood function is defined as the error between observations and simulated model outputs (see Eq 5). More details 198 is given in Section 3.3.
where F (.) denotes the function of crop model conditional on parameter θ, E means the error.
Randomly select the first stage proposal candidate parameter θ * 4 Return to step 2 mean and a proper variance, thereby the likelihood function in this study is formulated as where N is the total observation number and the y i −ŷ i (x k , θ) is the error between the measurement of dynamic  while in real-world experiment, the measurable canopy cover is adopted to assess the performance.
The parameter definition and prior interval information are shown in Table 3  AquaCrop-OS model. In order to test the capability of the developed algorithm, the prior information in Table 3 is 251 reduced by increasing the uncertain parameter ranges as shown in Table 4. The iteration is also increased to 6000 to where E opt and E bay denote the parameter estimation errors by optimisation and Bayesian methods, respectively. p opt 265 and p bay represent the average calibrated parameters with p t being the ground truth. The parameter estimations and 266 their error percentages are shown in Table 5. where the results are displayed in Table 6. Similarly, it can be seen that Bayesian approach outperforms optimization 270 approach for all parameter estimation in term of RMSE except the parameter eme.     confidence rule is that the less spread the distribution is, the more reliable the parameter estimation is. However, the 292 optimization based approach can only provide a point estimate without confidence information (see, Fig 11).

293
It can also be seen from Fig 10 that pse and pop are with a large variance. There exist several possible reasons.

294
First, it may be due to the lack of calibration data in the sensitive growth stages. Secondly, the number of observations 295 may be not enough for the estimate of 10 dimensional parameter vector. The estimated parameters for both Bayesian 296 (e.g. mean value) and optimization (e.g. point estimate) methods are calculated and shown in Table 7, which are 297 used for CC prediction in Section 5.2.2.     The CC estimation against ground truth CC data for different approaches under different datasets for calibration 314 is also displayed in Fig 13. X-axis and y-axis represent the ground truth validation data and the estimated CC values; 315 the red and blue points represent the estimated CC by using MCMC Bayesian and SA optimization approaches. It 316 can be visually seen that the results of Bayesian approach are closer to y = x line than SA optimization approach.

317
The RMSE values are also summarised for two approaches in Table 8. It can also be seen that MCMC Bayesian 318 approach results in a smaller RMSE value (in comparison with validation CC) than SA optimization approach. reflecting estimation confidence. Bayesian approach also obtains a smaller root mean square error for parameter 330 estimation and canopy cover prediction than optimization based approach. In addition, Bayesian approach is less 331 sensitive to the selection of data points for calibration. Although the results are very promising, there is also room for 332 further improvement, which are summarized as below.

333
(1) This work is mainly focused on algorithm development and its initial validation by using a small field, algorithm 334 validation by large fields will be more convincing;

335
(2) More advanced Bayesian inference algorithms can be developed to further improve the performance (e.g. reducing 336 the computation load).