Analysis of the accuracy of daily series of global solar radiation simulated by the weather generator PGECLIMA-R , in the State of Parana , Brazil

This study aimed to analyze the accuracy of daily series of global solar radiation, simulated by the weather generator PGECLIMA_R, in the State of Parana, Brazil. For this purpose, there were used historical series of 30 years from 28 different localities, spatially well distributed, so as to represent the entire State. There were five replications for each localitie, allowing to compare the monthly average observed and simulated data to test the accuracy of the generator PGECLIMA_R through statistical analysis of the coefficients of Pearson correlation index ";;r";;, Willmott agreement index ";;d";;, confidence index ";;c";;, the mean bias error (MBE), the root mean square error (RMSE) and the mean absolute error (MAE). The comparison between data generated by PGECLIMA_R and historical data demonstrated a very satisfactory performance of this weather generator for estimating global solar radiation in almost all studied localities.


Introduction
The weather generators are computational tools, which are mathematical simulation models designed to generate synthetic series of climate data with the same statistical characteristics of the historical series.These, in turn, have been used in various areas of human activity, as they allow the analysis of information on local climate, and from simulations, makes it possible to evaluate the influence of climate on natural or human-induced processes.
Weather generators have also been important in the modeling and analysis of ecosystems.Kittel et al. (1995) used this feature to build a bioclimatic database, enabling the analysis of the sensitivity of an ecosystem to climate change.
According to Zanetti (2003) the use of weather generators in the construction of future climate scenarios, aimed at predicting events that might occur at some time in a location of interest, is an alternative of great interest due to the lack of observed data series in the future, allowing thus the use of simulated data.
It stands currently the PGECLIMA_R -Stochastic Generator of Climate Scenarios (Virgens Filho et al., 2011a;Virgens Filho et al., 2011b), which can be considered an evolution of SEDAC-R, with the difference that in addition to simulate weather data, it is also capable of generating climate scenarios

Material and methods
The historical series of global solar radiation, measured in langley.day - (ly), referring to the twenty-eight locations in the State of Parana, Brazil (Figure 1 and Table 1) were obtained from meteorological stations belonging to the Agronomical Institute of Parana -IAPAR.
Source: The author  Thus, the accuracy of PGECLIMA_R was obtained by statistical comparison between different coefficients, by starting confidence index "c" (equation 1), proposed by Camargo & Sentelhas (1997), whose criteria are shown in Table 2.
In more detail, to calculate the confidence index "c", it was used the Pearson correlation coefficient "r", precision indicator, which measures the degree of dispersion among the observed and simulated and the coefficient of agreement "d" proposed by Willmott (1981), regarding the accuracy.The latter indicates the distance between estimated and observed data, ranging from the 0 (no correlation) to 1 (perfect agreement), and is described by the following equation: where Pi represents the monthly averages of the series simulated by weather generators, Oi the monthly averages of the observed historical series and O represents the mean values of historical monthly averages.
Table 2. Criteria for interpretation of performance PGECLIMA_R, by the index "c" (Camargo & Sentelhas, 1997).disadvantage is that just a few outliers are enough to a significant increase in its results (Stone, 1993).

[ ∑ ]
It was also used mean absolute error, MAE (Mean Absolute Error), which according to Willmott (2005) (equation 5), is a more natural measure of average error, and (unlike RMSE) is unequivocal.

∑ | |
For the application of equations 3, 4 and 5, it is considered that Pi represents the monthly averages of the series simulated by weather generator, Oi the monthly averages of the observed historical series, and N is the number of observed values, of historical series.
To complement the study it was also carried out a visual comparison of the observed averages with the averages of the simulated data through the analysis of graphs.

Results and discussion
In Table 3  The occurrence of values greater than 0.99 for the index "r" in almost all localities, allowed establishing a high correlation between predicted and observed data, which can be considered an excellent performance for the index.Baena (2004), testing the model ClimaBR to generate synthetic series of precipitation in Brazil and analyzing the solar radiation, also used the correlation coefficient, concordance rates and confidence and found values above 0.98, considering this a great performance.
In the analysis of the concordance coefficient "d" regarding the accuracy, there The same occurs with indexes RMSE and MAE, which values were small for most locations, citing as an example, Cianorte, which showed RMSE of 2.08 ly, and a MAE of 1.57 ly, evidencing then small deviations between observed and simulated data.The Figure 3  it is concluded that it had a very good from future statistics disturbance in climate variables.This model simulates daily weather data series of rainfall, air temperature (minimum and maximum), relative humidity and global solar radiation, and can also fill the remaining gaps in historical series, parameterize the existing data and simulate the missing data.Among the climatic variables, the solar radiation can be identified as the main element related to the meteorological phenomena, due to the fundamental character of his direct intervention of life on Earth.According to Valiati (2005), solar radiation is a primary climatological variable, responsible for the distribution of fauna and flora on the planet, directly influencing the physiological activity of living beings and the elements of weather, so plant and animal production depend directly on availability of solar energy.Pereira et al. (2002), also argue that solar radiation is the primary source of all atmospheric phenomena and physical, chemical and biological processes observed in ecosystems.It can also be used in various forms, such as the capture by the biomass, heating ventilation and water for domestic and industrial purposes, photoelectricity for small potential and sources for thermodynamic cycles.In this context, this study aimed to analyze the accuracy of daily series of global solar radiation simulated by the weather generator PGECLIMA_R in the State of Parana, Brazil.

Figure 1 .
Figure 1.Selected locations in the State of Parana, Brazil.

∑
To analyze the dispersion between the observed and simulated values due to nonsystematic errors, were used RMSE (Root Mean Square Error) (equation 4), which is related to the real value of the error produced by the model.This index provides information about the performance of the model in the short term, and the lower its value, the lower the data dispersion.Its and 0.82, considered "Very good".It can be seen the relationship between the index "d" and index "c", by the fact that in cities where there was a less satisfactory performance in one of these indexes, the other in turn, showed very similar results, also lower than the others.As proposed byStone (1993), which states that the smaller the values of MBE and RMSE, the better the agreement between the observed and simulated data, and also considering the fact that these indexes are measured dimensional, i.e. depend on the unit so that the data presented in the variable of interest, to determine if the index value indicates compliance or non of simulated and observed values (Willmott, 2006), it is possible then, when were performed the analysis of the MBE index, to observe the occurrence of small deviations in comparison with the variable values (ly) in question, both for overestimation as to underestimation, indicating a greater agreement of the observed and simulated values, i.e. a lower incidence of errors or deviations.The only places where the values were more distinct, representing larger deviations between simulated and observed data, were the cities of Nova Cantu, Paranavai, Pinhais and Planalto, which, in turn, presented in the same way, less performance considered satisfactory when analyzed the indexes "c" and "d".

Figure 2 Figure 2 .
Figure 2 shows the annual trend of the series of global solar radiation on dry days to the twenty-eight sites studied, which allows to check-out the close similarity between observed and simulated data in most places, even in cases where there was an underestimation for the towns of Nova Cantu, Paranavaí, Pinhais and Planalto, and this was quantified by the indexes MBE, RMSE and MAE.

Figure 3 .
Figure 3. Annual trend of the series of global solar radiation on wet days to evaluate the performance of PGECLIMA_R.

Table 1 .
Geographical coordinates of the meteorological stations.

Table 3 .
Statistical index for evaluating the performance of PGECLIMA_R in the simulation of global solar radiation on dry days.

Table 4 .
Statistical indexes for evaluating the performance of PGECLIMA_R in the simulation of global solar radiation on wet days.