PCA application for identification of NDVI and precipitation patterns in the Pernambuco state

The present study aimed to interpret and analyse the spatial pattern of NDVI and rainfall on Pernambuco state. We used monthly average data of rainfall and NDVI, which we obtained from Terra/MODIS satellite, with spatial resolution of 1km, from 2003-2013 period. We applied the Principal Component Analysis (PCA) to determine the spatial pattern of the variability of variables. Our results showed that, generally, there is a relation batwing the rainfall distribution and the state relief, not necessarily in amount, but in the spatial distribution. In addition, the vegetation react according with the region rainfall.


Introduction
The state of Pernambuco is located in the center-east of the Northeast of Brazil, with climatic conditions depending on the amount and distribution of rainfall.Throughout the state, rainfall decreases east-west and, to a lesser extent, south-north.Thus, there are three variations of the climate in Pernambuco: tropical humid climate predominant in the Coastal and Zona da Mata, tropical subhumid climate that predominates in the Agreste and tropical semiarid climate.The average annual temperatures recorded for the territory vary from 26 ° C to 31 ° C (Dantas et al., 2016).
The variability of precipitation and temperature in the state is related to vegetation and relief and are also influenced by meteorological systems that, interacting with each other, give the state peculiar characteristics.Its vegetal diversification is between the mangrove and the tropical forest, located closer to the coast, and the caatinga located in the interior of the territory, with a climate characterized by semiarid.There are between these two regions, respectively called Zona da Mata and Sertão, a transition area, known as Agreste.The vegetation is directly related to the climatic variability of a given region (Lexer et al., 2002).
Multitemporal data obtained from remote sensing of different meteorological and environmental satellites have been widely used for different purposes all over the world.Monthly IVDN series were used by Gutman and Ignatov (1998) to produce the vegetation fraction and to compose it in numerical models of weather and climate prediction.Ha et al. (2001) analyzed the variability of IVDN, LAI and estimated surface temperature from AVHRR / NOAA data in Korea.The authors observed that the interannual variability of LAI strongly depends on vegetation type and that LAI changes are not related to variations of the NIDI.
Factorial Analysis in Principal Components (PCA) is widely used in studies involving a large number of variables, by this technique capable of reducing them without deforming the original data.For this reason it is often applied time series of meteorological data.Sousa et al. (2014) used monthly data from the IVNA of the NOAA and MODIS satellites in the 2010 decade to find a relationship with rainfall in the state of Paraíba-Brazil.The results showed that the correlations are higher in the drier months than in the rainy season.Dantas et al. (2016), used ACP to analyze the time vegetation response to rainfall in the state of Pernambuco-Brazil.
Taking into consideration the above, the objective since work is to improve previous studies for the state of Pernambuco, making use of the statistical technique of the ACP, allowing to identify the spatial patterns of IVDN and precipitation, associating them with the atmospheric systems that produce Rains in the region.
Figure 1 -Location of the state of Pernambuco on the map of Brazil.

Rainfall data
Precipitation data were obtained from the Pernambuco State Water and Climate Agency (APAC) of 81 meteorological stations distributed in the state of Pernambuco from 2003 to 2013.The relief of Pernambuco is moderate, most of the state is below 600 m.In the Coastline the relief is almost all to the average level of the sea.As it distances from the coast, one observes a coastal plain with altitude between 0 and 10m.Between Zona da Mata and Agreste, one can find the Borborema Plateau, with an average elevation of 600m, passing from 1000m to the peaks.
It is observed that the altitude grows from São Francisco to Sertão.The Serra do Araripe is also highlighted on the border with Ceará, with an altitude of approximately 800 m (Figure 2).The volume of precipitation of a locality is related to the characteristics of the relief.

NDVI data from Terra/MODIS
NDVI is a model resulting from the combination of reflectance levels in satellite images, the near infrared (0,725 -1,10 μm) and the visible (0,58 -0,68 μm).Soon, the NDVI is determined by the following equation: The MODIS product for the IVDN, more specifically the product MOD13A3, with spatial resolution of 1 km for the period from 2003 to 2013, was purchased on the REVERB/NASA website.There was a conversion of the images in .HDF to .IMG, the tiles (h13v09 and h14v09) were grouped.In addition, the correction factor (0.0001) was multiplied by the product MOD13A3, with the aid of a software developed to extract information from digital images.Thus, with the processed images, the IVDN value was extracted for each pixel

Factor Analysis in Principal Components
For an analysis of principal components (ACP), we need an original data matrix X, of p variables for n individuals, in order to obtain an array of variance and covariance S through: Since X is the matrix with centered values, Xt is the transposed matrix and n is the number of individuals.The correlation matrix R will be equal to the matrix of variance and covariance, thus: The matrix R is a symmetric and positive correlation matrix of dimension (pxp), diagonalizable by an orthogonal matrix A, of base change called eigenvectors, thus: Where D is the diagonal matrix and A-1 is the inverse of matrix A. Since A is the base change matrix for a new reference system composed of the eigenvectors of R, the principal components (CP) U1, U2, ... , Up are obtained by linear combinations between the transpose of the eigenvectors (At) and the standardized observation matrix (X), as described below: To estimate Xi n-th local values, is used: Where aij is the set of eigenvectors of X in descending order of the most significant eigenvalues of a in A.
The percentage of explained variance of the eigenvalues in descending order is given by: The correlation between the i-th original variable and the i-th major component is: Being, aij the jth element of the ith eigenvector and λi the ith eigenvalue.For this analysis, adequate statistical software was used

Spatial analysis of precipitation
The following are presented and analyzed the spatial and temporal patterns of precipitation and IVDN for the state of Pernambuco in the period 2003 to 2013.
For the spatial analysis of the precipitation, the first 5 common rotational factors were considered, which explained 89.5% of the total variance of the series.Only the first three components of larger weights will be analyzed.
The first spatial pattern (Figure 4a), which explained 27.82% of the variance, has correlations higher than 0.6 in a core isolated in the Sertão, and in much of São Francisco and Agreste.Isolated nuclei higher than 0.8 in the south central region (São Francisco) stand out.The lowest correlations are observed in the north of Zona da Mata The third spatial pattern (Figure 4c), explaining 16.83% of the total data variance, presents a positive correlation (> 0.6) in small distinct nuclei throughout the state and above 0.4 in western Pernambuco. Figure 5 shows the time series for this factor, highlighting a maximum in 2004, showing that in this period more rainfall occurred in the west compared to other years, while the lows were observed in 2010 and 2011.Results found similar in the literature by Nicácio et al. ( 2009) and Silva et al. (2014).The first spatial pattern (Figure 4a), which explained 27.82% of the variance, has correlations higher than 0.6 in a core isolated in the Sertão, and in much of São Francisco and Agreste.Isolated nuclei higher than 0.8 in the south central region (São Francisco) stand out.The lowest correlations are observed in the north of Zona da Mata The third spatial pattern (Figure 4c), explaining 16.83% of the total data variance, presents a positive correlation (> 0.6) in small distinct nuclei throughout the state and above 0.4 in western Pernambuco. Figure 5 shows the time series for this factor, highlighting a maximum in 2004, showing that in this period more rainfall occurred in the west compared to other years, while the lows were observed in 2010 and 2011.Results found similar in the literature by Nicácio et al. (2009) and Silva et al. (2014).
Figure 5 -Principal components of the main five common spatial precipitation factors for the state of Pernambuco.

Spatial analysis of NDVI
The first three common spatial explaining 89.64% of the total variance of the annual data are shown in Figure 6 (a, b, c).For the first factor that accounts for 52% of annual IVDN data, correlations are higher than 0.6 in all of Sertão and even in the Zona da Mata and Ageste

Conclusion
Although the GAP applied to the rainfall time series has retained five factors, it can be observed from the results already discussed that only the first three common factors represent aspects of rainfall climatology in Pernambuco.The first common factor showed, throughout almost the entire state, high correlations of IVDN and precipitation.A nucleus located in Agreste Pernambuco can be seen in both spatial distributions of this factor.In addition, the north of the Zona da Mata presents negative correlations of IVDN and rainfall.
The second precipitation factor has high correlations in the northwest of the state and the Agreste to the Coastal.For the IVDN, the first factor exhibits positive correlations throughout the state, although some of them are close to zero mainly in the portions of the East and South Coast, north of the Zona da Mata and northwest of the state.
The third common factor related to rainfall, which showed maximum scores in 2004, allowed us to observe that the vegetation index varies according to the occurrence of rainfall, that is, rainy / dry regions cause an increase / decrease in IVDN.
There is a relationship between precipitation distribution and state relief, not necessarily in quantity but in spatial distribution.The ACP was effective for the proposed treatment Figure 3 shows the average annual precipitation totals for the period from 2003 to 2013 in the state of Pernambuco.The highest indices are seen on the coast, with values higher than 1800 mm.Meanwhile, the lowest values are observed in the São Francisco and Sertão regions, with values lower than 600 mm.

Figure 4 -
Figure 4 -Spatial distribution of total average annual precipitation (mm) in the state of Pernambuco, period 2003-2013.
. The time series (Figure 5) associated with this factor identifies the rainy years of 2005 and 2010.And the dry seasons are 2003 and 2012 with little rainfall in the eastern sector.The spatial configuration of the second component explaining 25.65% of the variance is illustrated in Figure 4b.Positive correlations higher than 0.4 are observed in the east of the state and in some nuclei of the same order in the interior.The temporal series of this factor presents maximum scores in 2007, 2011 and 2013 (Figure 5), indicating that in those years the most relevant rainfall indexes were observed from Zona da Mata to Litoral and minimums in years 2006, 2008 and 2009, showing a Reduction of rainfall in the Zona da Mata.

Figure 4 -
Figure 4 -Spatial patterns (correlations) for the first three common precipitation factors in Pernambuco explaining 70.30% of the total variance . The time series (Figure 5) associated with this factor identifies the rainy years of 2005 and 2010.And the dry seasons are 2003 and 2012 with little rainfall in the eastern sector.The spatial configuration of the second component explaining 25.65% of the variance is illustrated in Figure 4b.Positive correlations higher than 0.4 are observed in the east of the state and in some nuclei of the same order in the interior.The temporal series of this factor presents maximum scores in 2007, 2011 and 2013 (Figure 5), indicating that in those years the most relevant rainfall indexes were observed from Zona da Mata to Litoral and minimums in years 2006, 2008 and 2009, showing a Reduction of rainfall in the Zona da Mata.
. The lowest correlations are found in the southeast and northeast of the state.The annual variability of the IVDN of this factor presents maximum in the years 2007 to 2010 and in 2012, and minimums in 2003 and 2013 (Figure 7).This factor is showing mainly dry years (APAC, 2014), harming vegetation development.According to Weng et al. (2008), vegetation behavior and growth is strongly influenced by rainfall.The spatial variability of the correlations of the second factor, which explains 20.38% of the total IVDN variance, has positive correlations throughout the region, but the lowest values are observed in the extreme northwest of Sertão, in the north of Zona da Mata and in the East Coast And south.The time series of this factor shows high weights in 2003 and 2011 and minimums in 2007 and 2012.Finally, the spatial variability of the correlations of the third factor (Figure6c), which explains 16.75% of the variance, has positive correlations throughout the state, decreasing from the coast (> 0.8) to the interior where the correlations are minimal positive In the southwest and central northwest (near zero).The annual temporal variability of this factor has maximum contribution in 2013.In this year, the highest IVDN was verified from the central part of Agreste to the Litoral.These results corroborate the work ofSilva et al. (2014)  andDantas et al. (2016).

Figure 6 -
Figure 6 -Spatial patterns (correlations) for the first three common factors of IVDN in Pernambuco explaining 89.64% of the total variance.

Figure 7 -
Figure 7 -Principal components of the first three common spatial factors (scores) of IVDN for the state of Pernambuco.