Improving small watersheds socioeconomic indicators with nighttime light satellite data to support water management decisions

estimate this type of information for small watersheds (from 5 to 100 km²), applying nighttime light (NTL) satellite images and available socioeconomic records from larger locale. Three socioeconomic indicators were chosen to test the method: Gross Domestic Product, population and jobs. The relationship between these three socioeconomic indicators and the radiance quantified from the NTL images was acquired through simple regression analysis applied at the 497 municipalities of the State of Rio Grande do Sul (RS), southern Brazil. The polynomial fit equations presented the best Coefficient of Determination, being further submitted to validation by using data from 50 municipalities of the neighboring State of Santa Catarina. The validation showed a very good estimation performance. The validated equations were used to estimate these socioeconomic indicators for small watersheds located in the municipality of Caxias do Sul, RS, in three different years: 2011, 2014 and 2018. Findings indicate that this novel application of NTL for estimating socioeconomic data can be a helpful tool towards land use and water resources management of small watersheds.


Introduction
Water resources policies and plans require knowledge of the physical, economic and social aspects of the watershed (Loucks and van Beek, 2017) to define desired socioeconomic standard at the watershed. However, deciding this desired standard is somewhat difficult when the socioeconomic information is unknown. Such policies include water allocation (water rights, water transfers, water reuse), address water infrastructure (new and improving operation of existing ones) and water demand management (water charges, tariffs and other economic instruments).
Water managers and stakeholders should be able to properly evaluate and compare the performance of different alternatives and combination thereof, which include estimating the potential changes in the water availability, quality, water accounting and budget, as well as the tradeoffs when water is transferred or reallocated from on region or user to another.
While hydroeconomic models have long been used to support this kind of analysis, given their capability to use the economic value of the water (Young and Loomis, 2014), the economic water scarcity (Moncur and Pollock, 1988) and the marginal resource opportunity cost (Pulido-Velazquez et al., 2013) to evaluate and optimize water and land allocation arrangements, they still have to deal with limitations regarding the different scales of the information. For instance, the available methods for the hydroeconomic assessment and modelling require data to estimate water demand and to quantify the opportunity cost of conservation measures (Belladona et al., 2019;Harou et al., 2009) and to manage land use in small watersheds whose water is destined to public supply.
Several socioeconomic databases (e.g. Gross Domestic Product (GDP), employment, population) are organized for municipality or other political boundaries, rather than the watershed boundaries, which are the ones where changes in water availability, reliability and quality are calculated by the hydroeconomic models. Another aspect is that water management policies and strategies rely on decisions that bring different outcomes to people in the target areas. An example is the design and implementation of land and water management solutions to improve the raw water quality and reliability in a small watershed located within one or more municipalities. Such measures may result in opportunity cost of land (e.g. increasing the protected area and limiting agricultural land use) and water (e.g. reducing emissions and limiting water withdrawals). These costs are borne by the small watershed land owners only (watershed scale costs), whilst the benefits, which include reduced water treatment costs and added water security extrapolate the small watershed boundary, accrue at all inhabitants living in the municipalities (Ruijs et al., 2017) (municipality scale benefits).
When managers need to design effective water protection programs, as in the example of payment for environmental services, this mismatch in the scale of the benefits and costs needs to be addressed. More specifically, one needs to evaluate the economic impact and costs imposed to the small watershed, as per our example. Thus, as in water resources management, the watershed scale is often the basic unit of planning. Socioeconomic indicators in such scale are needed to allow decision makers to measure the performance of the policies implemented in these areas.
There are, however, two limitations in accessing socioeconomic information for water resources management, especially at small watershed scales. The first is that this information is often available in political scales (e.g. country, state, and municipality). For watersheds that are either smaller (or larger) than the municipality, or are located between two or more municipalities, there is an information gap for the refinement of socioeconomic indicators. Another limitation is the lack of such information on an adequate time scale for small areas, a situation that makes it difficult to assess changes over time (Taylor et al., 2021).
Geospatial remote sensing allows a significant advance in measuring the relationship between economy and water resources (Booker et al., 2012). This tool has been applied to estimate socioeconomic data of municipalities and countries from nighttime light (NTL) satellite images, such as the GDP (Dai et al., 2017a;Gu et al., 2022;Huang et al., 2021;Li et al., 2013), population (Archila Bustos et al., 2015;Dória, 2015;Zeng et al., 2022) and to monitor spatial and temporal changes in urbanization in cities and regions (Liu et al., 2016;Nel·lo et al., 2017;Pandey et al., 2013;Sono et al., 2022). However, existing studies still mostly address estimation of socioeconomic data and indicators for political boundaries, and a gap to the mismatching scales for water resources management still remains in the literature. The present paper contributes to fulfil this gap by using nighttime light (NTL) satellite data and information to estimate socioeconomic indicators for small watersheds (from 5 to 100 km²), which complements existing studies at the municipal, county, provincial or other regional scales and providing useful and necessary information to determine economic (opportunity) cost at the watershed scale.
Our method correlates socioeconomic data with radiance at the municipality scale, and uses the results to quantify GDP, population and jobs, which were selected giving their capability as indicators (Pozzebon et al., 2022) of pressure on land and on water resources.

Methodology
The methodology was divided into three steps. In the first step, the encoded nighttime images of the years 2011, 2014 and 2018 from the DMSP-OLS and the NPP-VIIRS sensors (Chen et al., 2020) were related to three socioeconomic indicators (GDP, population and jobs) of the 497 municipalities of the State of Rio Grande do Sul (RS), southern Brazil, using simple regression (linear, exponential and polynomial). Sensors of this type have the unique ability to capture a low level of radiance at night, in both visible and infrared wavelengths (Elvidge et al., 1997), detecting lighting in cities, industrial parks and rural communities. The images corresponding to these years allow to follow the temporal changes in the amount of radiance of nightlight, which can be a result of various processes: city growth causes a unique illumination increase at the peripheral boundary, whereas in other cases, the enrichment of adjacent neighbourhoods increase nighttime light emission (Kyba et al., 2017).
In a second step, the regression equations with the best Coefficient of Determination (R²) value were validated by applying them to 50 municipalities in the neighbouring State of Santa Catarina and comparing the results with the observed data. In the third step, the validated equations were applied to estimate the GDP, population and number of jobs in a central urban area and in six small watersheds located in the urban and rural areas in the municipality of Caxias do Sul, RS, for the years 2011, 2014 and 2018.

First step -Socioeconomic indicators and nighttime light satellite images relationship
The linear, exponential and polynomial fits (Gupta et al., 2020;Ross, 2021) were tested and their respective performance were verified by the corresponding R². The independent variable (X) of the regression functions was represented by the radiance of the nighttime light satellite images and the dependent variable (Y) was represented by three distinct socioeconomic indicators: GDP, population and number of jobs.
GDP represents the amount of goods and services produced in a specific place and period (Williamson, 2016) and corresponds to total income in the economy (Mankiw, 2018). The population has significant importance in the economy, as it is linked to the supply of work and the dynamics of consumption (Castro et al., 2020). The number of jobs is also an indicator of the economic health of a region, evidencing a moment of growth or a recession (Hall, 2005). The GDP (IBGE, 2021a), the population (IBGE, 2022) and the number of jobs (Brasil, 2021) of the 497 municipalities of RS were taken from official bodies to compose the database for the years 2011, 2014 and 2018.
The radiance (or simply NTL), expressed in nanowatt per square centimetre per steradian (nW•cm −2 •sr −1 ), was obtained through nighttime satellite images of the DMSP-OLS and NPP-VIIRS sensors (Chen et al., 2020) (Figure 1), with a spatial resolution of 15 arc-second (approximately 500 meters). These images went through a decodingencoding process, resulting in a third composite image where the effects of incandescence and light saturation were reduced (Chen et al., 2021). The radiance of each municipality for 2011, 2014 and 2018 was obtained by the use of geoprocessing considering the municipal grid (IBGE, 2021b) in shapefile format, georeferenced in WGS84. The radiance of all pixels inserted in the area of each municipality was summed, allowing the quantification of the respective NTL for each of the three years. Table 1 presents the statistical summary of the dataset for the years evaluated.

Second step -Validation
The regression equations with the best R² were selected for validation. The validation was carried out in 50 municipalities in the State of Santa Catarina (SC) (Figure 2), which borders RS. These municipalities were randomly selected on the condition that they all had information about GDP, population and jobs. Radiance values were obtained as described in Section 2.1. Table 2 presents the statistical dataset summary for the 50 selected municipalities for the years 2011, 2014 and 2018.  The performance between observed/existing and estimated data was measured using three metrics: the Nash-Sutcliffe (NSE) (Equation 1), the root mean square error (RMSE) (Equation 2) and the RMSE-standard deviation ratio (RSR) (Equation 3). According to Moriasi et al. (2007), the NSE performance can be categorized into four classes: very good for 0.75 < NSE ≤ 1, good for 0.65 < NSE ≤ 0.75, satisfactory for 0.50 < NSE ≤ 0.65 and unsatisfactory for NSE ≤ 0.50. RSR was also categorized into four classes: very good for 0.0 ≤ RSR ≤ 0.50, good for 0.50 < RSR ≤ 0.60, satisfactory for 0.60 < RSR ≤ 0.70 and unsatisfactory for RSR > 0.70.
Resulting performances for NSE and RSR classified as very good or good were considered successful for the application of the method. Also, the RMSE value must be less than 10% of the range of the maximum and minimum value of the observed target data (Al-Murad et al., 2018;De Vargas et al., 2022b).

Equation 3
Where the index i represents the sample, is the observed output at i; is the predicted output at i; is the mean of the observed output; n is the total number of samples (n=50) and the is the standard deviation of the observed data.

Third step -Small watersheds
The municipality of Caxias do Sul has a population of 523.716 inhabitants (IBGE, 2022) and is characterized by presenting geomorphological diversity and high altimetry variation (De Vargas et al., 2022a), which makes it dependent on damming small creeks for public water supply. Seven small areas ( Figure 1C) within this municipality were selected to test the methodology and have the three socioeconomic indicators estimated: six of them were defined based on watershed limits, five of which have special land use legislation (Caxias do Sul, 2005). The reason for choosing these five watersheds is due to the importance they represent to the city water supply. Hence the management of their water resources ought not to be limited to preservation, but also focus on understanding the social and economic aspects that there coexist. The last area, namely Central, is not exactly a watershed, but a part of the urban central area.
The watersheds of the Faxinal (66.77 km²), Marrecas (53.14 km²), Maestra (15.28 km²), Dal Bó (6.31 km²) and Samuara (6.71 km²) creeks, represent 8.98% of the municipality area and are responsible for storing water to supply the city of Caxias do Sul (Pozzebon et al., 2021). The Maestra, Dal Bó and Samuara watersheds are partially or totally included in the urban perimeter. The Faxinal watershed is mostly in the rural area with a small west and northeast portion inserted in the urban perimeter. In the Marrecas watershed grassland and some forest formations prevail, only at a southwest portion will a tiny urban area be found.
The watershed of the Belo creek (16.27 km²), in turn, does not contribute to the public water supply. It is inserted in a portion of the municipality which suffers strong real estate and economic pressure. This watershed was selected to test the method as it presents the dimension that matches the scope of this study, but the land use and occupation restrictions are not as strict as the five previous ones. The Central area (7.11 km²) was selected as a reference aiming to verify changes in the three indicators in a location without increasing radiance within the physical limits of the polygonal and with little incidence of vacant areas.
The regression equations with the best fit, with the R² closest to 1, are the ones used to estimate the Y values for these seven small areas. Table 3 presents the results of the evaluation of the variable X (NTL) and the variables Y (GDP, population and number of jobs) obtained by the simple regression method applied to the 497 municipalities in RS. The R² coefficient varied considerably depending on the type of fit. The exponential functions presented the worst adjustment in all years, showing that its application is not the most appropriate to estimate these socioeconomic indicators for regions located in RS. Overall, only 21 to 38% of the variation in GDP, population and number of jobs is explained by the variation in NTL in the exponential fit. For the linear fit, 84 to 96% of the variation in the socioeconomic indicators is explained by the variation in NTL. The second order polynomial functions presented the best adjustment, with R² coefficients closer to 1 (ranging from 0.939 to 0.986) and, therefore, best relating the X and Y variables (Figure 3). These results suggest that data for RS for such indicators for the years 2011, 2014 and 2018 are best estimated when the second order polynomial functions are applied.

Relationship between radiance and GDP, population and jobs
While the correlation results alone do not imply causation (Field, 2017), several previous studies point out to strong evidence that higher GDP is reflected in regions/cities with brighter results in terms of NTL (Chen and Nordhaus, 2011;Lu and Coops, 2018;Villa, 2016). Similar results were obtained by Dai et al. (2017) who applied simple regression analysis to some provinces and cities in China using images from DMSP/OLS and NPP-VIIRS sensors, and found that the polynomial fit better related GDP values to NTL. Levin e Zhang (2017) also observed a statistically significant correlation (between 0.60 and 0.66) of GDP per capita with nighttime light radiance in the world's 200 largest urban areas.

Validation of the polynomial functions
The second order polynomial regression functions were submitted to validation, as they were the ones with the best fit. The validation results, represented by the NSE, RSR (Table 4) and RMSE performance metrics, confirm that all equations are adequate to estimate GDP, population and the number of jobs for areas without data in the study region. The NSE and RSR classified the performance of the equation as "Very Good" and all RMSE values are far below the greatness that corresponds to 10% the range (maximum and minimum values) of the observed data.  Moriasi et al. (2007).

Estimation of GDP, population and jobs for small watersheds
The change in radiance values from 2011, 2014 and 2018 in the seven areas is depicted in Figure 4. This Figure shows, through the colour ramp, an increase in radiance between 2011 and 2014 (there was a 54.1% increase in NTL in the period), which may be linked to the economic growth of Caxias do Sul and the region. In the same period, the municipal GDP increased by 36.1% (IBGE, 2021a). On the other hand, for the period from 2014 to 2018, the NTL of the seven areas, summed altogether, reduced by 8.7%. In this period, the increase in municipal GDP was limited to only 15.3% (IBGE, 2021a). Nationwide, the variation of Brazilian GDP for the four-year period 2011-2014 compared to the previous four-year period (2007-2010) was 2.3%, while for 2015-2018 it was −1.1% (Balassiano and Pessôa, 2021). Between 2014 and 2018 there is a reduction in radiance in six of the seven areas (Table 5), suggesting that the low national and municipal economic performance is reflected in the small areas studied in this period.
In the Faxinal watershed, on the other hand, the total radiance between 2014 and 2018 increased from 162 to 192 nW•cm −2 •sr −1 , being located in the northeast, south, and west limits of the area and in a concave arc in the south-west direction (Figure 4). This distinct behaviour of Faxinal within this period can be explained by the population growth observed by Machado et al. (2022), who identified the advance of irregular human occupations in these portions of the watershed. This type of relationship between radiance and anthropic occupation is also evidenced by Ge et al. (2018), who demonstrated the possibility of identifying "ghost" urban regions in cities, corroborating the applicability of the method based on nighttime satellite imagery for studies in small areas.
The socioeconomic indicators estimated from the polynomial functions are presented in Table 5. The verification of the estimated socioeconomic results is difficult to perform, as there is little information available in the literature or from official bodies for such areas. However, simple comparisons were performed when possible. For instance, the GDP estimated by the equation of 2018 (Tables 4 and 5) for the Marrecas watershed was R$ 20,751/ha. Information obtained at the municipal level showed that the agricultural yield per hectare planted in Caxias do Sul for that same year was R$ 17,657/ha (SEBRAE/RS, 2019). The GDP obtained by the method presented in this study overestimates the value by 18.3%, a difference that may be linked to the fact that the first considers the GDP of agriculture and livestock, while the second represents only agricultural income.
Other values for GDP with a compatible scale were not identified in the literature or in official bodies, making it impossible to validate or perform a comparative analysis of the estimated values for areas with urban characteristics. The same is true for jobs and population in either urban or rural areas, a fact that, per se, reinforces the importance of this study for the generation of estimated data for small watersheds.
These seven areas added together (171.3 km²) correspond to 10.5% of the total area of Caxias do Sul, but this ratio does not follow the same proportion for the indicators. Considering the estimated average values for the three years, the GDP of these areas corresponds to 23.4% of the municipal GDP, the population is equivalent to 30.4%, while jobs reach 21.5%. This difference between the area and the indicators demonstrates the importance of these small areas in the local socioeconomic scenario.   Table 4 and the radiance given.

Limitations of the method
The Maestra watershed, for instance, is located in an urban area with a large portion of remaining vegetation and rural uses. The presence of vegetation in this area was identified as a component negatively correlated with the NTL, which may totally or partially prevent the emission of light into the atmosphere (Levin and Zhang, 2017b), becoming a limitation of the method to be considered in the generation of estimated socioeconomic data.
Another limitation identified is the scale of work in relation to the scale of the images used. NTL has established its place as a proxy for socioeconomic indicators on a macro level, but the evidence is not complete at a local level, mainly due to data constraints at finer scales (Huang et al., 2021;Määttä et al., 2021), which is often the case for water resources applications in small watersheds, as they are not a census tract unit and socioeconomic data are simply non-existent for these areas. However, Liu et al. (2022) showed that population density at fine scales presented a very strong positive correlation with coarse NTL satellite images, but they came to the conclusion that NTL emissions at microscales are nearly not a proxy for per capita income. On the other hand, Mellander et al. (2015) presented a moderate correlation (approximately 0.5) between NTL and economic activity at a micro-level. Smaller regions exhibit stronger nonlinearity since smaller units are more homogeneous in terms of population density, economic activity, and economic structure (Bluhm and McCord, 2022), corroborating with the application of the polynomial correlation that resulted in the best fit in this study. Finally, Määttä et al. (2021) confirmed that NTL data is an even stronger proxy for economic development at a local level than previous literature suggests.
The availability of NTL data with a finer spatial resolution in the forthcoming future and, perhaps, treating some watersheds as census tract units to generate more adequate socioeconomic indicators can aid data estimation for hydroeconomic assessment and modelling.

Conclusions
This study presented a methodology to estimate socioeconomic indicators for small watersheds based on nighttime light satellite images. The method was tested in seven small areas in the municipality of Caxias do Sul, southern Brazil, and three indicators (GDP, population and jobs) for the year 2011, 2014 and 2018 were estimated. By applying simple regression analysis, the second order polynomial regression proved to be the most appropriate fit to estimate such information for the State of Rio Grande do Sul. The equations were then validated in 50 municipalities in the neighbouring State of Santa Catarina, confirming their efficiency to estimate these types of indicators.
Two limitations can be pointed out to influence the estimated data. One is the presence of vegetation, which can partially prevent the emission of light into the atmosphere. The second one is the scale of the study areas and the NTL images. However, further investigation on these limitations is beyond the scope of this study, but it is strongly commented future studies evaluating these and other socioeconomic indicators on the small watershed scale. In addition, a thorough analysis throughout time can be performed, when data is available, since nighttime light satellite images have a prolonged time series, contributing even more to the management of water resources in small watersheds without official data.
Despite these limitations, this novel application of NTL has revealed itself to be a helpful tool to estimate socioeconomic data for small watersheds, boundaries that normally lack such information, and thus shedding new lights on the ongoing debate in the hydroeconomic modelling to improve the identification of water management policies in these areas.
Based on these findings, water resources analysts are provided with an easy approach to feed their hydroeconomic models, thus aiding decision makers towards more assertive policies in small watersheds.