Complex network analysis of climate and landscape satellite data to explore spatio-temporal patterns in urban environment: the case of Athens

Motivated by the significance and complexity of exploring spatiotemporal patterns - regions within an urban environment, particularly in the context of extreme heat events-this research analyzes meteorological time series through complex network analysis. The data collected for the examination area is focused on Athens, Greece, and covers sections of the city’s urban landscape. The data was obtained from the Copernicus observation component of the European Union. Initially, the time series are transformed into networks using correlation network methodology, followed by examination of the discriminative capability of the topological measures of networks degree and modularity as community - region detection methods. Of particular interest is that our findings suggest that the proposed complex network analysis can lead to the extraction of spatial urban regions closely linked to land use and building heights in corresponding areas. These results may help investigate the spatial variability of heat in the urban environment and inform urban planning and management strategies in policy decision-making regarding the intensity of urban heat throughout the city and the planning of climate change adaptation strategies.


Introduction
Analysis of spatiotemporal observations, such as meteorological variables, is of significant importance for scientists due to spatial and also temporal variability of the observable physical quantities to understand the underlying processes of urban regions under the potential existence of the extreme heat events (Yang et al. 2016;Deilami et al. 2018).As the methodology of identifying and assessing urban regions is a challenging issue, our proposed methodology aims to explore the potential of complex network analysis to identify characteristic regions-patterns in urban areas landscape based on climate data (land surface temperature, wind speed, and relative humidity).This approach aims to support and assist decision makers in land use planning or climate mitigation.
Several methods have been suggested to identify urban areas.The existing methods can be divided into two categories: unsupervised and supervised.Unsupervised includes methods based on structure characteristics of very high resolution (VHR) remote sensing images (Iannelli et al. 2014), using aHarris-based feature point set and adaptive orientation-sensitive voting technique (Kovács and Szirányi 2012).Yuan et al. (2005) describe methods of classification change detection to produce accurate landscape change maps.The other group of methods are supervised detection methods.Benediktsson et al. (2003) proposed a method based on feature extraction (or feature selection) and classification.Bruzzone and Carlin (2006) classify the urban areas usinga support vector machine (SVM).Hu et al. 2016; used multi-scale features to build a supervised framework for built-up area detection.Rawat and Kumar 2015 employed supervised classification methods to illustrate the spatiotemporal dynamics of land use/cover, while Tian et al. 2018 employed deep convolutional neural networks to detect urban regions.Vaz 2016; studied the possible spatial interpretations of landscape change by means of defining the role of Geographic Information Systems to allow sounder urban and regional interactions and to guide for best the directions of regional planning in future.
Graphs that illustrate spatial relationships and interactions between different elements of urban landscapes provide an effective framework for capturing the complex spatial structures inherent in urban environments (Batty 2007(Batty , 2009;;Boeing 2019;Barthélemy 2011;Ding et al. 2019).By modeling urban areas as graphs, nodes can represent specific geographic locations or features, while edges signify spatial connections or relationships between these locations (Porta et al. 2006(Porta et al. , 2009;;Crucitti et al. 2006a, b).Techniques such as community detection algorithms and graph partitioning methods enable the identification of cohesive clusters or regions within urban networks based on connectivity patterns (Newman and Girvan 2004;Newman 2018;Latora and Marchiori 2007;Domingues et al. 2022De Montis et al. 2013).These measures, such as betweenness centrality, closeness centrality, and eigenvector centrality, provide valuable insights into the significance and prominence of various urban elements like streets, intersections, and nodes within transportation networks (Crucitti et al. 2006a, b;Porta et al. 2006).With this aim, the objective of the present work is to investigate the identification of known characteristic regions -patterns granted from urban atlas land Copernicus data (https:// land.copernicus.eu/local/urban-atlas/urban-atlas-2018)with an alternative view focused on the relationships among physical variables and complex network analysis, in combination with satellite data (Sevtsuk and Mekonnen 2012;Sharifi 2019;Zhang et al. 2022;Ziliaskopoulos and Laspidou 2024).Complex network-based time series have gained considerable attention and they have provided new insights to time series analysis by using graph theory, succeeding fruitful achievements to address interdisciplinary challenges (Gao et al. 2017;Zou et al. 2019;Yang and Yang 2008).By employing network analysis techniques, such as community detection algorithms, we were able to identify spatial clusters within urban environment exhibiting similar patterns.Community structures, defined as groups of nodesthat are more densely connected than with the rest of the network, are widely existed inmany real-world complex systems.This methodology offers an understanding of urban microclimates and can help policy decision -making regarding them.
The data sets used in this work were obtained by Copernicus -Climate variables for cities in Europe and Urban Atlas.We focused on an area in Athens's city center, which is of particular interest for analysis since variations in geomorphological relief, land use, and building heights characterize it.This area was divided into a grid with 100 m x 100 m resolution where each grid point was associated with meteorological features.We first independently performed a Pearson correlation coefficient analysis among the grid pointsfor each variable (temperature, relative humidity, and wind speed).This process results in three distinct spatial correlation matrices, one for each variable.Subsequently, we convert each correlation matrix into a correlation network comprising 2417 nodes corresponding to spatial points, with edges between nodes representing the Pearson correlation value obtained above.Then, we explore the topological properties of the networks associated with the series, such as degree and modularity.The discriminative capability of modularity networks lies in their ability to uncover distinct communities or clusters within a complex network and, thus, a complex system.Modularity, as a measure of the quality of network partitioning, assesses how well nodes group into communities while having sparse connections between them.By identifying modules or groups of nodes with high intra-connectivity and low inter-connectivity, modularity networks can reveal underlying structures and patterns.
We would like to clarify that our aim is not to classify the land use of the spatial points since we take this information as granted from urban atlas land Copernicus data as we mention above.We must also stress that we do not take the information concerning land use or building height as input in the construction of the networks.The only input is climatic time series that correspond to each spatial point.The land use is mentioned in order to examine, once the network analysis is performed, if the resulting regions present any qualitative dependence/correlation to the building height and/or the land use.In an urban environment an interplay between land elevation and building height with speed of wind, temperature and land use may result in more complex interactions between different spatial regions which are not directly observable just from the statistics of the climatic variables and cannot be attributed to land use alone.Complex network analysis that considers interactions between spatial nodes may be helpful in this case and help classify regions differently.
Overall, our study suggests that complex network analysis, together with classical techniques, could be used to characterize urban regions with district properties associated with other features such as land uses, soil elevation, and building height.

Site and data description
The study area for implementing the proposed approach is located in the central area of Athens.Athens is the capital and the largest city of Greece, and its municipality covers a land area of 38.96 km 2 .The examination area lies between points 39°97' N − 23°71' E, 37°96' N − 23°73' E, 38°00' N − 23°79' E and 38°03' -N 23°77' E as shown in Fig. 1.This area was selected due to its scientific interest, as it is including the downtown core of the city, areas with diverse morphological characteristics.Moreover, this area of Athens presents a particular scientific interest for study, as evidenced by the city's participation in the European H2020 ARSINOE program (https://arsinoe-project.eu/), which, among other issues, examines the detection of areas with different characteristics by studying meteorological and geomorphological features.
The entire area is divided into a 100 m × 100 m resolution grid, comprising 2417 locations/points.At each grid point corresponds a value of Land Surface Temperature (LST), Relative Humidity (RH), and Wind Speed (WS), which were provided by the Copernicus observation component of EU (https://www.copernicus.eu/en/access-data).We removed extreme, spikes and wrong values from the corresponding data sets.
Land surface temperature and relative humidity are measured near the surface, usually at a height of 2 m, while wind speed is measured at 10-meter height.The sampling interval was 1 h for all variables and every measurement is an average over 60 min, resulting in 24 values per day.Thus, the length of the time series is 720 values (24 × 30 in units of Δt = 1 h).The available data cover a period of four years (2014-2017) for the months of April to September.In Fig. 2 we can see the grid area (included within the dashed line) as well as representative temperature, humidity, and wind speed time series from a specific grid point.
Furthermore, the region being studied displays a wide range of variations in terms of land uses, soil elevation, and building heights.Each point is linked with a particular land use type and the height of the associated buildings.The database provides a Fig. 2 The study area of Athens city center.The red time series corresponds to temperature, the blue one to relative humidity, and the black one to wind speed respectively for a specific grid point Fig. 1 The study area of Athens city center.The red star indicates the Acropolis, while the blue corresponds to Lycabettus Hill comprehensive description of the various land use categories, which are outlined in the Table 1.
The different land uses at each point on the grid, as well as the building heights derived from urban atlas land Copernicus data: https://land.copernicus.eu/local/urban-atlas/urban-atlas-2018 and are presented in Fig. 3.
In this paper we present the results of three months: April, July and September of the same year (2014).This selection is due to their different behavior, in terms of climatic conditions, as April belongs to the spring season, July is in the middle of summer and September belongs to the end of summer.

Methodology
In this section, we briefly describe the proposed urban area detection methodology based on graphsemployed.The methodmainly consists of the steps presented in Fig. 4 where the methodological framework of the study is illustrated.
The first step was to match the data set of land surface temperature, relative humidity, and wind speed into grid points.Then, we performed Pearson correlation analysis to examine the correlation between them.By constructing the correlation undirected weighted graphs and applying the network degree and community detection algorithm, we visualized each network based on the computed property.Finally, we projected the nodes of the network at the corresponding points on the map.
By examining the Pearson correlations between these attributes across all grid points, the complex correlation network was constructed and then we evaluated the degree and modularity of the network.The results of the analysis indicate that the area can be divided into sub-areas (three to five).The ability to separate into sub-regions with similar characteristics or dynamics is verified by jointly studying land uses, ground height, and building height.

Linear correlation
One of the most widely used correlation methods is Pearson's correlation (Cohen et al. 2009), which quantifies the relationship between two variables that are measured on the same interval or ratio scale.It serves as a measure of the strength of the association between two continuous variables and the value vary from − 1 to + 1 where 1 indicates a strong positive relationship, both variable increasesor decreases, -1 indicates a strong negative relationship, one object decreasing as the other increases, while uncorrelated objects tend to have a Pearson score close to zero.Given a paired data {(x 1 ,y 1 ),….(xn ,y n )} consisting of n pairs r xy is defined as where n is sample length, x i , y i are the individual sample points indexes with i and − x and − y the sample mean.

Complex network analysis and topological measures
The field of network science has made significant progress and has become a crucial tool for exploring the dynamic and structural features of real systems in various fields such as physics, biology, and social sciences.The manner of node connection by edges in a network is critical.Numerous approaches have been proposed to transform time series data into complex networks.In this study, we utilize the linear correlation coefficient to establish connections between nodes.A detailed overview of these methods can be found in (Zou et al. 2019).
According to the correlation network, two nodes x(ti) and x(tj) are connected in the associated graph at time t if the correlation coefficient is larger than a threshold value.In a network mapped using this criterion, each point grid corresponds to a network node.
From a mathematical perspective, a network is represented by a graph G=(N, E), which consists of a set of N=(n 1 ,n 2 ,…n N ) vertices or nodes connected by a set of E=(e 1 ,e 2 ,…,e E ) links or edges.A network can be described using its adjacency matrixA=[a ij ], which encodes the graph's connectivity structure.The adjacency matrix is an N×N matrix for a graph with N nodes.
The degree of a node i(k i ) is defined as the total number of edges adjacent to that node and can be calculated as and the average degree < k > which represents the mean value of ki for all vertices, is the mean value of k i for all vertices, is a global measurement of the network's connectivity.
The modularity of a network, introduced by Newman and Girvan 2004; serves as a measure for identifying communities or clusters within a network.Essentially, modularity quantifies the density of links between nodes within communities relative to what would be expected in a random network.Networks exhibiting high modularity indicate dense connections among vertices within groups, suggesting closely connected communities that facilitate efficient transmission of information.
Numerous community detection methods have been documented in the literature, including the Girvan-Newman and Louvain algorithms, among other.By Girvan-Newman method, we have a network with N nodes.Let s i =1 if the vertex i belongs to group 1 and s i = -1 if it belongs to group 2. The modularity Q is defined as.
where A ij is the adjacency matrix, k i •k j /2m is the probability a random edge would go between i and j expected number of edges between vertices i and j if edges are placed at random, where k i and k j are the degrees of the vertices and m = 1 2 i k i is the total number of edges in the network.
Another modularity approach is the Louvain Method which has been proposed by Blondel et al. (2008).It is a very fast, efficient heuristic algorithm that maximizes the modularity of nonoverlapping community structure through an iterative, hierarchical optimization process.This method is more efficient for identifying communities in large networks.

Results and discussion
For each observed variable (temperature, relative humidity, and wind speed) we construct agraph, based on the methodology known as correlation network, where the connectivity between two nodes is defined by the Pearson correlation coefficient between the corresponding time series of the variable for the two points.In this way, we have constructed three networks, one for each variable (temperature, relative humidity, and wind speed).As mentioned above, the proposed methodology was applied for each month of the data set, but for reasons of limitation, in this study we present the results for three characteristic months, April, July and September for one year (2014).
The time series of land surface temperature, relative humidity and wind speed of one grid point are displayed in Fig. 5(a-c).
The heat map in Fig. 6 shows variable values for April 2014, giving insights into their distribution across the area.In the map's southern region, where the city center is located, the highest temperatures are observed, which are represented by a bright red color.As you move towards the northern direction, away from the city center, a decline in temperature becomes evident.Additionally, the figures for temperature and relative humidity in all three months demonstrate a negative correlation between the data, as expected.In addition, areas that have either high soil elevation or high buildings in Fig. 4 Illustrated methodological framework Fig. 3 The map shows the land uses (left) and the corresponding building heights (right) for each point on the map combination with the designation as green areas (Fig. 3) are depicted, as white scattered areas in the temperature figures and as bright blue in the of relative humidity figures.The wind speed, as it fluctuates greatly, shows greater distribution on the map.
As mentioned above, the three variables have been recorded at each point on the map with a sampling frequency of one hour.Applying the Pearson correlation method separately for each variable, we extracted the corresponding correlation matricesbetween all grid points on the map (a matrix of 2417 × 2417 points).
The core part of this study is to build a graph from the correlation values representing spatial relationships and interactions between variables of urban landscapes.Thus, using the methodology of correlation networks, we construct the network illustrated in Fig. 7, which is based on the above correlation matrix.In this way, we construct a network consisting of as many as the points on the grid area (2417 nodes).
Modeling urban areas as graphs involves assigning specific geographic locations as nodes and spatial connections or relationships between these locations as edges.Once the network is constructed, we evaluate its topological properties such as degree and modularity to detect communities or clusters within the network.Figure 8 provides a visual representation of the network's projection on the map, displaying the constructed network with each node representing a point in the map's coordinate system.Each node is marked with a different color corresponding to its assigned community, and these groups are depicted on the map.It's important to note that the nodes in the network correspond to geographical points on the map.
Figures 9 and 10, and 11 illustrate the outcomes of community formation utilizing modularity and degree properties.We apply the methodology separately to land surface temperature, relative humidity, and wind speed.The figures are presented based on temporal evolution to facilitate comprehension and result interpretation.
The network partitioning method divides the area into four regions for April and five regions for July and September.Across all three cases, two common areas are identified: one further north, depicted in yellow on the map, and the scattered regions in light green.The boundaries of these areas remain consistent, with the yellow area slightly larger in April.Upon examining Figs. 2 and 3(a-b), it becomes apparent that areas marked in light green correspond to regions that are green areas as they have been Fig. 6 Heat map of the temperature mean, relative humidity, and wind speed for April, July, and September 2014 designated in the land cover/use Copernicus Urban Atlas 2018 and also are characterized by low building heights.In July, when average temperatures are higher, five groups of regions are formed, with the central region getting smaller and the southern region being divided into two sub-regions.The dynamics of the regions change in September, where, in this case, the central region splits into two subsections.
The figures depicting the separation -and localization of areas using the network degree are of particular interest.This approach reveals distinctly characteristic properties of the formed regions on the map.Notably, we observe scattered areas with a brown color, corresponding to characterized green areas according to land cover and regions with very low altitudes.Apart from separating these areas where they are located throughout the area under consideration, there is an area of intense grey that develops in an area west of an area with high soil elevation.Inthe networkdomain, this area consists of nodes that have many connections to other nodes.However, given that the network formation was conducted based on data correlations, these nodes are linked with high weights at the network edges, indicating strong correlations.
These findings are consistent with those obtained from the analysis of relative humidity data.Similarly, we observe (Fig. 10) the formation of groups in the area, whereby, using the modularity method, five groups are formed for April and July and four groups  for September.The most characteristic region corresponds to the green area region according to land cover/uses (Fig. 3).
The formation of the regions using the network degree is proportionally similar to those if the temperature data were used.That is, the green areas are located with the brown color.At the same time, in the center, a region is formed with characteristic boundaries mainly peripheral to an area with high soil elevation (hill).Furthermore, an additional area with many connections in the northern part is observed in July, possibly related to the distribution of high relative humidity values.
Finally, using wind speed data, the formation of groups is less distinct than in the analysis of temperature values and relative humidity.We can identify areas that correspond to regions characterized as green and with low building-ground height, but in the rest of the regions, the formation has no defined boundaries.This may be due to the dynamic evolution of wind speed values, where fluctuations are most pronounced.
The physical interpretation of these results, in combination with other characteristics of the study area, such as land uses, elevation of buildings, soil altitude and temperature, humidity, and wind speed observations, seems very promising.It provides an alternative way to extract characteristic urban spatiotemporal patterns that may complement the existing techniques.

Conclusions
This study examines 100 m grid square time series data of hourly surface temperature, humidity, and wind speed provided by Copernicus data.The study explores the potential of complex network analysis in unveiling urban spatiotemporal region patterns.What is particularly interesting is that in modularity and network degree, green areas, based on the land use characterization, are precisely identified.From the results and based on the overall knowledge, we can suggest that the proposed procedure can promote useful information that can characterize regions with different characteristics patterns in the area of study and seem to reach notable results also in urban heat island effect existence since it can shed light to the interactions between regions of different characteristics.Overall, the complex network methodology has shown that community detection based on graph analysis is very promising, providing an alternative way to extract information from time series and may play a complementary role to the existing techniques.Moreover, an essential extension of this study on data sets from larger urban regions is the future objective, as it would also incorporate the multilayer and multiplex methodologies.

Fig. 5
Fig. 5 Land surface temperature (a), Relative humidity (b) and Wind speed (c) time series of a grid point for a period of one month, 720values total, (24 daily x 30 days)

Fig. 8
Fig. 8 Projection the network based on modularity and degree into the map

Fig. 9
Fig. 9 Community detection based on modularity (a) and network degree (b) using land surface temperature variable

Fig. 11
Fig. 11 Community formation based on modularity (a) and network degree (b) using wind speed values