Geographically weighted regression for spatial analysis in BigQuery
.png)
Although leading data warehouses already offer some level of support for spatial data they can usually only work with core spatial functionalities and lack some of the advanced analytical capabilities required for many geospatial use cases. The CARTO Analytics Toolbox extends the geospatial capabilities of the most popular cloud data warehouses using spatial SQL which means both easier integrations as well as accessibility given SQL’s universal adoption within the spatial community. Our Analytics Toolbox already unlocks more than 60 advanced spatial functions using a set of User Defined Functions (UDFs) and procedures covering a broad range of spatial use cases including: data transformations spatial indexing and advanced functions to carry out geocoding clustering route calculations and more.
As part of the Advanced modules for Google BigQuery, we have now added support for the Geographically Weighted Regression (GWR) method a statistical regression method that models the local (e.g. regional or sub-regional) relationships between a set of predictor variables and an outcome of interest.
Suppose we have data across the whole UK on the number of crimes per area as well as other associated variables (i.e. such as the unemployed active population or the level of urbanity of the area) and we wanted to model the number of crimes as a function of such variables. The output from this regression model would be a set of parameter estimates each reflecting the relationship between the number of crimes and a particular attribute. The parameter estimates in this scenario are global statistics and describe the average relationship for the whole of the UK which is assumed to be a good representation of any local relationship across the territory.
However the global relationship might hide local differences as well as contrasting relationships in different parts of the study area which tend to cancel at the global level. In other words there might be particularities in the relationship of the different variables in specific areas of the country (e.g. Central London, Manchester area rural areas) that may get cancelled if we only study the behaviour in an aggregated manner.
| This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 960401. |







