Measure Area and Distance of GIS Data by Geopandas in Colab
We have discussed how to use geopandas in Colab to use GIS data to plot maps (Yiu, 2021). This article is to use geopandas functions to measure areas of polygons with boundaries and to measure distances between coordinate points.
So we start downloading GIS data with polygon boundaries first. I am going to use the Wellington City Council’s GIS open data of defining suburbs at https://data-wcc.opendata.arcgis.com/datasets/WCC::wcc-suburb-boundaries/about.
- The first step is always installing software, here we install geopandas and contextily.
!pip install geopandas #mapping!pip install contextily #basemapimport geopandas as gpdimport contextily as ctx
2. Then we collect a shape file (.shp) of the boundaries of suburbs of Wellington, New Zealand provided by Wellington City Council. Here I save the shp file and others to my google.colab drive first, and define the filepath as the filename and use geopandas to read the file. Print the first 5 rows (Figure 1)
from google.colab import drivedrive.mount('/content/drive/')#import the Wellington City Council's Suburb Boundaries Map from https://data-wcc.opendata.arcgis.com/datasets/WCC::wcc-suburb-boundaries/aboutdata = gpd.read_file("drive/MyDrive/Colab Notebooks/WCC_Suburb_Boundaries.shp")data.head()
Print the map of suburbs in blue colored on a Terrain basemap first (Figure 2).
ax = data.plot(column="suburb", figsize=(25,12.5), cmap='Blues') #colored suburbs
ctx.add_basemap(ax, crs=data.crs, source=ctx.providers.Stamen.Terrain) #basemap
First, we set the “suburb” column as the index, then measure the area of each suburb by using .area function of geopandas. They are in metre square, so I divide them by 1,000,000 to convert them into kilometre square (km2).
data_suburb = data.set_index("suburb") #set suburb column as the index
data_suburb["area"] = data_suburb.area/10**6 #measure the area of each suburb by .area in km2
data_suburb["area"]
Here I follow the example in geopandas documentation webpage to take the centroid of each suburb as the point to measure distances. Taking the centroid of the first suburb as the point of reference, measure all the distances between centroids of suburbs to that of the first suburb by using .centroid and .distance functions. But there are various coordination systems which can affect the measured distances. It is beyond the scope of this article. For details, please refer to the geopandas documentation.
data_suburb['centroid']=data_suburb.centroid #define centroid of suburb boundary by the function .centroid
first_point = data_suburb['centroid'].iloc[0] #define the first point as the centroid of the first suburb by the function .iloc[0]
data_suburb['distance'] = data_suburb['centroid'].distance(first_point) #measure the distance from the centroid of each suburb to the first point
data_suburb['distance']
You can check the EPSG coordinate system and the units of measure of length in the data file by .crs function of geopandas.
data_suburb.crs #check the coordinate system used and the unit of measure in metre
The outputs are as follows:
<Projected CRS: EPSG:2193>
Name: NZGD2000 / New Zealand Transverse Mercator 2000
Axis Info [cartesian]:
- N[north]: Northing (metre)
- E[east]: Easting (metre)
Area of Use:
- name: New Zealand - North Island, South Island, Stewart Island - onshore.
- bounds: (166.37, -47.33, 178.63, -34.1)
Coordinate Operation:
- name: New Zealand Transverse Mercator 2000
- method: Transverse Mercator
Datum: New Zealand Geodetic Datum 2000
- Ellipsoid: GRS 1980
- Prime Meridian: Greenwich
The three new parameters are added into the dataframe data_suburb. You can list the dataframe and see how the three parameters are included in the table (Figure 3).
data_suburb.head() #Area, Centroid, and Distance are included in the dataframe
References
Yiu, C.Y. (2021) Mapping GIS Data on a Basemap by Contextily in Colab, June 29. https://ecyy.medium.com/mapping-gis-data-on-a-basemap-by-contextily-in-colab-dfff5837eec