TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Member-only story

Python Hands-on Tutorial

Visualising Global Population Datasets with Python

Summary statistics of geospatial raster and vector datasets bounded by polygon shapefiles

Parvathy Krishnan
TDS Archive
Published in
8 min readNov 16, 2021

This work has been done entirely using publicly available data and was co-authored with Kai Kaiser. All errors and omissions are those of the author(s).

Photo by NASA on Unsplash

Mapping information concerning the distribution of people is vital to a host of public policy questions across our planet’s different country settings. The ability to capture the geographic distribution of the population and their key characteristics is integral to measuring exposure to disasters and climate change, and access differentials to key services such as health, and environmental and land-use pressures. Whether for planning, budgeting, or regulatory purposes, sufficiently granular and timely population data for more evidence-based decision-making is necessary.

A new generation of high-resolution population estimate count layers stands to increasingly make a powerful contribution to public sector decision-making, particularly in developing countries. The mapping layers rely on non-traditional methodologies of data collection, including the use of satellite imagery. Consequently, they can provide population estimates for any grid cell on the earth down to 30 meters in resolution. Their latest updates can be accessed online through Application Programming Interfaces (APIs), making them potentially a very valuable asset for data-driven decision-makers.

Some critical limitations of the traditional administrative or statistical population census data are addressed by these high-resolution population maps. Population census data typically lack frequent updates, being undertaken only roughly every ten years by most countries. They are generally presented in tabular administrative classifications, which limits analytics and visualization options compared to more granular grid-based layers. Household-level population census data is rarely collected on a geo-referenced basis, or disclosed at that level. The administrative registers of births and deaths maintained by national and subnational governments are also not always reliable or updated, especially in low and middle-income countries.

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Parvathy Krishnan
Parvathy Krishnan

Written by Parvathy Krishnan

Lead Data Scientist | CTO at Analytics for a Better World | Public Sector Consultant

Responses (1)

Write a response