Creates Masks for geographic regions¶
When working with gridded data such as climate model output or reanalysis data it is often important to create regional averages, e.g., over countries, continents or regions defined in the literature. To do so we need to know for each grid point to which region it belongs to. RegionMask allows us to do just that!
Regionmask is a package that:¶
- Can be used sed to create masks of geographic regions for arbitrary longitude and latitude grids.
- These masks indicate which region a gridpoint belongs to.
- They come in two varian-2D integer Masks-3D Boolean Masks.
- takes great care to consistently treat gridpoints and overlapping regions,
- contains a number of defined regions, including:
- countries, landmasks, regions used in the scientific literature, can plot figures of these regions
- supports using arbitrary existing or user-defined region definitions:regions defined as shapefiles can be accessed via geopandas, user-defined regions can be created via numpy or shapely
This tutorial will highlight some of the utilities of using the regionmask package with additional information at the end of the tutorial
References: https://regionmask.readthedocs.io/en/stable/index.html#s://github.com/hydpy-dev/hydpyub.com/hydpy-dev/hydpy
Where to start:¶
1. Installation Instructions¶
In this tutorial there are two ways to install regionmask:
i. One can use conda, which at time can be the easiest way to install, however in some cases you may run into this issue:
"Solving environment: failed with initial frozen solve. Retrying with flexible solve. Solving environment: / failed with repodata from current_repodata.json, will retry with next repodata source."
# Code to install using conda:
conda install -c conda-forge regionmask cartopy
ii. Alternatively one can use pip to install the regionmask package, as well as the other packages used in this tutorial: cartopy.crs, xarray, numpy and matplotlib
#Code to install all required dependencies:
pip install regionmask
2. Importing various packages required for this tutorial:¶
Useful Packages:
regionmask: Creates masks for different regions.
cartopy: Cartopy is a Python package designed for geospatial data processing in order to produce maps and other geospatial data analyses.
xarray: An open source project and Python package that introduces labels in the form of dimensions, coordinates, and attributes on top of raw NumPy-like arrays, which allows for more intuitive, more concise, and less error-prone user experience.
numpy: A Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
matplotlib: a comprehensive library for creating plots and statistical visualizations in Python
# Importing packages that will be necessary for this tutorial
import regionmask
import cartopy.crs as ccrs
import xarray as xr
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patheffects as pe
Example 1: Creating a mask using predefined regions using defined_regions function¶
Using the function regionmask.defined_regions can be made from the outline of the countries obtained from Natural Earth. They are automatically downloaded, cached1 and opened with geopandas. The following countries and regions are defined in regionmask.
- Countries 1:110m
- Countries 1:50
- Countries 1:10m
- US States 1:50m
- US States 1:10m
For this example SREX regions will be used as the predefined regions(Seneviratne et al., 2012)
#Create an object with various predefined regions
srex = regionmask.defined_regions.srex
#Showing what srex looks like
print(srex)
<regionmask.Regions 'SREX'> Source: Seneviratne et al., 2012 (https://www.ipcc.ch/site/assets/uploads/2... overlap: False Regions: 1 ALA Alaska/N.W. Canada 2 CGI Canada/Greenl./Icel. 3 WNA W. North America 4 CNA C. North America 5 ENA E. North America .. .. ... 22 EAS E. Asia 23 SAS S. Asia 24 SEA S.E. Asia 25 NAU N. Australia 26 SAU S. Australia/New Zealand [26 regions]
Every region has two plotting functions that draws the outlines of all regions:¶
- plot(): draws the region polygons on a cartopy GeoAxes (map)
- plot_regions(): draws the the region polygons onlynes:
Visualizing the different regions using plot_regions()¶
Calling .plot_regions() on any region without any arguments draws the region polygons only
srex.plot_regions()
<AxesSubplot:>
Adding a map to the polygons using plot_regions and matplotlib.patheffects:¶
To create a map one can use the subplots and plot method from Matplotlib, which has a large number of arguments to adjust the layout of the axes. For example, you can pass a custom projection, the labels can display the abbreviation insead of the region number, one can add coastlines etc.
This example also shows how to use matplotlib.patheffects to ensure the labels are easily readable without covering too much of the map (compare to the map above):
- import matplotlib.patheffect as pe (if you haven't already)
We will also use Cartopy to produce maps and other geospatial data analyses.
- import cartopy.crs as ccrs (if you haven't already)
#Import cartopy if it was not imported earlier
import cartopy.crs as ccrs
#Creating a figure (map canvas) with projection Robisnon
figure, ax = plt.subplots(subplot_kw=dict(projection=ccrs.Robinson()))
#Creating text labels for each region, but since path effects is being used the labels will not be in big gray boxes covering important information
text_kws = dict(
bbox=dict(color="none"),
path_effects=[pe.withStroke(linewidth=2, foreground="w")],
color="#67000d",
fontsize=8,)
# Plotting the predefined srex regions onto the map canvas above and adding the text labels
srex.plot_regions(ax=ax, line_kws=dict(lw=1), text_kws=text_kws)
#Adding coastlines to the map
ax.coastlines()
# Creating a global map
ax.set_global()
Visualizing the different regions using plot()¶
Calling .plot() on any region without any arguments draws the default map with a PlateCarree() projection and includes the coastlines:
srex.plot();
Creating a more detailed global map using .plot() and matplotlib.patheffects¶
#import matplotlib.patheffect as pe if not done already
import matplotlib.patheffects as pe
#The same text labels (text_kws) will be used as before
ax = srex.plot(
projection=ccrs.Robinson(), label="abbrev", add_ocean=True, text_kws=text_kws)
#Creating a global map
ax.set_global()
Plotting only a subset of the regions:¶
To plot a selection of regions subset them using indexing:
# regions can be selected by number, abbreviation or long name
regions = [11, "CEU", "S. Europe/Mediterranean"]
# choose a good projection for regional maps
projection = ccrs.LambertConformal(central_longitude=15)
#Creating the plot
ax = srex[regions].plot(
add_ocean=True,
resolution="50m",
projection=projection,
label="abbrev",
text_kws=text_kws,)
# fine tune the extent
ax.set_extent([-15, 45, 28, 76], crs=ccrs.PlateCarree())
Example 2: Creating your own regions using the Regions Function¶
Assume you have two custom regions in the US, you can easily use these to create Regions:
# Create two numpy arrays, one for each region
US1 = np.array([[-100.0, 30], [-100, 40], [-120, 35]])
US2 = np.array([[-100.0, 30], [-80, 30], [-80, 40], [-100, 40]])
#Use the Regions function to convert the new variables into regions
regionmask.Regions([US1, US2])
<regionmask.Regions 'unnamed'> overlap: None Regions: 0 r0 Region0 1 r1 Region1 [2 regions]
This creates two unnamed regions, but giving the regions additional information will make them more useful. To set names and abbrevs:
# Creating a names variable for each region
names = ["US_west", "US_east"]
# Creating an abbreviation variable for each region
abbrevs = ["USw", "USe"]
# Setting the name and abbreviation for each region
USregions = regionmask.Regions([US1, US2], names=names, abbrevs=abbrevs, name="US")
#Calling the new variable USregions to see if the names and abbreviations have been set
USregions
<regionmask.Regions 'US'> overlap: None Regions: 0 USw US_west 1 USe US_east [2 regions]
Plotting the assigned regions onto a map:¶
#Plotting the regions variable USregions using the plot() method from matplotlib and assigning abbreviations as the label
ax = USregions.plot(label="abbrev")
#Zooming into the two regions
# fine tune the extent
ax.set_extent([225, 300, 25, 45], crs=ccrs.PlateCarree())
Example 3. Using the Mask Function to create a mask to define points to each regions¶
The function mask determines which gridpoints lie within the polygon making up each region:
# Start by defining a lon/lat grid with 1 degree grid spacing, where the points define the center of the grid.
lon = np.arange(-179.5, 180)
lat = np.arange(-89.5, 90)
#Creating the mask that uses the srex predefined regions with 1 degree grid spacing
mask = regionmask.defined_regions.srex.mask(lon, lat)
mask
<xarray.DataArray 'mask' (lat: 180, lon: 360)> array([[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]) Coordinates: * lat (lat) float64 -89.5 -88.5 -87.5 -86.5 -85.5 ... 86.5 87.5 88.5 89.5 * lon (lon) float64 -179.5 -178.5 -177.5 -176.5 ... 177.5 178.5 179.5 Attributes: standard_name: region flag_values: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19... flag_meanings: ALA CGI WNA CNA ENA CAM AMZ NEB WSA SSA NEU CEU MED SAH W...
#Verifying the shape of mask
print('Lat shape is:', lat.shape)
print('Lon shape is:', lon.shape)
print('Mask shape is:', mask.shape)
Lat shape is: (180,) Lon shape is: (360,) Mask shape is: (180, 360)
Visualizing the masks created using cartopy and matplotlib¶
# Import if you have not done so before
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
#Creating the mapping canvas and setting the projection to PlateCarree
f, ax = plt.subplots(subplot_kw=dict(projection=ccrs.PlateCarree()))
#Adding Coastlines
ax.coastlines()
#Adding outlines for each srex region to see if the mask lines us
regionmask.defined_regions.srex.plot(ax=ax, add_label=False, line_kws=dict(lw=0.5, color="0.2"))
#Plotting the new masks
mask.plot(ax=ax, transform=ccrs.PlateCarree(), add_colorbar=False)
<cartopy.mpl.geocollection.GeoQuadMesh at 0x7f5c29e809d0>
Part 4: Applying regionmask to real data¶
For this tutorial we will use the NCEP reanalysis air temperature subsets which creates an air temperature field over North America. This data is freely available as an open dataset as a part of the Xarray tutorial library.
https://docs.xarray.dev/en/stable/generated/xarray.tutorial.open_dataset.html
airtemps = xr.tutorial.load_dataset("air_temperature")
#Displaying the data
airtemps
<xarray.Dataset> Dimensions: (lat: 25, time: 2920, lon: 53) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Data variables: air (time, lat, lon) float32 241.2 242.5 243.5 ... 296.5 296.2 295.7 Attributes: Conventions: COARDS title: 4x daily NMC reanalysis (1948) description: Data is from NMC initialized reanalysis\n(4x/day). These a... platform: Model references: http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanaly...
Visualizing what the first time step in the dataset looks like using cartopy and indexing¶
# choose a good projection for regional maps
proj = ccrs.LambertConformal(central_longitude=-100)
#Creating the map canvas
ax = plt.subplot(111, projection=proj)
#Plotting the first time step an dtransforming the data projection from PlateCarree to the map canvas projection
airtemps.isel(time=1).air.plot.pcolormesh(ax=ax, transform=ccrs.PlateCarree())
#Adding coastline
ax.coastlines();
Creating a mask that matches the srex predefined regions to the air temperature data¶
mask = regionmask.defined_regions.srex.mask(airtemps)
Region mask is able to automatically detect whether the longitude needs to be wrapped around, i.e. if the regions extend from -180° E to 180° W, while the grid goes from 0° to 360° W as in our example:
lon = airtemps.lon.values
print("Grid extent: {:3.0f}°E to {:3.0f}°E".format(lon.min(), lon.max()))
bounds = regionmask.defined_regions.srex.bounds_global
print("Region extent: {:3.0f}°E to {:3.0f}°E".format(bounds[0], bounds[2]))
Grid extent: 200°E to 330°E Region extent: -168°E to 180°E
Now plotting the mask over the data¶
# Creating the map canvas
ax = plt.subplot(111, projection = ccrs.AlbersEqualArea(central_lon, central_lat))
low = mask.min()
high = mask.max()
levels = np.arange(low - 0.5, high + 1)
#Plotting the mask using pcolormesh
h = mask.plot.pcolormesh(
ax=ax, transform=ccrs.PlateCarree(), levels=levels, add_colorbar=False)
# for colorbar: find abbreviations of all regions that were selected
#1. Assigning a unique value for each region
reg = np.unique(mask.values)
#2. Ensuring to only collect values for areas within a mask (isnan Tests element-wise for NaN and return result as a boolean array)
reg = reg[~np.isnan(reg)]
#3. Assigning the abbreviation of each srex region to the corresponding mask regions
abbrevs = regionmask.defined_regions.srex[reg].abbrevs
#Assigning the colorbar to the figure
cbar = plt.colorbar(h, orientation="horizontal", fraction=0.075, pad=0.05)
#Editing the colorbar and adding ticks to go with each region
cbar.set_ticks(reg)
cbar.set_ticklabels(abbrevs)
cbar.set_label("Region")
#Adding coastlines to the map canvas
ax.coastlines()
#Zooming into the area and fine tune the extent
ax.set_extent([200, 330, 10, 75], crs=ccrs.PlateCarree())
Additional Information¶
Useful API information from the regionmask package:¶
Top Level Functions¶
- mask_geopandas(geodataframe, lon_or_obj[, ...]): Creates a 2D float mask of a set of regions for the given lat/ lon grid
- (geodataframe, lon_or_obj): Creates a 3D boolean mask of a set of regions for the given lat/ lon grid
- from_geopandas(geodataframe, *[, numbers, ...]): Creates regionmask.Regions from a geopandas.GeoDataFrame.
- flatten_3D_mask(mask_3D):Flattens 3D masks
- plot_3D_mask(mask_3D, **kwargs): Flattens and plot 3D masks
- set_options(**kwargs):Sets options for regionmask in a controlled context.
- get_options():Gets options for regionmask.et options for regionmask.
Source: https://regionmask.readthedocs.io/en/stable/api.html
References:¶
https://regionmask.readthedocs.io/en/stable/notebooks/plotting.html
https://docs.xarray.dev/en/stable/
https://www.ibm.com/docs/en/watson-studio-local/1.2.3?topic=notebooks-markdown-jupyter-cheatsheet
https://docs.xarray.dev/en/stable/generated/xarray.tutorial.open_dataset.html https://numpy.org/doc/stable/reference/generated/numpy.isnan.html