plot NaNs in gray color and use log color bar by ValentinGebhart · Pull Request #929 · CLIMADA-project/climada_python
Changes proposed in this PR:
- In the plot function
geo_im_from_array, NaN values in the data will be plotted in gray. Before, NaN value were not plotted (i.e. transparent), making them indistinguishable from plot regions for which there is no data (no centroids). - In the plot function
plot_from_gdf, the colorbar with will be shown on a logarithmic scale if a) the gdf is about return periods or impacts, b) there are no zeros in the data, c) the span of the data's values are at least two orders of magnitude
PR Author Checklist
- Read the Contribution Guide
- Correct target branch selected (if unsure, select
develop) - Descriptive pull request title added
- Source branch up-to-date with target branch
- Documentation updated
- Tests updated
- Tests passing
- No new linter issues
- Changelog updated
PR Reviewer Checklist
- Read the Contribution Guide
- CLIMADA Reviewer Checklist passed
- Tests passing
- No new linter issues
@ValentinGebhart Thank you for that contribution. Can you share an example and compare the resulting plots before and after your changes?
@ValentinGebhart Thank you for that contribution. Can you share an example and compare the resulting plots before and after your changes?
This is an example of plotting the return periods of a hazard object where there are some NaNs (because the centroid had never seen the given threshold intensity, so the return period is given as NaN), and some centroids are removed (left bottom corner). This is the code:
import numpy as np
from climada.hazard import Hazard
from climada.util import HAZ_DEMO_H5 # CLIMADA's Python file
haz_tc_fl = Hazard.from_hdf5(HAZ_DEMO_H5) # Historic tropical cyclones in Florida from 1990 to 2004
haz_tc_fl.check() # Use always the check() method to see if the hazard has been loaded correctly
centroids_mask = np.array(
[ (i + j > 10) for j in range(50) for i in range(50)]
)
haz_tc_fl.centroids = haz_tc_fl.centroids.select(sel_cen=centroids_mask)
haz_tc_fl.intensity = haz_tc_fl.intensity[:, -2434:]
return_periods, label, column_label = haz_tc_fl.local_return_period([30, 40])
from climada.util.plot import plot_from_gdf
plot_from_gdf(return_periods, colorbar_name=label, title_subplots=column_label)
Note that if the value range of the hazard return periods was more than two orders of magnitude (without having zeros), the color scale would also be logarithmic in the new plots
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I very much like the overall contribution, but I have to say I dislike the approach. Calling griddata again on basically the same data, casting to bool, plotting with a weird colormap...
I think the same thing can be achieved much easier, using all the tools of Matplotlib. You can set "bad" and "over/under" colors for a colormap. Choosing the right vmin should then give you the expected outcome with a single call to pcolormesh
# ... if "norm" in kwargs: min_value = kwargs["norm"].vmin vmin = None # We will pass norm else: min_value = np.nanmin(array_im) vmin = kwargs.pop("vmin", min_value) grid_im = griddata( (coord[:, 1], coord[:, 0]), array_im, (grid_x, grid_y), fill_value=min_value-1, # Values outside the grid ) # ... cmap = plt.get_cmap(kwargs.pop("cmap", "viridis")) cmap.set_bad("gray") # For NaNs and infs cmap.set_under("white", alpha=0) # For values below vmin axis.pcolormesh( grid_x - mid_lon, grid_y, np.squeeze(grid_im), transform=proj, cmap=cmap, vmin=vmin, **kwargs )
Comment on lines +927 to +928
| gdf = gdf[['geometry', *[col for col in gdf.columns if col != 'geometry']]] | ||
| gdf_values = gdf.values[:,1:].T |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ): | ||
| kwargs.update( | ||
| {'norm': mpl.colors.LogNorm( | ||
| vmin=gdf.values[:,1:].min(), vmax=gdf.values[:,1:].max() |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| vmin=gdf.values[:,1:].min(), vmax=gdf.values[:,1:].max() | |
| vmin=gdf_values.min(), vmax=gdf_values.max() |
Thanks for the advice! I agree that the way you describe is easier. I implemented and tested it (example plots from above didn't change), with a small modification for the case of the log colorscale (min_value - 1 did not seem to work, so I used min_value/2).
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is looking much better now, thanks for the update! I have a few nitpicky suggestions still 🙈 We can merge once these are resolved!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

