Below are a few of the latest posts in my blog.
You can see a full list by year to the left.
Fundamentals: What is Cloud-Optimized Scientific Data?
The article I wish I could have read back when I first heard of Zarr in 2018. Explains how object storage and conventional filesystems are different, and the key properties that make Zarr work so well in cloud object storage.
Science needs a social network for sharing big data
Imagine being able to visit one website, search for any scientific dataset from any institution in the world, preview it, then stream the data out at high speed, in the format you prefer. We have the technology - here's what we should build.
Xarray x NASA: xarray.DataTree for hierarchical data structures
How xarray's new DataTree feature came about, and thoughts on how public agencies can support the open-source scientific software that they depend on.
Cubed: Bounded-memory serverless array processing in xarray
Cubed was designed to address the main problems with Dask, so I integrated it with Xarray.
Dask.distributed and Pangeo: Better performance for everyone thanks to science / software collaboration
Dask's distributed scheduler algorithm got a major fix after we tested its' limits on a huge oceanography analysis problem.
Unit-aware arithmetic in Xarray, via pint
All scientific computations involve units, so let's make our analysis software aware of them.
Easy IPCC part 1: Multi-Model Datatree
Analysing CMPI6 data as a motivation for xarray DataTree.
The faulty science, doomism, and flawed conclusions of Deep Adaptation
Calling out pseudoscience claiming that societal collapse due to climate change is inevitable.
Oxford University Divestment Explained
Thought's on Oxford University's fossil fuel divestment motion.
Coronavirus: The Simplest Model
Solving the simplest possible epidemiological model of the spread of COVID-19.
Nuclear Fusion: too late for the climate
Don't let politicians use funding for nuclear fusion research as greenwashing.