I’m a physicist turned open-source software engineer. I work on projects that make it easier to analyse and share massive scientific datasets.
I strongly feel that progress in most fields of computational science is bottlenecked by how uneccessarily hard that is right now. There are crucial missing pieces of global scientific infrastructure.
Climate science, meteorology, and all downstream impact analyses are similarly hamstrung, and I hope my work contributes in some small way to addressing the global climate crisis by helping enable them.
What I do¶
Open Source Software and Infrastructure for Science
I contribute to a number of open-source software projects that support this aim, often as part of the Pangeo Community.
A few projects I’m particularly proud of or excited about, and my role in them:
- Xarray - N-D labeled arrays and datasets in Python (core maintainer).
- VirtualiZarr - Cloud-Optimizes your Scientific Data as Virtual Zarr stores, using Xarray syntax (original author and lead developer).
- Cubed - Scalable array processing with bounded memory (cheerleader, and author of the Cubed-Xarray integration).
- FROST - Federated registry of all scientific datasets (originator - I’m trying to make this a thing).
Check out my my GitHub page for more details.
Scientific Research Dilettante
I have at some point somehow been involved in research on or written peer-reviewed pieces on a wide range of topics, including:
- Nuclear Fusion Plasma Physics
- Economics of Fusion Reactors
- Physical Oceanography
- Ocean-based Carbon Dioxide Removal
- Climate Science
- Superconductivity
- Hypersonic Aerothermodynamics (Spacecraft Re-entry)
- Seismology
I also regularly interact with researchers in all sort of fields of science, from biology to social science to machine learning.
In every field I see the same kinds of pain around doing computational work, which motivates my software projects.
For a list of publications and scholarly artifacts in which I’ve been involved, check out my ORCID page or my Google Scholar page.
About this website¶
This website is my fork of Chris Holdgraf’s experiment in hosting a personal website and blog via Sphinx extensions instead of using Jekyll. All credit for the website should go to him.
A rough timeline¶
Below is a rough timeline of my working life so far.
2025- : Engineer at Earthmover
I joined Earthmover to be able to work full-time on improving tools for science. The open-sourcing of Icechunk was a major catalyst for this move.
I’m hoping we can build something like my vision of a social network for sharing scientific data.
2023-2025: Staff Scientist at [C]Worthy
After Ryan wound his lab down, I looked for places to apply my open-source skills in the climate space, and found [C]Worthy. Their non-profit “Focused Research Organisation” status is a very cool model.
I helped build some cool stuff, but I felt generally frustrated by the lack of powerful tools for working with scientific data in an operational context.
2021-2023: Oceanographer at Columbia University
After meeting Ryan Abernathey through the Xarray core development team, he invited me to come work with him at Columbia. This seemed like an ideal way to continue doing open-source development whilst pivoting towards more climate-ish stuff.
This meant a move to New York, and a pivot to physical oceanography research. Luckily it turned out ocean turbulence is surprisingly similar to plasma turbulence!
2019: Became worried about Climate Change
There was a clear moment when I realized exactly how big and urgent the climate crisis is. Many people describe such a “penny drop” moment, and for me it came while watching lectures organized by the Oxford Climate Society.
Through this excellent student society I had the privilege of meeting and interrogating many climate experts of all kinds, which only cemented my decision to somehow work on climate issues.
2018: First contributions to Xarray
I first heard about Xarray during my PhD, and immediately started using it to analyse my plasma physics simulation data.
To get this to work I began making upstream contributions. One of my first big contributions was generalizing xarray.open_mfdataset
to work on N-dimensional grids of files, which still sees a lot of use via the combine='by_coords'/'nested'
options.
2016-2021: PhD at Culham Centre for Fusion Energy
I did big simulations of turbulent plasmas inside magnetically-confinement fusion experiments (particularly MAST-U). I was a student of the University of York as part of the excellent Fusion CDT, but worked at Culham Centre for Fusion Energy. The simulations generated a lot of netCDF files on HPC...
My proudest work during my PhD wasn’t physics, but economics: a paper about the (lack of) future market for fusion power.
2012-2016 Studied Physics at Oxford
Graduated from Oxford University with an MPhys in Physics, specializing in Theoretical and Condensed Matter Physics. Did my Master’s thesis on modelling and data analysis of certain types of novel superconducting materials.