Pixi: A Conda Replacement
I have heard about pixi for a while now but have not had the chance to try it out until today. When you head to their documentation, you will find the install instructions consist of a bash script piped directly into the terminal. This is not ideal, though admittedly conda does the same thing. There is also an option to compile the binary from source. Since I have cargo installed, that seemed like the more principled approach....
Snow Data in Switzerland
I have recently moved to Switzerland and started skiing again after many years. Curious about historical snow levels, I went looking for public data. Finding out how much snow is available right now is very easy, there are many webpages that provide this information for today but it’s much harder to find the answer to the question: How much snow was there yesterday? Or last week? Or a year ago?...
Immich - ML supported tagging plugin
For the last decade or so, I used Seafile, Owncloud and then Nextcloud to self-host my data on a small homeserver. This has worked wonderfully, and I have nothing but respect for the community that built these wonderful and powerful tools. But one thing that never worked as smoothly as I wanted it to was the photo upload from my smartphone to Nextcloud. The upload works, and it rarely fails, but it’s never instant....
MHC and Viruses - Molecular Mimicry
I saw this article, “Molecular mimicry as a mechanism of viral immune evasion and autoimmunity”, and I got immediately interested in reproducing Figure 1b. In there, the authors investigate peptide similarity between viruses and the human proteome. They say certain viruses might have adjusted their peptide use to match peptides found in the human proteome, so that they can evade the MHC recognition. And this is a biologically cool mechanism, and the method they used is rather simple....
Face Detection with Python
In the past I have explored what I can do with image embeddings and used it to train a very usable set of classifiers that sort out random photos and nature photos from my camera roll. If you want to read about that you can find the blog post here: openpaul.github.io/posts/2025-04-06-image-sorting and here a small intro into embeddings: openpaul.github.io/posts/2024-09-28-image-embeddings/ Recently I became interested in detecting faces and identifying people in my photos locally....
Python et al. - Getting to a scientific plot on a new machine
From time to time you and I are lucky enough to start fresh. A new MacBook, a new Linux laptop or maybe a new server? And of course, we need to quickly get it up and running to create lovely plots. With Anacoda, Conda, Mamba, UV, Python, Virualenvs and more it can get confusing quickly. While all of this will be changing over time, today I want to disentangle this status quo as of Summer 2025 and maybe create a bit of order in this chaos....
Netxflow and nf-core
When analysing data, especially when analysing complicated genomics data, one quickly learns to appreciate the benefits of well-written workflows. In the past, I have developed my own bash, Snakemake and Nextflow pipelines. But since then some people from the bioinformatics community have put in enormous effort to create general standardized pipelines that anyone can use. For Snakemake this effort is called workflow catalogue and for Nextflow it is called nf-core....
Composing with Plotnine
Composing plots with plotnine has just become possible. Well, not quite yet. As of the writing of this blog post, the latest development version of plotnine is plotnine==v0.15.0a1. And there is a discussion issue open where the feature has been teased: https://github.com/has2k1/plotnine/discussions/929 Copying the example from that issue, we can reproduce the tiling mentioned in the post: from plotnine import * from plotnine.data import mtcars p1 = ggplot(mtcars) + geom_point(aes("wt", "mpg")) + labs(tag="a)") p2 = ( ggplot(mtcars) + geom_boxplot(aes("wt", "disp", group="gear")) + labs(tag="b)") ) p3 = ggplot(mtcars) + geom_smooth(aes("disp", "qsec")) + labs(tag="c)") p4 = ggplot(mtcars) + geom_bar(aes("carb")) (p1 | p2 | p3) / p4 /home/paul/miniforge3/envs/post_plotnine/lib/python3....
Plotting With Python
Plotting with Python is a nightmare. At least that’s what I thought after almost a decade of plotting experience with R. In R ggplot2 is the undefeated champion of plotting. I learned ggplot2 syntax in 2014 and it is beautiful. Like building a tower, in ggplot2 one builds layer on layer until the plot is done. All in one expression. And ggplot2 has sane defaults: You add a color to the data, and you get a legend....
Sorting Images with ML
I got a lot of positive feedback about the last blog post where I showed that CLIP embeddings are a cool tool to use for image classification. As such I thought I share how you can use that knowledge to actually sort your images into your own categories on your own machine. So in this post, I share the script that I currently use to train my own classifier and then sort images based on it....