I am fortunate enough that some of the work done at QuantCo is open source.


Datajudge is a Python data testing library. Datajudge allows for the convenient expression and testing of expectations between different data source in database. It integrates nicely with pytest. For more, read our blog post or have a look at an example in the documentation.


pytsql is a Python library enabling the execution and parametrization of raw mssql scripts. Under the hood it parses a script with an antlr grammar.


Life Monitor

This is project aims to provide me with notifications about aspects of my life which might otherwise go unnoticed. I use google calendar data, sport activity data and daily org mode log data to inform me about recent achievements as well as warn me about recent downturns. Notifications are sent via a telegram chat bot.

I wrote more about it here - all code can be found here. There was a discussion on Hacker News.


Reimplementation of the CycleGAN paper and serving of inference on webpage via ONNX. This was collaborative work Lorenz Kuhn and Philip Junker. Source code can be found here.


Characterizing and Approximating the Optimal Allocation for Top-m Arm Identification

Multi-armed Bandits, Bayesian Optimization, Experiment Design.

We characterize the rate with which the posterior of an optimal allocation converges to the underlying true best arms in the context of multi-armed bandits. Moreover, we propose an algorithm inspired by Thompson sampling to identify the m best arms in a pure exploration scenario. The full report can be found here.

Master thesis in Prof. Andreas Krause’s lab, supervised by Johannes Kirschner and Mojmír Mutný.

Collaborative Filtering: Stacking Collaborative Filtering and Neural Networks for Improved Recommendations

Abstract: Online businesses face the challenge of recommending relevant products to users based on users’ previous preferences and similar customers. This work explores the use of classic matrix factorization methods on the one hand and recent neural network-based methods on the other hand. Final predictions were further improved using ensembling methods such as bagging and stacking. We report similar, competitive scores for matrix factorization methods and slightly lower accuracy for neural network-based methods with a final ensemble RMSE of 0.964.

This project was collaborative work with Benjamin Hahn and Lorenz Kuhn. The full report can be found here.

Structured Information Retrieval of Natural Language Supporting Clinical Decision-making

We created a topic modelling with ICD codes - encoding medical diagnoses - to retrieve relevant medical literature based on an electronic health record of a patient.

Bachelor thesis in Prof. Thomas Hofmann’s lab, supervised by Prof. Carsten Eickhoff. The full report can be found here.