Code | Bennett Institute for Applied Data Science

We believe that all researchers should share their analytic code. Open sharing provides an unambiguous record of the analytical methods used, aiding reproducibility and error spotting. It also permits efficient re-use by other researchers. This is why we share all source code for all our projects, including OpenPrescribing, OpenPathology, OpenSAFELY and FDAAA Trials tracker is available on our main GitHub account. In addition, every paper we publish has its code shared, linked to from the page, and usually in a Jupyter notebook, where you can see exactly how the code runs against the underlying data. This is all part of our open working ethos, and our other core value: combining best practice from the academic and software development communities.

We want to help others work in this way. We have a range of resources we use to onboard external researchers, and we are currently developing a course on Open Analytic Methods for Health Data Analysis. If you are interested in learning how to use Jupyter Notebooks, Github and other tools then please get in touch, we will add you to our contact list for the course.

Check out some of our main GitHub repositories below!

Latest Code blog posts

7 Oct 2020

OpenPathology

This is the code related to our OpenPathology project. Specifically this repo stores ad-hoc analyses, papers, and related research. The code for the website (and online tool, when developed) are in their own repository.

7 Oct 2020

OpenPrescribing

This is the website code for https://openprescribing.net - a Django application that provides a REST API and dashboards for the English Prescribing Dataset published by the NHS Business Services Authority. Information about data sources used on OpenPrescribing can be found here.

7 Oct 2020

OpenSAFELY Cohort Extractor

This is the code for the OpenSAFELY cohort extractor tool which supports the authoring of OpenSAFELY-compliant research, by: Allowing developers to generate random data based on their study expectations. They can then use this as input data when developing analytic models. Supporting downloading of codelist CSVs from the OpenSAFELY codelists repository, for incorporation into the study definition Providing tools to understand and visualise the properties of real data, without having direct access to it It is also the mechanism by which cohorts are extracted from live database backends within the OpenSAFELY framework.

View more Code blog posts →