Reproducibility in astronomy research papers 7404-REPASTR
Most contemporary astronomy research papers depend extensively on software pipelines. Observations are collected and reduced digitally, and analysed in comparison to numerical models. Analytical calculations are often sufficiently complex that computer algebra systems are needed to reduce the chance of errors. This heavy dependence on powerful computational hardware and software makes reproducibility of any individual astronomy research paper difficult. This course will start with an overview of the broader problem of reproducibility in science [1] and the current proposal by several astronomers for carrying out reproducible research projects whose aim, method and results are published in peer-reviewed journals as research papers [2]. An overview of other existing methods is given in Appendix A of [2] on tools including containers such as Docker, package managers, and Jupyter, and in Appendix B of [2] on existing implementations of scientific workflows.
The initial task of the students will be to fully reproduce, on an OS and computer of their choice, an already published paper [3], starting with a source package small enough to fit on a floppy disk [4]. Guidance on shell-level computing skills [5] and the Maneage system [6] will be provided.
The main aim will be that each student's own branch of the template will be sufficiently developed to the level of yielding a reproducible draft research paper (pdf file) based on the draft status of the student's observational data and/or models at the end of the semester. International scientific collaboration by providing bug reports [7] will be encouraged as part of this course.
Extensive use of git repositories, as well as synchronous communication during tutorials and asynchronous communication at other times, will include channels using the irc protocol: #maneage at OFTC, and the matrix protocol: #maneage_community:matrix.org. A curated list of matrix servers is provided at https://servers.joinmatrix.org.
Total student workload
Learning outcomes - knowledge
Learning outcomes - skills
Learning outcomes - social competencies
Teaching methods
Prerequisites
Course coordinators
Assessment criteria
The scale of points (max 5) will be negotiated by rough consensus among the participating students and the lecturer, depending on progress made during the semester. The initial set of parameters is (N0, N1, N2, N3, N4) = (3, 0.5, 0.5, 0.5, 0.5).
N0 points - A git branch of Maneage with the student's own project (prior to submission to a journal: private access only), through to the stage of verification (verify.mk) of some minimal results, that can be reproduced through to the final pdf by (at least) the lecturer.
N1 points - Each properly written bug report (Tatham 1999 [7]) posted on the public bug reporting site for Maneage or a research-level astronomy software package or a lower-level dependency will count for N1 points. If responses are given by developers within the semester, constructive followup will be expected for the report to qualify.
N2 points - Each properly written merge request (MR) posted on the public bug reporting site for Maneage or a research-level astronomy software package or a lower-level dependency on a git forge will count for N2 points. If responses are given by developers within the semester, constructive followup will be expected for the report to qualify.
N3 points - Provision of (successive) log files on a POSIX-compatible OS that shows previously unknown bugs in upstream Maneage, and participation in their analysis, through to final testing that either establishes a reproducible bug or proposes a fix. Max N3 points.
N4 points - Log files, as for case N3, for project-specific software. Max N4 points.
Additional information
Additional information (registration calendar, class conductors, localization and schedules of classes), might be available in the USOSweb system: