Project

General

Profile

Reproducibility » History » Version 2

Miguel Dias Costa, 15/11/2011 14:24

1 1 Miguel Dias Costa
h1. Reproducibility
2 1 Miguel Dias Costa
3 1 Miguel Dias Costa
The original name of "Annals of Improbable Research", the organizers of the Ignobel Prizes, was "Journal of Irreproducible Results". This was meant as a joke, but how reproducible are computational results published in "real" journals?
4 1 Miguel Dias Costa
5 1 Miguel Dias Costa
h2. Open Source
6 1 Miguel Dias Costa
7 1 Miguel Dias Costa
* Reproducibility is not only about the developer reproducing the results, everyone else should be able to.
8 1 Miguel Dias Costa
9 1 Miguel Dias Costa
h2. Version Control System 
10 1 Miguel Dias Costa
11 1 Miguel Dias Costa
* Subversion
12 1 Miguel Dias Costa
13 1 Miguel Dias Costa
* Git
14 1 Miguel Dias Costa
15 1 Miguel Dias Costa
* Bazaar
16 1 Miguel Dias Costa
17 1 Miguel Dias Costa
* Mercurial, etc.
18 1 Miguel Dias Costa
19 1 Miguel Dias Costa
The output files must be coupled to a revision. Of course, this implies always commiting before a production run.
20 1 Miguel Dias Costa
21 1 Miguel Dias Costa
h3. Simplest approach
22 1 Miguel Dias Costa
23 1 Miguel Dias Costa
<pre>
24 1 Miguel Dias Costa
int main(int argc, char * argv[]) {
25 1 Miguel Dias Costa
system("svn info > svn.info");
26 1 Miguel Dias Costa
...
27 1 Miguel Dias Costa
</pre>
28 1 Miguel Dias Costa
29 1 Miguel Dias Costa
h3. Improvements
30 1 Miguel Dias Costa
31 1 Miguel Dias Costa
* We could also check status, commit automatically before a production, etc.
32 1 Miguel Dias Costa
33 1 Miguel Dias Costa
  * For that, it would be better to use an API instead of system(), e.g. http://rapidsvn.tigris.org/svncpp.html
34 1 Miguel Dias Costa
35 1 Miguel Dias Costa
* Always branch on production run?
36 1 Miguel Dias Costa
37 1 Miguel Dias Costa
h2. Parameters and Configuration Files
38 1 Miguel Dias Costa
39 1 Miguel Dias Costa
* Output files must also be coupled with runtime parameters.
40 1 Miguel Dias Costa
41 1 Miguel Dias Costa
* Use parsers for parameters and command line arguments (e.g. getopt, boost.program_options)
42 1 Miguel Dias Costa
43 1 Miguel Dias Costa
h2. Automation
44 1 Miguel Dias Costa
45 1 Miguel Dias Costa
* Not only to save time - automated tasks are inherently reproducible
46 1 Miguel Dias Costa
47 1 Miguel Dias Costa
h1. Other reproducibility concerns
48 1 Miguel Dias Costa
49 1 Miguel Dias Costa
* http://www.johndcook.com/blog/2008/05/26/reproducible-scientific-computing/
50 1 Miguel Dias Costa
51 1 Miguel Dias Costa
* http://www.reproducibility.org/wiki/Reproducibility
52 1 Miguel Dias Costa
53 1 Miguel Dias Costa
* http://www.reproducibility.org/wiki/Reproducible_computational_experiments_using_SCons
54 1 Miguel Dias Costa
55 1 Miguel Dias Costa
* http://wwwcdf.pd.infn.it/~loreti/science.html