mcss-postprocess documentation

Introduction

mcss saves the simulation output it generates, for example time series of species levels, as a HDF5 data file. HDF5 is an open-source versatile binary data storage format. While many applications (e.g. R) support import of HDF5 directly, since the simulation output may be very large, it is often better to extract the required data and save it as plain text. The mcss-postprocess application allows you to do this. mcss-postprocess can be used to perform common tasks such as extracting times series for specific species, or averaging species levels over many runs. mcss-postprocess outputs plain text, by default to the screen, or to a file if specified on the command line.

Installation

For instructions on how to compile and install mcss-postprocess, see the README file included with the mcss distribution.

Running mcss-postprocess

Once installed, mcss-postprocess is generally run by typing the following command:
$ mcss-postprocess [OPTIONS] HDF5_FILE
where HDF5_FILE is the filename of the data file output by mcss (specified by the data_file parameter in the mcss parameter file), and [OPTIONS] are a number of command-line options specifying the actions mcss-postprocess is to perform. These command-line options are described below, and a summary can be seen by running mcss-postprocess with no options or HDF5_FILE.

Running mcss-postprocess with only a HDF5 file (no options) will give a summary of the simulation data, including total time taken, number of reactions executed, and so on. For example, if the module1 model (in the examples/ directory) has been run with mcss, then running the following command:
$ mcss-postprocess module1.h5
will output something similar to:
model input file: module1.sbml
simulation algorithm: Multicompartment Gillespie
number of compartments: 9
number of species: 2
number of rule templates: 2
number of rules in templates: 5
total number of rules: 28
lattice x dimension: 3
lattice y dimension: 3
simulated time: 59 minutes 50 seconds (3590.870557 seconds)
simulation start time: 11:00:28 16/09/08
simulation end time: 11:00:28 16/09/08
total simulation time: 0 seconds
total preprocessing time: 0 seconds
total main loop time: 0 seconds
total reactions simulated: 14047
reactions per second: 14047

mcss-postprocess command-line options

output a time-series of the levels (i.e. number of molecules) of species. Specific species and compartments can be specificed with the -c and -s options. The first line of the output is a header showing labels for each column. This may be switched off using the -p option.
specify which compartments to output data for. c1,c2,... is a comma-separated list of compartment identifiers, which can be specified in two ways, either by their index or position. Compartment indices can be obtained by first using the -m option to view the information on the compartments and noting their indices. Alternatively, compartment positions can be specified as an (x,y) tuple where the x and y positions are separated by a full stop. For example, the following two commands do exactly the same thing - output the levels of all species in the compartments at positions (0,1),(1,1) and (2,1) for the simulation output for the module1 model in the examples/ directory. The first command uses compartment indices to specify the compartments, the second uses compartment positions.
$ mcss-postprocess -l -c 4,3,5 module1.h5
$ mcss-postprocess -l -c 0.1,1.1,2.1 module1.h5
specify which species to output data for. s1,s2,... is a comma-separated list of species identifiers, which are obtained by using the -n option. For example, to output the levels of species A in the compartment at (2,2) for a run of the module1 model:
$ mcss-postprocess -l -s 0 -c 2.2 module1.h5
sum the levels of all the species in each compartment.
sum the levels of species over all compartments.
do not output a header line with time series or propensities data.
output information e.g. indices, names, identifiers on the dataset, rules, compartments and species.
output data on reaction propensities. mcss must have been run with the log_propensities parameter set to 1 to use this option.
a simulation may need to be run multiple times and the levels of species averaged over these runs. For example, the module1 model may be run ten times. The output for each run must be saved as a file of the format module1.INDEX.h5, where INDEX is the number of the run. Every index must have the same number of characters, so the first run is saved as module1.01.h5, the second as module1.02.h5, and the tenth run as module1.10.h5. For 1000 runs the first run would be saved as module1.0001.h5. Instead of giving a HDF5 filename after the options, the basename of the file is given e.g. module1. To average the levels of species A in compartment (2,2) over 10 runs the following command would be used:
$ mcss-postprocess -e 10 -s 0 -c 2.2 module1
For each species, three columns of data are output: the mean level, the standard deviation, and the confidence interval (95% interval by default).
specify the degree of confidence used to calculate confidence levels (95% by default).
save the output of mcss-postprocess to the specified file rather than outputting to the screen.

License

The mcss distribution, including all source code, model examples, and documentation, are the copyright of Jamie Twycross, and released under the GNU GPL version 3 license.

Credits

mcss was written by Jamie Twycross, with contributions from Francisco Romero-Campero, Jonathan Blakes and James Smaldon. It is being used on Systems Biology research projects in the Centre for Plant Integrative Biology and the School of Computer Science, University of Nottingham, U.K. This work is funded by grants from the BBSRC grant BB/D0196131.

For further information or any questions please contact jpt AT cpib.ac.uk.

copyright 2008, 2009 Jamie Twycross, released under GNU GPL version 3.