We contribute to several open-source projects including R and Bioconductor and core member Henrik Bengtsson is the creator of the Aroma Project. Being parts of these projects helps us to keep up to date with the field and to get invaluable feedback on our own software and work. Please contact Henrik Bengtsson to discuss software projects.
Here are some of the software tools that we have developed ourselves or contributed to:
One of our priorities is to provide scientifically sound and reproducible research results. In order to achieve this we make use of a large number of high-quality computational software tools provided by either industry or academia. We try to use open-source software as much as possible, particularly because it is key to reproducible research.
The amount of data being collected in genomic research has grown dramatically. It has been less than a decade ago since Affymetrix SNP array data (~60MB/sample) were considered large. Many software tools could handle only 10-20 arrays in multi-sample studies. This was one of the reason Henrik Bengtsson developed the Aroma Project, which handles tens of thousands of arrays even on systems with limited memory resources. When high-throughput sequencing (HT-Seq) entered the arena, there was a paradigm shift in the amount of data that needed to be processed. Sequencing the DNA of a single human genome at 50 times coverage produces a ~250 GiB data file of aligned reads. Yes, that is ~4000 times larger file than what we get with microarray technologies. (This does not mean that we get 4000 times more "information" from HT-Seq data, but that is a different story.) We are now extending the Aroma Project for it to support HT-Seq analysis as well.
We are experienced in programming languages such as C, C++, Java, Perl, Python and R, to name a few.