Getting there

Visiting address

Radboud Institute for Molecular Life Sciences
Geert Grooteplein 28
6525 GA Nijmegen


Enter building at: Radboud Institute for Molecular Life Sciences
Follow route 260
Research Departments Center for Molecular and Biomolecular Informatics

Center for Molecular and Biomolecular Informatics

The Center for Molecular and Biomolecular Informatics (CMBI) does research and education, and provides services in bioinformatics and cheminformatics.

Contact Head of the CMBI

Peter-Bram 't Hoen
head of department

+31 (0)24 361 93 90

Mission & vision

Our mission

Our mission is to add value to personal health data by their translation into integrative knowledge and actionable information.

Our vision 

  • CMBI develops bioinformatics approaches that contribute to the understanding of disease mechanisms, personalized therapies and interventions, and a learning health care system
  • CMBI is committed to the reusability of their data, tools, and services
  • CMBI provides  bioinformatics researchers in Radboudumc with a platform for exchange of knowledge and expertise
  • CMBI contributes to the education of BMW, MLW, and MMD students so that they can apply and understand the principles behind (existing) bioinformatics tools

Radboudumc Technology Center Bioinformatics

The Bioinformatics technology center aims to raise and solve biological questions using the most recent computer technologies and big data solutions.

read more


Our mission is to provide a basic understanding in Bioinformatic principles. Follow-up courses are available for those who want to gain greater insight in our field.

read more


Researchers at the CMBI contribute to several courses at both the Faculty of Science and the Medical Faculty. We provide courses in Structural Bioinformatics, Comparative Genomics, Data Analysis, and programming courses such as Java. We specifically focus on the Molecular Life Science students, Biomedical Sciences students and participants in the master Molecular Mechanisms of Disease, although some of our courses can be chosen by Biology, Chemistry or Medical students as well. 

Our mission is to provide a basic understanding in Bioinformatic principles for bachelor students as this was shown to be beneficial for those who want to pursue a career in Life Sciences. Follow-up courses are available for those who want to gain greater insight in our field. 

For MLS/Biology students it is even possible to follow the B-track by choosing a combination of (master) Bioinformatics courses and internships at Bioinformatics departments. 

Our researchers also teach in special interest courses and summer schools here at the Radboudumc, the Radboud University and elsewhere.


For more information, you can contact dr. Hanka Venselaar, education coordinator at the CMBI.




The CMBI offers a wide range of possibilities for internships.

read more


Both bachelor and master students from studies such as Molecular Life Sciences, Biomedical Sciences, Chemistry and MMD are welcome. In general, we are flexible in terms of internship length, type of internship and type of research. 

Below you can find our internship projects:

  • Rationale and objective:

    A significant proportion of right-sided colonic precursor lesions of colorectal cancer (CRC) are still missed during colonoscopy because right-sided precursor lesions are often flat and non-ulcerated (sessile-related lesions: SSLs). The high miss rate of SSLs in combination with their distinct and rapid progression to highly aggressive CRC requires significant improvements in detection. Almost all adenomas and carcinomas (89%) of the right-side colon have solid bacterial biofilms. We postulate that these biofilms might be a good surrogate marker for precursor lesions and could be used to detect adenomas and early cancers. In this project we want to identify extracellular microbial target peptides specific for right-sided precursor lesion using a discovery-based metagenomics and proteomics approach.

    Study population:

    Metagenomics: DNA of archived tissue material of right-sided SSLs (n=20) and CAAs (n=20) collected between 2010-2018 has been extracted and sequenced and will be compared with metagenomics sequencing of normal right-sided biopsies of control population from the BaCo-study of patients without neoplastic lesions (n=35). Proteomics: Adult patients that are scheduled for an EMR-procedure (n=20) for removal of colonic right-sided precursor lesions >10 mm in size, will be asked to participate. Proteomics will be performed on surface-shaved samples of precursor lesions and corresponding normal biopsies.


    Next-generation library preparation and sequencing will be performed by Novogene using the Illumina Novaseq technology and generating an average of 4 million paired-end reads with average size of 150 bases. Data will be combined with the sequencing data that we have generated for the metagenomes of biofilm positive/negative normal tissues of 35 healthy controls following the same protocol (pipeline established by Daniel Garza). First, short sequencing reads will be filtered to remove tags, low-quality reads, and human DNA. Remaining reads will be assembled into longer contigs and these contigs will be binned into metagenomic assembled genomes (MAGs). The quality and completeness of these genomes will be assessed by the distribution of single copy ubiquitous genes using the CheckM method. The open reading frames (ORFs) of the assembled MAGs and high quality contigs that were not binned into a MAG will be used to generate a database of metagenome-predicted proteins. The proteins in this database will be extensively annotated by mapping them to a reference database consisting of 10 million protein sequences of microorganisms (bacteria, archaea, viruses, protists, and fungi) that have been previously found in the human gut by metagenomic studies across the world.

    Proteomics: The intact precursor lesions will be exposed to tryptic digestion to cleave all available protein targets on the outside of the lesion. Peptides will be analyzed combining liquid chromatography with tandem mass spectrometry at the Radboud Technology Center for Mass Spectrometry, and identified using metagenomic sequence libraries of biofilm positive cases. Bioinformatic analysis of the candidate marker proteins will be performed to identify potential targets that can be stained and visualized during endoscopy. True membrane bound peptides will be selected based on annotated protein domains.

    Protein identification specific for precursor lesions: All identified proteins will be annotated by their taxonomic origin, biological function, protein domains, and cellular localization. A specific protocol named Inmembrane for cellular localization will be used for protein sequences of bacterial origin. This protocol is based on integrating different tools and using specific references for gram-negative and gram-positive bacteria to predict cellular localization, identifying outer membrane proteins, cell-wall proteins, extracellular proteins and proteins that are known to be shed and attached to the cell surface. A similar search strategy will be used for non-bacterial proteins, using Pfam.

    The predicted abundance of the identified microbial proteins will be used for target discovery. For this purpose we will first compare proteins individually, by modeling the data as a negative binomial distribution, using the DESeq protocol. Next, we will perform variable selection using Elastic Net (EN) and partial least squares discriminant analysis PLS-DA coupled with variable importance selection. Both models will be applied with a 10-fold cross-validation to identify consistent combinations of proteins that best explain the difference between normal/biofilm-negative and precursor lesions/biofilm-positive tissues. Specificity will be tested by comparing the predictive value of our selected proteins on other metagenomes from the same biological material that we have previously generated, including biofilm positive normal tissues of healthy controls, Lynch syndrome and Inflammatory Bowel Disease samples. Finally, we will preselect proteins following three criteria: (i) significant FDR-corrected p-values from the DESeq analysis; (ii) high consistency scores found by the variable selection procedures (EN and PLS-DA) and; (iii) proteins that exhibit high specificity to biofilm-positive precursor lesions compared to healthy tissue of Lynch syndrome and inflammatory bowel disease patients. From these preselected proteins, we will further select a group of proteins that are highly expressed on the precursor lesions/biofilm-positive tissues and are predicted to have cell-wall, secretion, membrane, or extracellular localization domains. These proteins will be selected as potential targets for the visualization of biofilms during colonoscopy.


    • processing metagenomics and proteomics sequencing reads/profiles
    • annotation of protein targets
    • selection of membranous protein targets
    • differential expression between precursor lesions and controls

    Preferred background:

    Bioinformatics (MSc: MLS/BMS/BIO/MMD)
    Basic programming skills are required (Python, Bash, R)
    Length: at least 5-6 months, can always be prolonged


Bioinformatics Services

We maintain computational facilities, databases, and software packages in bioinformatics.

read more

Diagnostics-in-3D progress update 2021

Update on the EFRO funded project Diagnostics-in-3D.

read more

Diagnostics-in-3D progress update 2021

About 7,000 rare hereditary diseases affect ±8% of the EU population, which translates to ±36 million people. Identification of causative mutations of such diseases thus, forms an essential step towards diagnosis as well as towards development of treatment. At CMBI, along with a multidisciplinary team of experts from BioProdict and Vartion, we are making efforts to predict (and eventually explain) functional effects of variants of unknown clinical significance.

Our EFRO funded project Diagnostics-in-3D uses an advanced deep learning framework known as DeepRank where we leverage information on protein structural features surrounding missense variants, coupled with the evolutionary significance of variant positions, and allow neural networks to learn from such variant environments. This exercise produces a probability estimate classifying whether a variant is disease-causing or not.

We have tailored DeepRank's 3D-CNN framework to help address our problem statement. One of the key aspects of this project is the diversity in variant environments captured from protein structures that differ depending on various protein families they belong to. Taking this aspect into account, we are now in the process of performance evaluation of our tool.


Our researchers CMBI

A list of researchers connected to this department.

read more

Getting there

Entrance: Radboud Institute for Molecular Life Sciences
Route: 260

get directions

Contact Business manager

Nicolai Giling
Business manager

+31 (0)24 3686538

Contact Operational manager

Barbara van Kampen
Operational manager

+31 (0)24 361 93 90

Contact Management assistent

Sandra de Leeuw
Management assistent


Research themes

Affiliated institutes

Radboud Institute for Health Sciences

This department is affiliated with RIHS. The research at this institute aims to improve clinical practice and public health. institute pages

Radboud Institute for Molecular Life Sciences

This department is affiliated with RIMLS. Their main aim is to achieve a greater understanding of the molecular mechanisms of disease. institute pages