About 7,000 rare hereditary diseases affect ±8% of the EU population, which translates to ±36 million people. Identification of causative mutations of such diseases thus, forms an essential step towards diagnosis as well as towards development of treatment. We, along with a multidisciplinary team of experts from BioProdict and Vartion, are making efforts to predict (and eventually explain) functional effects of variants of unknown clinical significance.

Our EFRO funded project Diagnostics-in-3D uses an advanced deep learning framework known as DeepRank where we leverage information on protein structural features surrounding missense variants, coupled with the evolutionary significance of variant positions, and allow neural networks to learn from such variant environments. This exercise produces a probability estimate classifying whether a variant is disease-causing or not.

We have tailored DeepRank's 3D-CNN framework to help address our problem statement. One of the key aspects of this project is the diversity in variant environments captured from protein structures that differ depending on various protein families they belong to. Taking this aspect into account, we are now in the process of performance evaluation of our tool.