Functional genomics of Indian SARS-CoV-2

Team Lead: Kumar Somasundaram


SARS-CoV-2 is a novel coronavirus that causes acute respiratory disease (Coronavirus disease 2019/COVID-19), which was initially found in China but is now spread all over the world. The World Health Organization (WHO) declared COVID-19 as a pandemic on March 11, 2020. While the number of COVID-19 cases has increased to more than 5 million worldwide, it has just crossed over one lakh in India ( as on 23 May, 2020). The low infection rate despite our population contributing to one-third of the world’s population could be atttributed to various reasons such as long lockdown with effective social distancing, active identification of COVID-19 patients and quarantining them with proper treatment, presumed cross-immune protection and possibly variation in the viral strains that are introduced or prevalent in India. Comparison of viral genome sequences from different regions/countries allows us to identify the genetic diversity among viruses which would help in ascertaining virulence, disease pathogenicity, as well as origin and spread of SARS-CoV-2 between countries.

The objective of our study is to determine the genetic diversity among Indian SARS-CoV-2 viral isolates in comparison to the strains that are occurring worldwide. In addition to identification of types of viral strains in India, it is anticipated that our study will help us to understand the source of virus origin, route of spread, transmission dynamics of the virus, disease severity, possible viral strains for vaccine development, right type of diagnostic kits and possibly developing relaxation models of social distancing.

Current status

We have completed analysis of 687 (as on 7th June, 2020) Indian viral genomes and found several interesting findings. The potential origin to be countries mainly from Oceania, Europe, Middle East and South Asia regions, which strongly implying the spread of virus through most travelled countries. Among different clades of the virus as identified by Global Initiative on Sharing All Influenza Data (, Indian SARS-CoV-2 viruses are enriched with certain types more than others. See below for details.

Phylogeny map

Bar plot showing clade distribution in India

Bar Plot showing clade distribution among states of India


Next steps/Future directions

Our effort to analyze the Indian SARS-CoV-2 genomes will continue as more sequences are available. We will start sequencing of SARS-CoV-2 viruses made available to us. We look forward to finding India-specific genetic variation. We will monitor the dynamics of different viral strains over time in India.

Efforts are also in progress to find the functional impact of high occurrence non-synonymous mutations on the viral protein functions and use this information toward understanding immune escape mechanism and also developing mutant specific therapies. 


  • Kumar Somasundaram, faculty member, MCB
  • Mainak Mondal
  • Ankita Lawarde

