A study on the relationship between Bacillus CalmetteGurin (BCG) vaccination and Covid-19 prevalence: Do other confounders warrant investigation?

The Covid-19 pandemic, which originated from Wuhan, Hubei province, China, and quickly spread to the rest of the globe is caused by SARS-CoV-2 coronavirus Preliminary data suggest a relationship between the BCG vaccine and the prevalence of Covid-19 The vaccine is used in the prevention of tuberculosis, a disease that is most prevalent in developing countries To determine the potential protective role of Bacillus Calmette-Guerin (BCG) vaccination, this study investigated the occurrence of Covid-19 and the relationship between the spread of Covid-19 in countries that offer BCG vaccination and those that do not To determine if some SARS-CoV-2 strains were more prevalent than others, the study also performed a phylogenetic analysis of the strains from the representative countries To achieve the objectives, the study utilized publicly available data on population size, vaccination coverage and Covid-19 cases The study revealed a significant negative trend between countries that offer the BCG vaccine to the general population and the reported cases of Covid-19 The study proposes future molecular and immunological analyses to determine the potential role of BCG vaccination in protection against Covid-19 This will determine if BCG vaccine has antiviral properties, with the possibility of recommending it for widespread use if supported by scientific data


INTRODUCTION
Over the last century, mankind has faced a few pandemics associated with the emergence of new microorganisms, the latest being the Covid-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Andersen et al., 2020). As of 30th April 2020, this virus was present in 212 countries and territories (Kang et al., 2020). Pandemics occur when novel microorganisms, usually viruses emerge as a result of mutations capable of coding for new potentiator biomolecules such as the spike glycoprotein by SARS-CoV-2 . Consequently, the new machinery may enable the pathogen to jump between hosts, for instance from bats to pangolins and then from the intermediate host to humans (Andersen et al., 2020). Prompt data collection of the novel organisms is critical to understanding behavioral and infectious trends of these pathogens, and in disease mitigation by generating relevant antimicrobial agents (AMA) and vaccines. Efforts to generate effective AMA for novel organisms are challenged by limited understanding of the infectious agent, and hence monitoring basic trends becomes crucial. Following the spread of Covid-19, there have been mixed and unprecedented trends, and varying findings are still accumulating. Interestingly, for instance, countries that immensely practice BCG vaccination against tuberculosis (TB), caused by a bacterium (Mycobacterium tuberculosis) that has a deleterious respiratory effect (Curtis et al., 2020), have generally reported low infection and fatality rates (Miyasaka, 2020). This has led to suggestions that BCG vaccination could be slowing or helping in the protection against Covid-19 (Curtis et al., 2020). While this is likely, in part, a consequence of under-funded health systems, it is vital to investigate the observation. Previous studies have demonstrated that the BCG vaccine has non-specific beneficial effects (Curtis et al., 2020) (Uthayakumar et al., 2018), including increased protection from a variety of respiratory infectious diseases (Prentice et al., 2015). Some of these studies have revealed possible epigenetic changes that subsequently regulate cytokine production (Uthayakumar et al., 2018) (Arts et al., 2018). Coincidentally, most Covid-19 victims die due to unregulated cytokine production.
The study sought to find a correlation between BCG vaccination coverage and Covid-19 virus cases to determine if indeed that pattern arises. It has also been argued that because of variation in geographical positioning and comorbidities of the affected people (Guan et al., 2020), pandemics tend to cluster and generally impact different parts of the world to varying extents (https://coronavirus.jhu.edu/). As such, the novel Covid-19 virus is no exception with early data showing variation in distribution to different regions of the world (Ensheng et al., 2020). Herein, how the impact and the behavioral trend of the Novel Covid-19 virus might have been influenced by geographical positioning and immunization of the populace especially against TB was discussed. The current study aimed to evaluate the relationship between BCG vaccination and coronavirus isolated strains, thus the prevalence of Covid-19.

BCG vaccination vs Covid-19 infected population
To establish if there was a correlation between Covid-19 infections and the BCG vaccination, representative countries across the globe

Phylogeographic analysis
A phylogeographic analysis was performed to understand the historical and contemporary evolutionary changes of SARS-CoV-2 as it spread across the world relative to the origin, Wuhan China. The genomic data used was mined from publicly available genomic databases; Global Initiative on Sharing All Influenza Data (GISAID) (https://www.gisaid.org/ ) (Elbe and Buckland-Merrett, 2017) and https://www.ncbi.nlm.nih.gov/. The accession numbers for the two databases starts with EPI_ISL_ and MT, respectively. The nucleotide sequences utilized includes the following: MT412225 In some countries such as USA and United Kingdom, several genomic data were selected for different geographical locations. In the USA for instance, genome from Washington (USA-WA), New York (USA-NY) and Louisiana (USA-LA) were identified and utilized, representing West Coast, East Coast and Southern states of USA, respectively.
Phylogenetic analysis for clades of SARCoV-2 viruses from the pandemic was performed to determine the geographic spread of different clades. This was to help reveal if there was a more dominant clade, and its distribution across the globe. This was achieved using Nextstrain (https://nextstrain.org/).

BCG vaccination vs Covid-19 infected population
Data obtained from various databases were tabulated factoring in BCG vaccination coverage vs Covid-19 infected population. (Table 1) The details included population of the candidate countries, BCG vaccinations and Covid-19 infections. Further, the Covid-19 infections data was standardized to cases per million comparisons in representative countries and represented graphically (Table 1 and Figure 2). Interestingly, there were high

Phylogeographic analysis
To further understand the genetic relationship and changes in the representative countries, a phylogeographic analysis was conducted.
Following alignment of the genomic data, a phylogeographic analysis was performed using the Tamura-Nei model on MEGA X  as shown in Figure 4. The alignment sequences can be accessed via https://doi.org/10.6084/m9.figshare.12246302.v1. Evolutionary history was inferred by using the Maximum Likelihood method and the Tamura-Nei model. The tree with the highest log likelihood (-41348.44) is shown ( Figure 4) and was rooted in SARS-CoV-2 nucleotide sequence from China-Wuhan (encircled). Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using the Tamura-Nei model, and then selecting the topology with superior log likelihood value. There were a total of 29911 positions in the final dataset. Phylogenetic analysis of SARCoV-2 viruses revealed that clade A2a was the most common both in all regions.

DISCUSSION
The Covid-19 pandemic has had a global impact ( Figure  1) and a high fatality rate. By the end of April 2020, there were 3,303,296 confirmed cases and 235,290 deaths globally. Further emerging in silico predictions reflect even more infections and fatalities. However, European countries and the USA have reported the highest number of cases per one million people (Ensheng et al., 2020). India and the source of the pandemic, China, had lower cases per one million people compared to the USA, Italy, Spain, Netherlands, and UK, even though they are the most populated countries in the world (Table 1). African countries under study had hardly reported high cases of Covid-19 by the date this article was written (end of April, 2020). However, there was an initial fear that due to weaker healthcare systems, Chinese expatriates (about 2 million Chinese nationals live and work in Africa), and the associated increased travel between China and Africa for education, business and leisure, the infection rates could be higher (Kapata et al., 2020).
Coincidentally, countries with lower cases per million people had one thing in common: BCG vaccination ( Figure 2). Interestingly, Iran, a country that abandoned offering a booster BCG vaccine in 1999 (Zwerling et al., 2011), registered high cases of Covid-19 (Figure 3). It is, however, hard to postulate the impact that may have had on Covid-19 incidences. The low number of cases in China could be due to the underlying effects of BCG vaccination and/or massive efforts by the government to contain the virus. Overall, countries with low TB incidences (mostly western countries) that do not administer the BCG vaccine to the general population had the highest Covid-19 cases per million people. Countries such as England stopped the BCG vaccination in 2005(Fine, 2005 whereas Australia abandoned its program in the 1980s due to a reduction of tuberculosis incidences (Zwerling et al., 2011). India has an expansive BCG vaccination program (Zwerling et al., 2011). The country registered 27 cases per million people by the end Figure 2. Covid-19 infected population in selected countries across the world. The data was standardized to cases per million. There is an increase in Covid-19 infections in countries that do not immensely practice BCG vaccination. Data analysis and graphing were performed using GraphPad Prism version 8. of April 2020. Additionally, India's reported first case was on 1/30/2020 (Table 1), and by April 5, 2020, the 1.3 billion people in India were put into lockdown. This precaution could have added to the containment of the virus spread. An even more interesting contrast was the low number of Covid-19 cases in the United States' southern neighbor, Mexico; while Mexico reported 152 cases per million by the end of April 2020, the USA had 3,424 cases per million by the same date. A similar trend was seen between the USA and Argentina, a South American state with high BCG vaccination coverage.
The BCG vaccination coverage had a negative correlation with reported Covid-19 cases (p<0.0001, R 2 =0.5707). Regression analysis revealed a strong association between low numbers of Covid-19 cases and BCG vaccination (mean value of 75.54%) (Figure 3). However, high numbers of Covid-19 cases cannot be explained by the lack of BCG vaccination alone. There are other risk factors such as comorbidity, age, and socio-economic factors including living conditions (Guan et al., 2020). Phylogenetic analysis using genomic data reveals that some countries that were highly impacted shared strains with the same ancestral nodes with those with less Covid-19 reported cases (Figure 4). This could mean shared strains alone may not determine cases, but BCG vaccine, which is shared among those reporting lower cases could be one factor, among other confounders. Further analysis revealed clade A2a was most common in Africa, North and South America, Europe, and Asia ( Figure 5). Clade A2a was curiously the most common in New York City's most affected areas of Manhattan, Brooklyn, Bronx and Westchester (Gonzalez-Reiche et al., 2020). The populations in New York City boroughs and its neighborhoods are considerably high. But these are not as poorly sanitized regions as slums in developing countries (Corburn and Hildebrand, 2015). Further, overcrowding (such as Kenya's Kibera slums which has 30,000 persons per square mile) (Njuguna et al., 2013), poor infrastructure, and inadequate social amenities (Kamau and Njiru, 2018), would have led to a catastrophic situation. Further, looking at Saudi Arabia and Egypt (both with >95% BCG vaccine coverage) emphasizes the possible existence of other confounders besides BCG vaccination (Table 1). Countries which have an immense BCG vaccination practice continue to manifest low levels of Covid-19 spread. Although BCG vaccine induced changes have been found to correlate with protection against experimental viral infections (Arts et al., 2018), it is not precisely clear whether there is such protection against coronaviruses. An understanding of the effects of BCG vaccine (Ozdemir et al., 2020) and the mechanism of action will be key in making informed decisions based on evidence (Kumar and Meena, 2020). Phylogeographic analysis indicates initial occurrence in Figure 5. Phylogeographic clades of SARS-CoV-2. The phylogeny showed that clade A2a (indicated in freeform) was the most common in representative geographical locations. The clades were generated through Nextstrain (https://nextstrain.org/). Wuhan, China, in November to December 2019, leading to transmission across the globe. Although clade A2a, isolated early during the pandemic from EU countries in February 2020 is the most common, evolution within this clade has been difficult to resolve (Gonzalez-Reiche et al., 2020). Consequently, it is assumed that other factors such as but not limited to weather, sanitation, comorbidity, the intensity of Covid-19 testing, different regional moments in the spreading of the pandemic, and continuous evolution of the virus are crucial confounders which warrant further individual and/or collective investigation(s).

Conclusion
If there is a correlation between BCG vaccination in representative countries across the world and the number of reported Covid-19 cases was assessed. The study revealed an inverse trend between countries that immensely offer BCG vaccination to the general population and reported cases of Covid-19. Whereas the study establishes a correlation, the contribution of other factors, including but not limited to test capabilities, demographics, and disease burden need to be investigated to further augment these findings. To elucidate direct antiviral effect of BCG vaccination, molecular analysis to determine the role of BCG vaccination and potential protection against coronaviruses should also be investigated. If BCG vaccine is found to have coronavirus protection benefits, the clinical need for a BCG vaccination may need to be re-evaluated in countries that do not predominantly administer it, particularly on vulnerable and easily exposed population groups such as, the elderly, healthcare workers, people with pre-existing conditions and those with other concerning comorbidities. Consequently, evidence of protection against viral respiratory infectious agents should underscore the need for developing countries to continue the administration of BCG vaccination promptly to offer continued beneficial effects on their populations.