Research cast doubt on pangolin’s link to SARS-CoV-2


An amazing new study by researchers at the Broad Institute of MIT and the University of British Columbia, published on preprint servers. bioRxiv* In July 2020, we question the hypothesis that the current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) was transmitted from pangolin, which functions as an intermediate host. Instead, the authors suggest that Pangolin itself happened to be infected by other captive animals or humans.

Study: Single source of Pangolin CoV with spike RBD almost identical to SARS-CoV-2. Image credit: 2630ben / Shutterstock

Similar spike sequences for SARS-Cov-2 and Guangdong pangolin

Since the beginning of the current SARS-CoV-2 pandemic, Spike protein A virus that has almost the same genomic sequence as Guangdong pangolin coronavirus (CoV). In fact, they share all five essential residues and have a total of 97% amino acids in the spike receptor binding domain (RBD).

The significance of this finding is that the viral spike protein is the binding site where the virus binds to host cells. It is also an important antigen and determines the host specificity of Coronavirus.. The fact that Pangolin CoV and the currently circulating virus have very high spike protein similarities seems to indicate that this animal may have been the reservoir of the first animal.

Bat CoV and SARS-CoV-2

Another closely related spike sequence among coronaviruses is in the Bat CoV, RaTG13. It is 90.1% similar in amino acid. This virus has even more 96% identical genomic sequence than the Guangdong pangolin strain with about 90% identity. The latter is considered to be probably the last intermediate host of SARS-CoV-2, before the final leap of infection across human species barriers due to the similarity of spiked RBDs.

Guangdong and Guangxi viruses

Guangdong Pangolin CoV was isolated in March 2019 from one batch of smuggled Pangolin in the province of this name. First explained in October 2019 by a scientist named Liu.. Some researchers then analyzed the gene sequences for CoV isolated from the same batch of animals.

However, another research team described CoV from another batch of ginseng in Guangxi, but they were only 86% to 87% identical to SARS-CoV-2.

Study: same sequence, different papers

Current research aims to examine individual sequences from various studies to find one that most broadly represents a strain. This is the best strain for future research.

Surprisingly, researchers found in a journal metagenomic study that most pangolin sequences were repetitive rearrangements or mappings of the same genetic material, based on the dataset provided by Liu et al. Virus..

Read the profile of the metagenomic dataset of Liu et al. 2019 virus, Xiao et al. 2020 Nature, Lam et al. 2020 Nature has been mapped to Xiao et al. Guangdong Pangolin CoV Genome Sequence GD_1 (EPI_ISL_410721). Samples ungu08 (explained in Liu et al. Viruses, but reintroduced by Xiao et al. Nature as M4) and pangolin_9 (sample M1, Xiao et al. Nature) are Liu et al. It has the most sequence data for all the samples analyzed in. Virus and Xiao etc. Each is natural. The “lung 08 + pangolin_9” track shows their combined read coverage. The “Liu et al. (2019)” track shows pooled lead coverage from all pangolin samples, including mapped leads. The “Xiao et al. (2020)” track reveals pooled read coverage from all samples specific to Xiao et al. Nature with mapped reads

Issues with previous sequence publications

They are Xiao et al. Begins with a paper written by. NatureShows that the sequence in this paper was the same as before Published by Liu et al... , But without proper reference. As a result, especially Xiao et al. Since then, it has become difficult to recognize their identity. We also use the term “total read” in two different ways.

For the two samples, total reads refer to the sequenced subgenomic fragment, while for the remaining samples, the total reads with two reads per fragment, which doubles the library size. I will. This had to be cut in half before the Liu’s pangolin sample. It can be shown to match that used by Xiao et al.

Another issue is one sample from Liu et al. (Virus) and Xiao et al. (natural) span the same sequence. However, the reading depth of the metagenomic sequence for one sample of Liu et al. The low spike RBD provided most of the sequence data. Sequence published by Xiao et al. Based on the same metagenomic analysis, but from a sample that does not match the Liu sample. They do not cover the amino acids containing the spiked RBD motif that are important for binding to the ACE2 receptor on human host cells.

Again, Xiao et al. Six spiked pangolin samples were used to generate the complete spike gene sequence. Nevertheless, the source data for the Spike RBD sequence remains unpublished, except for the final GD_1 sequence.

The same paper by Xiao et al. Share another issue issued by Liu et al. In the journal PLoS pathogenIn other words, the sequences used to bridge the genomic gap by the targeted PCR assay cannot be used. Because of this, current researchers have not been able to replicate the genomic sequence independently.

The method used by these groups is to assemble the genome by pooling sequences from several different pangolin samples, claiming that all samples contain the same type of CoV. In the second study by Liu et al., 2/21 and 1/6 Confucius confiscated in March and June 2019 were pooled to build a single sequence.

These sequences are said to be less abundant in the July 2019 sample compared to the March 2019 sequence, but have not been published. Therefore, a bit of Li et al. The Pangolin CoV genome obtained from the July sequence remains unknown. The problem is, as the researchers explain, “the sequence error cannot be distinguished without access to the raw sequence data, including the gap filling sequence.”

Therefore, researchers say it is important to understand one of the following: “As expected, because of the dependence on the same dataset, the same pangolin source, Xiao et al. Genome GD_1 (GISAID: EPI_ISL_410721) and Liu et al. Genome MP789 (updated May 18, 2020) , Share 99.95% nucleotide identity.”

Non-identity of other pangolin or bat CoV with SARS-CoV-2

In other words, the researchers say that there was a single identified source of pangolin CoV with spike RBD that was about the same as Guangdong pangolin. Further research such as Stretched.. And Lamb and others.. We will also use the dataset from Liu et al. (Virus). But in the latter case, they got more pangolin samples from Guangzhou Customs Technology Center. However, they obtained viral sequences from only one scale sample, which was combined with that of Liu et al. Generate a pangolin CoV reference sequence.

Lam et al. Sequenced a CoV in the Guangxi Zhuang Autonomous Region and found that the sequence was comparable to both the Guangdong and bat CoV RaTG13 in the spiked RBD region. However, only one of the five key binding residues required for RBD-ACE2 engagement at SARS-CoV-2 is present in all three strains. There are all five in the Guangdong Pangolin CoV sequence.

The author notes: “There were batches of Guangdong pangolin other than pangolins smuggled in March 2019, especially if similar CoV sequences occurred in spike RBD, we could not find such data based on Liu et al. Xiao et al. Publication.”

Pangolin is accidental and may not be an intermediate host

As previously observed, it is necessary to clarify whether Guangdong and Guangxi strains are present in wild pangolins. Although 14/17 of infected pangolins in the first batch of smuggled died within a month and a half, in a second batch of confiscated pangolins or a longitudinal study of 334 confederate pangolins in Malaysia , The virus was not detected again. As a result, both the Malaysian study author and the current study found that pangolins infected with the same virus as SARS-CoV-2 were actually only accidental hosts, not humans or some other infected individuals. I raised the hypothesis that I was infected from an animal.

If only smuggled pangolins contain the virus, the infection may have originated from another species and acquired in captivity. This is particularly likely because only one strain of virus was recovered from each batch of smuggled pangolin.


The study concludes: “There is only one source of Pangolin CoV that shares almost the same spike RBD as SARS-CoV-2, but there is still no direct evidence that Pangolin is an intermediate host for SARS-CoV-2. I would like to emphasize that pangolins and other trafficked animals should continue to be considered carriers of infectious viruses that can be transmitted to humans.”

bioRxiv It publishes preliminary non-peer reviewed scientific reports and should not be considered conclusive and should not guide clinical practice/health-related behaviors or be treated as established information.


See journal:


