Vol.3, No.3, 234-248 (2011) Natural Science http://dx.doi.org/10.4236/ns.2011.33030 Copyright © 2011 SciRes. OPEN ACCESS Receptor binding specificity and origin of 2009 H1N1 pandemic influenza virus Wei Hu Department of Computer Science, Houghton College, Houghton, USA; wei.hu@houghton.edu Received 31 December 2010; revised 2 February 2011; accepted 3 February 2011. ABSTRACT Recently, a genetic variant of 2009 H1N1 has become the predominant virus circulating in the southern hemisphere, particularly Australia and New Zealand, and in Singapore during the win- ter of 2010. It was associated with several vac- cine breakthroughs and fatal cases. We ana- lyzed three reported mutations D94N, N125D, and V250A in the HA protein of this genetic variant. It appeared that the reason for D94N and V250A to occur in pairs was to maintain the HA binding to human type receptor, so the virus could replicate in humans efficiently. Guided by this interpretation, we discovered a new muta- tion V30A that could compensate for N125D as V250A did for D94N. We demonstrated that the presence of amino acids 30A and 125N in HA enhanced the binding to human type receptor, while 30V and 125D favored the receptors of avian type and of A/South Carolina/1/18 (H1N1). Furthermore, a combination of 94D, 125D, and 250V made the primary binding preference similar to that of A/South Carolina/1/18 (H1N1) and a combination of 94N, 125D, and 250A re- sulted in the primary binding affinity for avian type receptor, which clearly differed from that of A/California/07/2009 (H1N1), a strain used in the vaccine for 2009 H1N1. We also re-examined the origin of 2009 H1N1 to refine our knowledge of this important issue. Although the NP, PA, PB1, and PB2 of 2009 H1N1 were closest to North American swine H3N2 in sequence identity, their interaction patterns were closest to swine H1N1 in North America. Keywords: 2009 H1N1; Hemagglutinin; Influenza; Informational Spectrum Method; Mutation; Receptor Binding Specificity 1. INTRODUCTION New influenza viruses arise through genetic reassort- ment. The 2009 H1N1 virus is a novel virus with its eight gene segments derived from North American and Eurasian swine lineages [1]. Intensive research on this virus has been conducted, including a series of papers of our own [2-12]. These papers covered the mutations and correlated mutations in HA and NA, the stalk motifs in NA, HA receptor binding specificity, novel host markers, interactions of the proteins of 2009 H1N1. Although the World Health Organization (WHO) declared an end to the 2009 H1N1 influenza pandemic on August 10, 2010, continued surveillance of the evolution of the 2009 H1N1 virus is still warranted. The 2009 H1N1 virus remained genetically stable since it emerged in March 2009. However, a genetic variant of 2009 H1N1 was first discovered in Singapore in early 2010, and then was spread to Australia and New Zealand during the 2010 winter influenza season of the southern hemisphere. This variant became the predomi- nant virus circulating in these three countries and was linked to several vaccine breakthroughs and fatal cases. As such, a vaccine update might be needed sooner than expected [13]. Several mutations were identified in genes HA, NA, PB2, PB1, NP, and NS1 of this variant, including three mutations D94N, N125D, and V250A in the HA protein. To examine the impact of these mutations, a structural homology model of HA from the A/Brisbane/10/2010 (H1N1) virus based on the A/California/04/2009 (H1N1) structure was constructed. Mutation N125D was found to be centrally located in the classical Sa epitope, poten- tially affecting antigenicity, and mutation V250A is lo- cated at an internal beta sheet below the receptor binding pocket facing the Sa epitope [13]. Additionally, two mu- tations D94N and V250A tended to occur in pairs in the HA of this variant exclusively circulating in Australia and New Zealand so far [13]. Given the potential sig- nificance of the mutations observed in this variant, it is imperative to further investigate their roles in HA recep-
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 235235 tor recognition. In general, human influenza viruses bind preferen- tially to α2,6 receptors, typically found in the upper air- way of humans, whereas avian influenza viruses tend to bind to α2,3 receptors, also found in the lower respira- tory tract of humans. It is well documented that muta- tions in the HA protein may alter HA receptor binding selection. For example, mutations E190D, Q226L, or G228S in H1, H2, or H3 could switch binding prefer- ence from avian to human type receptor [14-17]. In [7], the informational spectrum method (ISM) was success- fully applied to quantify the effects of mutations in the HA protein on its binding affinity. Position 222 resides in the receptor binding site of the HA protein of 2009 H1N1 and therefore may play a critical role in HA bind- ing specificity. One of the findings in [7] indicated that mutation D222G in the HA of 2009 H1N1 enhances the selection for avian type receptors, and reduces the selec- tion for human type receptors. This finding was subse- quently verified in an experiment [18] and mutation D222G was further found to be associated with severe clinical outcome [18-20]. Another recent experiment [21] showed that mutation D94N in H5 HA of avian origin increased the binding of HA to human type receptor, while decreased the binding to avian type receptor. The purpose of this study was to elucidate the impact of mutations D94N, N125D, and V250A in the HA pro- tein of this variant on receptor affinity with ISM, a bio- informatics technique developed in [7,22-24]. Also we sought to explore the origin of 2009 H1N1 and its con- nections with the swine lineages to enrich our under- standing of this novel virus. 2. MATERIALS AND METHODS 2.1. Sequence Data Protein and gene sequences of influenza were re- trieved from the Influenza Virus Resource (http://www. ncbi/nlm.nih.giv/geno mes/FLU/FLU.html) of the Na- tional Center for Biotechnology Information (NCBI) and the EpiFlu Database (http://platform.gisaid.org) of GI- SAID. Only the full length and unique sequences were selected. All sequences used in this study were aligned with MAFFT [25]. 2.2. Informational Spectrum Method The informational spectrum method is a bioinformat- ics technique that can be used to analyze protein se- quences. Prior to this analysis, the protein sequences have to be translated into numerical sequences. One such approach is to assign each amino acid to its electron-ion interaction potential (EIIP), which represents the average energy of the valence electrons in the amino acid (Table 1). The application of EIIP to protein function analysis assumes that the strength of the electromagnetic field surrounding the protein is indicative of its biological function. This method was successful in revealing vari- ous protein properties. The numerical sequence 1, 2,xm m of a protein sequence is transformed into the frequency do- main using DFT. The DFT coefficients n are de- fined as 2 1, 2,,2 jnm N nxmen N where N is the length of sequence m The energy density spectrum is defined as 2,1,2,,2SnXnX nXnnN The informational spectrum (IS) of a sequence m comprises the frequencies and the amplitudes of its DFT. Peak frequencies of IS of a protein sequence reflect its biological or biochemical functions. To determine the same biological or biochemical functions of a group of protein sequences, a consensus informational spectrum (CIS) can be used, which is defined as the product of energy density spectrum Sn of each sequence in the group. A measure of similarity for each peak is a signal- to-noise ratio (S/N), which is defined as a ratio of signal density to the mean value of the whole spectrum [22]. The theory of CIS [26] states that: 1) One peak only exits for a group of protein se- quences sharing the same biological function. 2) No signal peak exists for biologically unrelated protein sequences. Table 1. The electron-ion interaction potential (EIIP) of amino acids used to encode amino acids. Amino acid EIIP Amino acid EIIP L 0.0000 Y 0.0516 I 0.0000 W 0.0548 N 0.0036 Q 0.0761 G 0.0050 M 0.0823 E 0.0057 S 0.0829 V 0.0058 C 0.0829 P 0.0198 T 0.0941 H 0.0242 F 0.0946 K 0.0371 R 0.0959 A 0.0373 D 0.1263
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 236 3) Peak frequencies are different for different biologi- cal functions. In [7,22-24], it was found that the CIS of HA1 of in- fluenza strains have the following characteristic domi- nant peaks at different IS frequencies as presented in Table 2. 3. RESULTS 3.1. HA Receptor Specificity Altered by Mutations It was observed in [17] that the HA proteins of 2009 H1N1 primarily bind to human type receptors. However, some of them could bind to both human and avian type receptors. Here we are interested in the receptor prefer- ences of the HA proteins of 2009 H1N1 in Singapore (n = 9) and Oceania (n = 92), which were collected after January 1, 2010. The ISM confirmed that the primary binding specificity of both groups was human type re- ceptor at CIS frequency F(0.295). However, they had different secondary binding frequencies, with the HA proteins from Singapore having F(0.258) and the HA proteins from Oceania having F(0.282). To numerically analyze the CIS frequency changes induced by muta- tions D94N, N125D, and V250A in the HA protein (Fig- ure 1), ISM was applied to the HA sequences of 2009 H1N1 in Singapore (Figure 2). It appeared that N125D increased F(0.258) and F(0.282) and decreased F(0.295) to make F(0.258) the primary frequency, D94N in- creased F(0.295) and F(0.282) and decreased F(0.258) dramatically, V250A increased F(0.258) and decreased F(0.295) to make F(0.258) the primary frequency. When combined together, D94N and V250A increased F(0.295) and F(0.282) and decreased F(0.258) due to a larger contribution from D94N. It seemed that the biological reason for mutations D94N and V250A to always occur in pairs was to keep F(0.295) as the primary frequency so the virus could replicate in humans efficiently. It could be inferred that mutation V250A compensated for mutation D94N. The consequence of mutation N125D was to make F(0.258) the primary frequency, and the original primary frequency of the HA proteins of 2009 H1N1 in Singa- pore was F(0.295). We were wondering if there was an- other mutation in the HA protein that actually could compensate for the effect of N125D to make F(0.295) the primary frequency for the whole set of HA sequences in Singapore. Based on the observed mutation pairing of D94N and V250A, we suspected that the desired muta- tion should be A → V. Sequence examination revealed that mutations A30V and N125D always occurred in pairs in the HA sequences of 2009 H1N1 in Singapore. To learn the contribution of A30V (Figure 1), this muta- Table 2. Characteristic IS frequencies of HA proteins in 2009 H1N1, swine H1N1/H1N2, avian H1N1, and A/South Caro- lina/1/18 (H1N1). Subtype 2009 H1N1 Swine H1N2/H1N1 Avian H1N1 A/South Carolina/1/18 (H1N1) FrequencyF(0.295) F(0.055) F(0.282) F(0.258) Figure 1. This plot shows in 3D structure the four mutations in the HA of 2009 H1H1 vari- ants found in [13] and in this study. Mutation A30V is colored in yellow, D94N in blue, N125D in red, and V250A in pink. (PDB code: 3AL4). tion was applied to these HA sequences (Figure 2). Its impact was to increase IS at F(0.295) from 73.7483 to 77.0436 (gain = 3.2953), and increase IS at F(0.258) from 45.1802 to 47.3324 (gain = 2.1522), so the relative gain at F(0.295) was 1.1431. We could conclude that the net gain of this mutation A30V was to enhance F(0.295). To further examine two mutations A30V and N125D, the nine HA sequences of 2009 H1N1 in Singapore were divided into two different subsets, one with 30A and 125N (n = 5), and the other with 30V and 125D (n = 4). Figure 3 showed that a combination of 30A and 125N
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 237237 Figure 2. The impact of mutations A30V, N125D, D94N, and V250A in HA on receptor preferences of the HA protein of 2009 H1N1 in Singapore.
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 238 Figure 3. The impact of the amino acids at positions 30 and 125 in HA on receptor preferences of the HA protein of 2009 H1N1 in Singapore. The left plot was based on the HA sequences with 30A and 125N, and the right plot was based on the HA sequences with 30V and 125D. increased F(0.295) and decreased F(0.258) and F(0.282), while a combination of 30V and 125D produced the op- posite outcome. An A/California/7/2009 (H1N1) like virus is part of both the northern hemisphere seasonal vaccine for 2010- 2011 and southern hemisphere seasonal vaccine for 2011 [27], which had amino acids 94D, 125N, and 250V in its HA protein. To investigate the reason for several vaccine breakthroughs associated with this variant virus in Oce- ania, we needed to compare the HA receptor preferences of A/California/7/2009 (H1N1) with those in Oceania. The HA sequences of 2009 H1N1 in Oceania (n = 92) were divided into three non-overlapping subsets accord- ing to three mutations N94D, D125N, and V250A: the first subset with amino acids 94D, 125D, and 250V (n = 38), the second with 94D, 125N, and 250V (n = 26), and the third with 94N, 125D, and 250A (n = 27). These se- quences did not carry a mutation A30V as the HA se- quences from Singapore. It appeared that the IS of the consensus HA1 sequence with 94D, 125N, and 250V was the most similar to that of A/California/7/2009 (H1N1), and the IS of the con- sensus HA1 sequence with 94D, 125D, and 250V was the most dissimilar to A/California/7/2009 (H1N1)(Fig- ure 4). Therefore, it was more likely for the viruses with 94D, 125D, and 250V or 94N, 125D, and 250A in the HA protein to cause vaccine breakthrough than the one with 94D, 125N, and 250V. The major difference be- tween the first two variations was that the one with 94D, 125D, and 250V had F(0.258) as its primary frequency and the one with 94N, 125D, and 250A had F(0.282) as its primary frequency. The history of HA binding preferences of 2009 H1N1 (Table 3) implied that in the early months of its run, the virus retained the swine charactertisitcs F(0.055), which disappeared in the late months of its course. On the other hand, the frequnecy F(0.258) of A/South Carolina/1/18 Table 3. Primary and secondary IS frequency of 2009 H1N1 HA by month. Year-Month Primary Frequency Secondary Frequency 2009-04 0.295 0.055 2009-05 0.295 0.258 2009-06 0.295 0.258 2009-07 0.295 0.258 2009-08 0.295 0.055 2009-09 0.295 0.258 2009-10 0.295 0.258 2009-11 0.295 0.258 2009-12 0.295 0.258 2010-01 0.295 0.258 2010-02 0.295 0.258 2010-03 0.295 0.258 2010-04 0.295 0.258 2010-05 0.295 0.258 2010-06 0.295 0.258 2010-07 0.295 0.258 2010-08 0.295 0.258 (H1N1) was dorminant after August 2009. 3.2. Origin of 2009 H1N1 It is well established that the genes of 2009 H1N1 are of North American and Eurasian swine origins [1,28]. To further learn their origins according to swine subtypes, the Hamming distances between the genes of 2009 H1N1 and those of swine H1N1 and H3N2 in North America and Europe were computed. The distance information in
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 239239 Tables 4 and 5 suggested that the HA gene of 2009 H1N1 was derived from swine H1N1 in North America, the NA gene from swine H1N1 in Europe, the M1 and M2 genes from swine H3N2 in Europe, and the NS1, NS2, NP, PA, PB1, and PB2 genes from H3N2 in North America. Hamming distances in Figure 5 gave further evidence that gene segments PA, PB1, and PB2 of 2009 H1N1 Figure 4. The IS of HA1 of A/California/7/2009 (H1N1) and IS of consensus HA1 of pandemic 2009 (Oceania). The three consen- suses were taken from the HA1 sequences with specific amino acids at positions 94, 125, and 250. Table 4. Hamming distances between consensus protein sequences of 2009 H1N1 and those of North American and Eurasian swine viruses were calculated. The minimum distance in each protein is marked with an asterisk. Protein HA NA M1 M2 NS1 NS2 NP PA PB1 PB2 Dist(H1N1 2009, H1N1 N_America) 48* 76 11 10 17 8 12 50 43 46 Dist(H1N1 2009, H3N2 N_America) 13 13 13* 5* 8* 14* 15* 9* Dist(H1N1 2009, H1N1 Europe) 108 24* 5 5* 42 15 32 31 26 25 Dist(H1N1 2009, H3N2 Europe) 3* 5* 42 14 35 32 28 26 Table 5. Hamming distances between consensus gene sequences of 2009 H1N1 and those of North American and Eurasian swine viruses were calculated. The minimum distance in each protein is marked with an asterisk. Gene HA NA M1 M2 NS1 NS2 NP PA PB1 PB2 Dist(H1N1 2009, H1N1 N_America) 149*267 91 21 48 21 92 373 423 358 Dist(H1N1 2009, H3N2 N_America) 97 24 32* 13* 63* 88* 101*82* Dist(H1N1 2009, H1N1 Europe) 431 81* 29 8 126 47 233 281 296 356 Dist(H1N1 2009, H3N2 Europe) 28* 7* 127 51 234 281 301 366
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 240
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 241241 Figure 5. Hamming distances by year between consensus protein sequences of 2009 H1N1 and those of the swine viruses that were closest to 2009 H1N1 as indicated in Table 5. were introduced to swine lineages around 1998, and gene segments NA and M of 2009 H1N1 were intro- duced around 1979 [1]. The findings in [1] also demon- strated that the introduction of HA, NP, and NS of 2009 H1N1 to swine lineages occurred around 1918. However, our analysis suggested that the NP and NS segments were introduced around 1998, and the HA segment around 1957, occurring much later than preciously pre- dicted. Codon usage bias is a unique molecular feature of many organisms including influenza viruses. This bias could influence host adaptation and the virulence of the influenza viruses because their replication relies on host cells. As the 2009 H1N1 virus originated from swine lineages, it is subject to host selection pressure after cross-species transmission, which could be reflected in the codon usage of its genes. To learn the subtle codon usage differences between 2009 H1N1 and its closest swine ancestors, the condon usage patterns of their genes were displayed in Figure 6. It demonstrated that the HAs of 2009 H1N1 and North American swine H1N1 had different codon usage in Asp, His, Gln, and Val, the NAs of 2009 H1N1 and Eurasian swine H1N1 had different codon usage in Leu, Lys, and Tyr, the Matrix genes (M1 + M2) of 2009 H1N1 and Eurasian swine H3N2 had different codon usage in Asn, Cys, Glu, Lys, and Tyr, and the genes NS1, NS2, NP, PA, PB1, and PB2 concatenated together (NS1 + NS2 + NP + PA + PB1 + PB2) of 2009 H1N1 and North American swine H3N2 had different codon usage in His. The ac- tual differences in codon usage were summarized in Ta- ble 6, which showed that 2009 H1N1 made more use of AAC for Asn than swine virus, and less TAC for Tyr than swine virus. Overall, host selection pressure on human influenza viruses does not favor the use of G or C nucleotides and the use of a G nucleotide at the third codon position [29]. 3.3. Comparison of NP, PA, PB1 and PB2 of 2009 H1N1 with Those of Swine Lineages The influenza viral polymerase, composed of proteins PB1, PB2, and PA, is critical in viral RNA synthesis, host adaptation, and virulence by interacting with NP. To
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 242 Figure 6. Codon bias of genes of pandemic 2009 and genes of swine viruses that were closest to those of pandemic 2009.
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 243243 Table 6. Actual codon bias of 2009 H1N1 and swine virus observed in Figure 6. Gene 2009 H1N1 Swine virus HA More GAC for Asp Less GAC for Asp HA Less CAC for His More CAC for His HA More CAG for Gln Less CAG for Gln HA Less CCG for Pro More CCG for Pro HA Less (GTC + GTG) for ValMore (GTC + GTG) for Val NA Less TTC for Phe More TTC for Phe NA More AAC for Asn Less AAC for Asn NA Less CTG for Leu More CTG for Leu NA Less AAG For Lys More AAG For Lys NA Less TAC for Tyr More TAC for Tyr M1 + M2 More AAC for Asn Less AAC for Asn M1 + M2 More TGC for Cys Less TGC for Cys M1 + M2 Less TAC for Tyr More TAC for Tyr M1 + M2 Less GAG for Glu More GAG for Glu M1 + M2 More AAG for Lys Less AAG for Lys NS1 + NS2 + NP + PA + PB1 + PB2 More CAC for His Less CAC for His study the interactions between these four proteins of influenza viruses of avian, human, 2009 H1N1 and swine origins, the correlated residue pairs that had a positive mutual information (MI) value were counted according to their location in the proteins [9]. It uncov- ered that in avian, human, 2009 H1N1, and swine vi- ruses, the inter-protein correlation from (NP, PA), (NP, PB1), (NP, PB2), (PA, PB1), (PA, PB2), (PB1, PB2) was stronger than the intra-protein correlation (NP, NP), (PA, PA), (PB1, PB1) and (PB2, PB2), with (NP, NP) being the weakest. Further, the correlation pattern of 2009 H1N1 was more similar to that of avian and human in- fluenza than to swine, in spite of the swine origin of 2009 H1N1. Using the same approach, we discovered that the interaction pattern of the four proteins of North American swine H1N1 was most similar to that of 2009 H1N1, although the sequence identity of the four pro- teins of North American swine H3N2 was most similar to that of 2009 H1N1 (Figure 7). Our findings rein- forced the concept that sequence identity is only one of the many factors to measure the similarity of two influ- enza viruses. Many of the classical markers for adaptation of avian or swine viruses to human hosts do not exist in 2009 H1N1, implying that other previously unrecognized mo- lecular determinants are accountable for its capability to infect humans. The study in [4,5] discovered novel host markers in the proteins of 2009 H1N1 that were not pre- sent in the traditional host markers. These novel markers were identified by the significant residue positions that could separate 2009 H1N1 from human viruses sub- tracted by the characteristic positions in avian and swine viruses, which were marked as (a) for avian and (s) for swine positions in Tables 7-10. To further examine the important positions in NP, PA, PB1, and PB2 of 2009 H1N1, we compared them to those of swine lineages in North America and Europe, and avian, human, swine (general) viruses (Tables 7-10). There were several important positions in the NP, PA, PB1, and PB2 of avian, human, swine (general), swine H1N1 in North America and in Europe that shared the same amino acid but 2009 H1N1 had a different amino acid at the same positions, reflecting the uniqueness of this novel virus. These positions were 53 and 316 in NP, 186, 204, 213, 275, 336, and 626 in PA, 12, 175, 216, 298, 364, 386, and 728 in PB1, and 54, 684 in PB2. At positions 353, 377, 444, 498 in NP, 362, 388, 407 in PA, 179, 339, 361, 486, 584, 638, 741 in PB1, and 65, 147, 225, 590, 591, 645 in PB2, swine H3N2 in North Amer- ica and 2009 H1N1 shared the same amino acid, but differed from other viruses (Tables 7-10). The amino acid (serine) at position 186 in PA of 2009 H1N1 was found to be necessary for its compatibility with PB2 and PB1 subunits [30], and the amino acids at positions 590 and 591 in PB2 were the SR polymorphism uncovered in [31]. The PB2 of 2009 H1N1 does not carry the human signature 627K, yet this virus replicates in humans and
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 244 Figure 7. Averaged correlated pair counts in each individual protein and between proteins of 2009 H1N1 and swine viruses. Table 7. This table contains the consensus amino acids at the sites in NP that have high importance in separating 2009 H1N1 from human viruses [4,5]. The novel host sites in this protein are the positions without a ‘a’ (for avian) or a ‘s’ (for swine) or both. Position 21 31(a,s) 53 119 189(s) 190 217(a) 289(s) 313(a,s)316 Avian N R E I M V I Y F I Human N K E I M V S Y Y I 2009 H1N1 D R D V I A V H V M Swine D R E V I A I H F I Swine H1N1(North America) D R E V I A I H F I Swine H1N1(Europe) N R E I M V I Y F I Swine H3N2 (North America) D R E V I A I H F I Swine H3N2 (Europe) N R E I M V I Y F I Position 350(s) 353 371 373(a) 377 430(s) 433 444 456(s) 498 Avian T V M T S T T I V N Human T S M A S T T I V N 2009 H1N1 K I V T N S N V L S Swine K V V A S S N I L N Swine H1N1(North America) K V V A S I N I L N Swine H1N1(Europe) T V M T I T T I V N Swine H3N2 (North America) K I V A N S N V L S Swine H3N2 (Europe) T V M T I T T I V N
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 245245 Table 8. This table contains the consensus amino acids at the sites in PA that have high importance in separating 2009 H1N1 from human viruses [4,5]. The novel host sites in this protein are the positions without a ‘a’ (for avian) or a ‘s’ (for swine) or both. Position 28(a,s) 55(a) 85 100(a,s)186 204 213 256 262 275 277 Avian P D T V G R R R K P S Human L N T A G R R R K P S 2009 H1N1 P D I V S K K K R L H Swine P D T V G R R R K P S Swine H1N1(North America) P N N V G R R R K P F Swine H1N1(Europe) P D T V G R R R R P S Swine H3N2 (North America) S D T V G R R Q K P S Swine H3N2 (Europe) P D T V G R R R R P F Position 336 337(a,s) 356(a)362 388 400(a,s)404(a,s) 407 552(a,s) 626 Avian L A K K S S A I T K Human L S R K S L S I S K 2009 H1N1 M A R R G P A V T R Swine L A K K S P A I T K Swine H1N1(North America) L A R K S F A I T K Swine H1N1(Europe) L A K K G M A I T K Swine H3N2 (North America) L A K R G P A V T K Swine H3N2 (Europe) L A K K G M A I T K Table 9. This table contains the consensus amino acids at the sites in PB1 that have high importance in separating 2009 H1N1 from human viruses [4,5]. The novel host sites in this protein are the positions without a ‘a’ (for avian) or a ‘s’ (for swine) or both. Position 12 175 179 216 298 327(a,s)339(s)361(a,s) 364 386(s) Avian V D M S L R I S L R Human V D M S L K I S L R 2009 H1N1 I N I G I R M R I K Swine V D M S L R I N L R Swine H1N1(North America) V D M S L R V N L R Swine H1N1(Europe) V D M S L R I S L R Swine H3N2 (North America) V D I S L R M R L R Swine H3N2 ((Europe) V D M S L R I S L R Position 435 486 517(s)584(a,s)587 618 638(s)728 741(a,s) Avian T R I R A E E I A Human T R I R A E E I A 2009 H1N1 I K V Q V D D V S Swine T R I R A E E I A Swine H1N1(North America) A R I R A K E I A Swine H1N1(Europe) T R V H T E E I A Swine H3N2 (North America) T K I Q A E D I S Swine H3N2 ((Europe) T R V H A E E I A
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 246 Table 10. This table contains the consensus amino acids at the sites in PB2 that have high importance in separating 2009 H1N1 from human viruses [4,5]. The novel host sites in this protein are the positions without a ‘a’ (for avian) or a ‘s’ (for swine) or both. Position 9(a) 54 64(a,s) 65(s) 81(a,s) 105(a,s)147 184(s) 199(a,s) Avian D K M E T T I T A Human N K T E M V I T S 2009 H1N1 D R M D T T T A A Swine D K M E T T I T A Swine H1N1(North America) N K I E T A I M S Swine H1N1(Europe) D K M E T T I T A Swine H3N2 (North America) D K M D T T T T A Swine H3N2 ((Europe) D K M E T T I T A Position 225 292(a,s)315 340(s) 453 475(a) 559 567(a,s)588(a,s) Avian S I M R P L T D A Human S T M R H M T N I 2009 H1N1 G V I K S L I D T Swine S I M R P L T D A Swine H1N1(North America) S T V M K R I S I Swine H1N1(Europe) S T I M K R I P V Swine H3N2 (North America) G A I M K K I P I Swine H3N2 ((Europe) S I I M K R I P V Position 590 591(s) 613(a,s)627(a,s) 645 661(a,s) 667 674(a,s)684(a) Avian G Q V E M A V A A Human G Q T K M T I T A 2009 H1N1 S R V E L A V A S Swine G Q V E M A V A A Swine H1N1(North America) G Q V K M S A N A Swine H1N1(Europe) G Q A E M A A T A Swine H3N2 (North America) S R V E L A A T A Swine H3N2 ((Europe) G Q A E M A A T A are efficiently transmitted in humans. The SR polymor- phism was recently identified in [31] as a mechanism for 2009 H1N1 to partially overcome the lack of K627 by enhancing polymerase activity. However, as early as in 2002, the SR occurred in the PB2 of swine H3N2 in North America (A/swine/Iowa/H02AS8/2002(H3N2)), but none was found in the PB2 of Eurasian swine H1N1 and H3N2. Even though the majority of the PB2 proteins of North American swine H1N1 had GQ, some of them had SR, as early as in 2002 (A/swine/Iowa/ H02NJ56371/2002(H1N1)), and GR in 2008 (A/swine/ Nebraska/02013/2008(H1N1)). Typically, GQ was cou- pled with 627K, though SR and GR were tied with 627E. 4. CONCLUSIONS A genetic variant of 2009 H1N1 recently emerged as a predominant virus in Australia, New Zealand, and Sin- gapore during the winter season of 2010 in the southern hemisphere. Our ISM analysis on the three mutations D94N, N125D, and V250A found in [13,32] suggested that the biological reason for the mutation pairing of D94N, and V250A was to keep the human type receptor as its primary binding preference so the virus could rep- licate in humans efficiently. Mutation V250A compen- sated for D94N. Based on this interpretation, we searched for and uncovered a new mutation A30V that
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 247247 compensated for N125D. We quantitatively investigated how mutations A30V, D94N, N125D, and V250A in the HA protein of this variant may affect its HA receptor binding affinity. In summary, mutation A30V increased IS frequency F(0.295) and decreased F(0.258), while V250A did the opposite. At the same time, mutation D94N increased F(0.295) and decreased F(0.258) and F(0.282), whereas N125D functioned the opposite. When combined together, D94N and V250A increased F(0.295) and F(0.282) and decreased F(0.258), but A30V and N125D produced the opposite. Our ISM re- sults also implied that the recent vaccine breakthroughs were partially caused by the alteration of HA receptor binding specificity resulted from these HA mutations. As the second task of our investigation, we revisited the origin of 2009 H1N1 to refine our understanding of this important issue. Our findings illustrated that the HA gene of 2009 H1N1 came from that of swine H1N1 in North America, the NA gene from Eurasian swine H1N1, the M1 and M2 genes from Eurasian swine H3N2, and the NS1, NS2, NP, PA, PB1, and PB2 genes from swine H3N2 in North America. In addition, our analysis pro- vided the timeline for the occurrence of genes of swine lineages most similar to those of 2009 H1N1. Although the four proteins NP, PA, PB1, and PB2 of 2009 H1N1 were closest to those of North American swine H3N2 in sequence identity, their interaction patterns were closest to those of swine H1N1 in North America. 5. ACKNOWLEDGEMENTS We thank Houghton College for its financial support. REFERENCES [1] Garten, R.J., Davis, C.T., Russell, C.A., Shu, B., Lind- strom, S., Balish, A., et al. (2009) Antigenic and genetic characteristics of swine-origin 2009 A(H1N1) influenza viruses circulating in humans. Scienc e, 325, 197-201. doi:10.1126/science.1176225 [2] Hu, W. (2009) Analysis of correlated mutations, stalk motifs, and phylogenetic relationship of the 2009 influ- enza A virus neuraminidase sequences. Journal of Bio- medical Science and Engineering, 2, 550-558. doi:10.4236/jbise.2009.27080 [3] Hu, W. (2010) The Interaction between the 2009 H1N1 influenza A hemagglutinin and neuraminidase: Mutations, co-mutations, and the NA stalk motifs. Journal of Bio- medical Science and Engineering, 3, 1-12. [4] Hu, W. (2010) Novel host markers in the 2009 pandemic H1N1 influenza A virus. Journal of Biomedical Science and Engineering, 3, 584-601. doi:10.4236/jbise.2010.36081 [5] Hu, W. (2010) Nucleotide host markers in the influenza A viruses. Journal of Biomedical Science and Engineering, 3, 684-699. doi:10.4236/jbise.2010.37093 [6] Hu, W. (2010) Identification of highly conserved do- mains in hemagglutinin associated with the receptor binding specificity of influenza viruses: 2009 H1N1, Avian H5N1, and Swine H1N2. Journal of Biomedical Science and Engineering, 3, 114-123. doi:10.4236/jbise.2010.32017 [7] Hu, W. (2010) Quantifying the effects of mutations on receptor binding specificity of influenza viruses. Journal of Biomedical Science and Engineering, 3, 227-240. doi:10.4236/jbise.2010.33031 [8] Hu, W. (2010) Subtle differences in receptor binding specificity and gene sequences of the 2009 pandemic H1N1 influenza virus. Advances in Bioscience and Bio- technology, 1, 305-314. doi:10.4236/abb.2010.14040 [9] Hu, W. (2010) Correlated mutations in the four influenza proteins essential for viral RNA synthesis, host adapta- tion, and virulence: NP, PA, PB1, and PB2. Natural Sci- ence, 2, 1138-1147. doi:10.4236/ns.2010.210141 [10] King, D., Miller, Z., Jones, W. and Hu, W. (2010) Char- acteristic sites in the internal proteins of avian and human influenza viruses. Journal of Biomedical Science and Engineering, 3, 943-955. doi:10.4236/jbise.2010.310125 [11] Hu, W. (2010) Highly conserved domains in hemaggluti- nin of influenza viruses characterizing dual receptor binding. Natural Science, 2, 1005-1014. doi:10.4236/ns.2009.29123 [12] Hu, W. (2010) Host markers and correlated mutations in the overlapping genes of influenza viruses: M1, M2; NS1, NS2; and PB1, PB1-F2. Natural Science, 2, 1225-1246. doi:10.4236/ns.2010.211150 [13] Barr, I.G., Cui, L., Komadina, N., Lee, R.T., Lin, R.T., Deng, Y., Caldwell, N., Shaw, R. and Maurer-Stroh, S. (2010) A new pandemic influenza A(H1N1) genetic variant predominated in the winter 2010 influenza season in Australia, New Zealand and Singapore. Euro Surveill, 15, 19692. [14] Stevens, J., Blixt, O., Glaser, L., Taubenberger, J.K., Palese, P., Paulson, J.C. and Wilson, I.A. (2006) Glycan microarray analysis of the hemagglutinins from modern and pandemic influenza viruses reveals different receptor specificities. Journal of Molecular Biology, 355: 1143- 1155. doi:10.1016/j.jmb.2005.11.002 [15] Stevens, J., Blixt, O., Tumpey, T.M., Taubenberger, J.K., Paulson, J.C. and Wilson, I.A. (2006) Structure and re- ceptor specificity of the hemagglutinin from an H5N1 in- fluenza virus. Science, 312, 404-410. doi:10.1126/science.1124513 [16] Matrosovich, M., Tuzikov, A., Bovin, N., Gambaryan, A., Klimov, A., Castrucci, M.R., Donatelli, I. and Kawaoka, Y. (2000) Early alterations of the receptor-binding prop- erties of H1, H2, and H3 avian influenza virus hemag- glutinins after their introduction into mammals. Journal of Virology, 74, 8502-8512. doi:10.1128/JVI.74.18.8502-8512.2000 [17] Karasin, A.I., West, K., Carman, S. and Olsen, C.W. (2004) Characterization of avian H3N3 and H1N1 influ- enza A viruses isolated from pigs in Canada. Journal of Clinical Microbiology, 42, 4349-4354. doi:10.1128/JCM.42.9.4349-4354.2004 [18] Liu, Y., Childs, R.A., Matrosovich, T., Wharton, S., Palma, A.S., Chai, W., Daniels, R., Gregory, V., et al. (2010). Altered receptor specificity and cell tropism of D222G hemagglutinin mutants isolated from fatal cases
W. Hu / Natural Science 3 (2011) 234-248 Copyright © 2011 SciRes. OPEN ACCESS 248 of pandemic A(H1N1) 2009 influenza virus. Journal of Vir olo gy, 84, 12069-12074. doi:10.1128/JVI.01639-10 [19] Kilander, A., Rykkvin, R., Dudman, S. and Hungnes, O. (2010) Observed association between the HA1 mutation D222G in the 2009 pandemic influenza A(H1N1) virus and severe clinical outcome, Norway 2009-2010. Euro Surveill, 15, 19498. [20] Liu, Y., Childs, R.A., Matrosovich, T., et al. (2010) Al- tered receptor specificity and cell tropism of D222G Hemagglutinin mutants isolated from fatal cases of pan- demic A(H1N1) 2009 influenza virus. Journal of Virol- ogy, 84, 12069-12074. doi:10.1128/JVI.01639-10 [21] Su, Y. Yang, H.Y., Zhang, B.J., Jia, H.L. and Tien, P. (2008) Analysis of a point mutation in H5N1 avian in- fluenza virus haemagglutinin in relation to virus entry into live mammalian cells. Archives of Virology, 153, 2253-2261. doi:10.1007/s00705-008-0255-y [22] Veljkovic, V., Niman, H.L., Glisic, S., Veljkovic, N., Perovic, V. and Muller, C.P. (2009) Identification of he- magglutinin structural domain and polymorphisms which may modulate swine H1N1 interactions with human re- ceptor. BMC Structural Biology, 9, 62. doi:10.1186/1472-6807-9-62 [23] Veljkovic, V., Veljkovic, N., Muller, C.P., Müller, S., Glisic, S., Perovic, V. and Köhler, H. (2009) Characteri- zation of conserved properties of hemagglutinin of H5N1 and human influenza viruses: possible consequences for therapy and infection control. BMC Structural Biology, 7, 9-21. [24] Veljkovic, N., Glisic, S., Prljic, J., Perovic, V., Botta, M. andVeljkovic, V. (2008) Discovery of new therapeutic targets by the informational spectrum method. Current Protein and Peptide Science, 9, 493-506. doi:10.2174/138920308785915245 [25] Katoh, K., Kuma, K., Toh, H. and Miyata, T. (2005) MAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Research, 33, 511- 518. doi:10.1093/nar/gki198 [26] Cosic, I. (1997) The resonant recognition model of mac- romolecular bioreactivity, theory and application. Birk- hauser Verlag, Berlin. [27] http://www.cdc.gov/flu/about/qa/1011_vac_selection.htm [28] Solovyov, A., Palacios, G., Briese, T., Lipkin, W.I. and Rabadan, R. (2009) Cluster analysis of the origins of the new influenza A(H1N1) virus. European Surveillance, 14, 19224. [29] Wong, E.H., Smith, D.K., Rabadan, R., Peiris, M. and Poon, L.L. (2010) Codon usage bias and the evolution of influenza A viruses. Codon Usage Biases of Influenza Virus. BMC Evolutionary Biology, 10, 253. doi:10.1186/1471-2148-10-253 [30] Wanitchang, A., Jengarn, J. and Jongkaewwattana, A. (2011) The N terminus of PA polymerase of swine-origin influenza virus H1N1 determines its compatibility with PB2 and PB1 subunits through a strain-specific amino acid serine 186. Virus Research, 155, 325-333. doi:10.1016/j.virusres.2010.10.032 [31] Mehle, A. and Doudna, J.A. (2009) Adaptive strategies of the influenza virus polymerase for replication in humans. Proceedings of National Academy Science of U.S.A., 106: 21312-21316. doi:10.1073/pnas.0911915106 [32] Maurer-Stroh, S., Lee, R.T., Eisenhaber, F., Cui, L., Phuah, S.P. and Lin, R.T. (2010) A new common muta- tion in the hemagglutinin of the 2009 (H1N1) influenza A virus. PLoS Currency, RRN1162.
|