H3N2

H3N2 is one of the two major subtypes of influenza circulating in humans. Major outbreaks of A/H3N2 strains in humans include Hong Kong Flu (1968-1969), and Fujian flu (2003-2004).

We use the official nextclade dataset for sequence alignment and HA and NA clade assignment: nextstrain/flu/h3n2 (more specifically we use the CY121680.1 HA reference and our custom dataset genspectrum/h3n2/seg6/CY114383 for NA, for all other sequences we use the GCF_000865085.1. We have converted nextclade's clade assignments that are based on a non-open GISAID sequence to use a similar INSDC-available reference sequence.

How We Process Influenza Data

Genspectrum uses all open influenza A data that is available on the INSDC (taxonid: 197911). To classify influenza segments and subtypes we use nextclade sort (using half of all k-mers for each subtype defined in https://github.com/anna-parker/InfluenzaAReferenceDB ) to improve classification). Where available we use the assembly information to group segments that are from the same sample/isolate. For all remaining segments we use a heuristic grouping algorithm to group all segments from the same sample/isolate using the metadata available from each segment.

Protein Functions Explained
  • Segment 1: PB2 protein (part of the RNA polymerase subunit).
  • Segment 2: PB1 protein (part of the RNA polymerase subunit).
  • Segment 3: PA protein (part of the RNA polymerase subunit).
  • Segment 4: HA protein (hemagglutinin, part of the envelope, functions as an attachment factor and membrane fusion protein. It is responsible for binding influenza to sialic acid on the surface of target cells in the upper respiratory tract.)
  • Segment 5: NP protein (nucleoprotein, at start of infection binds with RNA to enter host cell nucleus where it is transcribed and replicated).
  • Segment 6: NA protein (Neuraminidase, part of the envelope, is an enzyme that breaks glycosidic bonds in molecules called neuraminic acids (often found as sialic acids). This helps the new virus particles leave the infected cell and spread to other cells).
  • Segment 7: M1 and M2 proteins (M1 forms the capsid - a layer between the nucleoprotein and the envelope, M2 a proton channel protein).
  • Segment 8: NS1 and NEP proteins (non-structural protein and nuclear export protein)

For each individual influenza subtype you can view the CDS of each protein in the genome data viewer.

Genome Data Viewer

PB2 (seg1)

PB1 (seg2)

PA (seg3)

HA (seg4)

NP (seg5)

NA (seg6)

MP (seg7)

NS (seg8)