New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Emerging B.1.X variant lineage in Europe and Africa with conserved Spike mutations: P9L, E96Q, R346S, Y449N, P681H and T859N #297
Comments
Dear Tom, thanks for flagging this, Public Health France and the NRC are currently investigating a cluster in France with this constellation (now more sequences on GISAID), and a Pango lineage designation would be super useful! |
Ah brilliant, thanks for flagging those extra sequences @Simon-LoriereLab! Extra GISAID IDs from French cluster: EPI_ISL_5655471 updated usher tree: |
New sequence from Italy (ID'd by @c19850727): EPI_ISL_5796708 |
Further sequences from France (currently called as B.1.628 or 'none'): There are now sequences from the North West, Centre and South East of France indicating there might be quite widespread dispersal through the country (or maybe multiple import clusters) |
This has all the signs of the dreaded complete or near-complete escape variant -- R346, N394, Y449, F490, plus lots of bad NTD mutations, i.e. I would be very highly surprised if this shows less neutralization reduction than the current champion C.1.2. It really needs to be designated ASAP so that it is on people's minds and it is properly monitored, even if it eventually goes nowhere rather than being a January 2020-type situation. |
@russcd @yatisht This is a tough one for the UCSC/UShER tree because it is so far diverged from any other sequences that it can't be placed with confidence. Sequences in this cluster are rejected from the daily tree update for two reasons:
It does seem worth labeling and tracking this. Sorry that the sequences won't have a place in the UCSC/UShER tree unless we figure out how to exempt them from our filters. I am doing a special run of usher in --multiple mode to see the 6 equally parsimonious possible placements to see if any of them look any better or worse than the others. |
thanks @AngieHinrichs thats very interesting! Further 7 sequences just uploaded from Congo: latest Usher tree: Interestingly the two Paris region samples are outliers - initially thought they might be poor seq/WT calling but starting to wonder whether they might genuinely represent large diversity within this lineage... |
First USA sequence (California): |
I'd suggest we add this clade as an exception to standard QC. Then, if/when additional samples from this lineage are added, they will presumably have on obvious correct placement relative to each other. The exact placement of the long branch on the main tree is pretty inconsequential given how divergent this clade is, but we should have this in our tree. @AngieHinrichs what do you think? |
Yeah, does make sense to make an exception in this case. |
Thanks @russcd for pointing out that if all sequences are added without the branch length filter, there is enough breaking up of the long branch for the sequences to be kept if the filter is applied only afterwards, and not all of the sequences have too many equally parsimonious placements. I will try to work the sequences into today's tree, and move the long-branch filter out of usher into a post-processing step. |
For your information, it seems that the cluster from the west of France (our sequences from Rennes Hospital) could geographicaly be linked to a traveler coming from Congo. It would be interesting to look whether there is a link with Africa for the other described strains from diverse regions of the world. Unless the virus is under a strong pressure, I have difficulties to understand how this very peculiar strain appeared in different countries. Any information on possible travels or links with Africa would be very interesting. |
Thanks @thomasppeacock We've added this as B.1.640 in v1.2.94 and included all 22 of the sequences from your comments above with <5% ambiguity. I'll close this issue now so we know it's been designated but if there's any more discussion about these sequences, travel history, etc, please do keep that going below |
I found 25 samples with the 3 S mutations :E96Q,P9L,N394S Here is it's USHER build : A - France E (EPI_ISL_5926666) looks a bit outlier. If i take the consensus mutations of groups A-D (34) it has only 19 of those. Another sample , EPI_ISL_5588295, Appeared in Delta (21J) |
Dear all, |
Hi @LVerdurme This data is very important. |
Hi, I'm puzzled as some of the sequences in this discussion seem to have been redesignated, from B.1 to B.1.576 (e.g. EPI_ISL_6095997 and EPI_ISL_5655478) in the Pango lineage notes in GISAID. I was expecting to see these changing to B.1.640. B.1.576 is a old lineage without sporadic Spike changes. Is this as intended? |
It looks like those are spurious assignments, caused by B.1.640 not yet being implemented in GISAID. Out of these ones we think are B.1.640 about half are now B.1 and half B.1.576 in GISAID. |
New proposed lineage
By Tom Peacock (also identified independently by @c19850727)
Description
Sub-lineage of: B.1
Earliest Sequence: 2021-09-28
Latest Sequence: 2021-10-18
Countries circulating: UK (2 genomes), Switzerland (1 genome), France (1 genome), Republic of Congo (1 genome)
Description:
Conserved Spike mutations - P9L, E96Q, Δ136-144*, R190S*, I210T*, R346S, N394S, Y449N, F490R*, N501Y*, D614G, P681H, T859N, D936H* (*seen conserved in all sequenced except French sequence)
Conserved non-Spike mutations - NSP2 – P129L, E272G*; NSP3 – L1301F*, A1537A*; NSP4 – S386F* R401H*, T492I; NSP6 – V149A*; NSP12 – P323l; ORF3a – T32I*, Q57H*; M – I82T; ORF8 – Q27*STOP; N- D22Y, T205I, E378Q (*found in all but one sequence)
Although there are currently only 5 sequences the i) extremely short period of time between first and most recent case, ii) (presumed) repeated import into European countries, iii) presence in Central Africa where surveillence is otherwise quite low, and iv) appearance despite Delta wiping out all other variant lineages globally makes me think this was worth flagging even with this few sequences.
It appears likely there is a lot more of this around than surveillence would otherwise imply and it would be useful to have a pango lineage assigned to allow rapid identification in future surveillence.
Genomes:
EPI_ISL_5481688
EPI_ISL_5619557
EPI_ISL_5592661
EPI_ISL_5531392
EPI_ISL_5588295 (this French sequence appears like it might have a lot of WT backcalling, rather than being a true ancestral sequence)
Evidence:
https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1a121_aa3ca0.json?c=pango_lineage
Proposed lineage name: B.1.X
The text was updated successfully, but these errors were encountered: