Sanger Sequencing Artefacts: Why Careful Scrutiny is Essential Before Calling a Mutation "Real"
- Om Prakash Singh

- Oct 11
- 5 min read
In the world of genetics, Sanger sequencing remains a gold standard for detecting DNA mutations—those tiny changes in our genetic code that can drive everything from disease resistance to evolutionary adaptations. This method involves chain-terminating dideoxynucleotides to read DNA sequences base by base, producing those colourful chromatograms we've all seen in labs. It's reliable, cost-effective, and widely used in research, from identifying cancer-causing variants to tracking drug resistance in pathogens and pesticide resistance in pests.
But here's the catch: Sanger sequencing isn't fool proof without meticulous observation, editing, and validation, what looks like a ground-breaking mutation might just be a lab artifact—a ghost in the machine caused by technical glitches. A recent publication in Trends in Parasitology (July 2025) by us drives this point home with a compelling critique of studies on human lice. Our article, "Novel knockdown resistance mutations in human lice: artifacts or emerging resistance?", dissects reports of bizarre "novel" mutations in lice DNA, arguing that many are likely errors rather than signs of evolving resistance.
The Backstory: Lice, Pyrethroids, and the Hunt for Resistance Mutations
Human lice—head lice (Pediculus humanus capitis) and body lice (Pediculus humanus humanus)—are more than just itchy nuisances. They're vectors for diseases and increasingly resistant to pyrethroid insecticides like permethrin, which target the voltage-gated sodium channel (VGSC) in their nervous systems. Known mutations like M815I, T917I, and L920F (called knockdown resistance or kdr) have been linked to this resistance worldwide.
Enter recent studies from Iran and Saudi Arabia, which reported an explosion of novel mutations—up to 19 new ones in VGSC domain II—via Sanger sequencing. Sounds exciting, right? New mutations could signal emerging super-resistant lice populations. But we. aren't buying it. We meticulously reviewed these reports and found red flags everywhere, suggesting artifacts from poor experimental design and analysis. This isn't just academic nitpicking; misidentifying artifacts as real mutations could mislead public health strategies, like recommending ineffective treatments.
Artifact #1: Off-Target Amplification and Primer Shenanigans
One of the most damning issues we highlight is off-target primer binding during PCR amplification—the step before sequencing where DNA is copied en masse. In four Iranian studies (references [10–13] in the paper), researchers used the same protocol to amplify a ~900 bp fragment of the VGSC gene. But the reverse primer had unintended complementarity (8 bp at the critical 3' end) with another site in the target region, leading to a sneaky shorter ~250 bp amplicon.
This shorter product wasn't purified out (no gel extraction mentioned), so it mixed with the intended one during sequencing. The result? Misaligned reads and false mutations. Take the cluster of five novel mutations (F927I, A928L, V929R, M930L, M932L) spanning six consecutive codons. Biologically, this is wildly improbable—requiring five nucleotide changes in a row, with no intermediates or hypermutable motifs to explain it. But zoom in: These "mutations" perfectly match the sequence of the forward primer (see Figure 1B in Singh et al.). It's like the primer's DNA got baked into the off-target product, and the sequencer picked up the dominant signal, calling fake variants.
Lesson here? Always check your primers for off-target binding using tools like BLAST. And after PCR, gel-extract your target band to avoid contaminants. Singh et al. emphasize that single-pass sequencing (using only one primer direction) exacerbates this, as it doesn't provide bidirectional confirmation to spot mismatches.

Potential artefactual mutations likely due to off-target primer binding. (A) Off-target binding of the reverse primer at an unintended site within the VGSC target region, showing complementarity at the 3′ end, which may lead to amplification of a shorter ~250 bp PCR product alongside the intended ~900 bp amplicon. (B) Incorporation of the forward primer sequence into the off-target amplicon, aligning with the reported mutations at codons 927–932 (F927I, A928L, V929R, M930L, M932L), suggesting primer-derived sequence misidentification as a source of artefactual mutation calls.
Artifact #2: Pooling Samples – Efficiency or Error Magnet?
Another pitfall comes from sequencing pooled lice samples, as in Mohammadi et al. (2022, reference [14]). They grouped 35 lice per pool (10 pools total) and reported six novel mutations unique to certain pools. Sanger sequencing shines for homogeneous samples but struggles with heterogeneity—like in pools where mutations vary between individuals. It can produce noisy chromatograms with overlapping peaks, leading to miscalls.
Singh et al. note the absence of raw chromatogram data or GenBank deposits for these mutations, making verification impossible. Plus, four of the reported wild-type codons didn't match the reference sequence (AY191157.1) or expected norms (Table 1). One mutation (S879V) required three nucleotide changes—statistically unlikely without evidence. The authors argue this risks "conflating methodological errors with genuine resistance."
Artifact #3: Frameshifts, Misreads, and Reporting Inconsistencies
The Saudi Arabian study by Alghashmari and Zelai (2025, reference [15]) gets its own spotlight in Singh et al.'s critique. They reported three novel mutations: L920H, V966F, and F967L. But inconsistencies abound—the abstract says L920F (a known mutation), while the text and figure claim L920H. GenBank entry PQ569620 confirms L920F, suggesting a chromatogram misread.
Worse, V966F and F967L are adjacent and likely artifacts from upstream deletions causing frameshifts. In Sanger chromatograms, indels (insertions/deletions) can shift the reading frame, turning real codons like GTT-TTT (Val-Phe) into illusory TTT-TTA (Phe-Leu). The GenBank sequence shows multiple upstream deletions, supporting this.
Lice VGSC sequencing is tricky anyway, per Singh et al.: Variable intron sizes, frequent indels, and poly A/T stretches distort chromatograms. Without manual review or bidirectional sequencing, errors slip through.
Broader Challenges and Smart Solutions
Singh et al. outline why lice VGSC is a sequencing nightmare—intronic variations mess with alignment, and homopolymers (repetitive A/T) cause "stuttering" in reads. Their proposed solutions are a roadmap for any Sanger user:
Multi-primer strategies: Use exon-specific primers to avoid introns.
Bidirectional sequencing: Sequence from both ends for cross-validation.
cDNA sequencing: Skip genomic DNA altogether by using RNA-derived cDNA, eliminating intron issues.
Gel extraction and sequencing: Excise and purify products in case of multiple bands before sequencing
Transparent reporting: Always share chromatograms and raw data.
Upgrade to NGS: For complex regions, it's more reliable and cost-effective with multiplexing.
These aren't just for lice research; they apply to any field using Sanger for mutation detection, like oncology or microbiology.
Wrapping Up: Don't Let Artifacts Steal the Show
Singh et al.'s paper is a wake-up call: Sanger sequencing demands careful observation—manual chromatogram editing, artifact checks, and rigorous validation—before declaring a mutation valid. Rushing to publish "novel" findings without this can inflate false positives, wasting resources and confusing the scientific community. As the authors conclude, their critique promotes "rigorous standards that prevent the mischaracterization of sequencing errors as functional resistance mutations."
If you're a researcher or student dabbling in DNA sequencing, take heed. Double-check those peaks, validate with orthogonal methods, and remember: Not every blip is a breakthrough. For the full details, check out the open-access article in Trends in Parasitology (DOI: 10.1016/j.pt.2025.05.011). Have you encountered sequencing artifacts in your work? Share in the comments!
Comments