It is part of our standard operating procedure that all sequence data generated at our Core facility are carefully inspected before they are made available to the end user. We have implemented rigorous process controls and a series of appropriate tools that enable us to immediately identify process- and/or instrument-related failures. In these very rarely occurring cases, all affected samples will be automatically resequenced, at no extra charge for our customers.
Should there be any issues with your sequencing data, we strongly recommend reviewing our compilation of the most common problems experienced in ABI Sanger sequencing, including example chromatogram images and descriptions of possible causes and solutions (see below).
Understanding and solving failed sequencing reactions and/or problematic sequencing data require thorough analysis of the corresponding sequencing chromatograms. These trace files are labeled with the extension .ab1 and can be viewed with one of the free software programs listed here.
If you would like assistance in analyzing your sequencing data, please contact us.
Good Quality Sequencing Data
1. Failed Sequencing Reactions (Sequence data contains mostly N's)
How to Identify:The trace is messy with no discernable peaks
What is the cause?It is difficult to pinpoint the exact reason as to why sequence reaction fails due to the fact that there is no good quality data to analyze.
Some of the more common reasons and how to fix them can be found here:
2. Chromatograms show a lot of noise or background along the bottom of the trace
How to Identify:The trace has some discernable peaks but also has a lot of background noise running along the bottom that can interfere with base calling. The quality scores of the peaks are usually very low.
What is the cause?This is usually due to low sample signal intensity. Low signal intensity is a result of poor amplification due to either:
3. Poor data is seen following a region of mononucleotides (single base)
How to Identify:The sequence trace becomes mixed and unreadable after a stretch of mononucleotides. (After a run of a single base)
What is the Cause?The polymerase slips on the stretch of mononucleotides causing a disassociation and re hybridization in a different location. This causes varying sized fragments, creating a mixed signal after the region.
How do I fix the problem?There is currently no way to effectively sequence directly through such a region. A primer can be designed that sits just after the mononucleotide region or else a primer can be designed that sequences toward it from the reverse direction.
4. Good quality data that suddenly comes to a hard stop
How to Identify:The sequence looks fine before it suddenly either terminates or the signal intensity drops dramatically.
What is the cause?This is usually a sign of secondary structure present in the template. Complementary regions fold up on themselves to form hairpin structures that the sequencing polymerase cannot pass through. Long stretches of Gs or Cs can also cause problems for the sequencing polymerase.
Important fact to note:If the secondary structure is weak enough, we often see a reaction successfully sequence through half of the time and terminate early the other half. Simply put, the polymerase successfully passes through the secondary structure only some of the time to give good sequence.
How can I fix the problem?We offer an alternate sequencing protocol that can sometimes help this situation. It is a different dye chemistry designed by ABI that helps to pass through secondary structure and other difficult templates. It is not guaranteed to work and should be tested on a few samples at first before using it on a large batch of samples. Many people find it helpful while some do not. It can be ordered by clicking on the "difficult template" button on the right hand side of the order form. Because of the extra reagents involved it costs a little bit more than a regular sequencing reaction. You can view the rates by clicking here. Please call us it you would like advice as to whether this protocol might be beneficial to you.Please note:This protocol will not help sequences that completely fail with the standard sequencing protocol. There must be visible signs in the chromatogram that will hint to a problem with secondary structure.
***An alternate method of sequencing a difficult region is designing a primer that either sits directly on the area of secondary structure, or designing a primer that avoids the region all together.
***Known regions to be problematic include sequences containing the LOXP site and the Att site of lambda phage used by Gateway entry vectors
5. Good quality data that turns to double sequence (2 or more peaks in the same location)
How to identify:The sequence trace begins high quality and then becomes mixed.
What is the cause and how can I correct it?There can be a couple different causes of this.
6. Double sequence that starts from the beginning of the trace (mixed template)
How to identify:The sequencing trace has two or more peaks in the same location. The quality scores for each peak are very low. The text file will contain a lot of N's
What are the causes and what can I do to fix them?There can be many different situations that give rise to this problem.
7. Sequence gradually dies out causing early termination in read length
How to identify:The sequence starts out with high quality peaks and either stops prematurely or becomes messy downstream. The raw data shows very high signal intensity at the very beginning of the sequence and sharply decreases.
What is the cause and how can I fix the problem?The number one reason for this is having too much starting template DNA. The template becomes over amplified and uses up all of the florescence that we label the DNA with at the beginning of the sequence. This can be corrected by lowering your template concentration to between 100ng/µL and 200ng/µL. Use lower amounts for short pcr products under 400bp. View the concentration guide here.
8. Poorly resolved sequence traces where the peaks are very broad and not sharp or distinct
How to identify:The peaks are very broad and blobby instead of narrow and sharp.
What is the cause and how can I fix it?This is a very rare occurrence though seems to happen now and then with certain people's samples. Popular theory is that there is some kind of unknown contaminant in the DNA. People have noticed it mostly when using filter purification kits. Try another cleanup method or dilute the template.
This can also occur if the polymer in our sequencers starts to break down. We usually detect if this happens and if it does we see a loss of resolution in the data of all users on a run. We automatically rerun these samples before you ever see the data.
9. Large dye blob that occurs around 70 base pairs
How to Identify:The sequence will contain a large dye blob that may or may not overshadow the regular base calling at around 70 base pairs.
What is the cause?This can be a result of one of a couple reasons.
10. The sequence starts out noisy or mixed and changes to good quality further down the trace
How to identify:There are many N's in the beginning of the sequence and the peaks appear noisy or mixed. The trace becomes clean with identifiable peaks downstream.
What is the cause and how to I fix it?This is usually the result of primer dimer formation. Primer dimers form when a primer self hybridizes due to complementary bases on the primer itself. Make sure your primer is not likely to amplify itself by having it analyzed by one of the many free primer analysis programs online. You can also view a primer design guide here.
11. Sharp signal spike that occurs randomly in the trace
How to identify:The trace reads cleanly and is disrupted by a very sharp signal spike that spans about 2 or 3 base pairs.
What is the cause and how do I fix it?It is believed that these spikes are caused by small air bubbles that can get trapped in the capillaries of the sequencer. We will rerun the sample for you if the spike is in an area of particular interest for you.
12. Chromatogram shows an insertion or deletion (indel)
How to identify:The sequencing trace is good quality in the beginning and becomes mixed further downstream after an indel region. The peaks in the mixed region are often times shifted and a small peak can occur either before or after a larger one.
What is the cause and how do I fix it?An indel is caused by either an insertion or deletion on the template strand. They can be seen when sequencing PCR products with polymorphic regions. Polymorphisms can be a direct result of a heterozygous individual or because of a random mutation that occurred during the cloning process. Random mutations occur much more readily if cloned template is toxic or unstable in the E.coli host. If the mutation is caused by template instability, try cloning into a different strain or using a low copy vector. You can also try sequencing from the reverse direction.
13. The trace shows a delayed start in the first base of the read
How to identify:The sequencing trace is good quality in the beginning and becomes mixed further downstream after an indel region. The peaks in the mixed region are often times shifted and a small peak can occur either before or after a larger one.
What is the cause and how do I fix it?An indel is caused by either an insertion or deletion on the template strand. They can be seen when sequencing PCR products with polymorphic regions. Polymorphisms can be a direct result of a heterozygous individual or because of a random mutation that occurred during the cloning process. Random mutations occur much more readily in the cloned template is toxic or unstable in the E.coli host. If the mutation is caused by template instability, try cloning into a different strain or using a low copy vector. You can also try sequencing from the reverse direction.
14. The peaks look fine in the chromatogram file but many of them are labeled as N's. The peaks may also blast out of range
How to identify:The peaks look clear and distinct as depicted in the chromatogram file but the base caller marks them as Ns. This is particularly prevalent in the earlier portion of the sequence. The signal intensities are extremely high and much of the time the peaks can be accurately be named by eye.
What is the cause and how do I fix the problem?This is always caused by high starting template concentrations. The DNA is over amplified and the resulting signal gets blasted out of the range where the base caller on the sequencer can accurately read. We see it most often in short PCR products where it is easy to submit too much DNA. Make sure to keep template concentrations closer to 100ng/µL if the PCR products are under 500bp.