An informal, non-comprehensive guide to successful sequencing
There are many guides to sequencing available on the internet and through manufacturers such as Applied Biosystems (maker of BigDye). In an effort to collate what we feel are some of the more important factors, both in our own direct experience and through conversations with facility users, we decided to put our own recommendations and insights into one place. Much of this material is available in other documents on the DNA Blackbox (http://dna.biotech.wisc.edu; scroll down to "Helpful Documents."), but this document attempts to combine it all. It is a mix of procedural information and troubleshooting hints, and is based on examining each of the components of the sequencing procedure. While sample chromatograms are not present, an attempt has been made to describe certain diagnostic features of the trace file. A good visual source of such files can be viewed from Roswell Park or University of Chicago.
There are several components in a sequencing reaction—template, primer, sequencing reagent, and “other additives.” The mix of these reagents then undergoes 3 steps—cycle sequencing, post sequencing cleanup, and analysis. By examining each of these processes, sometimes in conjunction with controls we provide, we can get you sequencing successfully right away, or, if you’re having difficulty, figure out why you’re not getting the data you desire.
Template is a very common problem area. If your chromatogram is blank, has very low signal, or starts well but gradually dies out, the template should be examined. For plasmids in the 3-10 kb range, 0.2 ug is a good amount of DNA to use. More is almost never better. One problem is that accurate quantification of DNA is not easy. Spectrophotometer readings will invariably overestimate the amount of template DNA, unless you have CsCl banded material. That’s because use of almost all miniprep kits on the market will result in some RNA, chromosomal DNA, and other fluorescent cellular material coming through the purification procedure that will absorb UV light. These other chemicals won’t necessarily inhibit the sequencing reaction, but they will contribute to the A260 reading. As a result, relying on this spec reading alone will cause you to add less DNA than you think (midi and maxi prep kits do a better job, giving a higher plasmid:contaminant ratio). It’s informative to run your template on an agarose gel, using some standard of known concentration and estimate the relative fluorescent intensity following gel staining. We can provide you with a pGEM standard at 0.2 ug/ul to run with your sample. If you are doing many preps and don’t always feel like quantifying your DNA, a reasonable rule of thumb for minipreps of a high copy number plasmid, following lysis of 2-3 mls of a dense overnight culture, is to use 1/10th volume of the eluted sample in the sequencing reaction (but try this first before doing several dozen).
We distribute primers that recognize the Amp or Kan resistance gene present on most commonly used plasmids as a control for template purity. We know these primers sequence well on good templates, so using these in a reaction helps you assess your template “sequencability.” [See http://dna.biotech.wisc.edu/documents/Available_Controls.htm ]
The problem of sequence starting out strong then rapidly dying out is common and is usually attributed to "impurities" in the prep. We have found that these impurities can't always be removed by EtOH precipitation and you may need to re-prep the plasmid.
Larger and/or lower copy number plasmids can be difficult to sequence. It is tempting to lyse a greater volume of culture for the miniprep column, but a better strategy is to stay with 2-3 mls culture/miniprep column and do multiple columns (or do a midi or maxiprep). The separate preparations can then be pooled; it may even be necessary to do an additional cleanup and concentration step such as ethanol precipitation. Also, keep in mind that larger templates often require higher amounts of DNA (0.5-1.0 ug) to maintain a good molar ratio of template:primer and reagents. It is better to add a small volume of a concentrated template; adding 10 ul of template to a 20 ul sequencing reaction is most likely not going to give you good data.
BACs, PI clones, and lambda phage present their own difficulties, and it may be necessary to investigate kits from several manufacturers to get good template. Our experience preparing such templates is limited, and we don’t have specific product recommendations (though we do have a BAC prep protocol used successfully by a user years ago on our web site). Most companies are willing to provide free samples so you can test out their products by giving them or their sales rep a call. Since these are also high MW templates you will need to add more DNA to the sequence reaction, remembering that small volumes of a concentrated prep generally give better results. See more info at http://dna.biotech.wisc.edu/documents/BAC_Sequencing_Information.htm .
PCR fragments seem to sequence either incredibly well or not at all. If one band is produced and the components are not used in gross excess, the products do not need to be cleaned for sequencing. For a strong band, use a dilution of 1 to 10 and use 1ul of product in a 20ul sequencing reaction. For a medium strength band, use 1ul of a 1 to 5 dilution within a 20ul reaction. For a weak band, add 1ul of undiluted PCR product to a 20ul sequencing reaction. If the PCR fragments generally must be cleaned up following the initial amplification and prior to sequencing to remove the two amplifying primers and the unused nucleotides (some mol bio jocks have their reactions fine tuned to the point that following amplification there are no primers or nucleotides left, and they can just add the product directly into a sequence reaction, but this is not common). Common ways to clean up the PCR fragment are listed below, with advantages and disadvantages for each:
1. Cutting the band out of gel and doing a gel extraction using any one of a number of kits available. We only recommend this if multiple fragments are seen following amplification, and you need to cut out the fragment that’s the “right” size. Disadvantages of this procedure include low yields, exposure of the DNA to ethidium and UV light, which damages it, and contamination from other bands. This procedure seems to cause the most problems among our users—unless you’re expecting multiple bands, your time might be better spent adjusting the PCR conditions to give one, or at least a dominant, band. You might also consider cloning the products and sequencing individual plasmid clones if several attempts at sequencing a gel-extracted band fails.
2. Performing a kit based column cleanup protocol. This typically involves applying the entire PCR to a column, then carrying out washing and elution steps to generate fragment free of contaminating primers and nucleotides. This procedure depends on having a single or predominant band following amplification, and can generate DNA that sequences effectively. It is highly recommended an agarose gel be run following this or the previous method in order to see how much material is present following purification.
3. Enzymatic cleanup of the PCR fragment. Commercially available shrimp alkaline phosphatase and exonuclease (SAP-EXO) are used to inactivate excess primers and triphosphates. As with the column cleanup, a single or predominant band must be present. This procedure, which is one we have done at the facility (and offer as a service), has a number of advantages—it is fast, inexpensive, quantitative, and amenable to scaling up in 96 well format. Most importantly, it yields DNA that sequences very well. Kits are available from USBiochemicals (USB) called “PCR Product Pre-Sequence Kit”, cat. # 70995 for the smaller kit.
4. Magnetic bead clean-up—“AMPure” from Beckman Coulter(Agencourt). This method is similar to bead cleanups following sequencing (see below). PCR fragment is bound to magnetic beads, rinsed, then eluted using a special super magnetic plate. It works well and has all the convenience advantages of the other magnetic bead cleanup method, ease of sample handling in multiwell format, though relatively expensive (it is included on our same UW quote as the dye terminator removal beads).
Official protocols for sequencing PCR fragments recommend using 10 ng of PCR fragment per 100 bp of fragment size. However, it is usually a pain to attempt quantification of PCR fragments, and it often works fine to empirically determine how much to add based on the intensity of the fragment on an agarose gel. PCR fragments can sequence very well, probably due to the high molar ratio of template:primer and the efficient denaturation of a fragment compared to a plasmid. An amount of DNA that is very bright on a gel will often sequence too well and resolution of the chromatogram will suffer. However, if you see any product at all, even if faint, you will probably get decent data from it in a sequencing reaction (IF it’s the right fragment). Amplifying primers work well to sequence PCR fragments, as do internal primers. Additional discussion of sequencing PCR fragments is presented below.
Template composition is a factor that can cause sequencing problems. GC rich templates will generally sequence fine, unless there are particular regions of very strong secondary structure. A diagnostic trace pattern for this sort of template would be a chromatogram that looks great to a certain point, then suddenly dies out. Methods for dealing with this are discussed below, but basically these involve the addition of denaturants to the sequence reaction or specially formulated versions of BigDye. Poly A/ Poly T regions will often cause difficulty—the chromatogram will look fine up to the polyT (A) stretch, then either be very noisy peaks under peaks, or just long “rolling hills” of the four chromatogram colors. This results from “polymerase slippage” on the poly T(A) region and is a difficult issue to resolve. The addition of “reaction enhancers” (see below) may help if the problem is not severe. Another strategy is to design “anchored primers,” a poly A or poly T sequence with the final 3’ base being either G, C, or T (or G, C, and A for a poly T primer). With these primers the poly A/T stretch needs to be at least 17 bases long, and special conditions for annealing (42°) and cycling (52°) are recommended. Sometimes this works well, but it may require the user experimenting with different conditions. We do have anchored primer mixes available free of charge if you want to try this. Remember you will lose information just downstream of the primer, as is typical in sequencing. Finally, di- and tri-nucleotide repeats can cause poor data. It's usually pretty obvious when such sequences are at fault. Often the use of sequencing reagent enhancers (discussed below) helps a lot.
Primer problems can also be a cause sequencing failures, and some of these give characteristic chromatogram patterns. A blank chromatogram, which is not that diagnostic of a primer problem specifically, can result from use of the wrong primer, too low a concentration of primer, or simply bad or degraded primer. However, assuming you’ve used one of our control Amp or Kan primers, or another primer from your lab on the same template with good results, it makes sense to re-examine the primer component. Primer design is usually not an issue. A wider variety of primer compositions can give excellent sequencing data, compared to PCR. Check out the "Primer Considerations" document at http://dna.biotech.wisc.edu/documents/Primer_Considerations.htm. Additional primer design and analysis resources can be found: Primer3
. For information on MW, and oligonucleotide conversions, check out the site from OligoCalc
Generally it is tough to design a primer that doesn’t work. However, it is easy to miscalculate how much you’re adding so it always is worthwhile re-checking your calculations. This may involve re-quantifying your primer by spectrophotometry. A basic rule of thumb is that you should use about the same amount of primer in a sequencing reaction as you do in a PCR (5-10 pmol, corresponding to 30-60 ng of an 18 mer). If everything has been examined and by all accounts a primer should work, you might just have a bad primer. We can examine any primer by mass spectrometry (for a fee) and tell you if it appears good, i.e., full length, or not. You might want to talk to the people who made it to find out about getting a replacement. Our DNA Synthesis facility will generally re-synthesize any primer made here that should work in a sequencing reaction but doesn’t. Of course this is based on our analysis of the primer and previous controls that have been carried out by you.
A “noisy” chromatogram—good signal strength, but peaks under peaks resulting in numerous ambiguities (“N’s)—can indicate several primer difficulties such as more than one primer present, more than one primer binding site present, secondary priming at a related sequence, or degraded primer. [This pattern is also evident if a mixed population of template is present, so this needs to be considered as well. It is particularly diagnostic if the peaks under peaks begin at the cloning site used in the construct]. Additional primers in the reaction is usually an accidental result resolved by repeating the reaction, or in the case of sequencing a PCR fragment, re-cleaning the input template to remove amplifying primers. Multiple peaks on the chromatogram can also occur if the primer binding site is within a repeat region on your template. Depending on the structure of your template, this can look pretty strange—you can have good sequence that all of a sudden turns into peaks under peaks, or peaks under peaks that will suddenly resolve into good looking data. If this happens you may need to reexamine the template you’re sequencing: investigate what’s known about it or analyze it yourself and see if you can detect evidence of a repeat pattern. This can be a real problem—sequencing through long repeat regions is one of the big challenges faced by genomic sequencers, and we have unhappily encountered it within our facility.
It is also possible to get secondary primer binding if the primer is very GC rich and is being used to sequence a GC rich template. In these cases, you can try raising the annealing temperature and adding a denaturant such as DMSO (to 5% final concentration), formamide (5%), or betaine (1M final concentration) to increase primer:template specificity. Secondary priming giving peaks under peaks, or too strong a signal resulting in peaks under peaks, can also occur if the primer concentration is too high in PCR fragment sequencing. In contrast to sequencing plasmids, where too much primer shows little effect, adding too much primer to a PCR fragment sequencing will frequently result in noisy data. It is important to limit primer amount to 5 pmol when sequencing PCR fragments.
Finally, if your primer is starting to degrade, or if there is a high proportion of n-1 products in your oligonucleotide preparation, you will also see peaks under peaks since you’re essentially adding multiple primers to the reaction. Primers can last a long time, but they can also degrade and it’s impossible to set an expiration date that covers all primers. As noted above, we can analyze your primer by mass spec as a fee for service, but in the interests of time it may be worth just having a new one synthesized.
In several years of facility operation, we have not seen a clear cut demonstration of the ABI BigDye reagent being a culprit in sequencing difficulties. By “clear cut” we mean a situation where newly acquired reagent worked, but the existing reagent in the lab did not in a parallel experiment. Should any lot of BigDye show difficulties, it would soon be apparent in many labs, including our own, and we would attempt to notify all users right away. Like any enzymatic mix, however, BigDye should be handled with care (keep in an ice bucket during use, don’t leave on benchtop overnight, etc.). Repeated freeze thaw cycles should be avoided, though in pilot experiments carried out in the lab, five freeze thaw cycles led to a barely noticeable decrease in activity. Depending on lab usage, it may be advantageous to re-aliquot what you get from us to minimize this as an issue. The mix stores well at –20°C, though for long term storage (>3 months or so) it can be kept at –80°C.
As stated in our protocol sheets, the amount of reagent to add to your reaction can vary. The “official” reaction recommended by ABI uses 8 ul of reagent in a 20 ul reaction, but even they acknowledge 4 ul works just as well. We have found that 2ul in a 20 ul reaction also works for most templates, and our standard conditions use that amount (see http://dna.biotech.wisc.edu/documents/Facility_Procedures.htm). We recommend experimenting with this in your system so you can get the best data for the least cost. If you decide to try doing parallel reactions with varying amounts of enzyme, we can give you an “R and D” price break on those samples. If you're doing a large # of samples and cost is critical, you might also check out the document "BigDye Dilution Experiment" at DNA blackbox that pushes the amounts of enzyme pretty low while still getting decent results (http://dna.biotech.wisc.edu/documents/BigDyeDilution.htm).
There is a specialized version of the BigDye sequencing reagent that contains dGTP instead of the dITP present in the normal formulation. This is used specifically for templates with problem regions of secondary structure. Such regions are manifested by a chromatogram that looks great up to a certain point, then falls off precipitously. If you are seeing this sort of pattern, we can provide you with an aliquot of the dGTP mix to try. It typically works best in a mixture with the standard BigDye, at a ratio of 3:1 or 2:2 normal:dGTP (it varies from template to template), and should also be used in conjuction with 1M Betaine and elevated extension temperature (68 or 72°). It may take a few attempts to get good data using this reagent. A problems that can occur with the use of dGTP is that “compressions” are evident as peaks that migrate slightly out of register due to annealing within the extension product. Sometimes the sequence needs to be edited manually if the basecaller can’t resolve these patterns.
Other Reaction Components
We provide you with the sequencing buffer shipped by ABI with every enzyme order. Older versions used a 2.5X dilution buffer comprised of 200 mM Tris pH 9.0, 5 mM MgCl2. The new buffer formulation is different and the recipe is a carefully guarded secret. We (and others) have seen that the new buffer does give better results than the old dilution buffer. The new buffer is at 5X strength; the enzyme mix itself is formulated in half buffer, so for a 20 ul reaction you should add 3 ul 5X buffer and 2 ul enzyme mix, 1 ul of which is 5X buffer. This gives 4ul total 5X buffer in 20 ul--perfect! Other commercially available additives are sold as “sequencing reaction enhancers.” Some do work marginally better than the buffer in maintaining signal strength, especially at very low enzyme dilutions (1ul in 20 ul reactions). However, none are cheap, and our calculations indicate no one would save money through using them. However, some users could conceivably benefit. If you’re curious about these, contact us and we can arrange to give you a limited amount of what we have on hand. Also, many companies are willing to distribute a small sample to try. Following that, it would be up to individual labs to purchase what they wanted.
Other non-commercial reagents can work well in getting refractory templates to sequence better. DMSO is a favorite additive of sequencers across the country. We go back and forth here adding itWe use DMSO in 90% of our reactions in house. One issue is we have seen it go bad and actually inhibit the reactions, so you'll want to keep your stocks fresh. But it can help, and it's cheap. Stick with 1 ul/ 20 ul reaction (final concentration of 5%). Also, people have reported the addition of formamide to a final concentration of 5% can help when secondary structure is a problem (a diagnostic chromatogram for secondary structure in a template is good looking data that abruptly terminates)--we have no data on this. As mentioned above in the dGTP secondary structure discussion, a chemical called betaine can enhance sequencing of these abruptly terminating clones, and can also help other template related problems, including nucleotide repeats. Betaine is used by making up a 5M stock, then adding it to a final concentration of 1-2 M (we typically use 1M though some protocols call for more) in your sequencing reaction. It's also commercially available in mol boil grade at 5M concentration from Sigma. Single strand binding protein (SSBP) available from Promega has been reported to help sequencing of refractory secondary structure templates, used at a concentration of 1 ug/20 ul reaction. We have no direct data with this additive.
In the "Helpful Documents" section of our web site is an older document entitled “Cycle sequencing sample protocols” that covers a number of sequencing protocols successfully used by us over the years. However, over time we've narrowed our manipulations to a couple of standard cycling protocols. Different protocols seem to work better with different primer/template concentrations, but we haven't developed definitive rules. We do give you a set of recommended conditions to start that has been broadly successful in our hands. If you get sub-par data with these conditions you may see an improvement using one of the other protocols. However, an important point to note is that there are a variety of conditions that will work, so if you're having trouble with your sequencing it might be worth checking other factors first before playing too much with cycling conditions (check out the document "Available Controls" at http://dna.biotech.wisc.edu/documents/Available_Controls.htm). Our recommended starting conditions are as follows:
96° 2’hot start, then 35 cycles of 96° 10” 52° 15" 60° 3' followed by one cycle of 72° 1’.
Notes on these conditions: the hot start is PCR holdover, and is probably overkill. It’s common for us to limit this to one minute or so, and it may be entirely unnecessary. 35 cycles may be slight overkill too, but we do it to maximize signal. In our facility PCR machine space is not at a premium so it’s usually not a big deal to run the extra cycles. However, if space is restricted in your lab, in the interest of lab harmony you may want to experiment with 25 cycles.
Experiments altering the annealing temperature have not yielded clearcut results. As an intuitive rule of thumb, it's probably better to stay at 52 with 17 or 18 mers, but we also use 55 with indistinguishable results. One set of conditions in which the annealing/extension step is combined and done at 58° for a 4' period. This last protocol has given good data getting through di- and tri-nucleotide repeats. We recommend keeping the extension step at least 3' to avoid a drop off in signal as products get longer (which is always something of a problem with capillary sequencers). However, for PCR fragments or if you only need 500-600 bases, you can shorten the extension to 2' in the interest of speed.
Finally, the 72° 1’ is another PCR holdover that should be considered optional. We will often pull reactions out of the machine by the time they have gone > 30 cycles in the interest of getting data to our users, and these reactions that have bypassed the final extension look fine (or if they don’t, some other problem is responsible).
Post Sequencing Cleanup
Following cycle sequencing, reactions must be cleaned up to remove excess dye terminators which don’t incorporate into DNA. Ideally, all the excess terminators will be removed and all the extension products will remain. This can be a common source of problems for individuals, either in terms of getting too many unincorporated nucleotides in the sample, or in sample loss during cleanup resulting in lower signal. Low signal is not necessarily a huge problem because the instrumentation is sensitive enough to extract useful information. It can mean that if you are on the threshold of observable signal, any slight decrease will lead to no data at all—there is less “wiggle room.” Another issue is that low signal generally doesn’t look good out past 600 bp or so. From the capillary you should ideally get about 800 bp of readable sequence, and this usually depends on starting out with strong signal (because of the way the capillary instruments operate, signal strength will usually fall over the course of the sequence). Another diagnostic sign of cleanup problems is the so-called “dye blobs.” In the capillary they can be seen at different places in the chromatogram--sometimes early, at about 70, and also at about 300 bp (check out for examples from Roswell Park
or University of Chicago
). They appear as off-scale, or very tall, broad, peaks stretching over what may be other legible peaks below. All colors are sometimes seen, but often they are comprised of just red (the latter resulting from a breakdown product of the T terminators). These peaks confuse the basecalling software, and lead to inaccurate sequence data. Often the true peaks are seen below and the sequence can be manually edited. However, depending on the signal strength of the other bases, and the severity of the dye blobs, all the T-signal can get “sucked up” by the dye blobs and the chromatogram will appear to have no other T’s (this is a function of the way the software works). Ideally, no dye blobs beyond small ones appearing early in the run should be seen.
The three most commonly used methods of dye terminator removal are size exclusion (i.e., Sephadex G-50 or G-75, or commercially available) columns, alcohol (isopropanol or ethanol) precipitation, and magnetic bead adsorption/elution. Protocols for these techniques are available at DNA blackbox in the "Helpful Documents" section (http://dna.biotech.wisc.edu/documents/Non-bead_cleanups.htm). All procedures have been successfully utilized by our users and by us. However, in our experience the most reproducible and foolproof method is magnetic bead cleanup. The University of Wisconsin has a quote from the company that makes the product (Beckman Coulter Genomics (Agencourt Biosciences)), and it can be purchased directly from them for a substantial discount (ordering information is also in the “Facility Procedures” http://dna.biotech.wisc.edu/documents/Facility_Procedures.htm). It costs about $0.30/reaction, so it’s more expensive than precipitation or homemade columns, but less than most commercial columns. Nevertheless, it’s fast, easy, qualitative (i.e., almost no signal is lost) and almost always yields great data. We are always willing to instruct users in this technology—you can bring over a few samples and we’ll process them for free while you observe the procedure. This procedure also integrates particularly well with the capillary sequencers handling most of our workload. We also carry out sequencing cleanup as a service, using the magnetic beads.
“Finally!” you’re saying, “something that’s not my fault!” It’s true, once you drop off your sample it’s up to us to get you back data that looks good, if possible. Fortunately most of the problems that occur at our end result in characteristic trace files, and most can be dealt with fairly easily. The important thing is to let us know when you see this as soon as possible. This is particularly important in those cases where the sample may need to be rerun, since we only keep the remaining sample for a week. Diagnostic chromatogram appearances and what to do about them include:
Good signal, but peaks are shifted relative to each other and basecalling results in many N’s. This suggests the data were analyzed using the mobility files from the wrong version of BigDye. We can easily reanalyze these files with the correct input mobilities.
Trace file starts right up in data (i.e., no flatline “leader” sequence), resulting in loss of the beginning of expected sequence. This is a software issue; the automatic analysis has just begun basecalling at an inappropriate point. We can manually set an earlier analysis beginning point that will recover the full sequence.
Trace file starts out well, but turns into “rolling hills” of color, or starts right out with rolling hills. This is either a capillary or a sample issue, but since it’s impossible to tell, (though if it starts out nice it’s usually the capillary), we will re-run such samples if we notice this or if requested (you have a week to tell us). Charged contaminants in the sample will lead to a pattern of rolling peaks throughout the trace file. If several samples from a particular user show the same pattern, it’s not the capillary and you will need to alter your procedures to avoid the problem in the future. In the meantime we’re usually willing to do re-runs.
Trace file looks good but there are one or several sharp multi-color spikes throughout the sequence. This indicates bubble(s) in the capillary and a re-run will help.
Trace file looks good but neither you nor anyone in the lab works on that organism. You got the wrong sequence, baby! We need to know this right away since it’s a sure bet someone else got the wrong sequence too. We can often figure out where the mistake was and get you the correct sequences without a rerun, within a few hours or less. A re-run to make sure can be done if requested. If you get sequence that’s from your organism but doesn’t seem to be what your primer should have given you, or you weren’t expecting exactly that PCR fragment but it’s close, please look over all your manipulations before calling. If it’s not one of the software or switched tube issues, chances are it’s something at your end.
Depending on the traceviewing program you're using to look at your chromatogram, you can find different information about your run. Click around the different windows and/or menu choices and see what you come up with. Of particular use is the relative signal strength--given in arbitrary fluorescent units for each base. Ideally you want to be in the high hundreds or low thousands; this information can be valuable when deciding how to get better data.
We want everyone to get the best sequencing data possible. While there are instances where fluorescent sequencing with BigDye just isn’t going to work, these are rare. Don’t be satisfied with “just OK” sequencing data. If you have used our controls, have gone over the relevant parts of this guide, and continue to have problems, bring your template(s) to us and we’ll try it free of charge. We don’t want people to waste time and money on sequencing when there are more important questions to answer.