[Previous] | [Next]

Table of Contents
Why Do Genetics
Genetic Terms
More Terms
Basic Molelcular

More Basic Concepts
Mutation Frequency
Chemical Mutagenesis
Frameshift Mutation
DNA Repair
Mutation Summary
Detecting Mutants
Complex Mutation
Insertion Sequences
Compound Transposons
Complex Transposons
Models of

Transposition Summary
Mutagenesis in vitro
Effects of Mutations
Plasmids and

F Factor


Two Factor Crosses
Deletion Mapping
Other Mapping Methods
Strain Construction
Inverse Genetics
Gene Isolation
Characterization of

Sequence Data
General Approaches
Final Summary
Problem Set 1
Problem Set 2

Search | Send us your comments

Analysis and use of clones and sequences

©2000 written by Gary Roberts, edited by Timothy Paustian, University of Wisconins-Madison



Rather than provide a description of sequencing technique, this section will touch on the requirements for sequencing and its uses. At the present time, sequencing methods are adequate to analyze 100-300 base pair stretches of DNA at a time. The rate of advances in technique has been so great, however, that this limit will probably be increased significantly in the near future. Similarly, until quite recently, any DNA to be sequenced needed to be cloned so that the region could be amplified. Now, however, the technique of PCR threatens to change the way we think about sequencing.

More important than considerations of specific methods is the question, what information is gained from sequence analysis?

  1. It provides a strong suggestion about the position of transcription start and stop sites since such sites often have recognizable sequences. The biological reality of sites so recognized can be confirmed by other sorts of analyses.

  2. It allows the recognition of regions of DNA that encode products that are non-essential for the phenotype under analysis, since these will still have open reading frames. You would not detect these genes by classical genetics.

  3. It may indicate, by the location of translational start and stop sites, the size and amino acid composition of the newly synthesized protein, so that differences from the protein as isolated would reflect protein processing.

  4. It allows an analysis of the amino acid sequence for hydrophobic and hydrophilic regions, as well as more sophisticated modeling which notes alpha-helices and more complicated structures (motifs).

  5. The sequence can be compared to the large and growing sequence data bank where similar sequences would suggest related functions of the encoded products (but see comments under "homology", section I C2).

  6. The sequence analysis of point mutants coupled with the biochemical analysis of the defective product allows the determination of the functional domains within the protein product.

  7. The sequence facilitates the employment of site-directed mutagenesis where, in distinction to traditional genetics, the genotype is changed specifically and then the phenotype is examined (see below).

The sequence does not necessarily identify functional genes, since there are a number of cases where an "open reading frame" does not correspond to the region encoding a gene product. These include "stop signals" that actually code for amino acids, cases of "natural" frameshifting, non-standard start codons, cases of ribosomes "jumping" over 2-20 codons in an mRNA, and cases of cryptic genes. Sequence analysis also does not identify biochemical functions, since the presence of "motifs" does not prove that such sites are important for the biological function of the gene product. For these reasons, while sequence provides very powerful insights into probable function, it is careless to make either biochemical or genetic conclusions based solely on sequence analysis.


There are two rather different reasons for generating mutations in your cloned region. The first is to help identify which region encodes products relevant to the phenotype under analysis and the second is to perform a precise "structure/function" analysis of the encoded products. For the former goal, loss-of-function mutations without polarity problems are best. These might include in-frame deletions are small in-frame insertions, which would be analyzed for any effect on the phenotype of interest. Structure/function analysis would almost certainly require site-directed or localized mutagenesis, coupled with a biochemical analysis of the affected protein product.

By any means of mutation generation, the resulting mutated clones will need to be analyzed for interesting phenotypes either by examining the products of the clone directly or by introducing the cloned region into the chromosome, replacing the wild-type version. The latter is clearly preferable in terms of its physiological "correctness", but is significantly more time- consuming. A good strategy would be to analyze the products of the mutated clones and then move any "interesting" alleles into the chromosome for better characterization.


The knowledge of the primary structure of a nucleic acid and its encoded product, coupled with the ability to rearrange sequences in a precise way, allows the production of novel products in novel amounts. While the possibilities are vast, here are two general examples. While the fusion proteins, described in section X, are used to monitor regulation, more precise gene fusions can be constructed which produce products retaining activities encoded by both fused genes. Alternatively, constructions can be performed that place the expression of a gene under a novel promoter, for example one which can be regulated by temperature or the addition of a small molecule to the culture medium. This allows the experimenter to produce large quantities of a product toxic to the cell by growing the culture to high cell density and then triggering the expression of the gene when cell growth is no longer necessary. This can be particularly useful in the site-directed mutagenesis described immediately above, since large amounts of the altered product are available for easy purification and characterization. As a small note of caution, proteins that are dramatically overproduced are occasionally found to be "odd" in some way because of their overproduction, so that the blind analysis of their properties would yield a misleading result.

[See sample problems 23-24]

[Previous] | [Next]

frontierlogo picture This page was last built with Frontier and Web Warrior on a Macintosh on Thu, Sep 21, 2000 at 1:01:15 PM.