Previous Article | Next Article ![]()
Microbiology and Molecular Biology Reviews, June 2001, p. 261-287, Vol. 65, No. 2
Centro de Biología Molecular "Severo Ochoa"
(CSIC-UAM), Universidad Autónoma, Canto Blanco, 28049 Madrid,
Spain
1092-2172/01/$04.00+0 DOI: 10.1128/MMBR.65.2.261-287.2001
Copyright © 2001, American Society for Microbiology. All rights reserved.
29 Family of Phages
SUMMARY
INTRODUCTION
GENERAL FEATURES OF PHAGES
29, B103, AND GA-1
SEQUENCE ANALYSIS OF THE GA-1 GENOME
GENETIC AND TRANSCRIPTIONAL ORGANIZATION
TRANSCRIPTIONAL REGULATION
Early Promoters A2b and A2c and Late Promoter A3:
Transcriptional Regulation by Proteins p4 and p6
Early Promoter C2: Transcriptional Regulation by Protein
p6
Early Promoters C1, C1a, and C1b Present in
29,
B103, and GA-1, Respectively
Promoter A1, Driving Synthesis of the pRNA
Other Promoters in the
29 Genome
Other Promoters in the GA-1 Genome
TRANSCRIPTIONAL TERMINATION
PROTEIN-PRIMED MECHANISM OF DNA REPLICATION
INITIATION OF DNA REPLICATION
DNA Polymerase-TP Heterodimer Formation
Sliding-Back Mechanism
Transition from Protein-Primed to DNA-Primed
Replication
THE FOUR MAIN PROTEINS REQUIRED FOR IN VITRO DNA
REPLICATION
DNA Polymerase
C-terminal domain of
29 DNA polymerase.
N-terminal domain of
29 DNA polymerase.
(i)
Proofreading.
(ii) Strand displacement.
Coordination between synthesis and degradation.
Terminal Protein p3
DBP Protein p6
SSB Protein p5
OTHER GENES AND OPEN READING FRAMES DOWNSTREAM OF GENE 2 IN
29 AND B103
Gene 1 of
29
GA-1 OPERONS CONTAINING OPEN READING FRAMES M-O AND P-T
EARLY OPERON LOCATED AT THE RIGHT SIDE OF THE PHAGE
GENOMES
Gene 17
Gene 16.7
LATE OPERON
Gene 8.5, Encoding the Head Fiber Protein
Structural Phage Proteins and
29 Phage Morphogenesis
Prohead formation.
DNA translocating/packaging machine.
(i) Connector.
(ii) pRNA ring.
(iii) ATPase protein p16.
Putative mechanism of
29 DNA packaging.
Phage maturation.
Lysis cassette.
(i) Holin-encoding genes of
29, B103, and GA-1.
(ii) Peptidoglycan hydrolase-encoding genes of
29, B103,
and GA-1.
CONCLUSIONS AND FUTURE PERSPECTIVES
ACKNOWLEDGMENTS
REFERENCES
SUMMARY
|
|
|---|
Continuous research spanning more than three decades has made the Bacillus bacteriophage
29 a paradigm for several molecular mechanisms of general biological processes, such as DNA replication, regulation of transcription, phage morphogenesis, and phage DNA packaging. The genome of bacteriophage
29 consists of a linear double-stranded DNA (dsDNA), which has a terminal protein (TP) covalently linked to its 5' ends. Initiation of DNA replication, carried out by a protein-primed mechanism, has been studied in detail and is considered to be a model system for the protein-primed DNA replication that is also used by most other linear genomes with a TP linked to their DNA ends, such as other phages, linear plasmids, and adenoviruses. In addition to a continuing progress in unraveling the initiation of DNA replication mechanism and the role of various proteins involved in this process, major advances have been made during the last few years, especially in our understanding of transcription regulation, the head-tail connector protein, and DNA packaging. Recent progress in all these topics is reviewed. In addition to
29, the genomes of several other Bacillus phages consist of a linear dsDNA with a TP molecule attached to their 5' ends. These
29-like phages can be divided into three groups. The first group includes, in addition to
29, phages PZA,
15, and BS32. The second group comprises B103, Nf, and M2Y, and the third group contains GA-1 as its sole member. Whereas the DNA sequences of the complete genomes of
29 (group I) and B103 (group II) are known, only parts of the genome of GA-1 (group III) were sequenced. We have determined the complete DNA sequence of the GA-1 genome, which allowed analysis of differences and homologies between the three groups of
29-like phages, which is included in this review.
INTRODUCTION
|
|
|---|
The genus Bacillus incorporates many species of gram-positive, aerobic, endospore-forming bacteria that normally inhabit the soil or decaying plant material. In these habitats, a large variety of phages have been isolated that infect bacilli. All of these phages isolated so far have some common features. First, they all contain double-stranded DNA (dsDNA), and second, the virions have prolate icosohedral heads and are tailed. Modern phage taxonomy is based on properties of the virion and its nucleic acid (see references 74 and 131). The order of tailed phages, named Caudovirales, are classified into three families: Myoviridae (phages with contractile tails), Podoviridae (phages with short tails), and Siphoviridae (phages with long noncontractile tails). For a general review on tailed bacteriophages, see reference 4. In addition to taxonomy based on properties of the virion and its nucleic acid, phages can be divided into three groups based on their infection cycle. The first group contains lytic phages that complete their life cycle within a well-defined period after infection and are unable to lysogenize their host. The second group is formed by the so-called pseudo-temperate phages. These are virulent phages with an extended and irregular latent period. Although this stage mimics lysogeny, it does not involve a stable prophage. The third group contains the temperate phages. The genomes of these phages are able to integrate into the host genome and can be maintained in this lysogenic stage for many generations. Generally, during this stage, the cells are immune to infection with the same phage.
This review specifically focuses on the
29-like genus of phages,
which includes, in addition to
29, phages PZA,
15, BS32, B103,
M2Y (M2), Nf, and GA-1. They are all lytic phages that belong to the
Podoviridae family. Most of these phages infect
Bacillus subtilis, but often they also infect other related
species such as Bacillus pumilus, Bacillus
amyloliquefaciens, and Bacillus licheniformis. Phages
of this genus have been subclassified into three groups based on
serological properties, DNA physical maps, peptide maps and partial or
complete DNA sequences (164, 220, 222). The first group
includes phages
29, PZA,
15, and BS32; the second group includes
B103, Nf, and M2Y; and the third group contains GA-1. Interestingly,
the classification of these phages coincides with their geographical
distribution. Thus, the phages belonging to group I were isolated in
the United States (169), those belonging to group II were
isolated in Japan (91, 113, 191), and GA-1 (group III) was
isolated in Europe (39).
The genomes of the
29-like phages consist of a linear dsDNA molecule
of about 20 kb that has a phage-encoded protein, named terminal protein
(TP), covalently attached at each 5' DNA end. The DNA sequences of the
complete genomes of
29 and PZA (83, 84, 161, 216, 221,
224) belonging to group I and that of B103 (163)
belonging to group II have been determined. However, only parts of the
GA-1 sequence, belonging to group III, have been determined so far
(78, 86, 111, 114, 222). To gain a comprehensive
understanding of the relatedness of the three groups of phages, we have
determined the complete DNA sequence of GA-1. The genomes of
29 and
B103 are 19,285 and 18,630 kb, respectively (163, 216).
However, the GA-1 genome was reported to be approximately 21.5 kb
(220, 223). Thus, an additional incentive to determine the
complete DNA sequence of GA-1 was to gain insight into possible
additional coding sequences present on the GA-1 genome.
Phage
29 has been subject to extensive studies, and the results have
led to the understanding of several molecular mechanisms of general
biological processes, such as DNA replication, regulation of
transcription, phage morphogenesis, and phage DNA packaging. These
various topics will be discussed in this review, and attention is
focused specifically on progress made during the last few years. In
general, the views presented are based on results obtained with
29,
since most studies concerned analysis of this phage. In addition, an
integrated overview of homologies and differences between the three
groups of the
29 genus based on the complete DNA sequences of
29
(group I), B103 (group II), and GA-1 (group III) is presented.
GENERAL FEATURES OF PHAGES
29, B103, AND GA-1
|
|
|---|
The
29-like phages are the smallest Bacillus phages
isolated so far and are among the smallest known phages containing
dsDNA. The sizes of the phage particle of each of the three groups of
29-like phages are shown in Table 1.
Phage
29 was first isolated by Reilly (169) from garden
soil. Phage B103 was first isolated from a nonspecified lysed
Bacillus culture (91), and phage GA-1 was first
isolated by Bradley (39) from rotting lawn mowings. Electron microscopy analysis showed that the phage particles of
29, B103, and GA-1 have a sixfold radial symmetry and a short noncontractile tail tube. A schematic representation of a
29 phage
particle in which each protein is indicated is shown in Fig.
1. Analysis of the host range showed that
29 was able to infect B. subtilis strains 168, 110NA, and Marburg, B. amyloliquefaciens H, and several
strains of B. licheniformis and B. pumilus
(reviewed in reference 177). The host range of B103 has
not been studied, but it is known to infect B. subtilis 9/3
(163). Finally, GA-1 was shown to infect lytically
Bacillus species strain G1R (39). Arwert and
Venema (10) showed that GA-1 is unable to infect the
standard B. subtilis strain 168. Although sequence analysis of G1R 16S rRNA showed that Bacillus strain G1R is most
closely related to B. pumilus, GA-1 is unable to infect
B. pumilus strains BP1 or B205-L (J. A. Horcajadas,
unpublished results). Therefore, the species identification of
Bacillus strain G1R, the specific host of GA-1, remains
unclear.
|
|
SEQUENCE ANALYSIS OF THE GA-1 GENOME
|
|
|---|
The DNA sequence of the complete genomes of
29 (group I)
(83, 84, 216, 221, 224) and B103 (group II),
(163) are known, and they consist of 19,285 and 18,630 bp,
respectively. However, only noncontinuous parts of the GA-1 genome have
been sequenced before. These include (i) the left (168 bp; GenBank
accession number M19512) and right (168 bp; M19519) terminal nucleotide sequences (222); (ii) the central region containing the
early promoters A2b and A2c and the late promoter A3 (354 bp; AJ133524) (111); (iii) gene 6 encoding the dsDNA binding protein
(DBP) p6 (342 bp; AF148209) (78); (iv) gene 5 encoding the
single-stranded DNA (ssDNA) binding protein (SSB) p5 (513 bp; AJ244026)
(86); (v) gene 4 encoding the transcriptional regulatory
protein p4G (405 bp; AJ133525) (111); (vi) the
region spanning genes 3 and 2 encoding the TP (p3) and DNA
polymerase (p2), respectively (2,668 bp; X96987)
(114); and (vii) a region downstream of gene 2 (549 bp; AJ294726) (A. Bravo and M. Salas, unpublished data). Genes 6 through 2 lie in the same order as the corresponding genes of phages
29 and B103. Where possible, the individual sequences were
integrated in larger contigs and gaps were filled using primers based
on the published sequences and purified GA-1 as template DNA. Next, the
remaining part (~17 kbp) of the GA-1 genome sequence was determined
of both strands by a primer-walking strategy using purified phage GA-1
DNA as template.
The genome of GA-1 was shown to have a total size of 21,129 bp. The
complete nucleotide sequence has been deposited in the EMBL/GenBank/DDBJ nucleotide sequence database and was assigned accession number X96987. Whereas the G+C content of GA-1 is 34.7%,
those of
29 and B103 are 40.0 and 37.7%, respectively. Next,
computer-assisted and manual analysis of the DNA sequence were used to
identify open reading frames (ORFs), direct and inverted repeats, and
putative promoters, ribosomal binding sites, and Rho-independent
transcriptional terminators. The deduced amino acid sequences of the
various ORFs were compared with protein sequences present in the
29
and B103 genomes as well as with those present in available databases.
In cases where the deduced amino acid sequences of the identified ORFs
or genes showed significant homology to those of the
29 and B103
genes, they were given numbers according to the nomenclature used for
these phages. The remaining ORFs were identified with letters. The data
obtained were used to construct a putative genetic and transcriptional
map of GA-1, which is shown, together with those of
29 and B103, in
Fig. 2. This figure shows that genes 2 to
6, 7 to 16 (with the exception of gene 8.5, which is lacking in GA-1),
and 17 and 16.7 are conserved in all three genomes. Characteristics of
the proteins synthesized by these GA-1 genes and their levels of
similarity to corresponding proteins of
29 and B103 are given in
Table 2, which shows that for all the
homologous genes shared by
29, B103, and GA-1, those of GA-1 are
less conserved than those of
29 and B103. This confirms that within
the family of
29-related phages, GA-1 is the most distantly related
one, as suggested previously (164, 220, 222). Features of
the putative proteins synthesized by the GA-1 ORFs are given in Table
3.
|
|
|
GENETIC AND TRANSCRIPTIONAL ORGANIZATION
|
|
|---|
Generally, genes with related functions are clustered in phage
genomes (4), and Fig. 2 shows that
29, B103, and GA-1
are no exception to this rule. In addition, Fig. 2 shows that in most aspects, the genomes of
29, B103, and GA-1 are similarly organized. In all three genomes the genes and ORFs are organized in
operons. Depending on the time when they are first expressed
during the infection cycle, these can be divided into early and late
operons. In all three genomes the early-expressed
operons are transcribed leftward and the single late-expressed
operon is transcribed rightward. The genes present in the late
operon (genes 7 through 16), which is located in the central
part of the genome, encode phage structural proteins, proteins involved
in phage morphogenesis, and proteins required for lysis of the host.
All three genomes contain an early-expressed operon that is
divergently transcribed with respect to the late operon (Fig.
2). Genes 6, 5, 3, and 2 of this operon encode the four main
proteins required for phage DNA replication. The operon also
contains gene 4, which encodes the transcriptional regulator protein.
In addition to its role in phage DNA replication, protein p6 also has a
role in transcriptional regulation (14, 69, 219). Note
that this operon of GA-1 is smaller than the corresponding ones
of
29 and B103. Another early-expressed operon is located at
the right side of the phage genomes. However, as described in more
detail later, only two genes of this operon, 17 and 16.7, are
conserved in all three phage genomes. Finally, another feature shared
by all three phages is the presence of a region located in the left
part of the genome that encodes an RNA (pRNA) which is required for
packaging of phage DNA.
The genome of GA-1 is about 1.8 and 2.5 kb larger than those of
29
and B103, respectively. Although the structural organization of GA-1
genome is similar to that of
29 and B103, it contains additional sequences, located at both genome ends, that may
encode several proteins, counterparts of which are not present in the genomes of
29 and B103 (see Fig. 2).
TRANSCRIPTIONAL REGULATION
|
|
|---|
The (putative) promoters and transcriptional start sites, for
these cases already determined, are listed in Table
4. When appropriate, the nomenclature of
the GA-1 promoters was adapted to that of
29 and B103. Expression of
most
29 and GA-1 promoters has been studied. As indicated in Table
4, most of the promoters contain the sequence TG positioned 1 bp
upstream of the
10 sequence. This additional sequence is
characteristic of the so-called
10 extended promoters first described
for Escherichia coli promoters (123, 165). At
least in E. coli, the extension of the
10 region is able
to compensate for the absence of a good
35 box, helping the sigma 70 RNA polymerase to recognize and bind such promoters (123, 128,
165). The additional TG sequence is also frequently found in
A-dependent B. subtilis promoters (106,
152). Possible involvement of the TG motif in promoter strength
has been recently studied for the
29 promoters A1, A2c, and A3
(46). In all three promoters, mutation of the TG motif
impaired the binding of the
A-RNA polymerase to the
promoter. These and additional results support the view that the TG
motif provides contact sites for B. subtilis
A RNA polymerase that are important for a specific role
in the first steps of transcription (46).
|
The B. subtilis
-amylase promoters amyP and
amyP2 contain the TGTG sequence located 1 bp upstream of its
10 region, called the
16 region. Mutation analysis of the
16
region of these promoters showed that it significantly affected the in
vitro promoter strength (217). In addition, a large
portion of known gram-positive bacterial promoters contain the
16
TRTG motif (in which R is a purine), suggesting that not only the
10
extended TG motif but also the
16 region is important for promoter
strength (217). The
16 region is present in the
following phage promoters: A1 and A2b of
29, A1 of B103, and A1c, A3
and C2 of GA-1 (Table 4). Possible involvement of the
16 region in
the activity of these phage promoters has not been studied yet.
Early Promoters A2b and A2c and Late Promoter A3: Transcriptional Regulation by Proteins p4 and p6
As described above, the structural organization of the centrally
located late operon and the divergently oriented early
operon is conserved in the genomes of
29, B103, and GA-1. In
all three phage genomes the promoters that drive the expression of
these early and late genes are localized in a short intergenic region between these two operons. The transcriptional regulation of
these promoters has been studied extensively for
29 (for reviews,
see references 171 and 182). Two strong promoters named
A2c and A2b drive the expression of the early operon of
29
containing genes 6 to 1. The late
29 operon is transcribed
from a single promoter named A3 (16, 136, 137, 149, 197).
The transition from early to late
29 transcription is controlled by
29 protein p4, the product of the early gene 4. Protein p4, which is
a dimer in solution, binds to its cognate DNA binding sites as a
tetramer (142), contacting only one side of the DNA helix
(172). The intergenic region comprising promoters A2c,
A2b, and A3 contains two p4 binding sites. The center of one of these
is located at position
82 relative to the transcription start site of
the late promoter A3 (15). Whereas this promoter contains
a good consensus sequence at the
10 region for the vegetative
B. subtilis
A RNA polymerase, it lacks a
typical
35 box (Table 4). Therefore, the RNA polymerase alone does
not bind efficiently to the A3 promoter, which explains why the
downstream operon is not expressed during early infection
times. Activation of the A3 promoter requires binding of protein p4 to
the p4 binding site upstream of the A3 promoter. The main role of
protein p4 is to stabilize the binding of RNA polymerase to the A3
promoter as a closed complex, and the protein has little effect on the
rest of the steps of the initiation process (157).
The
29 promoters A2c and A2b drive the expression of the early
operon containing genes 6 to 1. Of these, promoter A2b is the
one located closest to the oppositely oriented late promoter A3;
promoter A2c is located proximal to gene 6. Both early promoters are
repressed by protein p4. The p4 binding site that is located upstream
of the late A3 promoter and is required for activation of this
promoter, as described above, partially overlaps the early A2b
promoter. Binding of protein p4 to this site occupies the
35 region
of the A2b promoter, preventing the expression of this promoter. Thus,
protein p4 activation of the late promoter A3 is accompanied by an
efficient repression of the A2b promoter (172). Expression
of the other early promoter, A2c, is also repressed by protein p4, but
this occurs through a totally different mechanism. In addition to the
p4 binding site upstream of the late promoter A3, another p4 binding
site is located upstream of promoter A2c (centered at position
72
relative to the transcription start site of A2c). Protein p4 binding to
this site is stabilized in the presence of RNA polymerase, indicating
that the proteins bind cooperatively to the DNA. In this situation, the
RNA polymerase can generate abortive initiation transcripts but is
unable to escape from the A2c promoter (150). Thus,
repression of the A2c promoter occurs by overstabilization of the RNA
polymerase to this promoter (148). Interestingly, both
repression at the A2c promoter and activation of the A3 promoter
involve interaction between a region of protein p4 containing
Arg120 and the C-terminal domain of the RNA polymerase
subunit (140-143, 150, 151, 171).
Recently it was demonstrated that expression of the
29 A2c, A2b, and
A3 promoters is regulated by the viral protein p6 in addition to
protein p4 (69). Protein p6 is an abundantly
early-expressed dsDNA binding protein that was shown previously to play
an important role in initiation of phage DNA replication (see below).
Elías-Arnanz and Salas (69) showed that protein p6
promotes p4-mediated repression of the A2b promoter and activation of
the A3 promoter by enhancing binding of p4 to its recognition site at
promoter A3. In addition, protein p4 promotes p6-mediated repression of
the A2c promoter by favoring the formation of a stable p6-nucleoprotein
complex that interferes with RNA polymerase binding to promoter A2c.
Although transcriptional regulation of the equivalent promoters of B103
has not been studied, conservation of the main characteristics of this
region regarding the A3 and A2b promoters suggests that transcription
of these promoters may be regulated in a similar way to those of
29.
Results that at least partially support this assumption may come from
the analysis of the corresponding region of phage Nf (147,
158), which belongs to the same group of phages as B103. First,
it was shown that activation of the late A3 promoter of Nf requires the
Nf-encoded protein gpF (homologue of the
29 protein p4);
(147). Second, Nuez and Salas (158) showed
that activation of the Nf A3 promoter is responsive to the
29
protein p4 in a similar way to that observed for the
29 A3 promoter.
A first in vivo and in vitro analysis of the transcriptional regulation
of the equivalent promoters of GA-1 has been reported recently
(111). The in vivo activity of the GA-1 A2b and A2c promoters was shown to diminish 10 min after infection, whereas at this
time the expression of the late A3 promoter increased significantly.
The GA-1-encoded protein p4 (named p4G, 53% similar to
29 p4) was purified and used to study its involvement in regulation of these promoters in vitro. As in
29, a p4G binding
site is located upstream of the late A3 promoter that overlaps with the early A2b promoter. As in
29, binding of p4G to this
site prevented the binding of RNA polymerase to the GA-1 early A2b
promoter. Surprisingly, however, binding of p4G to this
site had no effect on the in vitro expression of the late A3 promoter
of GA-1. Both in the absence and in the presence of p4G,
promoter A3 was expressed efficiently in vitro. Thus, in contrast to
the situation in
29, p4G is not required in vitro to
activate the expression of the GA-1 A3 promoter. Moreover, in contrast
to the
29 protein p4, the GA-1 protein p4G was shown not
to interact with the RNA polymerase
subunit (111).
Although the A3 promoter of GA-1 was active in the absence of
p4G in in vitro assays, it was not active at early
infection times in vivo. In addition, in vivo activation of the A3
promoter was completely blocked when protein synthesis was prevented
just before infection. Together, these results suggested that the A3
promoter may be repressed in vivo by a host-encoded protein and that
protein p4G may function as an antirepressor, permitting A3
expression at late infection times. Finally, it is intriguing that the
GA-1 A3 promoter, which, like the A3 promoters of
29 and B103, lacks
a good
35 box, is expressed efficiently in vitro. Studies are under
way to unravel the mechanisms that underlie the observed differences in
regulation of the
29 and GA-1 A3 promoters.
At present, it is unknown whether a p4-dependent repression of the A2c
promoter, as described for
29, also applies for the equivalent A2c
promoters of Nf/B103 or GA-1. The fact that a typical p4 binding site
is lacking upstream of the A2c promoters of B103 (163), Nf
(158), and GA-1 (111) may be an indication
that p4 is not involved in the repression of these promoters, at least not in a similar way to that in
29. It is also unknown whether protein p6 of B103/Nf and/or GA-1 plays a role in the regulation of the
A2c, A2b, and A3 promoters of these phages.
Early Promoter C2: Transcriptional Regulation by Protein p6
All three phage genomes contain an early-expressed operon
located at the right end of their genome, whose expression is under the
control of the C2 promoter (Fig. 2). For
29 it has been demonstrated that the activity of the early promoter C2 decreases rapidly 10 min
after infection (110, 122, 149). Protein p6 was shown to be responsible for in vivo and in vitro repression of promoter C2
(14, 219). Thus, the
29 p6 protein not only plays a
role in the regulation of the A3, A2b, and A2c promoters (see above) but also regulates the expression of the C2 promoter. In addition, as
described below, it plays an important role in the initiation of
29
DNA replication. Most probably, binding of p6 to the DNA ends prevents
the RNA polymerase to recognize the C2 promoter (A. Camacho and M. Salas, unpublished results). The
29 mutant sus6(626)
contains a suppressible mutation in gene 6, and therefore protein p6 is
not synthesized in nonsuppressor cells infected with this mutant phage.
When
29 sus6(626) mutant phage was used for infection,
phage DNA replication did occur in suppressor cells but not in
nonsuppressor cells (219). However, under these conditions the C2
promoter was not repressed in either nonsuppressor or suppressor cells.
It appeared that whereas the amount of p6 protein synthesized under
permissive conditions was sufficient to permit in vivo
29 DNA
replication, it was too small to repress the C2 promoter in vivo
(47, 219). The observation that a fairly large amount of
p6 is required for repression of the C2 promoter in vitro (14, 219) supports this view.
Equivalent C2 promoters are also present in the genomes of B103 and
GA-1. Like the C2 promoter of
29, the GA-1 C2 promoter is expressed
almost exclusively during the first 10 min after infection (Horcajadas,
unpublished). In vitro expression of the C2 promoter of GA-1 is
inhibited in the presence of purified GA-1-encoded protein p6, as well
as, although somewhat less efficiently, by protein p6 of
29. DNase I
footprint analysis indicated that DNA binding of protein p6 prevents
the RNA polymerase from recognizing the C2 promoter of GA-1
(Horcajadas, unpublished results). Thus, due to protein p6-mediated
repression, the
29 and GA-1 C2 promoters are expressed only during
the initial 10 min after infection. Obviously, this repression will
limit the amount of proteins encoded by the downstream genes and ORFs.
Early Promoters C1, C1a, and C1b Present in
29,
B103, and GA-1, Respectively
All three phage genomes contain a promoter within the early
operon located at the right side of their genome (Fig. 2 and
Table 4). The absence of potential transcriptional terminators upstream of these promoters suggests that the last genes or ORFs of these operons may be expressed from two promoters. In
29 this
additional promoter was named C1. It is located within gene 16.7 and
may drive the expression of ORFs 16.6 and 16.5. In B103, the promoter is located within ORF d and may drive the expression of gene 16.7 and
ORF 16.5. According to the
29 nomenclature, this promoter of B103
was named C1 (163). Finally, in GA-1 the promoter is located within ORF G and may drive the expression of ORFs H to L. Since
these promoters drive the expression of different genes and ORFs, they
are not equivalent. Therefore, we named these promoters of B103 and
GA-1 Cla and Clb, respectively.
In vitro transcription analysis showed that expression of the
29 C1
promoter is repressed by protein p6 (14). Although p6
repressed the C2 promoter in the presence of low and high salt concentration, p6 affected C1 expression only at low salt
concentrations. This difference may be due to the higher affinity of p6
for the terminal
29 DNA fragment containing the C2 promoter than for the more internal DNA sequences containing the C1 promoter
(14).
Promoter A1, Driving Synthesis of the pRNA
For
29 it has been demonstrated that packaging of TP DNA into
the phage prohead requires a 174-base
29-encoded RNA (pRNA) (5, 93, 94). This pRNA is produced from promoter A1
(136, 137, 197), which is active throughout the infection
cycle (149). Although substantial levels of pRNA were
detected at early infection times, a rapid increase in the number of
pRNA molecules was detected starting about 15 min after infection,
which approximately coincided with the onset of
29 DNA replication.
Therefore, the additional phage DNA templates produced explain this
increase of pRNA and suggest a constant transcription rate
(149).
Equivalent A1 promoters driving pRNA synthesis of the corresponding
phages are present in B103 and GA-1 (Table 4). The pRNA coding
sequences of
29 and B103 are located at the far-left ends of their
genomes. Figure 2 shows that the situation is different for GA-1. This
genome contains an additional operon downstream of the
pRNA-coding region, as well as another operon located between gene 2 and its pRNA-coding region. A promoter is located upstream each
of these two unique operons. Thus, whereas the leftmost region of the
29 and B103 genomes contains only one promoter, this region of GA-1 contains three promoters. To maintain a consistent
nomenclature, the GA-1 promoter upstream of ORF M was named A1a, the
one driving the expression of the GA-1 pRNA was named A1b, and the one
upstream of ORF P was named A1c.
The expression patterns of GA-1 promoter A1b and B103 promoter A1
during the infection cycle have not been studied. Table 4 shows that
the
35 and
10 sequences of the A1 promoters of
29 and B103 and
the equivalent A1b promoter of GA-1 are almost identical and very close
to the consensus sequence recognized by
A-containing RNA
polymerase. Therefore, it is likely that the A1b promoter of GA-1 and
the A1 promoter of B103 behave similarly to the equivalent A1 promoter
of
29.
Other Promoters in the
29 Genome
In vivo and in vitro experiments revealed two promoters, named B1
and B2, that are located in the
29 DNA region encoding the late
genes (16, 197) (Fig. 2). Transcription from these promoters proceeds leftward. Compared to other
29 promoters, only
minor amounts of RNA were synthesized by the B1 and B2 promoters in
vivo (149). No ORF with a reasonable ribosome binding site was found downstream of either of these promoters. Although it has been
suggested that the products synthesized by these promoters may function
as antisense RNA to modulate the expression of some late genes
(16, 136), such a function has not been proven
experimentally. The
29 promoter A1IV, located in the DNA polymerase
coding region (Fig. 2), was shown to be weakly expressed in vivo
(16) and to contribute to the synthesis of protein p1
(40). The B1, B2, and A1IV promoters are shown in Table 4.
Other Promoters in the GA-1 Genome
The promoters A1c and A1a are unique for GA-1. Primer extension analysis using total RNA isolated at different times after infection showed that these two promoters are active early after infection and that they are progressively downregulated at later infection times (J. A. Horcajadas, unpublished). Therefore, it is likely that promoters A1c and A1a drive the expression of the GA-1 regions containing ORFs P to T and M to O, respectively. At present, the mechanism underlying the in vivo repression of these promoters is unknown. Since the pattern of repression of these promoters is different from that of the abruptly repressed C2 promoter, it is unlikely that these promoters are repressed by protein p6 in a similar way to the C2 promoter.
TRANSCRIPTIONAL TERMINATION
|
|
|---|
The main early and late in vivo transcription termination sites of
29 have been determined by S1 nuclease mapping (17). Transcription of the late A3 promoter and that of the early promoters C2 and C1 terminated in the short intergenic region between gene 16 and
ORF 16.5 (Fig. 2). This DNA region contains an inverted repeat, and
stem-loop structures with calculated free energies of
14.8 and
16.8
kcal could be drawn for the early and late transcripts, respectively.
In both directions, a uridine-rich tail follows the stem-loop,
indicating that it functions as a Rho-independent bidirectional
transcription terminator. This terminator was named TD1. Inverted
repeats are located at similar positions in the genomes of B103 and
GA-1. As in
29, uridine-rich tails at either strand follow the
stem-loops of B103 and GA-1, indicating that these also constitute
bidirectional Rho-independent transcriptional terminators. According to
the
29 nomenclature, these terminators were named TD1. The DNA
sequences of the TD1 terminators are shown in Table
5.
|
Another Rho-independent transcriptional terminator, named
TA1, was found to be present within gene 4 of
29 (17).
It has been suggested that part of the transcripts initiated at the A2b and A2c promoters terminate at this terminator. This would result in
the synthesis of high levels of mRNA coding for proteins p6 (DBP)
and p5 (SSB) and lower levels of longer mRNA coding for proteins p6
to p1 (17). Apart from possible differences in translation initiation efficiencies, this explains why p6 and p5 are synthesized in
far larger quantities than are proteins p4, p3, p2, and p1 (2,
86, 139). Equivalent TA1 transcriptional terminators are present
in the genomes of B103 and GA-1, indicating that a regulatory mechanism
similar to that proposed for
29 exists in B103 and GA-1. In all
three genomes, the TA1 transcriptional terminator is located at very
similar positions within gene 4. Thus, the mRNAs synthesized up to
the TA1 terminators may allow the synthesis of the N-terminal 28 to 30 amino acids of protein p4. Interestingly, this region of the three p4
proteins is far more conserved than the downstream p4 region (Fig.
3), which might imply that the N-terminal
30 amino acids of p4 could have a function on its own.
|
No potential Rho-independent transcriptional terminator is present
downstream of the pRNA coding region of
29, which constitutes the
most leftward-reading region of this genome (Fig. 2). This could imply
that transcription, starting from the A2b, A2c, and A1 promoters,
continues until the left end of the genome is reached. It has indeed
been shown that in vivo transcription initiating at these
29
promoters reaches the very left end of the
29 DNA molecule as if the
RNA polymerase would run off the template (16, 17). The
same organization and the absence of a potential Rho-independent terminator downstream of the B103 pRNA-coding region suggests a similar
situation for B103.
The situation is different, however, for GA-1. As shown in Fig. 2,
three potential Rho-independent terminators are present in the left
part of the GA-1 genome. The one located closest to the left DNA end
(downstream of ORF T), named TA4, would terminate transcription
initiating at the A1c promoter of GA-1. The middle one, named TA3,
located downstream of the pRNA coding region, would terminate
transcription initiating from the GA-1 promoter A1b and possibly A1a.
The third one, named TA2, would terminate transcription initiating from
the GA-1 A2c and A2b promoters. Note that in contrast to the situation
in
29 and B103, the GA-1 terminator TA2 is located directly
downstream of gene 2. The
35 sequence of the GA-1 promoter A1a is
located within this terminator.
PROTEIN-PRIMED MECHANISM OF DNA REPLICATION
|
|
|---|
The genomes of the
29-like phages consist of a linear dsDNA
molecule of about 20 kb with a phage-encoded protein, TP, covalently attached at each 5' end. Genomes consisting of a linear dsDNA molecule
with a TP covalently linked to their 5' ends have also been found for
(i) other bacteriophages (e.g., the Streptococcus pneumoniae
and Escherichia coli phages Cp-1 and PRD1, respectively), (ii) animal viruses (e.g., adenoviruses), (iii) plasmids (e.g., S1 and
Kalilo), and (iv) bacteria (e.g., Streptomyces). In most of
these cases, initiation of DNA replication occurs via a so-called protein-priming mechanism (for reviews, see references 176, 178, and 181).
The in vitro mechanism of protein-primed DNA replication has been
studied in most detail for
29. The basic features of the protein-primed mechanism of DNA replication, based on the
29 system,
are outlined here. More detailed descriptions of the different steps
and the function of the proteins involved are given below. In addition,
it should be mentioned that although the main characteristics of protein-primed DNA replication are conserved, some minor differences with respect to the
29 mechanism have been observed in some cases, especially regarding the sliding-back step (see below). Figure 4 shows a schematic representation of in
vitro
29 DNA replication. Initiation of
29 DNA replication starts
with recognition of the origin of replication, i.e., the TP-containing
DNA ends, by a TP-DNA polymerase heterodimer. The virus-encoded protein
p6 forms a nucleoprotein complex that would help to open the DNA
ends (187), facilitating the formation of a covalent
linkage between the first inserted nucleotide (dAMP) and TP,
which is catalyzed by the
29 DNA polymerase (29, 109).
The formation of this first TP-dAMP covalent complex is directed by the
second nucleotide at the 3' end of the template; then the TP-dAMP
complex slides back 1 nucleotide to recover the information of the
terminal nucleotide (144). Next, the
29 DNA
polymerase synthesizes a short elongation product before dissociating
from the TP (146). Replication, which starts at both DNA
ends, is coupled to strand displacement. This results in the generation
of so-called type I replication intermediates consisting of
full-length
29 dsDNA molecules with one or more ssDNA branches of
different lengths. The ssDNA stretches generated are bound by the SSB
protein (p5). When the two converging DNA polymerases merge, a type I
replication intermediate becomes physically separated into two type II
replication intermediates. Each of these consists of a full-length
29 DNA molecule in which a portion of the DNA, starting from one
end, is double stranded and the portion spanning to the other end is
single -stranded (102, 117). Continuous elongation by the
DNA polymerase completes replication of the parental strand.
|
INITIATION OF DNA REPLICATION
|
|
|---|
DNA Polymerase-TP Heterodimer Formation
DNA polymerases are unable to initiate de novo DNA synthesis on a
DNA template but require the existence of a primer containing a free
hydroxyl group to start DNA elongation (126). Generally, RNA primers provide the 3'-hydroxyl (3'-OH) group needed by the DNA
polymerase to elongate the DNA chain. However, in most linear genomes
containing a TP covalently linked to their 5' DNA ends, the 3'-OH group
of a specific serine, threonine, or tyrosine residue of the TP is used
for DNA elongation (reviewed in reference 181). In
29
DNA polymerase, its TP deoxynucleotidylation activity is responsible
for the covalent linkage of 5'-dAMP, via a phosphoester bond, to the
hydroxyl group of Ser232 of the TP (24, 29,
109). This reaction requires the formation of a stable
heterodimer complex between the TP and the
29 DNA polymerase
(28). Most probably, the active site used for
polymerization is also used for the TP deoxynucleotidylation reaction
(reviewed in references 31 and 32). This implies that the
TP present in the heterodimer complex has to be specifically positioned
in order for the DNA polymerase to perform the TP deoxynucleotidylation reaction. Several mutations located in different regions of the
29
DNA polymerase affect its interaction with the TP (37, 61, 145,
206). In addition, interaction of TP with the purified C-terminal portion of the
29 DNA polymerase is severely impaired (209). Together, these results suggest that interaction of
the TP involves many contacts with different regions of the DNA polymerase.
Interestingly, a multiple sequence alignment of DNA polymerases
belonging to the B-type family showed that DNA polymerases involved in
protein-primed DNA replication contain two regions of amino acids,
denoted TPR-1 and TPR-2 (Fig. 5), which
are not present in other B-type DNA polymerases (33).
Analysis of the
29 mutant DNA polymerase in which the conserved
Asp332 residue of the TPR-1 region was changed into Tyr
showed that it was able to form a stable heterodimer with TP and that
it had essentially wild-type levels of synthetic activities in DNA
primed reactions. However, its activity was drastically affected in
29 TP-DNA replication, indicating that the mutant DNA polymerase forms a non functional interaction with the TP and hence supporting the
view that at least TPR-1 is involved in proper positioning of the TP in
the TP-DNA polymerase heterodimer complex (68).
|
Sliding-Back Mechanism
Although the TP deoxynucleotidylation reaction can occur in the
absence of a DNA template, it is strongly stimulated in the presence of
29 TP-DNA (24). In the latter case, TP-dAMP
is preferentially formed. The DNA ends of
29 have a short inverted terminal repeat of 6 nucleotides (3'-TTTCAT-5'). The first TP-dAMP is
not directed by the terminal nucleotide but by the penultimate nucleotide of the
29 template strand. Subsequently, the complex slides back 1 nucleotide to recover the information of the 3'-terminal nucleotide (144). Terminal repeats are also present in the
genomes of B103 (3'-TTTCAT-5'), GA-1 (3'-TTTATCTT-5'), and all other
29-related phages analyzed so far. Moreover, this feature is also
conserved in other linear genomes containing a TP covalently linked to
their DNA ends, such as the E. coli and S. pneumoniae phages PRD1 and Cp-1, respectively, linear plasmids,
and the eukaryotic adenovirus. Terminal reiteration is a prerequisite
for the sliding-back mechanism. Indeed, the replication initiation site
in GA-1 (114), PRD1 (43), Cp-1
(132), and adenovirus (125) corresponds to an
internal nucleotide close to the 3'-terminal end, and a sliding-back or similar mechanism has been shown to occur in these cases to recover the
information of the terminal nucleotide(s). Probably, the sliding-back mechanism applies to all genomes that replicate via a protein-primed mechanism. Since proofreading does not apply to the TP-dNMP product (72), the sliding-back mechanism would be an alternative
way to ensure that the replication origin-containing DNA ends are replicated with high fidelity.
Transition from Protein-Primed to DNA-Primed Replication
After the sliding-back step, the
29 DNA polymerase and the
primer TP do not dissociate immediately. Rather, there is a transition stage in which the DNA polymerase synthesizes a DNA molecule of 5 nucleotides while complexed with the primer TP (initiation mode). During the synthesis of nucleotides 6 to 9 the complex undergoes some
structural change (transition mode), and the DNA polymerase finally
dissociates from the primer TP when the nucleotide 10 is inserted into
the nascent DNA chain (elongation mode) (146). This
behavior probably reflects a requirement of the
29 DNA polymerase for a DNA primer of a minimum length to efficiently carry out DNA-primed elongation. This view is supported by the following data.
First, Méndez et al. (146) demonstrated that primer
molecules of 6 nucleotides or less are not elongated. This fits well
with the observation that
29 DNA polymerase synthesizes a DNA chain of 5 nucleotides before it changes from the initiation mode to the
elongation mode in TP-DNA-primed reactions. Second, abortive replication products consisting of the primer TP linked up to 8 nucleotides were particularly observed under conditions that decrease
the strand displacement capacity of
29 DNA polymerase (146). Finally, de Vega et al. (62)
demonstrated that
29 DNA polymerase covers a DNA region of 10 nucleotides, which may be indicative of the optimum length to carry out
polymerization. Interestingly, the
29 DNA polymerase mutant in which
Asp456, belonging to the conserved "YxDTDS" motif at
the polymerization domain (see below), has been changed into Gly is
unable to proceed further than 5 nucleotides from the initiation
complex. This suggested that the
29 DNA polymerase residue
Asp456 is crucial to entry into the transition stage of
29 DNA replication (185).
A similar transition step has also been demonstrated in replication of adenovirus (124) and probably is a general feature of protein-primed DNA replication.
THE FOUR MAIN PROTEINS REQUIRED FOR IN VITRO DNA
REPLICATION
|
|
|---|
In the
29, B103, and GA-1 genomes, genes 6, 5, 3 and 2 are
located in a single early-expressed operon (Fig. 2). In
29,
these genes are indispensable for in vivo phage DNA replication. Gene 2 encodes the DNA polymerase, gene 3 encodes the TP, gene 5 encodes SSB,
and gene 6 encodes DBP. An in vitro
29 DNA replication system, based
on these four purified proteins, has been established
(27). The availability of this system has allowed a
detailed analysis of the in vitro
29 DNA replication mechanism and
functional analysis of these four main replication proteins.
Characteristics of these four proteins are given below.
DNA Polymerase
Gene 2 of
29, B103, and GA-1 encodes a DNA polymerase. In
29
and GA-1 the DNA polymerase has been shown to be required for replication of its phage DNA (29, 114). The DNA
polymerases encoded by
29, B103, and GA-1 belong to the B-type
superfamily of DNA-dependent DNA polymerases (also referred to as
eukaryotic or
-like polymerases). This family includes a large
number of prokaryotic and eukaryotic enzymes that are sensitive to
certain drugs (aphidicolin and phosphonoacetic acid) and nucleotide
analogs (butylanilino-dATP and butylphenyl-dGTP). The DNA polymerase of
29 has been analyzed in detail (for reviews, see references 31 and
32). The monomeric
29 DNA polymerase, which has a size of only about
66 kDa, catalyzes both the initiation and elongation stages of DNA
synthesis (29, 30). To accomplish this, it is able to
carry out two distinguishable synthetic reactions: TP deoxynucleotidylation and DNA polymerization. In addition, it has two
degradative activities: pyrophosporolysis and 3'-5' exonucleolysis. Moreover, it has two intrinsic properties: high processivity and strand
displacement ability (25). Due to the
29 DNA polymerase properties, in vitro
29 DNA replication does not require accessory proteins and DNA helicases (25).
The enzymatic activities of the
29 DNA polymerase have been mapped
by site-directed mutagenesis. A structural map, given in Fig. 5, shows
that the
29 DNA polymerase has a bimodular organization, with the
N-terminal portion constituting the 3'-5' proofreading domain and the
C-terminal portion constituting the domain responsible for its 5'-3'
synthetic activities. The bimodular organization of the
29 DNA
polymerase has been proven experimentally. Analysis of a purified
C-terminal deletion derivative of
29 DNA polymerase containing the
188 N-terminal amino acids showed that it was devoid of any synthetic
activity but retained 3'-5' exonuclease activity (31).
Reciprocally, a purified N-terminal deletion derivative containing the
C-terminal 388 amino acids had neither 3'-5' exonuclease nor strand
displacement activity but did have synthetic activities (209). Available three-dimensional structures of other DNA
polymerases show that the bimodular organization is characteristic of
proofreading proficient DNA polymerases (reviewed in reference 121).
C-terminal domain of
29 DNA polymerase.
The
polymerization activity of the
29 DNA polymerase is confined to the
C-terminal domain of the enzyme. This part of the
29 DNA polymerase
has three regions containing motifs that are conserved in other DNA
polymerases belonging to family B. These three motifs are
Dx2SLYP (motif A, also named motif 1), Kx3NSxYG (motif B, also named motif 2a), and YxDTDS (motif C, also named motif
3). The positions of these and other conserved motifs described below
are indicated in Fig. 5, together with the amino acid sequence corresponding to each motif present in the DNA polymerase of
29, B103, and GA-1. Site-directed mutagenesis at motifs A, B, and C of
29 DNA polymerase (21, 34-36) showed that these three
regions form an evolutionarily conserved polymerization-active site.
29
Leu253. Analysis of a
29 DNA polymerase mutant in which
Leu253 had been replaced by a Val residue (L253V) showed
that whereas it was not affected in template-primer DNA binding, it was
strongly affected in reactions involving the use of TP as primer
(35). With this result in mind, it would be interesting to
study the effects of a
29 L253M DNA polymerase mutant and relate it
to the reciprocal mutation in the GA-1 DNA polymerase (M253L). For motif B, the residue corresponding to Asn387 of
29 DNA
polymerase is occupied by an Asp in the B103 polymerase (Fig. 5). The
involvement of
29 DNA polymerase Asn387 in the correct
binding of the primer terminus at the polymerization active site was
demonstrated by the analysis of the N387Y mutant (36).
Taking into account the protein sequence of the B103 DNA polymerase, it
would be interesting to study possible effects of replacing
Asn387 by Asp (N387D).
In addition to motifs A, B and C, two other motifs, Tx2GR
(motif 2b) and KxY (motif 4), were identified in the C-terminal portion
of
29 DNA polymerase and analyzed by site-directed mutagenesis (37, 145). These two motifs, which are also conserved in
the C-terminal portion of B103, GA-1 (Fig. 5), and other B-type DNA polymerases, are involved in primer stabilization at the active site.
In addition, motif 2b is involved in TP and metal binding (145). For several DNA polymerases, including the
29
DNA polymerase, it has been demonstrated that three Asp residues form a
metal binding triad required for catalysis at the polymerization active site (reviewed in reference 32). In the
29 DNA polymerase, the three
Asp residues implicated are Asp249, belonging to motif A,
and Asp456 and Asp458, both belonging to motif
C (21, 35, 185). These three Asp residues are conserved in
the DNA polymerases of B103 and GA-1 and in all other known members of
the B-type DNA polymerases. Also, Arg438 of motif 2b of
29 DNA polymerase plays a role in catalysis of the polymerization
reaction (145). Moreover, three highly conserved Tyr
residues were shown to be involved, directly or indirectly, in
interaction with deoxynucleoside triphosphates (dNTPs). These residues,
also conserved in the B103 and GA-1 DNA polymerases (Fig. 5), are
Tyr254 of motif A (34, 35), Tyr390
of motif B (34, 36), and Tyr454 of motif C
(21). Since the
29 residues Tyr254 (motif
A) and Tyr390 (motif B) are also involved in selection of
dNTP binding, they play an important role in the fidelity of DNA
replication (184). In addition, a single and specific
replacement of Tyr254 (motif A) by a Val residue enables
the mutant
29 DNA polymerase to incorporate ribonucleotides without
affecting its wild-type affinity for dNTPs (38). This
indicates that
29 Tyr254 is responsible for the
discrimination against the 2'-OH group of an incoming ribonucleotide.
In addition, seven residues that are invariant or highly conserved in
the C-terminal domain of B-type DNA polymerases were shown to be
involved in binding template-primer structures. These residues are
Ser252 of motif A (35), Asn387
(see above) and Gly391 of motif B (36),
Thr434 and Arg438 of motif 2b
(145), and Lys498 and Tyr500 of
motif 4 (37).
N-terminal domain of
29 DNA polymerase.
(i)
Proofreading.
The insertion discrimination values of the
29 DNA
polymerase range from 104 to 106 and the
efficiency of mismatch elongation is 105- to
106-fold lower compared to a properly paired terminus
(72). These values illustrate the high fidelity with which
the
29 DNA polymerase replicates DNA. As with other
proofreading-proficient DNA polymerases, the
29 DNA polymerase owes
its high fidelity to its 3'-5' exonuclease activity (81),
which is confined to the N-terminal part of the enzyme. Bernad et al.
(20) proposed that three N-terminally located regions,
ExoI, ExoII, and ExoIII, form the 3'-5' exonuclease active site (Fig.
5) and are evolutionarily conserved in prokaryotic and eukaryotic DNA
polymerases. This proposal has been proven valid for various DNA
polymerases of eukaryotic and prokaryotic origin (for a review, see
reference 60). The three Exo domains contain five invariant residues
that are involved in metal binding and 3'-5' exonuclease catalysis. In
29 DNA polymerase, these residues are Asp12 and
Glu14 in ExoI, Asp66 in ExoII, and
Tyr165 and Asp169 in ExoIII (20).
29 DNA polymerase and can be extrapolated to other
proofreading-proficient DNA polymerases (73). Another
invariant residue, Lys143 of
29 DNA polymerase, was
analyzed and shown to be important for the catalytic efficiency of the
3'-5' exonuclease activity (63). In addition, other
residues in the Exo motifs that are conserved in B103, GA-1, and most
other prokaryotic and eukaryotic DNA polymerases were functionally
analyzed. Two of these, Thr15 and Asn62,
located at the ExoI and ExoII motifs, respectively, were shown to act
as single-stranded DNA ligands playing a critical role in the
stabilization of the frayed primer terminus at the 3'-5' exonuclease
active site (64). Also, Phe65 of the ExoII
motif and residues Ser122 and Leu123, which are
part of a newly identified motif [S/T]Lx2h, were shown to
be important for (i) stable interaction with ssDNA, (ii) 3'-5' exonucleolysis of ssDNA substrates, and (iii) proofreading of DNA
polymerization errors (65). In addition, these studies
showed that the aromatic ring of Phe65 appeared to be
critical to orient the ssDNA substrate in a stable conformation to
allow 3'-5' exonucleolytic catalysis. These three residues,
Phe65, Ser122, and Leu123, are also
conserved in the B103 and GA-1 DNA polymerases.
(ii) Strand displacement.
After the initiation,
sliding-back, and transition steps, continuous polymerization, carried
out by a single
29 DNA polymerase molecule, completes the
replication of the almost 20-kb DNA strand (30). Using
primed M13 DNA as the template, the
29 DNA polymerase is able to
synthesize DNA chains of more than 70 kb (25). This demonstrates the high processivity and strand displacement activity of
the
29 DNA polymerase. Replication of
29 DNA starts
nonsimultaneously from either end of the linear DNA molecule
(117), generating so-called type I replication
intermediates (Fig. 4). Until the two converging DNA polymerases
collide, DNA polymerization is coupled to strand displacement, which
makes a helicase unnecessary (25). Various DNA
polymerases, but not the one encoded by
29, are prone to replication
slippage. This particular type of error, which results in deletions, is
caused when a polymerizing DNA polymerase slips between two short
sequence duplications. Recently, evidence has been presented that the
high strand displacement activity of the
29 DNA polymerase prevents
replication slippage (48).
29 DNA polymerases
containing mutations in one of the five invariant residues in the Exo
motifs critical for 3'-5' exonuclease activity, Asp12,
Glu14, Asp66, Tyr165, or
Asp169, showed that they were also strongly affected in
their strand displacement activity (73, 194). In addition,
mutants corresponding to Lys143, the residue which is
conserved in GA-1 and B103 DNA polymerases and was shown to play an
auxiliary role in catalysis of the exonuclease reaction, were affected
in strand displacement activity (62). These results
indicated that the strand displacement activity of
29 DNA polymerase
is located in its N-terminal domain, somehow overlapping with the
3'-5'exonuclease activity.
Mutations of residues Thr15 and Asn62, shown to
act as ssDNA ligands but not playing a direct role in the
29 DNA
polymerase 3'-5' exonuclease catalysis reaction, displayed wild-type
levels of strand displacement activity (64). Therefore, it
seems that impaired strand displacement activity is restricted to the
3'-5' exonuclease mutants that act directly as metal ligands or to
those that affect the metal binding network. Based on these results, it
was proposed that contacts with divalent metal ions assist in
interactions with the displaced ssDN