Expression, purification and characterization of a recombinant fusion protein based on the human papillomavirus-16 E7 antigen

A fusion protein comprising a cell penetrating and immunostimulatory peptide corresponding to residues 32 to 51 of the Limulus polyphemus protein linked to human papillomavirus (HPV)-16 E7 antigen (LALF32-51-E7) was expressed in E. coli BL21 (DE3) cells. The recombinant protein in E. coli accounted for approximately 18% of the total cellular protein and purified with a single affinity chromatographic step. Yields of approximately 38 mg purified LALF32-51-E7 per liter of induced culture was obtained with an overall 52% recovery and constitutes a promising setting for the future production and scaling-up. Purified protein was characterized as soluble aggregates with molecular weight larger than 670 kDa, which is considered an important property to increase the immunogenicity of an antigen preparation. The recombinant fusion protein LALF32-51-E7 will be a promising vaccine candidate for the treatment of HPV-16 related malignancies.


Background
Cervical cancer represents the second most frequent cancer in women (zur Hausen 2009). Today it is very well established that so-called high-risk human papillomavirus (HPV) infections, particularly those related to HPV-16, cause cervical cancer (zur Hausen 2002). The availability of preventive vaccines against HPVs represents a milestone in the prevention of this infection (Harper et al. 2006), but no effective therapeutic vaccine or immunological treatment exists for individuals already infected or for the 470,000 women that develop highgrade dysplasia, carcinoma in situ, and cervical cancer each year.
The oncogenic potential of HPV-16 is mainly ascribed to the viral oncoprotein E7, which has been shown to interact with a variety of cellular proteins (Munger et al. 2001). Moreover, being expressed in all the cervical tumors and in precancerous lesions, the E7 protein represents a specific target for immunotherapy (zur Hausen 2002).
We designed a fusion protein comprising a cell penetrating and immunostimulatory peptide corresponding to residues 32 to 51 of the Limulus polyphemus protein (LALF 32-51 ) linked to HPV-16 E7 antigen (LALF 32-51 -E7) and selected E. coli as protein expression systems by its relative simplicity, its inexpensive and fast high-density cultivation, the well known genetics and the large number of compatible tools available for biotechnology (Jana and Deb 2005).
In a previous paper we describes some results related the biological properties of this fusion protein, a promissory vaccine candidate for the treatment of HPV-16-related malignancies (Granadillo et al. 2011). Here we describe the expression and purification and some results concerning the characterization of this recombinant fusion protein. We demonstrated that LALF 32-51 -E7 is highly expressed in E. coli BL21 (DE3) and easily purified with a single chromatographic step with a high purity. Non-optimized yields obtained by us are in order of 38 mg/l of bacterial culture, a very promising setting for the future production and scaling-up. We also show that the protein is obtained in a highly aggregated form, a property that is considered very important to increase the immunogenicity of an antigen preparation.

Results
Bacterial expression and purification of LALF 32-51 -E7 fusion protein The DNA sequence of kanamycin resistance gene (KanR) was amplified by PCR from the corresponding gene of a reliable plasmid template, purified and cloned into pPEPE7M-7 vector (Granadillo et al. 2011), which expresses the 134 amino acid LALF 32-51 -E7 fusion protein. After corroborating that the KanR gene was successfully cloned, BL21 (DE3) cells were transformed with the pPEPE7M-7K plasmid and induced for expression obtaining approximately 7 g/l of biomass at the end of the fermentation process. As shown in Figure 1A, lane 1, LALF 32-51 -E7 accounted for approximately 18% of the total cellular protein and migrated as an approximately 24 kDa protein in 15% sodium dodecyl sulfate polyacrilamide electrophoresis (SDS-PAGE). The fusion protein was located in the insoluble fraction after cell disruption ( Figure 1A, lane 3). This protein was solubilized from bacterial pellet using 6 M urea ( Figure 1A, lane 4) and further purified by immobilized metal-ion affinity chromatography (IMAC) up to 94% purity ( Figure 1A, lane 9). The fusion protein was recognized by an anti-HPV-16 E7 mouse monoclonal antibody in Western blot ( Figure 1B). The 300 mM imidazole eluate contains a major 24 kDa LALF 32-51 -E7 band and high molecular weight (MW) aggregates of this same protein, as shown by Western Blot analysis ( Figure 1B, lanes 8 and 9). Yields of approximately 38 mg purified LALF 32-51 -E7 per liter of induced culture was obtained with an overall 52% recovery (Table 1). The IMAC-purified fusion protein was further analyzed by size exclusion analytic HPLC in Superdex 200 10/300 GL. A major peak eluting in the void volume of the column and accounting for 100% of the applied protein was obtained ( Figure 1C). According to the used column calibration standard, this peak appears to contain soluble aggregates with MW larger than 670 kDa.

Transmission electron microscopy studies
In order to corroborate if the LALF 32-51 -E7 fusion protein was expressed as inclusion bodies, ultrastructural studies were performed. As expected, transmission electron microscopy study of cells harboring pPEPE7M-7K indicated that the fusion protein is produced as cytoplasmic inclusion bodies ( Figure 2A).
To characterize the purified fusion protein, preparations of LALF 32-51 -E7 were analyzed by negative staining. The Figure 2B shows representative electron microscopy micrograph of the LALF 32-51 -E7 preparation. The protein appears as aggregates of different shape and size.

Mass spectrometry analysis
According to the gene sequence, LALF 32-51 -E7 is synthesized as a protein of 134 amino acids containing a hexa-histidine tag at the C-terminus, with a theoretical mass value of 15867.85 Da. ESI-MS of reduced and carboamidomethylated protein (rcm-LALF 32-51 -E7) gave a major signal of 15736.90 Da in mass ( Figure 3A), which is in good agreement with the theoretical value (15736.65 Da) calculated for the sequence starting from the second amino acid (alanine, abbreviated A in Figure 4). The major signal obtained differs in 130.95 mass units with respect to the expected theoretical mass of entire protein (15867.85 Da), indicating a full processing of the initiation methionine from the protein.
To further verify the identity of the molecule, the protein was enzymatically digested with trypsin, and generated fragments were analyzed by mass spectrometry. Identified peptides accounted for 93% of the entire sequence of LALF 32-51 -E7 (Table 2). Undetected peptides corresponded to fragments with less than 3 amino acids which are out of the mass range analysis of the mass spectrometer (400-2000 Th). ESI-MS/MS sequencing of N-and C-terminal suspected peptides confirm the lack of the initiation methionine ( Figure 3B) and the presence of the six-His-tag, respectively ( Figure 3C).

Discussion
In this paper, we describe the expression, purification and some results related to the characterization of LALF 32-51 -E7 fusion protein; a promising vaccine candidate for the treatment of HPV-16 related malignancies. Antigen design was based on mutated version of viral HPV-16 E7 antigen bearing a base substitution of T by G in the triplet encoding cysteine at position 24 (substitution of Cys to Gly) in order to disrupt their binding to protein Rb (Munger et al. 1989;Barbosa et al. 1990;Jones et al. 1990), in this way reducing possible regulatory objections in the future development of a human vaccine candidate. To improve safety, and since the ampicillin resistance gene (AmpR) is precluded for use in humans, we introduced the KanR gene as a selectable marker of our final expression vector pPEPE7M-7K. The KanR gene is the antibiotic resistance marker often used while the AmpR is not acceptable due to concerns with hyper reactivity of some patients to β lactam antibiotics (Williams et al. 2009).
It is well documented that a key aspect influencing on the expression of heterologous proteins in E. coli cytoplasm is the selection of host strain (Sorensen and Mortensen 2005). In this sense E. coli BL21 (DE3) is the most common host and has proven outstanding in standard recombinant expression application, is able to grow vigorously in minimal media but however non-pathogenic and unlikely to survive in host tissues and cause disease (Chart et al. 2000). In this paper, we show that the fusion protein LALF 32-51 -E7 is highly expressed (18%) in E. coli BL21 (DE3). In agreement with our results other researchers have expressed efficiently recombinant fusion proteins for therapeutic purposes based on HPV-16 E7 antigen in E. coli (Chu et al. 2000;Preville et al. 2005;Liu et al. 2008).
In this work we also show that LALF 32-51 -E7 was easily purified with a single affinity chromatographic step (up to 94% purity) that can be followed by other polishing ones (i.e. gel filtration) if manufactured for human vaccine purposes. The non-optimized yields obtained by us are in order of 38 mg of purified LALF 32-51 -E7 per liter of induced culture, a promising figure in terms of production and scaling-up. This study also reports that the fusion protein was obtained highly aggregated, a property that could be convenient to enhance immunogenicity for an antigen preparation. Other researchers have reported that while low MW aggregates such as dimers and trimers appear inefficient in inducing immune responses, large multimers whose MW exceeds 100 kDa are efficient inducers of immune responses (Rosenberg 2006). As our aim was to obtain a highly immunogenic E7 preparation, we did not focus on obtaining aggregates of identical shape and size considering that particles of different size can be taken up by different types of antigen presenting cells, such as dendritic cells, macrophages and polymorphonuclear leukocytes, sustaining a more potent immune response (Oyewumi et al. 2010).
In this study we also characterized the LALF 32-51 -E7 fusion protein by mass spectrometry. LALF 32-51 -E7 is a 134 amino acids protein with a hexa-histidine tag at the Cterminus and a theoretical molecular mass of 15867.85 Da. The mass spectrometry analyses were in good agreement with the theoretical molecular mass value of the full length Table 1 Summary of LALF 32-51 -E7 purification (from dry weight of the biomass (0.75 g/l)) Step Volume ( gene product without the N-terminal methionine and verified the identity of the molecule. Although the theoretical mass value of LALF 32-51 -E7 is approximately 16 kDa, the protein migrated in SDS-PAGE under reducing conditions as an approximately 24 kDa protein that is larger in size than predicted. This abnormal migration pattern has been previously reported for the HPV-16 E7 protein, and is attributable to their high content of acidic amino acid residues (Armstrong and Roman 1993;Bolhassani et al. 2008).
The E. coli expressed proteins represent a well-studied and cost-effective means for the production of vaccines. Our vaccine candidate represents not only a good substrate for antigen-presenting cell uptake and processing, but also a cost-effective promising approach for developing a HPV therapeutic vaccine. A generation of new low-cost HPV vaccines could represent the only possibility for women living in developing countries to gain access to HPV vaccination programs to prevent or treat pre-cancerous lesions and cancer.

Conclusions
This is a report of a non-optimized process about the expression, purification and characterization of a recombinant fusion protein that is a cost-effective promising approach for developing a HPV therapeutic vaccine.

Fusion protein expression vector
The AmpR gene of the plasmid pPEPE7M-7 (Granadillo et al. 2011) that containing the recombinant LALF 32-51 -E7 fusion protein was interrupted by digestion with ScaI and a DNA fragment containing the KanR gene was cloned in this plasmid. The KanR gene was PCR amplified from pUC4K using two primers. The forward primer 5' CAG CTG GCC ACG TTG TGT CTC AAA ATC 3' contains a PvuII site as well as the reverse primer 5' CAG CTG TTC AAC AAA GCC GCC GTC CC 3'. The PCR product was digested with PvuII, purified and ligated to pPEPE7M-7 which had been cut with ScaI. The KanR gene orientation was verified by ClaI digestion. Clones resulting in bands of approximately 3.2 and 1.3 kb have opposite orientation respect to the target gene and clones resulting in bands of 2.5 and 2 kb have the same orientation respect to the target gene. We selected the clone in which the KanR gene has opposite orientation respect to the target gene and was designated pPEPE7M-7K (Figure 4).

Expression and purification of LALF 32-51 -E7
BL21 (DE3) cells transformed with pPEPE7M-7K were inoculated in 500 ml of LB medium containing kanamycin (50 μg/ml) and incubated for 6 h at 37°C in a shaker. This culture were used to inoculate a 5 l fermentor (B.E. Marubishi, Japan) containing M9 salt medium enriched with 10 g/l casein hydrolysate, 0.011 g/l CaCl 2 · 2H 2 O, 0.246 g/l MgSO 4 · 7H 2 O, 2 g/l glucose and 0.05 g/l kanamycin. After three h, 3 β-indole-acrylic acid was added to a final concentration of 0.04 g/l and the culture was grown for another 15 h to obtain as much possible recombinant protein as insoluble inclusion bodies according to the standard production protocol for E coli based technology implemented in our institute. The fermentation parameters Cells were harvested by centrifugation at 15 000 g for 20 min at 4°C. Then, 3 g of cellular biomass was resuspended in 30 ml of rupture buffer (50 mM NaH 2 PO 4 , 300 mM NaCl, pH 8.0) at a ratio of biomass/buffer of 1:10. The biomass was disrupted in French Press (Othake, Japan) at 1500 bar, with two passes on 4°C. After centrifugation at 15 000 g for 30 min at 4°C, the pellet was recovered and the recombinant protein was totally solubilized in 6 M urea in carbonate-bicarbonate buffer pH 10.6. Cell debris was removed by centrifugation at 15 000 g for 30 min at 4°C and the soluble fraction containing the fusion protein, that have a six-histidine C-terminus tail for purification purposes, was recovered. Due to the recombinant protein was totally solubilized in 6 M urea in carbonate-bicarbonate buffer pH 10.6 and not in other buffers at lowers pH and at different urea concentrations (data not shown), the purification of the protein was necessarily conducted in carbonate-bicarbonate buffer pH 10.6. The soluble fraction was diluted in equal volume of 1 M NaCl and loaded onto a 22 ml His-Select W Nickel Affinity Gel (Sigma, Catalog number P6611) equilibrated with loading buffer (3 M urea and 0.5 M NaCl in carbonate-bicarbonate pH 10.6). The column was then washed with loading buffer containing 10 mM imidazole and the protein of interest was eluted with 300 mM imidazole. The eluted IMAC fraction (25 ml) was further loading onto a HiPrep 26/10 desalting column (GE Healthcare) equilibrated with 10 mM Tris pH 8.0 renaturation buffer and following the manufacturer's instructions. In this chromatographic step the protein was refolded because the  urea and imidazole were totally removed. The peak fractions containing LALF 32-51 -E7 protein were pooled after desalting and then endotoxin removal was performed using an EndoClean ™ Kit from BioVintage. The final protein preparation contained <0.05 endotoxin units (EU)/μg as measured by the chromogenic Limulus ameobocyte lysate assay (Associates of Cape Cod, Inc). Samples in each step of the process were collected and later analyzed by SDS-PAGE and Western blot. The MW of the recombinant protein was estimated using size exclusion analytic HPLC (YL9100) in a Superdex 200 10/300 GL column (GE Healthcare); briefly 200 μL of the purified protein were applied at a flow rate of 0.5 ml/min in 10 mM Tris (pH 8.0). MW was estimated using the retention times, in comparison with a gel filtration standard preparation (Bio-Rad).

SDS-PAGE and Western blot
SDS-PAGE (15%) was performed according to Laemmli (Laemmli 1970). Protein expression and purity were evaluated by densitometry (TDI-1D manager 2.0 software, Spain) of SDS-PAGE gels. Proteins were visualized by Coomasie blue staining. The identity of the protein was verified by Western blot (Burnette 1981 Figure 4 Schematic representation of pPEPE7M-7K construction for the expression of the LALF 32-51 -E7 fusion protein. bands were detected with an enhanced chemiluminescence kit (Amersham Pharmacia Biotech).

Transmission electron microscopy
For ultrastructural studies the samples consisting of pelleted cells (2 x 10 7 cells) from non-transformed or transformed BL21 (DE3) E. coli that express the recombinant fusion protein were fixed with 3.2% glutaraldehyde, and post-fixed for 1 h in 2% OsO 4 . Then it was rinsed with 0.1 M PBS, pH 7.2, and dehydrated in increasing ethanol concentrations such as 50%, 70%, 80%, 90% and 100%. The embedding was in Spurr. The blocks were sectioned with an ultramicrotome (NOVA, LKB), and the ultrathin sections were placed on 400-mesh cooper grids. The ultrathin sections were stained with uranyl acetate and lead citrate and then examined in a JEOL-JEM 2000 EX electron microscope (JEOL, Japan).
For the negative staining studies, a drop of purified LALF 32-51 -E7 fusion protein was placed on to a 400-mesh copper grid coated with formvar-carbon film. Following 30 min of sample absorption and washing with water, grids were stained for 30 s with 2% uranyl acetate. After staining, grids were dried with Whatman no. 1 filter paper and allowed to air dry for 15 min. Samples were then viewed on JEOL-JEM 2000 EX electron microscope.

Reduction and S-carboamidomethylation of cysteines
The LALF 32-51 -E7 protein (10 nmoles) was dissolved in 100 ml of 6 M guanidium chloride, 500 mM Tris, pH 8.1, and incubated for 2 h with dithiothreitol (DTT) 50-fold in excess over cysteines, in a nitrogen atmosphere at 37°C. Iodoacetamide was added 2-fold over DTT, and the reaction proceeded at 25°C for 30 min in the dark. The reduced and fully alkylated protein (rcm-LALF 32-51 -E7) was desalted by HPLC system (LKB-Phamacia, Sweeden) in a RP-C4 column (4.6 x 50 mm, Vydac). The elution was performed with a linear gradient of solvent B (0.05% TFA in acetonitrile) from 5 to 60% in 30 min at a flow rate of 0.8 ml/min (Solvent A, 0.1% TFA in water). The eluate was monitored at 226 nm. An aliquot was submitted to ESI-MS analysis and the rest of the sample was evaporated under vacuum to dryness before trypsin digestion.

Mass spectrometry
ESI-MS and MS/MS spectra were acquired using a hybrid quadrupole orthogonal acceleration tandem mass spectrometer QTOF-2 ™ from Micromass (Manchester, UK) fitted with a Z-spray nanoflow electrospray ion source.
Other measuring conditions and data processing were the same as reported previously (Gonzalez et al. 2003).