Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular...

10
Hindawi Publishing Corporation Biotechnology Research International Volume 2013, Article ID 383646, 9 pages http://dx.doi.org/10.1155/2013/383646 Research Article Family-Specific Degenerate Primer Design: A Tool to Design Consensus Degenerated Oligonucleotides Javier Alonso Iserte, 1 Betina Ines Stephan, 1 Sandra Elizabeth Goñi, 1 Cristina Silvia Borio, 1 Pablo Daniel Ghiringhelli, 2 and Mario Enrique Lozano 1 1 LIGBCM- ´ Area Virosis Emergentes y Zoon´ oticas, Universidad Nacional de Quilmes, B1876BXD Buenos Aires, Argentina 2 LIGBCM- ´ Area Virosis de Insectos, Universidad Nacional de Quilmes, B1876BXD Buenos Aires, Argentina Correspondence should be addressed to Javier Alonso Iserte; [email protected] Received 15 October 2012; Accepted 11 January 2013 Academic Editor: Goetz Laible Copyright © 2013 Javier Alonso Iserte et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Designing degenerate PCR primers for templates of unknown nucleotide sequence may be a very difficult task. In this paper, we present a new method to design degenerate primers, implemented in family-specific degenerate primer design (FAS-DPD) computer soſtware, for which the starting point is a multiple alignment of related amino acids or nucleotide sequences. To assess their efficiency, four different genome collections were used, covering a wide range of genomic lengths: Arenavirus (10 × 10 4 nucleotides), Baculovirus (0.9 × 10 5 to 1.8 × 10 5 bp), Lactobacillus sp. (1 × 10 6 to 2 × 10 6 bp), and Pseudomonas sp. (4 × 10 6 to 7 × 10 6 bp). In each case, FAS-DPD designed primers were tested computationally to measure specificity. Designed primers for Arenavirus and Baculovirus were tested experimentally. e method presented here is useful for designing degenerate primers on collections of related protein sequences, allowing detection of new family members. 1. Introduction e polymerase chain reaction (PCR), one of the most impor- tant analytical tools of molecular biology, allows a highly sensitive detection and specific genotyping of environmental samples, specially important in the metagenomic era [1]. A large list of genome typing applications includes arbitrarily primed PCR [2] (AP-PCR), random amplified primed DNAs [3] (RAPDs), PCR restriction fragment length polymorphism [4] (PCR-RFLP), and direct amplification of length polymor- phism [5] (DALP). All of these techniques require a high quality and purity of the specific target template, because any available DNA could be substrate for the amplification step. In view of this, genotyping procedures of large genomes or complex samples are more reliable if they are based on DNA amplification using specific oligonucleotides. ere- fore, primer design is crucial for efficient and successful amplification. Several primer design programs are available (e.g., OLIGO [6], OSP [7, 8], Primer Master [9], PRIDE [10], Primer3 [11], among others). Regardless of each computa- tional working strategy, all of these use a set of common criteria (e.g., / content, melting temperature, etc.) to evaluate the quality of primer candidates in a specific target region selected by the user. Alternative programs are aimed at more specific purposes, such as selection of primers that bind to conserved genomic regions based on multiple sequence alignments [12, 13], primer design for selective amplification of protein-coding regions [14], oligonucleotide design for site-directed mutagenesis [15], and primer design for hybridization [16]. Usually, the design of truly specific primers requires the information of the complete nucleotide sequence. is is the starting point for most of the programs described in the literature. However, the need of designing specific primers is not always accompanied by the complete knowledge of the target genome sequence. A primer, or more generally any DNA sequence, is called specific if it represents a unique sequence and is called degenerate if it represents a collection of unique sequences. For example, the amino acid sequence “YHP” could be

Transcript of Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular...

Page 1: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Hindawi Publishing CorporationBiotechnology Research InternationalVolume 2013 Article ID 383646 9 pageshttpdxdoiorg1011552013383646

Research ArticleFamily-Specific Degenerate Primer Design A Tool to DesignConsensus Degenerated Oligonucleotides

Javier Alonso Iserte1 Betina Ines Stephan1 Sandra Elizabeth Gontildei1 Cristina Silvia Borio1

Pablo Daniel Ghiringhelli2 and Mario Enrique Lozano1

1 LIGBCM-Area Virosis Emergentes y Zoonoticas Universidad Nacional de Quilmes B1876BXD Buenos Aires Argentina2 LIGBCM-Area Virosis de Insectos Universidad Nacional de Quilmes B1876BXD Buenos Aires Argentina

Correspondence should be addressed to Javier Alonso Iserte jiserteunqeduar

Received 15 October 2012 Accepted 11 January 2013

Academic Editor Goetz Laible

Copyright copy 2013 Javier Alonso Iserte et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Designing degenerate PCR primers for templates of unknown nucleotide sequence may be a very difficult task In this paperwe present a new method to design degenerate primers implemented in family-specific degenerate primer design (FAS-DPD)computer software for which the starting point is a multiple alignment of related amino acids or nucleotide sequences To assesstheir efficiency four different genome collections were used covering a wide range of genomic lengths Arenavirus (10 times 104nucleotides) Baculovirus (09 times 105 to 18 times 105 bp) Lactobacillus sp (1 times 106 to 2 times 106 bp) and Pseudomonas sp (4 times 106 to7 times 10

6 bp) In each case FAS-DPD designed primers were tested computationally to measure specificity Designed primers forArenavirus and Baculovirus were tested experimentally The method presented here is useful for designing degenerate primers oncollections of related protein sequences allowing detection of new family members

1 Introduction

Thepolymerase chain reaction (PCR) one of themost impor-tant analytical tools of molecular biology allows a highlysensitive detection and specific genotyping of environmentalsamples specially important in the metagenomic era [1] Alarge list of genome typing applications includes arbitrarilyprimed PCR [2] (AP-PCR) random amplified primed DNAs[3] (RAPDs) PCR restriction fragment length polymorphism[4] (PCR-RFLP) and direct amplification of length polymor-phism [5] (DALP) All of these techniques require a highquality and purity of the specific target template becauseany available DNA could be substrate for the amplificationstep In view of this genotyping procedures of large genomesor complex samples are more reliable if they are based onDNA amplification using specific oligonucleotides There-fore primer design is crucial for efficient and successfulamplification

Several primer design programs are available (egOLIGO [6] OSP [7 8] Primer Master [9] PRIDE [10]

Primer3 [11] among others) Regardless of each computa-tional working strategy all of these use a set of commoncriteria (eg 119866119862 content melting temperature etc) toevaluate the quality of primer candidates in a specific targetregion selected by the user Alternative programs are aimedat more specific purposes such as selection of primersthat bind to conserved genomic regions based on multiplesequence alignments [12 13] primer design for selectiveamplification of protein-coding regions [14] oligonucleotidedesign for site-directed mutagenesis [15] and primer designfor hybridization [16] Usually the design of truly specificprimers requires the information of the complete nucleotidesequence This is the starting point for most of the programsdescribed in the literature However the need of designingspecific primers is not always accompanied by the completeknowledge of the target genome sequence

A primer or more generally any DNA sequence is calledspecific if it represents a unique sequence and is calleddegenerate if it represents a collection of unique sequencesFor example the amino acid sequence ldquoYHPrdquo could be

2 Biotechnology Research International

coded by ldquoTATCATCCCrdquo ldquoTACCATCCArdquo or ldquoTACCAC-CCGrdquo among others all of these are unique sequences thatcan be summarized in a ldquodegeneraterdquo nucleotide sequenceldquoTAYCARCCNrdquo using IUPAC code Operatively the use of adegenerate primer implies the use of a population of specificprimers that cover all the possible combinations of nucleotidesequences coding for a given protein sequence Also primersincluding modified bases can be used Some modified basescan match different bases

Although the increase in degeneracy rises the chance ofunspecific annealing of the designed primers it also increasesthe probability of finding unknown divergent variants ofa sequence family This dual behavior must be taken intoaccount during the design Algorithmic search of primersthat include degenerated positions is usually defined as thedegenerate primer design (DPD) problem In recent yearsseveral methods were developed to solve DPD problem Eachone has a specific scope or is designed to solve a variant ofthe problem but all of them aim to minimize the number ofdegenerations of the resulting primers

The DPD problem was expressed in different ways bymany researchers Linhart and Shamir [17] presented themaximum coverage DPD problem (MC-DPD) with the goalof finding a primer that covers themaximumnumber of inputsequencesThe selection of primers is constrained by limitingthe maximum degeneracy They also stated the minimumdegeneracyDPDproblem (MD-DPD) in which the objectiveis finding a primer with the minimum degeneracy thatcovers all the input sequences To solve MC-DPD they havedeveloped the HYDEN program [18] Wei et al [19] devel-oped the DePiCt program that uses hierarchical clusteringof protein blocks to design the primers Rose et al [20]developed a method for hybrid degenerate-nondegenerateprimers where the 31015840 region is degenerated and its 51015840 regionis a consensus clamp It was implemented in CODEHOP[21] and iCODEHOP [22] programs and was used to searchnew members of protein families and for identification andcharacterization of viral genomes Balla and Rajasekaran [23]described a method for a variant of MD-DPD that toleratesmismatch errors implemented in the minDPS programTheprograms PT-MIPS and PAMPS address mainly the problemof multiple degenerate primer design The aim of these pro-grams is finding theminimumnumber of degenerate primersthat cover all the input sequences taking into account thatnone of them may be more degenerated than an input value

In this study a new method for solving the DPD problemis proposed in which the focus is shifted away from theglobal minimum degenerated primer in favor of maximizinga score value which contains degeneracy but weighted by itsproximity to the 31015840 end of the primer This minimizes thedegeneracy at that end while allowing more freedom in theremaining positions Hereby the best scoring primers maynot be the less degenerated but take into account a biologicalrestraint that is not so heavily considered in other methodsThe 31015840 end is the essential anchoring site because it is wherethe polymerase initiates its activity From a strategic pointof view a decision must be made whether or not to allowdegeneracy at this end The presence of degeneracy at the31015840 end probably assures a greater diversity of sequences to

be detected However at the same time it diminishes theproportion of primer specific for a given sequenceThereforewe decided to be very strict in the search of conserved regionsand minimize the amount of degeneracy incorporated at thisend If the input set of sequences is sufficiently large it ishighly probable that a region identified as conserved amongall known sequences will likewise be conserved in any newmember of the family

2 Scoring and Primer Search Strategy

The method presented here can be used starting with DNAor protein sequence alignments (Figure 1(a)) If the inputwas DNA sequences were aligned to obtain one globaldegenerate DNA consensus If the input was a proteinalignment each protein of the alignment is backtranslatedinto a degenerate DNA sequence All the degenerate DNAsequences were combined in one global degenerate DNAconsensus This consensus sequence covers all the putativeinput sequences that could be the origin of each proteinsequence (Figure 1(b)) Also the consensus sequence maycode for amino acids that were not detected in the knownsequences This is inevitable given the kind of degeneracy ofthe genetic code

Then the degenerate consensus sequence was analyzedusing an overlapping window-based strategy The windowlength corresponds to the required oligonucleotide lengthand each window corresponds to a putative primer For eachcandidate primer a score is calculated In the first place foreach position of a candidate primer a position score (119878119901

119894) was

calculated using (1)

119878119901119894= 1 minus log

10(119873119863119894) (1)

where 119873119863119894is the degeneracy value at the position 119894 of the

oligonucleotide (1 le 119894 le 119899 where 119899 is the length of theprimer) 119873119863

119894is 1 for ldquoA C G or Trdquo 2 for ldquoK M R S W

or Yrdquo 3 for ldquoB D H or Vrdquo and 4 for ldquo119873rdquo This expressiontakes a value of 1 for nondegenerate bases and decreases formore degenerated bases On the other hand it is known thatin PCR reactions the 31015840 end of the primer is more importantthan the 51015840 endThe region of the 31015840 end of the primermust beas little degenerated as possible Therefore a good annealingat this end is imperative in order to minimize unspecificamplifications Considering this the value of 119878119901

119894is multiplied

by a weighting value (119882119901119894) defined by a straight line function

that increases as it comes closer to the 31015840 end (2)

119882119901119894= 119901119860 +

119894 times (119873119910minus 119901119860)

119873119909

(2)

where 119894 is the position from the 51015840 end along the oligonu-cleotide (1 le 119894 le 119899 where 119899 is the length of the primer)and 119901119860119873

119910 and119873

119909are user adjustable parameters defining

the straight line function 119901119860 is the axis intersection and(119873119910minus 119901119860)119873

119909is the slope Default values for 119901119860 119873

119910 and

119873119909are 0 1 and 1 respectively Changing them will permit

them to be more or less strict about including degenerationscloser to the 31015840 end of the primer Increasing 119901119860 or 119873

119909 or

Biotechnology Research International 3

Strategy

Multiple protein alignment

Backtranslation

DNA alignment Degenerate DNA alignment

Combination

Degenerate DNA sequence

Collection of oligonucleotides

Evaluation of primers score

Result with the best-scoring primers

(a)

TGG ACN CAR WSN YTN MGN AAR GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GGN YTN WSNTGG GTN CAR WSN YTN MGN MGN GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GAR YTN WSN

Degeneratedconsensus

TGG RKN CAR WSN YTN MGN MRN GRN YTN WSN111 224 112 224 214 214 224 124 214 224

A

A

C

C

G

G

T

TIUPAC code

Degeneracy 1 1 1 12 2 2 2 2 2 24 4 4 4

R RW S N NN NY M M

Seq_l W T Q S SL LR K GSeq_2 W T Q S SL LR R GSeq_3Seq_4Seq_lSeq_2Seq_3Seq_4

W V Q S SL LR R GW T Q S SL LR R E

lowast lowast lowast lowast lowast lowast

(b)

Figure 1 Minimum degenerated sequence generation (a) Diagram of the general strategy used (b) Sample protein alignment showing anexample for the steps of the strategy diagram Each sequence is computationally backtranslated to hypothetical nucleic acid sequences IUPACcodes were used to show ambiguous positions These sequences are piled up in order to get the degenerated consensus sequence Numbersbelow this indicate the degeneration value of each position

decreasing 119873119910 results in lesser stringency on the designed

primer Finally to obtain a scaled global score (119878119892) the result

of119882119901119894times 119878119901119894is divided by the maximum possible score (119872

119904

(3)) Global normalized score (119878119892) was calculated according

to (4) In this way 119878119892value varies from 0 to 1Maximum score

is obtained when the value of the 119878119901119894is 1 for each position

Therefore 119873119863119894must also be 1 too and this only happened

with nondegenerated primers

119872119904= 119899 times 119901119860 +

(119899 + 1) times 119899 times (119873119910minus 119901119860)

2 times 119873119909

(3)

119878119892=sum119899

119894=1119878119901119894times119882119901119894

119872119904

(4)

3 Methods

31 Alignment and Sequence Comparison Tools For globalalignment of protein sequences the program ClustalW 183[24] was used with default parameters Local alignments ofproteins against genomes were made using stand-alone Blast2213 [25] with default parameters Oligonucleotide matchsearches were made with specifically developed tools writtenin C language

32 Sequence Data Several sets of sequences were used inthe tests of the program for designing and comparison of theprimer sequences against genomes All sequences GenBankrsquosaccession numbers are presented in Table 1

33 Filtering Primers In addition to the scoring processFAS-DPD can optionally filter the primers individually

according to common criteria melting point temperature(estimated using Santaluciarsquos method [26]) 119866 + 119862 content51015840 versus 31015840 stability presence of tandem repeats of thesame base occurring at 31015840 end or any place in the sequencepresence of a degenerated position at the 31015840 end and for-mation of homodimer structures Also primer pairs can befiltered according to amplification product size melting pointtemperature compatibility 119866 + 119862 content compatibility andformation of heteroduplex structures

34 PCR Amplification The PCR conditions used in allexperiments follow a common protocol The reaction mixcontained 1X Taq DNA polymerase buffer (Productos Bio-logicos Argentina) 02mM dNTPs 05120583M of each primer20 pM template and different concentration of MgCl

2and

dimethyl sulfoxide (DMSO) in different reactionsTheMgCl2

was used from 2mM to 3mM and DMSO was used from0 (vv) to 5 (vv)The reactions were performed in a totalvolume of 10 120583L and the thermal profile consisted of an initialdenaturation step of 94∘C for 2min followed by 35 cyclesof denaturationannealingextension steps The denaturationstep was at 92∘C for 10 seconds the temperature of theannealing step was not the same in all experiments varyingfrom 45∘C to 60∘C and the time was always 15 seconds(see Figure 4) The extension step was at 72∘C the time ofthis step was 15 seconds In all cases one of the primers isspecific for the template while the other primer was designedby the method described in this work The last step was afinal extension of 5 minutes at 72∘C For Junin Virus thetemplate used was a plasmid containing a copy of cDNA ofJUNV S genomic segment For Baculovirus the template wasa plasmid containing a fragment of Anticarsia gemmatalis

4 Biotechnology Research International

Table 1 List of sequences used in the test of FAS-DPD Accession numbers and brief description are presented

Acc number Sequence description Acc number Sequence descriptionArenaviral sequences

AY1292481 Machupo v st Carvallo U410711 Sabia vAF4852601 Machupo v st Carvallo EU2604631 Chapare v st 810419AY9242061 Machupo v st MARU-216606 AY0812101 Allpahuayo v CLHP-2098AY9242021 Machupo v st Chicava AY0126861 Allpahuayo v from PeruAY6243551 Machupo v st Chicava AY0126871 Allpahuayo v st CLHP-2472AY9242051 Machupo v st 9301012 AF4852621 Pirital v st VAV-488AY6196451 Machupo v st Mallele AF2776591 Pirital vAY9242031 Machupo v st 9430084 M167351 Pichinde vAY9242081 Machupo v st MARU 249121 AF4852611 Parana v st 12056AY9242041 Machupo v st 200002427 AF5128291 Parana v st 10256AY9242071 Machupo v st MARU 222688 AF5128311 Flexal v st BeAn 293022AY5719591 Machupo v st 9530537 AF4852571 Flexal v st PinheiroAY7463531 Junin v st Candid-1 AF5128311 Flexal v st BeAn 293022AY3580232 Junin v st XJ13 AF5128301 Latino v st MARU 10924AY6196411 Junin v st Rumero AF4852591 Latino v st Maru 10924D100722 Junin v st MC2 U342481 Oliveros vM203041 Tacaribe v AY8473501 LCM v st Armstrong 53bAF4852561 Amapari v st BeAn 70563 M208691 LCM v st Armstrong 53bAF5128341 Amapari v st BeAn 70563 EU1360381 Dandenong v is 0710-2678AF5128321 Cupixi v st BeAn 119303 DQ3288741 Mopeia v st MozambiqueAY1292471 Guanarito v st INH-95551 DQ3288771 Ippy v st Dak-An-B-188-dAF4852581 Guanarito v st INH-95551 X524001 Nigeria Lassa vAY4975481 Guanarito v st CVH-960101 AY6282061 Lassa v st WellerAY9243921 Bear Canyon v st AV 98470029 AY6282011 Lassa v st MacentaAY9243911 Bear Canyon v st AV A0070039 AY6282051 Lassa v st Z148AF5128331 Bear canyon v st A0060209 J043241 Lassa v st JosiahDQ8652441 Catarina v st AV A0400135 AY7721681 Mopeia Lassa reassortant 29DQ8652451 Catarina v st AV A0400212 AY6282031 Lassa v st JosiahEU1233281 Skinner Tank v st AV D1000090 AF1818531 Lassa v st LPEU1233311 North American arenav st AV 96010024 AY6282071 Lassa v st PinneoEU1233301 North American arenav st AV 96010151 AY6282081 Lassa v st Acar-3080AF2280631 Whitewater Arroyo v st 9310135 AF1818541 Lassa v st 803213AF4852641 Whitewater Arroyo v st 9310141 AY3423901 Mobala v st ACAR-3080-MRC5-P2EU1233291 North American arenav st AV D1240007 M338791 Mopeia v st AN-21366AF4852631 Tamiami v st CDCW-10777 AY7721701 Mopeia v st AN-20410AF5128281 Tamiami v st W 10777

Baculoviral sequencesAP0062701 Adoxophyes honmai nucleopolyhedrovirus DNA X770481 Cryptophlebia leucotreta granulosisAF5479841 Adoxophyes orana granulovirus X795691 Cryptophlebia leucotreta granulosisNC 0058392 Agrotis segetum granulovirus NC 0028161 Cydia pomonella granulovirusL228581 Autographa californica nucleopolyhedrovirus clone C6 NC 0030831 Epiphyas postvittana NPVL331801 Bombyx mori nuclear polyhedrosis virus isolate T3 NC 0026542 Helicoverpa armigeraNC 0051372 Choristoneura fumiferana DEF MNPV AF0818101 Lymantria disparNC 0047783 Choristoneura fumiferanaMNPV NC 0035291 Mamestra configurata NPV-AAY8643301 Chrysodeixis chalcitesNPV U759302 Orgyia pseudotsugataMNPVAY4563891 Chrysodeixis chalcitesNPV AF4995961 Phthorimaea operculella granulovirusAY4563901 Chrysodeixis chalcitesNPV NC 0025931 Plutella xylostella granulovirusAY5457861 Chrysodeixis chalcitesNPV NC 0043231 Rachiplusia ouMNPVAY5457871 Chrysodeixis chalcitesNPV NC 0021691 Spodoptera exiguaMNPVAY2299871 Cryptophlebia leucotreta granulovirus NC 0031021 Spodoptera litura NPV

Biotechnology Research International 5

Table 1 Continued

Acc number Sequence description Acc number Sequence descriptionAY0962411 Cryptophlebia leucotreta granulovirus NC 0073831 Trichoplusia ni SNPVAY0962421 Cryptophlebia leucotreta granulovirus

Pseudomonas sp sequencesNC 0074922 Pseudomonas fluorescens Pf0-1 NC 0045781 Pseudomonas syringaeNC 0057733 Pseudomonas syringae NC 0029473 Pseudomonas putidaNC 0041296 Pseudomonas fluorescens NC 0025162 Pseudomonas aeruginosaNC 0070051 Pseudomonas syringae

Lactobacillus sp sequencesNC 0053621 Lactobacillus johnsonii NC 0026621 Lactococcus lactis subspNC 0075761 Lactobacillus sakei subsp NC 0045671 Lactobacillus plantarum

12

10

8

6

4

2

00 200 400 600 800 1000 1200 1400 1600 1800

Num

ber o

f prim

ers

Position in the multiple alignment

Figure 2 Primer distribution along one ORF A collection of thebest scoring primers for the nucleoprotein ofArenavirus comprisedof 50 primers for the genomic sequence and 50 for the antigenomicsequence were represented in the corresponding alignment posi-tion The height of each point indicates the cumulative number ofprimers corresponding at this position The alignment was madewith 71 arenavirus N protein sequences

MNPV p74 gene Sensitivity of the PCR assaywas determinedby dilution of cloned fragments from Junin virus [27] andBaculovirus template

4 Results

41 Distribution of Generated Primers Thedistribution of theresulting primers along the input sequence was analyzed Forthis the best one hundred primers obtained from a proteinalignment were selected For each position in the alignmentthe number of the selected primers that correspond to thisposition was recorded (Figure 2) The test was repeated fordifferent protein alignments

The selected primers were located around a few hotspots in the alignment This behavior indicates that thereare generally few regions in a sequence alignment usefulfor degenerate primer design Many primers found by theprogram are almost identical shifting one or two bases

between them and located formost cases in a 30ndash40 base runSimilar results were obtained with all proteins tested

42 Intragenomic Specificity and Score Analysis Because it ispossible that the best primers are not the less degeneratedsubstrings in the collection of candidates their specificitywas tested Also it was necessary to get a more preciseunderstanding of the score assigned by FAS-DPD in terms ofspecificity To achieve this the primers were compared withthe complete genome sequences used to design them lookingfor unspecific perfect matches

For this task a wide range of genome sizes was cov-ered Four collections of complete genome sequences wereused Arenavirus (genome in 104 bases order) Baculovirus(genome in 105 bases order) Lactobacillus (genome in 106bases order) and Pseudomonas (genome in 106 bases order)For each set a randomly selected genome was used asreference Each annotated ORF of this genome was used tosearch related ORFs in the other genomes of the collectionusing the local Blast tool The expected value of Blast wasused to decide when twoORFs were relatedWhen anORF ofthe reference genome had a related one in all other genomesall of them were aligned with ClustalW and used in furtheranalysis

Each resulting alignment was used as input for FAS-DPDto search primers For each genome polarity the best fiftynonoverlapping primers were selected This selection wasmade to avoid concentration of overrepresented hot-spot-derived high score primersThis allowedus to find a balancedset of primers with high and low scores

In order to find the relationship between the scorecalculated for each primer and its specificity all the primerswere compared with all the oligonucleotides of the same sizederived from each genome searching for perfect matches(Figure 3) The results were similar for the four systemsdespite their differences in genome size

There is an inverse correlation between primer score andthe number of unspecific perfectmatches But this correlationis not linear The quantity of unspecific perfect matches ofprimers with a minimal score of 085 and their target genomewas generally zeroThe number of unspecific perfect matchesgrew enormously with lower primer scores

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 2: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

2 Biotechnology Research International

coded by ldquoTATCATCCCrdquo ldquoTACCATCCArdquo or ldquoTACCAC-CCGrdquo among others all of these are unique sequences thatcan be summarized in a ldquodegeneraterdquo nucleotide sequenceldquoTAYCARCCNrdquo using IUPAC code Operatively the use of adegenerate primer implies the use of a population of specificprimers that cover all the possible combinations of nucleotidesequences coding for a given protein sequence Also primersincluding modified bases can be used Some modified basescan match different bases

Although the increase in degeneracy rises the chance ofunspecific annealing of the designed primers it also increasesthe probability of finding unknown divergent variants ofa sequence family This dual behavior must be taken intoaccount during the design Algorithmic search of primersthat include degenerated positions is usually defined as thedegenerate primer design (DPD) problem In recent yearsseveral methods were developed to solve DPD problem Eachone has a specific scope or is designed to solve a variant ofthe problem but all of them aim to minimize the number ofdegenerations of the resulting primers

The DPD problem was expressed in different ways bymany researchers Linhart and Shamir [17] presented themaximum coverage DPD problem (MC-DPD) with the goalof finding a primer that covers themaximumnumber of inputsequencesThe selection of primers is constrained by limitingthe maximum degeneracy They also stated the minimumdegeneracyDPDproblem (MD-DPD) in which the objectiveis finding a primer with the minimum degeneracy thatcovers all the input sequences To solve MC-DPD they havedeveloped the HYDEN program [18] Wei et al [19] devel-oped the DePiCt program that uses hierarchical clusteringof protein blocks to design the primers Rose et al [20]developed a method for hybrid degenerate-nondegenerateprimers where the 31015840 region is degenerated and its 51015840 regionis a consensus clamp It was implemented in CODEHOP[21] and iCODEHOP [22] programs and was used to searchnew members of protein families and for identification andcharacterization of viral genomes Balla and Rajasekaran [23]described a method for a variant of MD-DPD that toleratesmismatch errors implemented in the minDPS programTheprograms PT-MIPS and PAMPS address mainly the problemof multiple degenerate primer design The aim of these pro-grams is finding theminimumnumber of degenerate primersthat cover all the input sequences taking into account thatnone of them may be more degenerated than an input value

In this study a new method for solving the DPD problemis proposed in which the focus is shifted away from theglobal minimum degenerated primer in favor of maximizinga score value which contains degeneracy but weighted by itsproximity to the 31015840 end of the primer This minimizes thedegeneracy at that end while allowing more freedom in theremaining positions Hereby the best scoring primers maynot be the less degenerated but take into account a biologicalrestraint that is not so heavily considered in other methodsThe 31015840 end is the essential anchoring site because it is wherethe polymerase initiates its activity From a strategic pointof view a decision must be made whether or not to allowdegeneracy at this end The presence of degeneracy at the31015840 end probably assures a greater diversity of sequences to

be detected However at the same time it diminishes theproportion of primer specific for a given sequenceThereforewe decided to be very strict in the search of conserved regionsand minimize the amount of degeneracy incorporated at thisend If the input set of sequences is sufficiently large it ishighly probable that a region identified as conserved amongall known sequences will likewise be conserved in any newmember of the family

2 Scoring and Primer Search Strategy

The method presented here can be used starting with DNAor protein sequence alignments (Figure 1(a)) If the inputwas DNA sequences were aligned to obtain one globaldegenerate DNA consensus If the input was a proteinalignment each protein of the alignment is backtranslatedinto a degenerate DNA sequence All the degenerate DNAsequences were combined in one global degenerate DNAconsensus This consensus sequence covers all the putativeinput sequences that could be the origin of each proteinsequence (Figure 1(b)) Also the consensus sequence maycode for amino acids that were not detected in the knownsequences This is inevitable given the kind of degeneracy ofthe genetic code

Then the degenerate consensus sequence was analyzedusing an overlapping window-based strategy The windowlength corresponds to the required oligonucleotide lengthand each window corresponds to a putative primer For eachcandidate primer a score is calculated In the first place foreach position of a candidate primer a position score (119878119901

119894) was

calculated using (1)

119878119901119894= 1 minus log

10(119873119863119894) (1)

where 119873119863119894is the degeneracy value at the position 119894 of the

oligonucleotide (1 le 119894 le 119899 where 119899 is the length of theprimer) 119873119863

119894is 1 for ldquoA C G or Trdquo 2 for ldquoK M R S W

or Yrdquo 3 for ldquoB D H or Vrdquo and 4 for ldquo119873rdquo This expressiontakes a value of 1 for nondegenerate bases and decreases formore degenerated bases On the other hand it is known thatin PCR reactions the 31015840 end of the primer is more importantthan the 51015840 endThe region of the 31015840 end of the primermust beas little degenerated as possible Therefore a good annealingat this end is imperative in order to minimize unspecificamplifications Considering this the value of 119878119901

119894is multiplied

by a weighting value (119882119901119894) defined by a straight line function

that increases as it comes closer to the 31015840 end (2)

119882119901119894= 119901119860 +

119894 times (119873119910minus 119901119860)

119873119909

(2)

where 119894 is the position from the 51015840 end along the oligonu-cleotide (1 le 119894 le 119899 where 119899 is the length of the primer)and 119901119860119873

119910 and119873

119909are user adjustable parameters defining

the straight line function 119901119860 is the axis intersection and(119873119910minus 119901119860)119873

119909is the slope Default values for 119901119860 119873

119910 and

119873119909are 0 1 and 1 respectively Changing them will permit

them to be more or less strict about including degenerationscloser to the 31015840 end of the primer Increasing 119901119860 or 119873

119909 or

Biotechnology Research International 3

Strategy

Multiple protein alignment

Backtranslation

DNA alignment Degenerate DNA alignment

Combination

Degenerate DNA sequence

Collection of oligonucleotides

Evaluation of primers score

Result with the best-scoring primers

(a)

TGG ACN CAR WSN YTN MGN AAR GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GGN YTN WSNTGG GTN CAR WSN YTN MGN MGN GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GAR YTN WSN

Degeneratedconsensus

TGG RKN CAR WSN YTN MGN MRN GRN YTN WSN111 224 112 224 214 214 224 124 214 224

A

A

C

C

G

G

T

TIUPAC code

Degeneracy 1 1 1 12 2 2 2 2 2 24 4 4 4

R RW S N NN NY M M

Seq_l W T Q S SL LR K GSeq_2 W T Q S SL LR R GSeq_3Seq_4Seq_lSeq_2Seq_3Seq_4

W V Q S SL LR R GW T Q S SL LR R E

lowast lowast lowast lowast lowast lowast

(b)

Figure 1 Minimum degenerated sequence generation (a) Diagram of the general strategy used (b) Sample protein alignment showing anexample for the steps of the strategy diagram Each sequence is computationally backtranslated to hypothetical nucleic acid sequences IUPACcodes were used to show ambiguous positions These sequences are piled up in order to get the degenerated consensus sequence Numbersbelow this indicate the degeneration value of each position

decreasing 119873119910 results in lesser stringency on the designed

primer Finally to obtain a scaled global score (119878119892) the result

of119882119901119894times 119878119901119894is divided by the maximum possible score (119872

119904

(3)) Global normalized score (119878119892) was calculated according

to (4) In this way 119878119892value varies from 0 to 1Maximum score

is obtained when the value of the 119878119901119894is 1 for each position

Therefore 119873119863119894must also be 1 too and this only happened

with nondegenerated primers

119872119904= 119899 times 119901119860 +

(119899 + 1) times 119899 times (119873119910minus 119901119860)

2 times 119873119909

(3)

119878119892=sum119899

119894=1119878119901119894times119882119901119894

119872119904

(4)

3 Methods

31 Alignment and Sequence Comparison Tools For globalalignment of protein sequences the program ClustalW 183[24] was used with default parameters Local alignments ofproteins against genomes were made using stand-alone Blast2213 [25] with default parameters Oligonucleotide matchsearches were made with specifically developed tools writtenin C language

32 Sequence Data Several sets of sequences were used inthe tests of the program for designing and comparison of theprimer sequences against genomes All sequences GenBankrsquosaccession numbers are presented in Table 1

33 Filtering Primers In addition to the scoring processFAS-DPD can optionally filter the primers individually

according to common criteria melting point temperature(estimated using Santaluciarsquos method [26]) 119866 + 119862 content51015840 versus 31015840 stability presence of tandem repeats of thesame base occurring at 31015840 end or any place in the sequencepresence of a degenerated position at the 31015840 end and for-mation of homodimer structures Also primer pairs can befiltered according to amplification product size melting pointtemperature compatibility 119866 + 119862 content compatibility andformation of heteroduplex structures

34 PCR Amplification The PCR conditions used in allexperiments follow a common protocol The reaction mixcontained 1X Taq DNA polymerase buffer (Productos Bio-logicos Argentina) 02mM dNTPs 05120583M of each primer20 pM template and different concentration of MgCl

2and

dimethyl sulfoxide (DMSO) in different reactionsTheMgCl2

was used from 2mM to 3mM and DMSO was used from0 (vv) to 5 (vv)The reactions were performed in a totalvolume of 10 120583L and the thermal profile consisted of an initialdenaturation step of 94∘C for 2min followed by 35 cyclesof denaturationannealingextension steps The denaturationstep was at 92∘C for 10 seconds the temperature of theannealing step was not the same in all experiments varyingfrom 45∘C to 60∘C and the time was always 15 seconds(see Figure 4) The extension step was at 72∘C the time ofthis step was 15 seconds In all cases one of the primers isspecific for the template while the other primer was designedby the method described in this work The last step was afinal extension of 5 minutes at 72∘C For Junin Virus thetemplate used was a plasmid containing a copy of cDNA ofJUNV S genomic segment For Baculovirus the template wasa plasmid containing a fragment of Anticarsia gemmatalis

4 Biotechnology Research International

Table 1 List of sequences used in the test of FAS-DPD Accession numbers and brief description are presented

Acc number Sequence description Acc number Sequence descriptionArenaviral sequences

AY1292481 Machupo v st Carvallo U410711 Sabia vAF4852601 Machupo v st Carvallo EU2604631 Chapare v st 810419AY9242061 Machupo v st MARU-216606 AY0812101 Allpahuayo v CLHP-2098AY9242021 Machupo v st Chicava AY0126861 Allpahuayo v from PeruAY6243551 Machupo v st Chicava AY0126871 Allpahuayo v st CLHP-2472AY9242051 Machupo v st 9301012 AF4852621 Pirital v st VAV-488AY6196451 Machupo v st Mallele AF2776591 Pirital vAY9242031 Machupo v st 9430084 M167351 Pichinde vAY9242081 Machupo v st MARU 249121 AF4852611 Parana v st 12056AY9242041 Machupo v st 200002427 AF5128291 Parana v st 10256AY9242071 Machupo v st MARU 222688 AF5128311 Flexal v st BeAn 293022AY5719591 Machupo v st 9530537 AF4852571 Flexal v st PinheiroAY7463531 Junin v st Candid-1 AF5128311 Flexal v st BeAn 293022AY3580232 Junin v st XJ13 AF5128301 Latino v st MARU 10924AY6196411 Junin v st Rumero AF4852591 Latino v st Maru 10924D100722 Junin v st MC2 U342481 Oliveros vM203041 Tacaribe v AY8473501 LCM v st Armstrong 53bAF4852561 Amapari v st BeAn 70563 M208691 LCM v st Armstrong 53bAF5128341 Amapari v st BeAn 70563 EU1360381 Dandenong v is 0710-2678AF5128321 Cupixi v st BeAn 119303 DQ3288741 Mopeia v st MozambiqueAY1292471 Guanarito v st INH-95551 DQ3288771 Ippy v st Dak-An-B-188-dAF4852581 Guanarito v st INH-95551 X524001 Nigeria Lassa vAY4975481 Guanarito v st CVH-960101 AY6282061 Lassa v st WellerAY9243921 Bear Canyon v st AV 98470029 AY6282011 Lassa v st MacentaAY9243911 Bear Canyon v st AV A0070039 AY6282051 Lassa v st Z148AF5128331 Bear canyon v st A0060209 J043241 Lassa v st JosiahDQ8652441 Catarina v st AV A0400135 AY7721681 Mopeia Lassa reassortant 29DQ8652451 Catarina v st AV A0400212 AY6282031 Lassa v st JosiahEU1233281 Skinner Tank v st AV D1000090 AF1818531 Lassa v st LPEU1233311 North American arenav st AV 96010024 AY6282071 Lassa v st PinneoEU1233301 North American arenav st AV 96010151 AY6282081 Lassa v st Acar-3080AF2280631 Whitewater Arroyo v st 9310135 AF1818541 Lassa v st 803213AF4852641 Whitewater Arroyo v st 9310141 AY3423901 Mobala v st ACAR-3080-MRC5-P2EU1233291 North American arenav st AV D1240007 M338791 Mopeia v st AN-21366AF4852631 Tamiami v st CDCW-10777 AY7721701 Mopeia v st AN-20410AF5128281 Tamiami v st W 10777

Baculoviral sequencesAP0062701 Adoxophyes honmai nucleopolyhedrovirus DNA X770481 Cryptophlebia leucotreta granulosisAF5479841 Adoxophyes orana granulovirus X795691 Cryptophlebia leucotreta granulosisNC 0058392 Agrotis segetum granulovirus NC 0028161 Cydia pomonella granulovirusL228581 Autographa californica nucleopolyhedrovirus clone C6 NC 0030831 Epiphyas postvittana NPVL331801 Bombyx mori nuclear polyhedrosis virus isolate T3 NC 0026542 Helicoverpa armigeraNC 0051372 Choristoneura fumiferana DEF MNPV AF0818101 Lymantria disparNC 0047783 Choristoneura fumiferanaMNPV NC 0035291 Mamestra configurata NPV-AAY8643301 Chrysodeixis chalcitesNPV U759302 Orgyia pseudotsugataMNPVAY4563891 Chrysodeixis chalcitesNPV AF4995961 Phthorimaea operculella granulovirusAY4563901 Chrysodeixis chalcitesNPV NC 0025931 Plutella xylostella granulovirusAY5457861 Chrysodeixis chalcitesNPV NC 0043231 Rachiplusia ouMNPVAY5457871 Chrysodeixis chalcitesNPV NC 0021691 Spodoptera exiguaMNPVAY2299871 Cryptophlebia leucotreta granulovirus NC 0031021 Spodoptera litura NPV

Biotechnology Research International 5

Table 1 Continued

Acc number Sequence description Acc number Sequence descriptionAY0962411 Cryptophlebia leucotreta granulovirus NC 0073831 Trichoplusia ni SNPVAY0962421 Cryptophlebia leucotreta granulovirus

Pseudomonas sp sequencesNC 0074922 Pseudomonas fluorescens Pf0-1 NC 0045781 Pseudomonas syringaeNC 0057733 Pseudomonas syringae NC 0029473 Pseudomonas putidaNC 0041296 Pseudomonas fluorescens NC 0025162 Pseudomonas aeruginosaNC 0070051 Pseudomonas syringae

Lactobacillus sp sequencesNC 0053621 Lactobacillus johnsonii NC 0026621 Lactococcus lactis subspNC 0075761 Lactobacillus sakei subsp NC 0045671 Lactobacillus plantarum

12

10

8

6

4

2

00 200 400 600 800 1000 1200 1400 1600 1800

Num

ber o

f prim

ers

Position in the multiple alignment

Figure 2 Primer distribution along one ORF A collection of thebest scoring primers for the nucleoprotein ofArenavirus comprisedof 50 primers for the genomic sequence and 50 for the antigenomicsequence were represented in the corresponding alignment posi-tion The height of each point indicates the cumulative number ofprimers corresponding at this position The alignment was madewith 71 arenavirus N protein sequences

MNPV p74 gene Sensitivity of the PCR assaywas determinedby dilution of cloned fragments from Junin virus [27] andBaculovirus template

4 Results

41 Distribution of Generated Primers Thedistribution of theresulting primers along the input sequence was analyzed Forthis the best one hundred primers obtained from a proteinalignment were selected For each position in the alignmentthe number of the selected primers that correspond to thisposition was recorded (Figure 2) The test was repeated fordifferent protein alignments

The selected primers were located around a few hotspots in the alignment This behavior indicates that thereare generally few regions in a sequence alignment usefulfor degenerate primer design Many primers found by theprogram are almost identical shifting one or two bases

between them and located formost cases in a 30ndash40 base runSimilar results were obtained with all proteins tested

42 Intragenomic Specificity and Score Analysis Because it ispossible that the best primers are not the less degeneratedsubstrings in the collection of candidates their specificitywas tested Also it was necessary to get a more preciseunderstanding of the score assigned by FAS-DPD in terms ofspecificity To achieve this the primers were compared withthe complete genome sequences used to design them lookingfor unspecific perfect matches

For this task a wide range of genome sizes was cov-ered Four collections of complete genome sequences wereused Arenavirus (genome in 104 bases order) Baculovirus(genome in 105 bases order) Lactobacillus (genome in 106bases order) and Pseudomonas (genome in 106 bases order)For each set a randomly selected genome was used asreference Each annotated ORF of this genome was used tosearch related ORFs in the other genomes of the collectionusing the local Blast tool The expected value of Blast wasused to decide when twoORFs were relatedWhen anORF ofthe reference genome had a related one in all other genomesall of them were aligned with ClustalW and used in furtheranalysis

Each resulting alignment was used as input for FAS-DPDto search primers For each genome polarity the best fiftynonoverlapping primers were selected This selection wasmade to avoid concentration of overrepresented hot-spot-derived high score primersThis allowedus to find a balancedset of primers with high and low scores

In order to find the relationship between the scorecalculated for each primer and its specificity all the primerswere compared with all the oligonucleotides of the same sizederived from each genome searching for perfect matches(Figure 3) The results were similar for the four systemsdespite their differences in genome size

There is an inverse correlation between primer score andthe number of unspecific perfectmatches But this correlationis not linear The quantity of unspecific perfect matches ofprimers with a minimal score of 085 and their target genomewas generally zeroThe number of unspecific perfect matchesgrew enormously with lower primer scores

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 3: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Biotechnology Research International 3

Strategy

Multiple protein alignment

Backtranslation

DNA alignment Degenerate DNA alignment

Combination

Degenerate DNA sequence

Collection of oligonucleotides

Evaluation of primers score

Result with the best-scoring primers

(a)

TGG ACN CAR WSN YTN MGN AAR GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GGN YTN WSNTGG GTN CAR WSN YTN MGN MGN GGN YTN WSNTGG ACN CAR WSN YTN MGN MGN GAR YTN WSN

Degeneratedconsensus

TGG RKN CAR WSN YTN MGN MRN GRN YTN WSN111 224 112 224 214 214 224 124 214 224

A

A

C

C

G

G

T

TIUPAC code

Degeneracy 1 1 1 12 2 2 2 2 2 24 4 4 4

R RW S N NN NY M M

Seq_l W T Q S SL LR K GSeq_2 W T Q S SL LR R GSeq_3Seq_4Seq_lSeq_2Seq_3Seq_4

W V Q S SL LR R GW T Q S SL LR R E

lowast lowast lowast lowast lowast lowast

(b)

Figure 1 Minimum degenerated sequence generation (a) Diagram of the general strategy used (b) Sample protein alignment showing anexample for the steps of the strategy diagram Each sequence is computationally backtranslated to hypothetical nucleic acid sequences IUPACcodes were used to show ambiguous positions These sequences are piled up in order to get the degenerated consensus sequence Numbersbelow this indicate the degeneration value of each position

decreasing 119873119910 results in lesser stringency on the designed

primer Finally to obtain a scaled global score (119878119892) the result

of119882119901119894times 119878119901119894is divided by the maximum possible score (119872

119904

(3)) Global normalized score (119878119892) was calculated according

to (4) In this way 119878119892value varies from 0 to 1Maximum score

is obtained when the value of the 119878119901119894is 1 for each position

Therefore 119873119863119894must also be 1 too and this only happened

with nondegenerated primers

119872119904= 119899 times 119901119860 +

(119899 + 1) times 119899 times (119873119910minus 119901119860)

2 times 119873119909

(3)

119878119892=sum119899

119894=1119878119901119894times119882119901119894

119872119904

(4)

3 Methods

31 Alignment and Sequence Comparison Tools For globalalignment of protein sequences the program ClustalW 183[24] was used with default parameters Local alignments ofproteins against genomes were made using stand-alone Blast2213 [25] with default parameters Oligonucleotide matchsearches were made with specifically developed tools writtenin C language

32 Sequence Data Several sets of sequences were used inthe tests of the program for designing and comparison of theprimer sequences against genomes All sequences GenBankrsquosaccession numbers are presented in Table 1

33 Filtering Primers In addition to the scoring processFAS-DPD can optionally filter the primers individually

according to common criteria melting point temperature(estimated using Santaluciarsquos method [26]) 119866 + 119862 content51015840 versus 31015840 stability presence of tandem repeats of thesame base occurring at 31015840 end or any place in the sequencepresence of a degenerated position at the 31015840 end and for-mation of homodimer structures Also primer pairs can befiltered according to amplification product size melting pointtemperature compatibility 119866 + 119862 content compatibility andformation of heteroduplex structures

34 PCR Amplification The PCR conditions used in allexperiments follow a common protocol The reaction mixcontained 1X Taq DNA polymerase buffer (Productos Bio-logicos Argentina) 02mM dNTPs 05120583M of each primer20 pM template and different concentration of MgCl

2and

dimethyl sulfoxide (DMSO) in different reactionsTheMgCl2

was used from 2mM to 3mM and DMSO was used from0 (vv) to 5 (vv)The reactions were performed in a totalvolume of 10 120583L and the thermal profile consisted of an initialdenaturation step of 94∘C for 2min followed by 35 cyclesof denaturationannealingextension steps The denaturationstep was at 92∘C for 10 seconds the temperature of theannealing step was not the same in all experiments varyingfrom 45∘C to 60∘C and the time was always 15 seconds(see Figure 4) The extension step was at 72∘C the time ofthis step was 15 seconds In all cases one of the primers isspecific for the template while the other primer was designedby the method described in this work The last step was afinal extension of 5 minutes at 72∘C For Junin Virus thetemplate used was a plasmid containing a copy of cDNA ofJUNV S genomic segment For Baculovirus the template wasa plasmid containing a fragment of Anticarsia gemmatalis

4 Biotechnology Research International

Table 1 List of sequences used in the test of FAS-DPD Accession numbers and brief description are presented

Acc number Sequence description Acc number Sequence descriptionArenaviral sequences

AY1292481 Machupo v st Carvallo U410711 Sabia vAF4852601 Machupo v st Carvallo EU2604631 Chapare v st 810419AY9242061 Machupo v st MARU-216606 AY0812101 Allpahuayo v CLHP-2098AY9242021 Machupo v st Chicava AY0126861 Allpahuayo v from PeruAY6243551 Machupo v st Chicava AY0126871 Allpahuayo v st CLHP-2472AY9242051 Machupo v st 9301012 AF4852621 Pirital v st VAV-488AY6196451 Machupo v st Mallele AF2776591 Pirital vAY9242031 Machupo v st 9430084 M167351 Pichinde vAY9242081 Machupo v st MARU 249121 AF4852611 Parana v st 12056AY9242041 Machupo v st 200002427 AF5128291 Parana v st 10256AY9242071 Machupo v st MARU 222688 AF5128311 Flexal v st BeAn 293022AY5719591 Machupo v st 9530537 AF4852571 Flexal v st PinheiroAY7463531 Junin v st Candid-1 AF5128311 Flexal v st BeAn 293022AY3580232 Junin v st XJ13 AF5128301 Latino v st MARU 10924AY6196411 Junin v st Rumero AF4852591 Latino v st Maru 10924D100722 Junin v st MC2 U342481 Oliveros vM203041 Tacaribe v AY8473501 LCM v st Armstrong 53bAF4852561 Amapari v st BeAn 70563 M208691 LCM v st Armstrong 53bAF5128341 Amapari v st BeAn 70563 EU1360381 Dandenong v is 0710-2678AF5128321 Cupixi v st BeAn 119303 DQ3288741 Mopeia v st MozambiqueAY1292471 Guanarito v st INH-95551 DQ3288771 Ippy v st Dak-An-B-188-dAF4852581 Guanarito v st INH-95551 X524001 Nigeria Lassa vAY4975481 Guanarito v st CVH-960101 AY6282061 Lassa v st WellerAY9243921 Bear Canyon v st AV 98470029 AY6282011 Lassa v st MacentaAY9243911 Bear Canyon v st AV A0070039 AY6282051 Lassa v st Z148AF5128331 Bear canyon v st A0060209 J043241 Lassa v st JosiahDQ8652441 Catarina v st AV A0400135 AY7721681 Mopeia Lassa reassortant 29DQ8652451 Catarina v st AV A0400212 AY6282031 Lassa v st JosiahEU1233281 Skinner Tank v st AV D1000090 AF1818531 Lassa v st LPEU1233311 North American arenav st AV 96010024 AY6282071 Lassa v st PinneoEU1233301 North American arenav st AV 96010151 AY6282081 Lassa v st Acar-3080AF2280631 Whitewater Arroyo v st 9310135 AF1818541 Lassa v st 803213AF4852641 Whitewater Arroyo v st 9310141 AY3423901 Mobala v st ACAR-3080-MRC5-P2EU1233291 North American arenav st AV D1240007 M338791 Mopeia v st AN-21366AF4852631 Tamiami v st CDCW-10777 AY7721701 Mopeia v st AN-20410AF5128281 Tamiami v st W 10777

Baculoviral sequencesAP0062701 Adoxophyes honmai nucleopolyhedrovirus DNA X770481 Cryptophlebia leucotreta granulosisAF5479841 Adoxophyes orana granulovirus X795691 Cryptophlebia leucotreta granulosisNC 0058392 Agrotis segetum granulovirus NC 0028161 Cydia pomonella granulovirusL228581 Autographa californica nucleopolyhedrovirus clone C6 NC 0030831 Epiphyas postvittana NPVL331801 Bombyx mori nuclear polyhedrosis virus isolate T3 NC 0026542 Helicoverpa armigeraNC 0051372 Choristoneura fumiferana DEF MNPV AF0818101 Lymantria disparNC 0047783 Choristoneura fumiferanaMNPV NC 0035291 Mamestra configurata NPV-AAY8643301 Chrysodeixis chalcitesNPV U759302 Orgyia pseudotsugataMNPVAY4563891 Chrysodeixis chalcitesNPV AF4995961 Phthorimaea operculella granulovirusAY4563901 Chrysodeixis chalcitesNPV NC 0025931 Plutella xylostella granulovirusAY5457861 Chrysodeixis chalcitesNPV NC 0043231 Rachiplusia ouMNPVAY5457871 Chrysodeixis chalcitesNPV NC 0021691 Spodoptera exiguaMNPVAY2299871 Cryptophlebia leucotreta granulovirus NC 0031021 Spodoptera litura NPV

Biotechnology Research International 5

Table 1 Continued

Acc number Sequence description Acc number Sequence descriptionAY0962411 Cryptophlebia leucotreta granulovirus NC 0073831 Trichoplusia ni SNPVAY0962421 Cryptophlebia leucotreta granulovirus

Pseudomonas sp sequencesNC 0074922 Pseudomonas fluorescens Pf0-1 NC 0045781 Pseudomonas syringaeNC 0057733 Pseudomonas syringae NC 0029473 Pseudomonas putidaNC 0041296 Pseudomonas fluorescens NC 0025162 Pseudomonas aeruginosaNC 0070051 Pseudomonas syringae

Lactobacillus sp sequencesNC 0053621 Lactobacillus johnsonii NC 0026621 Lactococcus lactis subspNC 0075761 Lactobacillus sakei subsp NC 0045671 Lactobacillus plantarum

12

10

8

6

4

2

00 200 400 600 800 1000 1200 1400 1600 1800

Num

ber o

f prim

ers

Position in the multiple alignment

Figure 2 Primer distribution along one ORF A collection of thebest scoring primers for the nucleoprotein ofArenavirus comprisedof 50 primers for the genomic sequence and 50 for the antigenomicsequence were represented in the corresponding alignment posi-tion The height of each point indicates the cumulative number ofprimers corresponding at this position The alignment was madewith 71 arenavirus N protein sequences

MNPV p74 gene Sensitivity of the PCR assaywas determinedby dilution of cloned fragments from Junin virus [27] andBaculovirus template

4 Results

41 Distribution of Generated Primers Thedistribution of theresulting primers along the input sequence was analyzed Forthis the best one hundred primers obtained from a proteinalignment were selected For each position in the alignmentthe number of the selected primers that correspond to thisposition was recorded (Figure 2) The test was repeated fordifferent protein alignments

The selected primers were located around a few hotspots in the alignment This behavior indicates that thereare generally few regions in a sequence alignment usefulfor degenerate primer design Many primers found by theprogram are almost identical shifting one or two bases

between them and located formost cases in a 30ndash40 base runSimilar results were obtained with all proteins tested

42 Intragenomic Specificity and Score Analysis Because it ispossible that the best primers are not the less degeneratedsubstrings in the collection of candidates their specificitywas tested Also it was necessary to get a more preciseunderstanding of the score assigned by FAS-DPD in terms ofspecificity To achieve this the primers were compared withthe complete genome sequences used to design them lookingfor unspecific perfect matches

For this task a wide range of genome sizes was cov-ered Four collections of complete genome sequences wereused Arenavirus (genome in 104 bases order) Baculovirus(genome in 105 bases order) Lactobacillus (genome in 106bases order) and Pseudomonas (genome in 106 bases order)For each set a randomly selected genome was used asreference Each annotated ORF of this genome was used tosearch related ORFs in the other genomes of the collectionusing the local Blast tool The expected value of Blast wasused to decide when twoORFs were relatedWhen anORF ofthe reference genome had a related one in all other genomesall of them were aligned with ClustalW and used in furtheranalysis

Each resulting alignment was used as input for FAS-DPDto search primers For each genome polarity the best fiftynonoverlapping primers were selected This selection wasmade to avoid concentration of overrepresented hot-spot-derived high score primersThis allowedus to find a balancedset of primers with high and low scores

In order to find the relationship between the scorecalculated for each primer and its specificity all the primerswere compared with all the oligonucleotides of the same sizederived from each genome searching for perfect matches(Figure 3) The results were similar for the four systemsdespite their differences in genome size

There is an inverse correlation between primer score andthe number of unspecific perfectmatches But this correlationis not linear The quantity of unspecific perfect matches ofprimers with a minimal score of 085 and their target genomewas generally zeroThe number of unspecific perfect matchesgrew enormously with lower primer scores

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 4: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

4 Biotechnology Research International

Table 1 List of sequences used in the test of FAS-DPD Accession numbers and brief description are presented

Acc number Sequence description Acc number Sequence descriptionArenaviral sequences

AY1292481 Machupo v st Carvallo U410711 Sabia vAF4852601 Machupo v st Carvallo EU2604631 Chapare v st 810419AY9242061 Machupo v st MARU-216606 AY0812101 Allpahuayo v CLHP-2098AY9242021 Machupo v st Chicava AY0126861 Allpahuayo v from PeruAY6243551 Machupo v st Chicava AY0126871 Allpahuayo v st CLHP-2472AY9242051 Machupo v st 9301012 AF4852621 Pirital v st VAV-488AY6196451 Machupo v st Mallele AF2776591 Pirital vAY9242031 Machupo v st 9430084 M167351 Pichinde vAY9242081 Machupo v st MARU 249121 AF4852611 Parana v st 12056AY9242041 Machupo v st 200002427 AF5128291 Parana v st 10256AY9242071 Machupo v st MARU 222688 AF5128311 Flexal v st BeAn 293022AY5719591 Machupo v st 9530537 AF4852571 Flexal v st PinheiroAY7463531 Junin v st Candid-1 AF5128311 Flexal v st BeAn 293022AY3580232 Junin v st XJ13 AF5128301 Latino v st MARU 10924AY6196411 Junin v st Rumero AF4852591 Latino v st Maru 10924D100722 Junin v st MC2 U342481 Oliveros vM203041 Tacaribe v AY8473501 LCM v st Armstrong 53bAF4852561 Amapari v st BeAn 70563 M208691 LCM v st Armstrong 53bAF5128341 Amapari v st BeAn 70563 EU1360381 Dandenong v is 0710-2678AF5128321 Cupixi v st BeAn 119303 DQ3288741 Mopeia v st MozambiqueAY1292471 Guanarito v st INH-95551 DQ3288771 Ippy v st Dak-An-B-188-dAF4852581 Guanarito v st INH-95551 X524001 Nigeria Lassa vAY4975481 Guanarito v st CVH-960101 AY6282061 Lassa v st WellerAY9243921 Bear Canyon v st AV 98470029 AY6282011 Lassa v st MacentaAY9243911 Bear Canyon v st AV A0070039 AY6282051 Lassa v st Z148AF5128331 Bear canyon v st A0060209 J043241 Lassa v st JosiahDQ8652441 Catarina v st AV A0400135 AY7721681 Mopeia Lassa reassortant 29DQ8652451 Catarina v st AV A0400212 AY6282031 Lassa v st JosiahEU1233281 Skinner Tank v st AV D1000090 AF1818531 Lassa v st LPEU1233311 North American arenav st AV 96010024 AY6282071 Lassa v st PinneoEU1233301 North American arenav st AV 96010151 AY6282081 Lassa v st Acar-3080AF2280631 Whitewater Arroyo v st 9310135 AF1818541 Lassa v st 803213AF4852641 Whitewater Arroyo v st 9310141 AY3423901 Mobala v st ACAR-3080-MRC5-P2EU1233291 North American arenav st AV D1240007 M338791 Mopeia v st AN-21366AF4852631 Tamiami v st CDCW-10777 AY7721701 Mopeia v st AN-20410AF5128281 Tamiami v st W 10777

Baculoviral sequencesAP0062701 Adoxophyes honmai nucleopolyhedrovirus DNA X770481 Cryptophlebia leucotreta granulosisAF5479841 Adoxophyes orana granulovirus X795691 Cryptophlebia leucotreta granulosisNC 0058392 Agrotis segetum granulovirus NC 0028161 Cydia pomonella granulovirusL228581 Autographa californica nucleopolyhedrovirus clone C6 NC 0030831 Epiphyas postvittana NPVL331801 Bombyx mori nuclear polyhedrosis virus isolate T3 NC 0026542 Helicoverpa armigeraNC 0051372 Choristoneura fumiferana DEF MNPV AF0818101 Lymantria disparNC 0047783 Choristoneura fumiferanaMNPV NC 0035291 Mamestra configurata NPV-AAY8643301 Chrysodeixis chalcitesNPV U759302 Orgyia pseudotsugataMNPVAY4563891 Chrysodeixis chalcitesNPV AF4995961 Phthorimaea operculella granulovirusAY4563901 Chrysodeixis chalcitesNPV NC 0025931 Plutella xylostella granulovirusAY5457861 Chrysodeixis chalcitesNPV NC 0043231 Rachiplusia ouMNPVAY5457871 Chrysodeixis chalcitesNPV NC 0021691 Spodoptera exiguaMNPVAY2299871 Cryptophlebia leucotreta granulovirus NC 0031021 Spodoptera litura NPV

Biotechnology Research International 5

Table 1 Continued

Acc number Sequence description Acc number Sequence descriptionAY0962411 Cryptophlebia leucotreta granulovirus NC 0073831 Trichoplusia ni SNPVAY0962421 Cryptophlebia leucotreta granulovirus

Pseudomonas sp sequencesNC 0074922 Pseudomonas fluorescens Pf0-1 NC 0045781 Pseudomonas syringaeNC 0057733 Pseudomonas syringae NC 0029473 Pseudomonas putidaNC 0041296 Pseudomonas fluorescens NC 0025162 Pseudomonas aeruginosaNC 0070051 Pseudomonas syringae

Lactobacillus sp sequencesNC 0053621 Lactobacillus johnsonii NC 0026621 Lactococcus lactis subspNC 0075761 Lactobacillus sakei subsp NC 0045671 Lactobacillus plantarum

12

10

8

6

4

2

00 200 400 600 800 1000 1200 1400 1600 1800

Num

ber o

f prim

ers

Position in the multiple alignment

Figure 2 Primer distribution along one ORF A collection of thebest scoring primers for the nucleoprotein ofArenavirus comprisedof 50 primers for the genomic sequence and 50 for the antigenomicsequence were represented in the corresponding alignment posi-tion The height of each point indicates the cumulative number ofprimers corresponding at this position The alignment was madewith 71 arenavirus N protein sequences

MNPV p74 gene Sensitivity of the PCR assaywas determinedby dilution of cloned fragments from Junin virus [27] andBaculovirus template

4 Results

41 Distribution of Generated Primers Thedistribution of theresulting primers along the input sequence was analyzed Forthis the best one hundred primers obtained from a proteinalignment were selected For each position in the alignmentthe number of the selected primers that correspond to thisposition was recorded (Figure 2) The test was repeated fordifferent protein alignments

The selected primers were located around a few hotspots in the alignment This behavior indicates that thereare generally few regions in a sequence alignment usefulfor degenerate primer design Many primers found by theprogram are almost identical shifting one or two bases

between them and located formost cases in a 30ndash40 base runSimilar results were obtained with all proteins tested

42 Intragenomic Specificity and Score Analysis Because it ispossible that the best primers are not the less degeneratedsubstrings in the collection of candidates their specificitywas tested Also it was necessary to get a more preciseunderstanding of the score assigned by FAS-DPD in terms ofspecificity To achieve this the primers were compared withthe complete genome sequences used to design them lookingfor unspecific perfect matches

For this task a wide range of genome sizes was cov-ered Four collections of complete genome sequences wereused Arenavirus (genome in 104 bases order) Baculovirus(genome in 105 bases order) Lactobacillus (genome in 106bases order) and Pseudomonas (genome in 106 bases order)For each set a randomly selected genome was used asreference Each annotated ORF of this genome was used tosearch related ORFs in the other genomes of the collectionusing the local Blast tool The expected value of Blast wasused to decide when twoORFs were relatedWhen anORF ofthe reference genome had a related one in all other genomesall of them were aligned with ClustalW and used in furtheranalysis

Each resulting alignment was used as input for FAS-DPDto search primers For each genome polarity the best fiftynonoverlapping primers were selected This selection wasmade to avoid concentration of overrepresented hot-spot-derived high score primersThis allowedus to find a balancedset of primers with high and low scores

In order to find the relationship between the scorecalculated for each primer and its specificity all the primerswere compared with all the oligonucleotides of the same sizederived from each genome searching for perfect matches(Figure 3) The results were similar for the four systemsdespite their differences in genome size

There is an inverse correlation between primer score andthe number of unspecific perfectmatches But this correlationis not linear The quantity of unspecific perfect matches ofprimers with a minimal score of 085 and their target genomewas generally zeroThe number of unspecific perfect matchesgrew enormously with lower primer scores

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 5: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Biotechnology Research International 5

Table 1 Continued

Acc number Sequence description Acc number Sequence descriptionAY0962411 Cryptophlebia leucotreta granulovirus NC 0073831 Trichoplusia ni SNPVAY0962421 Cryptophlebia leucotreta granulovirus

Pseudomonas sp sequencesNC 0074922 Pseudomonas fluorescens Pf0-1 NC 0045781 Pseudomonas syringaeNC 0057733 Pseudomonas syringae NC 0029473 Pseudomonas putidaNC 0041296 Pseudomonas fluorescens NC 0025162 Pseudomonas aeruginosaNC 0070051 Pseudomonas syringae

Lactobacillus sp sequencesNC 0053621 Lactobacillus johnsonii NC 0026621 Lactococcus lactis subspNC 0075761 Lactobacillus sakei subsp NC 0045671 Lactobacillus plantarum

12

10

8

6

4

2

00 200 400 600 800 1000 1200 1400 1600 1800

Num

ber o

f prim

ers

Position in the multiple alignment

Figure 2 Primer distribution along one ORF A collection of thebest scoring primers for the nucleoprotein ofArenavirus comprisedof 50 primers for the genomic sequence and 50 for the antigenomicsequence were represented in the corresponding alignment posi-tion The height of each point indicates the cumulative number ofprimers corresponding at this position The alignment was madewith 71 arenavirus N protein sequences

MNPV p74 gene Sensitivity of the PCR assaywas determinedby dilution of cloned fragments from Junin virus [27] andBaculovirus template

4 Results

41 Distribution of Generated Primers Thedistribution of theresulting primers along the input sequence was analyzed Forthis the best one hundred primers obtained from a proteinalignment were selected For each position in the alignmentthe number of the selected primers that correspond to thisposition was recorded (Figure 2) The test was repeated fordifferent protein alignments

The selected primers were located around a few hotspots in the alignment This behavior indicates that thereare generally few regions in a sequence alignment usefulfor degenerate primer design Many primers found by theprogram are almost identical shifting one or two bases

between them and located formost cases in a 30ndash40 base runSimilar results were obtained with all proteins tested

42 Intragenomic Specificity and Score Analysis Because it ispossible that the best primers are not the less degeneratedsubstrings in the collection of candidates their specificitywas tested Also it was necessary to get a more preciseunderstanding of the score assigned by FAS-DPD in terms ofspecificity To achieve this the primers were compared withthe complete genome sequences used to design them lookingfor unspecific perfect matches

For this task a wide range of genome sizes was cov-ered Four collections of complete genome sequences wereused Arenavirus (genome in 104 bases order) Baculovirus(genome in 105 bases order) Lactobacillus (genome in 106bases order) and Pseudomonas (genome in 106 bases order)For each set a randomly selected genome was used asreference Each annotated ORF of this genome was used tosearch related ORFs in the other genomes of the collectionusing the local Blast tool The expected value of Blast wasused to decide when twoORFs were relatedWhen anORF ofthe reference genome had a related one in all other genomesall of them were aligned with ClustalW and used in furtheranalysis

Each resulting alignment was used as input for FAS-DPDto search primers For each genome polarity the best fiftynonoverlapping primers were selected This selection wasmade to avoid concentration of overrepresented hot-spot-derived high score primersThis allowedus to find a balancedset of primers with high and low scores

In order to find the relationship between the scorecalculated for each primer and its specificity all the primerswere compared with all the oligonucleotides of the same sizederived from each genome searching for perfect matches(Figure 3) The results were similar for the four systemsdespite their differences in genome size

There is an inverse correlation between primer score andthe number of unspecific perfectmatches But this correlationis not linear The quantity of unspecific perfect matches ofprimers with a minimal score of 085 and their target genomewas generally zeroThe number of unspecific perfect matchesgrew enormously with lower primer scores

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 6: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

6 Biotechnology Research International

Arenavirus

Score

6

5

4

3

2

1

0045 05 055 06 065 07 075 08 085 09 095 1

log10

(num

ber o

f mat

ches

)

(a)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Baculovirus

(b)

Score045 05 055 06 065 07 075 08 085 09 095 1

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Lactobacillus sp

(c)

Score

045 05 055 06 065 07 075 08 085 09 095 1

6

7

8

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

Pseudomonas sp

(d)

6

7

5

4

3

2

1

0

log10

(num

ber o

f mat

ches

)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20Base pairing

(e)

Figure 3 Specificity of primers Primers designed for all ORFs shared among eachmodel organism used were compared against the completeset of genomes for perfect matches with oligonucleotides of the same length Each point represents the number of perfect matches (in log

10

scale) of a primer in relation to its score The length of the primers was 20 nucleotides (a) Arenavirus genomes 71 for S (small) RNA 24for L (large) RNA (b) 22 Baculovirus genomes (c) 5 Lactobacillus sp genomes (d) 7 Pseudomonas sp genomes (e) A set of primers forLactobacillus sp with scores between 085 and 090 were tested for nonperfect matches that could anneal unspecifically in PCR Each barrepresents the number of matches against the complete set of Lactobacillus genomes The number below the bar indicates how many basesare shared

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 7: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Biotechnology Research International 7

GPC NIGR 3UTR5UTR86 1458 92 1695 80

G-1058

Arena Arena

p741935

FAS-DPD primersRegion used in design

p74-1334r

Regions used in designFAS-DPD primers

N-527N-918Generic primers used Generic primers used

p74ndash550

(a)

Product sizePrimers

N527ArenaN918Arena

GR1058Arena

Sensitivity

20

20

Annealing

542577468587

Score

08490868086

0828

(FAS-DPDspecific) temperature ( ∘C) (copiesreaction)

p74-1334rp74ndash550

605 bp986 bp947 bp896 bp

2 times 102

2 times 105

(b)

Figure 4 Experimental challenge of designed primers (a) Genomic organization of the Arenaviruses S RNA and P74 ORFArenavirus showsan ambisense coding strategy of the GPC and N ORFs and three noncoding regions 51015840 untranslated region (5UTR) intergenic region (IGR)and 31015840 untranslated region (3UTR) The location of each designed primer (GR1058 N918 N537 and p74-1334r) and specific primers (Arenap74-550) is also shown (b) The results obtained with each pair of primers tested and characteristics of reaction are shown

43 Experimental Challenge In addition to theoretic teststo determine the usefulness of FAS-DPD designed primersexperimental challenges were performed using ArenavirusandBaculovirus asmodelsThe assay consisted in performingPCRs using a pair of primers including a degenerated FAS-DPDdesigned primer and a standard nondegenerated primer(this allowed testing individually each designedprimer) opti-mizing the reaction conditions and measuring its sensitivity

For arenavirus the primers were designed usingsequences of 71 different GenBank records for thenucleoprotein (N protein) and the glycoprotein precursor(GPC protein) From the lists of the highest scoredprimers three were randomly selected and synthesizedfor experimental evaluation one for GPC (GR1058RCNWHRTTNYCRAARCAYTT score 08596) andtwo for N (N527 GGNRYNSWNCCRAAYTGRTT score08494 N918 NANRTTYTCRTANGGRTTNC score08437) (Figure 4(a))

Amplification reactions were performed using each ofthese primers together with the Arena primer CGCAC-CGGGGATCCTAGGC) as nondegenerated counterpartThelatter is a generic primer forArenaviruses thatmatches almostperfectly with the nineteen bases of 31015840 end of the genomicRNA sequence and with the nineteen bases of 31015840 end of theantigenomic RNA sequence of all known arenaviruses Thereaction template was a cDNA corresponding to the Juninvirus small RNA segment which encodes the N and GPCproteins

For Baculovirus one primer (p74-1334r BYRWRNC-CVWRNGGRTCSCA score 08281) was designed using 57sequences of p74 different Baculovirus As its counterpart aspecific primer for Anticarsia gemmatalis MNPV was used

[28] (p75-550r GGcGTGGACGACGTGC) The reactiontemplate was theAnticarsia gemmatalisMNPV p74 isolate 2D[29] gene cloned in a plasmid

PCRs were assayed with different sets of conditions andthe sensitivity was measured Sensitivity achieved with are-navirus primers was high Twenty copies120583L or less of specifictemplate were detected ForBaculovirus the detectionwas notas sensible as for arenavirus but it can be considered as agood sensitivity 2 times 104 copies120583L of specific template weredetectedThis difference can be explained taking into accountthat the divergence observed for baculovirus sequences isgreater than for arenavirus Therefore the score for the p74-1334r primer was lower than that of Arenavirus

44 Increment of Degeneration of FAS-DPD Designed Primersin relation to Minimum Degenerated Substring The aim ofFAS-DPD is to design universal degenerated primers thatare not necessarily the less degenerated sequences of thecollection of candidates In order to know how much degen-eration FAS-DPD designed primers acquire another test wasperformed Given an alignment of homologous ORFs thedegeneration was calculated for the highest scoring primerselected with FAS-DPD and for the minimum degeneratedsubstring of the same length Then the ratio of these twovalues was obtained The comparison was made with thecomplete set of ORF alignments used before (ArenavirusBaculovirus Pseudomonas and Lactobacillus) (Figure 5) Inmore than 90 of the cases the increase of degeneration valueis at most fourfold (eg changing ldquo A rdquo to ldquo N rdquo orldquo A A rdquo to ldquo R W rdquo) Therefore these primers

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 8: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

8 Biotechnology Research International

2500

2000

1500

1000

500

0

Num

ber o

f prim

ers

5911954

826777

943386 983

134 99643

99911

1002

1 2 4 8 16 32 64Degeneration value ratio

Figure 5 Comparison of FAS-DPD designed primers and mini-mumdegenerated substrings Collection of primers with the highestscore designed for all the ORFs shared by all the genomes usedwere compared against the minimum degenerated subsequenceof the same length for each ORF in order to know how muchmore degenerated they are The number below each bar indicatesthe ratio of degeneration between the designed primer and theminimum degeneration substring The number above each barindicates the amount of primers that correspond with the ratiomentioned before The percentages are cumulative with respect toincreasing degeneration ratios and referred to the total number ofprimers used in the test

have only up to two more degenerated positions than thesubstring with minimum degeneration

It is important to note that in general there is notonly one minimum degeneracy substring for each ORF Thedecision of which primer is better must not only take intoaccount the degeneration value The position of degeneratedbases in the sequence is crucialThe ratio of greater increase ofdegeneration foundwas 64 this corresponds to only less than01 of primersThis result shows that FAS-DPD primers aremore degenerated than the less degenerated substring butthis increase of degeneration is slight and does not imply ahigh compromise of the specificity

5 Discussion

In this work we presented a new algorithm implemented inthe FAS-DPD software as an alternative strategy to solvingDPD problems FAS-DPD was designed to use multiplealignments of proteins or nucleic acids as input data andconstructs a consensus degenerate sequence from that whichis then used to design the putative primers

The experimental background knowledge from molecu-lar biology teaches us that in the real world the 31015840 ends ofprimers are key determinants of a successful amplificationFAS-DPD takes into account this property and incorporatesspecial considerations in the global score calculation becom-ing more strict for the 31015840 end than for the 51015840 end

The specificity of the set of primers designed with FAS-DPD was computationally tested with several collectionsof whole genomes ranging from 104 bp to 106 bp The

restriction to higher lengths was due to the lack of wholegenome collections for genus of bigger sizes with severalindividuals In all genome collections assayed the resultsshowed the same behavior there is a relationship betweenthe score value and the number of unspecific perfectmatchesThis analysis allows us to suggest a cut-off score (085) forprimers that could be more successful

PCRs were successfully performed on arenaviral andbaculoviral models For arenavirus the designed GPC or Nprimers were used with the universal Arena primer [30]For Baculovirus the designed p74 primer was used with aspecific p74 primer [28] Each reaction was tested in differentconditions in order to optimize its yield

FAS-DPD software is licensed under GNU GeneralPublic License Version 3 and is available at httpwwwgithubcomjavierisertefas-dpd

In general the results suggest that FAS-DPD could beused to design generalized degenerate primers for detectionof known or unknownmembers of gene families or organismfamilies including different types of pathogens Also this toolwould allow a more efficient search for enzymes and otherproteins with commercial or biotechnological importancemaking for a faster and cheaper research process

References

[1] K Nelson Metagenomics as a Tool to Study Biodiversity ASMPress Washington DC USA 2008

[2] J Welsh and M McClelland ldquoFingerprinting genomes usingPCR with arbitrary primersrdquoNucleic Acids Research vol 18 no24 pp 7213ndash7218 1990

[3] J G KWilliams A R Kubelik K J Livak J A Rafalski and SV Tingey ldquoDNApolymorphisms amplified by arbitrary primersare useful as geneticmarkersrdquoNucleic Acids Research vol 18 no22 pp 6531ndash6535 1990

[4] W C Nichols S E Lyons J S Harrison R L Cody and DGinsburg ldquoSevere vonWillebrand disease due to a defect at thelevel of von Willebrand factor mRNA expression detection byexonic PCR-restriction fragment length polymorphism analy-sisrdquoProceedings of theNationalAcademy of Sciences of theUnitedStates of America vol 88 no 9 pp 3857ndash3861 1991

[5] EDesmarais I Lanneluc and J Lagnel ldquoDirect amplification oflength polymorphisms (DALP) or how to get and characterizenew genetic markers in many speciesrdquo Nucleic Acids Researchvol 26 no 6 pp 1458ndash1465 1998

[6] W Rychlik and R E Rhoads ldquoA computer program for choos-ing optimal oligonucleotides for filter hybridization sequencingand in vitro amplification of DNArdquo Nucleic Acids Research vol17 no 21 pp 8543ndash8551 1989

[7] L Hillier and P Green ldquoOSP a computer program for choosingPCR and DNA sequencing primersrdquo PCR Methods and Appli-cations vol 1 no 2 pp 124ndash128 1991

[8] P Li K C Kupfer C J Davies D Burbee G A Evans and HR Garner ldquoPRIMO a primer design program that applies basequality statistics for automated large-scale DNA sequencingrdquoGenomics vol 40 no 3 pp 476ndash485 1997

[9] V Proutski and E C Holmes ldquoPrimer Master a new programfor the design and analysis of PCR primersrdquo Computer Applica-tions in the Biosciences vol 12 no 3 pp 253ndash255 1996

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 9: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Biotechnology Research International 9

[10] S Haas M Vingron A Poustka and S Wiemann ldquoPrimerdesign for large scale sequencingrdquo Nucleic Acids Research vol26 no 12 pp 3006ndash3012 1998

[11] S Rozen and H Skaletsky ldquoPrimer3 on the WWW for generalusers and for biologist programmersrdquo Methods in molecularbiology vol 132 pp 365ndash386 2000

[12] AGibbs J ArmstrongAMMackenzie andG FWeiller ldquoTheGPRIME package computer programs for identifying the bestregions of aligned genes to target in nucleic acid hybridisation-based diagnostic tests and their use with plant virusesrdquo Journalof Virological Methods vol 74 no 1 pp 67ndash76 1998

[13] M D Gadberry S T Malcomber A N Doust and E AKellogg ldquoPrimaclade a flexible tool to find conserved PCRprimers across multiple speciesrdquo Bioinformatics vol 21 no 7pp 1263ndash1264 2005

[14] C E Lopez-Nieto and S K Nigam ldquoSelective amplification ofprotein-coding regions of large sets of genes using statisticallydesigned primer setsrdquo Nature Biotechnology vol 14 no 7 pp857ndash861 1996

[15] A Turchin and J F Lawler ldquoThe primer generator a programthat facilitates the selection of oligonucleotides for site-directedmutagenesisrdquo BioTechniques vol 26 no 4 pp 672ndash676 1999

[16] D Hyndman A Cooper S Pruzinsky D Coad and MMitsuhashi ldquoSoftware to determine optimal oligonucleotidesequences based on hybridization simulation datardquo BioTech-niques vol 20 no 6 pp 1090ndash1097 1996

[17] C Linhart and R Shamir ldquoDegenerate primer design theoret-ical analysis and the HYDEN programrdquo Methods in MolecularBiology vol 402 pp 221ndash244 2007

[18] C Linhart and R Shamir ldquoThe degenerate primer designproblemrdquo Bioinformatics vol 18 supplement 1 pp S172ndashS1802002

[19] X Wei D N Kuhn and G Narasimhan ldquoDegenerate primerdesign via clusteringrdquo IEEE Computer Society BioinformaticsConference vol 2 pp 75ndash83 2003

[20] T M Rose J G Henikoff and S Henikoff ldquoCODEHOP(COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCRprimer designrdquo Nucleic Acids Research vol 31 no 13 pp 3763ndash3766 2003

[21] T M Rose ldquoCODEHOP-mediated PCR a powerful techniquefor the identification and characterization of viral genomesrdquoVirology Journal vol 2 article 20 2005

[22] R Boyce P Chilana and T M Rose ldquoiCODEHOP anew interactive program for designing COnsensus-DEgenerateHybrid Oligonucleotide Primers from multiply aligned proteinsequencesrdquo Nucleic Acids Research vol 37 no 2 pp W222ndashW228 2009

[23] S Balla and S Rajasekaran ldquoAn efficient algorithm for min-imum degeneracy primer selectionrdquo IEEE Transactions onNanobioscience vol 6 no 1 pp 12ndash17 2007

[24] J D Thompson D G Higgins and T J Gibson ldquoCLUSTALW improving the sensitivity of progressive multiple sequencealignment through sequence weighting position-specific gappenalties and weight matrix choicerdquoNucleic Acids Research vol22 no 22 pp 4673ndash4680 1994

[25] S F AltschulW GishWMiller EWMyers and D J LipmanldquoBasic local alignment search toolrdquo Journal ofMolecular Biologyvol 215 no 3 pp 403ndash410 1990

[26] J SantaLucia ldquoA unified view of polymer dumbbell andoligonucleotide DNA nearest-neighbor thermodynamicsrdquo Pro-ceedings of the National Academy of Sciences of the United Statesof America vol 95 no 4 pp 1460ndash1465 1998

[27] A S Parodi D J Greenway H R Rugiero et al ldquoConcerningthe epidemic outbreak in Juninrdquo El Dıa medico vol 30 no 62pp 2300ndash2301 1958

[28] M F Bilen M G Pilloff M N Belaich et al ldquoFunctional andstructural characterisation of AgMNPV ie1rdquo Virus Genes vol35 no 3 pp 549ndash562 2007

[29] J V de Castro Oliveira J L CWolff A Garcia-Maruniak et alldquoGenome of the most widely used viral biopesticide Anticarsiagemmatalismultiple nucleopolyhedrovirusrdquo Journal of GeneralVirology vol 87 no 11 pp 3233ndash3250 2006

[30] S E Goni J A Iserte B I Stephan C S Borio P DGhiringhelli and M E Lozano ldquoMolecular analysis of thevirulence attenuation process in Junın virus vaccine genealogyrdquoVirus Genes vol 40 no 3 pp 320ndash328 2010

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology

Page 10: Research Article Family-Specific Degenerate Primer Design ...tant analytical tools of molecular biology, allows a highly sensitivedetectionandspeci cgenotypingofenvironmental ... primer.Finally,toobtainascaledglobalscore(

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Anatomy Research International

PeptidesInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation httpwwwhindawicom

International Journal of

Volume 2014

Zoology

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Molecular Biology International

GenomicsInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioinformaticsAdvances in

Marine BiologyJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Signal TransductionJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

BioMed Research International

Evolutionary BiologyInternational Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Biochemistry Research International

ArchaeaHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Genetics Research International

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Advances in

Virolog y

Hindawi Publishing Corporationhttpwwwhindawicom

Nucleic AcidsJournal of

Volume 2014

Stem CellsInternational

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Enzyme Research

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Microbiology