CHAPTER 2. REVIEW OF LITERATURE -...

23
CHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB), designated as ‘the Captain of all these men of death’ as it continued to claim lives throughout the ages (Daniel, 2006), is a widespread infectious disease. It is caused by various Mycobacterial strains, commonly by Mycobacterium tuberculosis, an obligate human pathogen. It can be traced in humans back to about 6000 B.C. (Wirth et al., 2008). In the past it also called as Phthisis or Phthisis pulmonalis (Breathnach & Moynihan, 2004). On 24 th March, 1882, the German scientist Robert Koch identified a tiny microorganism MTB as the causative pathogen of TB. This discovery tiled the way for vast advances in TB research. For diagnosing TB, Clemens von Pirquet developed the tuberculin skin test in 1907 (Daniel, 2006). Albert Calmette and Camille Guérin developed BCG (Bacille Calmette-Guérin) vaccine from attenuated bovine tuberculosis strain in 1921 (Calmette 1931). Now, BCG is the world’s most widely used vaccine. With the discoveries of para-amino salicylic acid (PAS) and streptomycin in two successive years i.e. 1943 and 1944 respectively, a revolution in TB therapy began. The first oral anti-TB drug, Isoniazid was developed in 1952 followed by Rifampicin in 1963 (Girling et al., 1976; Daniel, 2006). In the early 1970s, short-course chemotherapy regimens developed which was proved to be highly efficient in the TB therapy (Dawson & Bateman, 2009; Zumla et al., 2009). In the early 1980s, TB treatment regimen was worked out. In 1993, Tuberculosis was declared as “global health emergency” by World Health Organization (WHO). The “Stop TB Partnership” (http://www.stoptb.org) was established in 2001 with an aim to eliminate tuberculosis as a public health. WHO outlined a global plan to reduce the global TB burden by 2015 with a target to eradicate TB as a “public health problem” by 2050 (The Stop TB Strategy, WHO, http://www.who.int/).

Transcript of CHAPTER 2. REVIEW OF LITERATURE -...

Page 1: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

CHAPTER 2. REVIEW OF LITERATURE

2.1. Tuberculosis

Tuberculosis (TB), designated as ‘the Captain of all these men of death’ as it

continued to claim lives throughout the ages (Daniel, 2006), is a widespread infectious

disease. It is caused by various Mycobacterial strains, commonly by Mycobacterium

tuberculosis, an obligate human pathogen. It can be traced in humans back to about 6000

B.C. (Wirth et al., 2008). In the past it also called as Phthisis or Phthisis pulmonalis

(Breathnach & Moynihan, 2004). On 24th March, 1882, the German scientist Robert Koch

identified a tiny microorganism MTB as the causative pathogen of TB. This discovery tiled

the way for vast advances in TB research. For diagnosing TB, Clemens von Pirquet

developed the tuberculin skin test in 1907 (Daniel, 2006). Albert Calmette and Camille

Guérin developed BCG (Bacille Calmette-Guérin) vaccine from attenuated bovine

tuberculosis strain in 1921 (Calmette 1931). Now, BCG is the world’s most widely used

vaccine. With the discoveries of para-amino salicylic acid (PAS) and streptomycin in two

successive years i.e. 1943 and 1944 respectively, a revolution in TB therapy began. The first

oral anti-TB drug, Isoniazid was developed in 1952 followed by Rifampicin in 1963 (Girling

et al., 1976; Daniel, 2006). In the early 1970s, short-course chemotherapy regimens

developed which was proved to be highly efficient in the TB therapy (Dawson & Bateman,

2009; Zumla et al., 2009). In the early 1980s, TB treatment regimen was worked out. In 1993,

Tuberculosis was declared as “global health emergency” by World Health Organization

(WHO). The “Stop TB Partnership” (http://www.stoptb.org) was established in 2001 with an

aim to eliminate tuberculosis as a public health. WHO outlined a global plan to reduce the

global TB burden by 2015 with a target to eradicate TB as a “public health problem” by 2050

(The Stop TB Strategy, WHO, http://www.who.int/).

Page 2: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

2.2. Epidemiology

2.2.1. World

Tuberculosis is the major public health problem worldwide. Globally, 8.6 million new

TB cases with mortality of 1.3 million from TB occurred in 2012, of which 0.32 million

deaths were in HIV-positive people. Out of total cases, an estimated 0.5 million are children

and 2.9 million occurred among women. There are an estimated 0.45 million multidrug-

resistant TB (MDR-TB) cases with 0.17 million death from MDR-TB. The majority of cases

worldwide were in the South-East Asia (29%), Africa (27%) and Western Pacific (19%)

regions. India & China unaided accounted for 26% and 12% of total cases, respectively

(WHO TB Report, 2013). The largest number of incident cases were in five countries i.e.

India (2.0 - 2.4 million), China (0.9 -1.1 million), South Africa (0.4 - 0.6 million), Indonesia

(0.4 - 0.5 million) and Pakistan (0.3 million–0.5 million). Figure 1 showing the estimated

number of TB cases in 2012.

Page 3: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Figure 1: Estimated number of TB cases in the year 2012

(WHO TB Report, 2013)

2.2.2. India

Annually, there are more new cases of TB in India as compared to other country. Out

of the 8.6 million estimate yearly incidences of TB globally, in 2012, 2.0 - 2.4 million have

occurred in India i.e. 26% of total TB global cases. There are 0.27 million deaths caused by

TB in 2012. India also accounted for 31% of the estimated 2.9 million missed TB cases

(WHO TB Report, 2013). According to WHO, there are 2-3% of Indian TB patients are

multi-drug-resistant.

2.3. Transmission of TB

Usually lungs are affected by tuberculosis, but sometimes it also affects other parts

also. It spreads through air when person with active pulmonary TB, sneeze, cough, speak or

transmit respiratory fluids through the air. This fluid contains droplets of diameter size range

from 0.5 to 5.0 µm. There are up to 40000 droplets in a single sneeze (Cole & Cook, 1998)

and each droplet has the competence to transmit disease, as the infectious dose of TB is very

low (less than 10 bacteria may cause the infection) ( Nicas et al., 2005). People with

protracted, close or recurrent contact with TB infected person are more prone to TB infection,

with 22% anticipated infection rate (Ahmed & Hasnain, 2011). A person having untreated

and active TB may infect other normal people. There are many factors make people more

vulnerable to TB infections such as quantity of contagious droplets the infected person

expelled, the aeration efficiency, the exposure period, the strain virulence properties of MTB

and also the immunity level of healthy individuals. One infected person takes 3 to 4 weeks

time to transmit the disease to other normal individuals.

Page 4: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

2.4. Pathogenesis of TB

About 90% of the people with MTB infection but having without any symptoms of

TB called latent TB and there are only 10% of them will become active TB disease (Mainous

& Pomeroy, 2009). When the pathogenic organism MTB reach at the pulmonary alveoli, the

tuberculosis infection begins. In pulmonary alveoli, the mycobacteria invade and get

replicated within endosomes of alveolar macrophages (Houben et al., 2006). Blood stream

infection may also cause for TB of the lungs. Due to this hematogenous transmission, TB can

also spread to the brain, kidneys and bones (Herrmann & Lagrange, 2005). How TB affects

the other body parts is still unknown, however it hardly affects thyroid, heart, pancreas, and

skeletal muscles (Agarwal et al., 2005). If the mycobacteria enter into the bloodstream from

any damaged tissue, they can reach different parts of the tissue and cause infection appearing

as tiny and white tubercles (Crowley, 2010). This type of severe infection are commonly

found in children and it is called as miliary TB, in case of HIV (Harries, 2005) with high

fatality rate nearly about 30% (Jacob et al., 2009).

2.5. Diagnosis and treatment of TB

An ultimate diagnosis of TB is possible if MTB strain is identified in a clinical

specimen such as tissue biopsy, pus or sputum. But, due to the prolonged culture time of this

sluggish pathogen, treatment is often started before cultures are established. TB treatment

uses antibiotics for killing the bacteria. The cell wall composition of mycobacteria is found to

be very atypical at both structural and chemical level which obstructs the drug entry due to

which effective TB treatment is difficult (Brennan & Nikaido, 1995). It is very difficult to

diagnose active TB based on signs and symptoms (Bento et al., 2011). However, the patients

with continuous cough of more than two weeks or lung disease may be considered for

diagnosis of tuberculosis (Escalante, 2009). Multiple sputum cultures and chest X-ray are

usually part of the initial evaluation (Escalante, 2009). In developing world, “tuberculin skin

Page 5: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

tests” and “interferon-γ release assays” are rarely used (Metcalfe et al., 2011; Sester et al.,

2011).

For rapid diagnosis of tuberculosis, adenosine deaminase and nucleic acid

amplification tests (NAATs) came in 2010 (Bento, 2011; Ling et al., 2008). As, there is no

specificity and sensitivity in detecting antibodies through blood tests, it is not recommended

(Steingart et al., 2011). For people with high TB risk, the Mantoux tuberculin skin test is

frequently used for diagnosis (Escalante, 2009). Those patients found positive in Mantoux

tuberculin test, interferon gamma release assays (IGRAs), are recommended which is more

sensitive (Amicosante et al., 2010).

Normally DOTS (Directly Observed Treatment, Short course) is used for drug

sensitive tuberculosis. It involves intake of four drugs, isoniazid, rifampicin, ethambutol and

pyrazinamide for the first two months followed by isoniazid and rifampicin for next four

months. In case of MDR-TB (Multi Drug Resistance - TB), second-line drugs like

fluoroquinolones, kanamycin and amikacin are used. The treatments currently used for MDR

and XDR-TB (Extremely Drug Resistant - TB) are long, toxic, expensive with little efficacy

(Zumla et al., 2012). There is over a dozen of new anti - TB drugs in clinical trials or in

preclinical development (Cole & Riccardi, 2011) and further research is going on for

developing new effective treatment for tuberculosis.

2.6. Mycobacterium tuberculosis (MTB) - the causative pathogen of TB

Mycobacterium tuberculosis is a slow-growing, aerobic, acid-fast, rod-shaped

bacterium. It causes TB in human. Other pathogens of the genus Mycobacterium causes

various human and animal diseases. Those include Mycobacterium ulcerans causes Buruli

ulcer, Mycobacterium leprae causes leprosy, Mycobacterium avium frequently reported to be

associated with HIV-infection (Horsburgh, 2001). MTB takes approximately 24 hours for cell

Page 6: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

division and nearly 3 - 4 weeks for forming colonies in vitro. It has a complex life cycle in

human, having a latent or dormant phase where there is reduced metabolism within host cell.

The most characteristic feature of the genus Mycobacteria is its exceptionally flexible cell

envelope which contains affluent diversity of lipids such as glycolipids, mycolic acids and

polysaccharides. This distinct cell wall is responsible for the acid fast staining of

mycobacteria (Uplekar, 2012).

Hierarchical classification of Mycobacterium tuberculosis Bacilli (Cavalier-Smith, 2004), is

as follows:

Hierarchical classification of Mycobacterium tuberculosis

Kingdom : Bacteria Suborder : Cornebacterinneae

Phylum : Actinobacteria Family : Mycobacteriaceae

Class : Schizomycetes Genus : Mycobacterium

Subclass : Actinobacteridae Species : tuberculosis

Order : Actinomycetales

MTB is a genetically diverse organism with varying phenotypes. Various MTB strain

are associated with different geographical areas. TB outbreaks are caused by hypervirulent

MTB strains. These strains have mutation in the form of deletion in their cell wall modifying

regulators or enzymes responsible for responding environmental stimuli. Due to these

mutations, MTB acquires the ability to survive in garnuloma for a long time causing

persistent infection and makes the organism extremely pathogenic (Casali, 2009).

2.7. Factors responsible for pathogenesis and virulence in MTB

There are different factors involved in the pathogenesis and virulence of

Mycobacterium tuberculosis. By knowing the virulent factors of MTB may reveal better

Page 7: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

understanding of host pathogen interaction (Camacho et al., 1999) and for the development of

novel vaccines and drugs. Many studies have been carried out in order to identify those

important factors by different researchers around the world.

Disruption of the erp gene in MTB and M. bovis have shown to decrease the ability to

multiply within host, revealed the important contribution of erp gene in MTB virulence

(Berthet et al., 1998). Secreted antigen 85-A (FbpA) encoded by Rv3804c, was reported as an

important virulence factor of MTB as this protein found to have mycolyl transferase activity

and helps in cell wall synthesis (Armitige et al., 2000). pcaA, an essential mycobacterial gene

required for cording and synthesis of mycolic acid cyclopropane ring in the cell wall of both

MTB and BCG reported by Glickman et al. (2000). pcaA was also reported as pro-

inflammatory activator of macrophages during early infection (Rao et al., 2005).

Dubnau et al. (2000) inactivated hma (cmaA, mma4) gene and constructed a mutant

MTB strain. They demonstrated that the mutant strain was unable to synthesize oxygenated

mycolic acids and also observed there was variation in its envelope permeability and

attenuation in mice (Dubnau et al., 2000). This result revealed the importance of oxygenated

mycolic acids for MTB virulence in mice.

Phospholipases C is known to have a significant function in pathogenesis of numerous

bacteria. Raynaud et al. (2002) revealed the involvement of Phospholipases C in MTB

virulence (Raynaud et al., 2002). Sirakova et al. (2003) demonstrated that Rv2946c (pks1)

and Rv2947c (pks15), required for polyketide synthase involved in the biosynthesis of

phthiocerol, as a virulence factor of MTB (Sirakova et al., 2003). The largest open reading

frame (pks12 / Rv2048c) of MTB required for dimycocerosyl phthiocerol was reported to

involve in MTB pathogenesis (Sirakova et al., 2003).

MmpL8 (Rv3823c), an integral membrane transport protein was reported as an important

factor which is necessary for “sulfolipid-1 biosynthesis” and MTB virulence (Converse et al.,

Page 8: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

2003). Domenech et al. (2005) revealed the involvement of MmpL family protein in the

MTB virulence and drug resistance (Domenech et al., 2004; Domenech et al., 2005).

Sander et al. (2004) found lipoprotein metabolism as a most important factor MTB virulence

and pathogenesis (Sander et al., 2004). The eukaryotic and prokaryotic like isoforms of the

glyoxylate cycle enzyme isocitrate lyase (ICL) were shown to have important factor for fatty

acid catabolism and MTB virulence (Muñoz-Elías and McKinney, 2005). The mymA operon

(Rv3083 to Rv3089) of MTB was reported as an important factor for the pathogenesis of

MTB (Singh et al., 2005).

Membrane bound metalloprotease encoded by Rv2869c was reported as an important enzyme

for regulating cell envelope composition and in vivo growth (Makinoshima and Glickman,

2005). OtsB2 (Rv3372) encodes trehalose 6-phosphate phosphatase was shown as an

essential protein in OtsAB pathway required for trehalose biosynthesis in MTB (Murphy et

al., 2005). CFP-10 & ESAT-6 are two proteins encoded by locus, ESX-1 are required for full

virulence in MTB (Fortune et al., 2005). The high-affinity phosphate binding proteins

encoded by pstS1 and pstS2 genes of MTB demonstrated as essential factor for in vivo

virulence (Peirs et al., 2005). Due to deletion of kasB (Rv2246) gene known for coding the

enzyme 3-oxoacyl-ACP synthase, resulted in the subclinical latent TB and acid-fastness in

non- immunodeficiency mice (Bhatt et al., 2007). Brzostek et al. (2007) demonstrated

cholesterol oxidase (ChoD), known as cholesterol modification enzyme, as an imperative

factor for MTB virulence (Brzostek et al., 2007). A study conducted by Lun and Bishai

(2007) revealed that cell wall-associated carboxylesterase, encoded by Rv2223c gene as very

essential for full virulence of MTB (Lun & Bishai, 2007).

Gioffré et al. (2005) generated knock-out mutants in mce1, mce2 and mce3 operons of MTB

and found decreased ability of these mutants to multiply within host and thus concluded mce

operon as virulence factor of MTB (Gioffré et al., 2005). Other two similar studies have also

Page 9: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

shown the importance of mce operons i.e. mce2 (Marjanovic et al., 2010) and mce3 & mce4

(Senaratne et al., 2008) in MTB virulence.

During infection, the bacterial proteins transport through MTB protein secretion system,

ESX-1, a significant factor for MTB virulence. Raghavan et al. (2008) revealed that EspR

(Rv3849), a main ESX-1 regulator, required for MTB virulence (Raghavan et al., 2008).

The ATP-binding cassette transporter LpqY-SugA-SugB-SugC found in MTB was reported

as an essential component for virulence (Kalscheuer et al., 2010). CtpV, a putative copper

exporter was also shown as a virulence factor for MTB (Ward et al., 2010). A novel heat

shock protein (Hsp22.5) encoded by Rv0990c was shown to be involved in MTB

pathogenesis (Abomoelak et al., 2011). The Region of difference 2 (RD2) shown to

contribute MTB virulence (Kozak et al., 2010). The acg gene of MTB was shown to as vital

factor for growth and virulence in vivo (Hu & Coates, 2011).

Besides the above mentioned factors, the two component system, senX3 and regX3 of MTB

(Parish et al., 2003), superoxide dismutase secreted by SecA2 (Braunstein et al., 2003), the

sigmaE (extra-cytoplasmic sigma factor) (Manganelli et al., 2004), the AraC family

transcriptional regulator Rv1931c (Frota et al., 2004), KatG, catalase-peroxidase (Li et al.,

1998; Ng et al., 2004), extracytoplasmic-function sigma factor SigL of MTB (Hahn et al.,

2005), SigD sigma factor (Calamita et al., 2005), The stress responsive chaperone alpha

crystallin 2 (Stewart et al., 2005), The phoP protein in MTB (Pérez et al., 2001; Martin et al.,

2006), PhoPR (two-component system) of MTB (Walters et al., 2006), nuoG (Rv3151) gene

(Velmurugan et al., 2007), transcriptional regulator of hypoxia (mosR) of MTB (Abomoelak

et al., 2009), the transcriptional regulator Rv0485 known to modulate pe and ppe gene

expression (Goldstone et al., 2009), Rv0198c, a putative matrix metalloprotease

(Muttucumaru et al., 2011), ESX-1 genes espF and espG1 (Bottai et al., 2011), PE_PGRS30

Page 10: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

(Iantomasi et al., 2012) were also reported as an important factor for pathogenesis and

virulence of Mycobacterium tuberculosis.

2.8. Genome sequencing of MTB

The complete genome sequence information (Cole et al., 1998; Fleischmann et al., 2002) of

different strains of MTB, have provided valuable imminent of its biology. The availability of

the genome and proteome information of MTB combined with high-throughput technologies

might unlock the new landscape for the development of novel diagnostic techniques, better

vaccine and drugs against TB (Ahmed and Hasnain, 2004). With the declining expenses of

genome sequencing technology (Ng & Kirkness, 2010) and advancement in molecular

biology & functional genomics, whole genome sequence information of different MTB

strains has been released and available in public domain. As of June 2013, complete genome

sequence of several clinical and laboratory strains of MTB are available at “National Center

for Biotechnology Information (NCBI)” (Table 1).

Page 11: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Table 1: List of completely sequenced genomes of MTB complex

Organism Genome

Size (Mb)

GC% Genes Proteins

Mycobacterium tuberculosis H37Rv 4.41 65.6 4062 4003

Mycobacterium tuberculosis 7199-99 4.42 65.6 4042 3994

Mycobacterium tuberculosis CAS/NITR204 4.39 65.6 4007 3959

Mycobacterium tuberculosis CCDC5079 4.4 65.6 3695 3646

Mycobacterium tuberculosis CCDC5079 4.41 65.6 4204 4156

Mycobacterium tuberculosis CCDC5180 4.41 65.6 3638 3590

Mycobacterium tuberculosis CDC1551 4.4 65.6 4293 4189

Mycobacterium tuberculosis CTRI-2 4.4 65.6 4001 3944

Mycobacterium tuberculosis EAI5 4.39 65.6 4026 3902

Mycobacterium tuberculosis EAI5 / NITR206 4.39 65.6 4067 4019

Mycobacterium tuberculosis F11 4.42 65.6 3998 3941

Mycobacterium tuberculosis H37Ra 4.42 65.6 4084 4034

Mycobacterium tuberculosis KZN 1435 4.4 65.6 4107 4059

Mycobacterium tuberculosis KZN 4207 4.39 65.6 4044 3996

Mycobacterium tuberculosis KZN 605 4.4 65.6 4071 4001

Mycobacterium tuberculosis RGTB327 4.38 65.6 3739 3691

Mycobacterium tuberculosis RGTB423 4.41 65.6 3670 3622

Mycobacterium tuberculosis UT205 4.42 64.9 3812 3794

Mycobacterium tuberculosis str. Beijing / NITR203 4.41 65.6 4158 4110

Mycobacterium tuberculosis str. Erdman = ATCC

35801

4.39 65.6 4301 4245

Mycobacterium tuberculosis str. Haarlem 4.41 65.6 4100 4036

Mycobacterium tuberculosis str. Haarlem /

NITR202

4.4 65.6 3729 3680

2.9. Comparative genomics of MTB

Page 12: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Comparative genomics is a field of life science, which deals with the comparison of

genomic features of different organisms (Touchman, 2010; Xia, 2013). The nucleotide

sequence, regulatory sequences, genes & their order etc. are come under genomic features

(Xia, 2013). The main goal of comparative genomics is to compare either whole or large

parts of genome sequence obtained from genome sequencing project, to know the biological

similarities and variations between organisms along with their evolutionary relationship

(Touchman, 2010; Russel et al., 2011; Primrose, 2009). The most important principle of this

branch of genomics is that common features of two different organisms are encoded by

conserved DNA sequence (Hardison, 2003).

Due to advances in genomics and associated novel technologies, vast amount of data

sets are generating which provide new openings for indulgent and combating both genetic &

infectious diseases in humans (Cole 2002). Comparative genomic analysis of different

mycobacterial strains also helpful in identifying the genetic basis of varying phenotypes

which may further gives new insights in the development novel drugs and vaccines (Brosch

et al., 2000). Comparative genomics is a powerful and novel tool for revealing microbial

evolution and identifying genes which might responsible for encoding novel drug targets

(Cole, 2002). The comparison study revealed that all members of MTB complex share

99.9% identity in their DNA sequence and having identical 16s rRNA (Brosch et al., 2002;

Fleischmann et al., 2002).

With the help of comparative genomics, two tandem duplications of 29 and 36 kb in

the chromosome of Mycobacterium bovis BCG Pasteur strain have been revealed (Brosch et

al., 2000). The entire genome comparison among different strains of MTB complex revealed

the mutation (insertion / deletion / substitution), gene duplication and selection on the MTB

strain evolution. After the completion genome sequencing of MTB H37Rv (Cole et al., 1998),

Page 13: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

MTB CDC1551 (Fleischmann et al., 2002), Mycobacterium bovis AF2122/97, the causative

agent of bovine (Garnier, 2003), the whole genome became available in the public domain.

The CDC 1551 strain known to cause TB outbreak in the United States in 1990s

(Valway et al., 1998) was observed to be comparatively less virulent that MTB H37Rv

(Manca et al., 2001). The genomic comparison of MTB H37Rv and MTB CDC1551 revealed

86 InDels and 1075 Single Nucleotide Polymorphisms (SNPs), of which 579 were observed

to be nonsynonymous, focusing the association of genotypic changes with phenotypic

variation (Fleischmann et al., 2002).

The Mycobacterium bovis genome sequence was found to be 99.95% identical to the

genomes of MTB CDC1551and MTB H37Rv but with slightly smaller genome size. With the

comparison of 2504 coding sequences (CDS) among these three genomes revealed 1600 CDS

of M. bovis identical to MTB H37Rv and MTB CDC1551 respectively. There were 2400

SNPs identified between the two MTB strains and M. bovis (Fleischmann et al., 2002). The

genome of Mycobacterium leprae (M. leprae) has undergone enormous gene loss, leaving

only 1604 functional protein coding genes in the bacillus (Cole et al., 2001). M. leprae is

known to cause leprosy. Out of 1439 common genes of MTB and M. leprae, a set of 219

genes were found to be unique to mycobacteria through in silico comparative analysis

(Marmiesse et al., 2004). Arnold et al. (2006), revealed the existence of short sequence

repeats in MTB used for genotyping schemes through whole genome comparison (Arnold et

al., 2006). Comparative genomics will also provide a proficient direction in making out the

genetic based variation in phenotype, pathogenicity and host range among different

mycobacterial species /strains. The current advances in comparative and functional genomics

have also improved our understanding of genetic diversity among the MTB complex. Diaz et

al. (2006), explored and identified genetic variability among different MTB strains through

DNA microarrays technology (Diaz et al., 2006).

Page 14: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

The genome comparison of M. bovis BCG Pasteur 1173P2 (BCG Pasteur) with MTB

H37Rv, MTB CDC1551, and M. bovis AF2122/97 discovered Large Sequence

Polymorphisms (LSPs) which led to the loss of 133 genes in BCG Pasteur (Behr et al., 1999;

Brosch et al., 2007).

Most of the comparative genomics studies have been carried out on MTB H37Rv,

MTB H37Ra, MTB Erdman, CDC1551 and Mycobacterium bovis BCG (Uplekar, 2012).

Comparative genomics revealed the genomic diversity among different MTB strains.

Specifically, the identification of particular genes that differ between virulent and avirulent or

attenuated MTB strains may give insights the molecular mechanisms of pathogenicity and

also give a new direction for the development of new therapies against TB.

2.10. Comparative proteomics of MTB

Proteomics is the study of different features of proteins, particularly their location,

structures and functions (Anderson & Anderson, 1998; Blackstock & Weir, 1999). Proteins

are usually highly conserved and therefore amino acid substitutions are very important for

constructive functional selection. Identification of proteins and comparison of similar

proteins among different strains of same organism may reveal the variation of virulence

mechanisms that lead to different forms of disease caused by the same organism (Uplekar,

2012). Comparative proteomics deals with the comparison of proteomic features of different

organisms which can reveal the role and association of different proteins in different

biological systems. With the completion and availability of the genome sequence of different

MTB strains, the vast information about the proteome of corresponding strains also became

available. This provides not only the comparison of their genomes but also gives a new

insight for the comparison of their proteomes also. Since last decades comparative proteome

analysis among different MTB strains have also been carried out by different researchers

worldwide.

Page 15: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

In silico analysis of MTB proteomes identified the existence of two novel protein

families, PE and PPE (Tekaia et al., 1999). Upon proteomic comparison of two M. bovis

BCG non virulent strains (Chicago and Copenhagen) with two MTB virulent strains (Erdman

and H37Rv), Jungblut et al. (1999) identified distinct proteins by mass spectrometry (Jungblut

et al., 1999). 27 diverse proteins specific to MTB were identified upon proteomic comparison

of culture supernatant from MTB H37Rv and M. bovis BCG strain (Mattow et al., 2003).

Miallau et al. (2013) identified “RelBE-like toxin-antitoxin complexes” associated with

lethality of MTB (Miallau et al., 2013).

2.11. MTB databases

Due to the advancement of Bioinformatics and life science research different database

on MTB have been developed and available in the public domain in last few years.

Mycobacterial Genome Divergence Database (MGDD) available at

http://mirna.jnu.ac.in/mgdd/, is an online database for accessing different types of genomic

variations (SNPs, indels, tandem repeats and divergent regions) among a six different strains

of MTB complex such as MTB H37Rv, MTB H37Ra, MTB CDC1551, MTB F11,

Mycobacterium bovis AF2122/97 and Mycobacterium bovis BCG (Vishnoi et al., 2008).

The TB drug resistance mutation database available at http://www.tbdreamdb.com/,

comprises comprehensive information on list of the genetic polymorphisms associated with

first and second line drug resistance in clinical MTB isolates all over the world.

Mycobacterium Database (MyBASE) available at http://mybase.psych.ac.cn/,

provides integrated information on Mycobacterium tuberculosis (MTB) and Mycobacterium

leprae (M. leprae). This information are mainly focused on genome polymorphism, predicted

operon along with the annotated information on essential & virulence genes and their role in

virulence and pathogenesis (Zhu et al., 2009).

Page 16: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

The “Tuberculosis Database (TBDB)” available at “http://www.tbdb.org/” is an

online database providing information on various aspects of TB such as genome sequence,

assemblies & expression data obtained from pre- and post- publication data along with

curated literature for various MTB strains along with more than 20 strains related to MTB.

Expression data mainly include datasets of more than three thousand MTB microarrays, 95

Real time PCR and also 2.7 thousand microarrays for mouse and human TB related research,

and 260 microarrays for Streptomyces coelicolor (Reddy et al., 2009; Galagan et al., 2010).

The TubercuList database available at http://tuberculist.epfl.ch/ is a knowledge base

of MTB which amalgamates vast information on MTB genome details, protein information,

mutant and operon annotation, bibliography, drug and transcriptome data etc (Lew et al.,

2011).

The MTBreg, a database of conditionally regulated proteins in MTB available at

“http://www.doe-mbi.ucla.edu/Services/MTBreg/”, integrates information on proteins up-

and down- regulated in MTB, when the pathogenic organism is subjected to grow under

conditions mimicking infection.

The Mycobacterium tuberculosis Structural Database (MtbSD) available at

http://bmi.icmr.org.in/mtbsd/MtbSD.php, hosts 857 protein structure information of MTB

which comprises of description, domains, reaction catalyzed, structural homologues, active

site etc for each proteins (Hassan et al., 2011).

The Mycobacterium tuberculosis Proteome Comparison Database (MTB-PCDB)

available at http://www.bicjbtdrc-mgims.in/MTB-PCDB/, hosts 40252 protein sequence

comparison data obtained through inter-strain proteome comparison of five different strains

of MTB (H37Rv, H37Ra, CDC 1551, F11 and KZN 1435) (Jena et al., 2011). MycoProtease-

DB database available at http://www.bicjbtdrc-mgims.in/MycoProtease-DB/, domiciles 1324

protease information of 8 strains of Mycobacterium tuberculosis (MTB) complex and 4

Page 17: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Nontuberculous Mycobacteria (NTM) strains, whose complete genome sequence is available

(Jena et al., 2012).

Mycobacterium tuberculosis genome variation resource (tbvar) available at

http://genome.igib.res.in/tbvar/index.html, comprises of more than 29000 single nucleotide

variations obtained from more than 450 isolates of MTB complex (Joshi et al., 2014).

2.12. MTB H37Rv and MTB H37Ra

MTB H37Rv is the virulent counterpart of its avirulent sister strain H37Ra. In 1935,

William Steenken derived both the strains from their parent strain H37 (Steenken & Gardner

1946). MTB H37Ra has various distinct characters as compare to MTB H37Rv. Those

includes a “raised colony morphology” (Steenken, 1935), lack of neutral red dye binding

(Dubos & Middlebrook, 1948), lack of cord formation (Middlebrook et al., 1947), declined

survival inside macrophages (Mackaness et al., 1954) or under anaerobic conditions (Heplar

et al., 1954), decreased virulence in mice (Larson & Wicht, 1964) and guinea pigs (Alsaadi &

Smith, 1973). In spite of several genetic and biochemical studies in the past seven decades,

the molecular mechanism for the decrease of virulence in MTB H37Ra is still under study

(Zheng et al., 2008).

2.12.1. Genome biology of MTB H37Rv

The mycobacterium tuberculosis strain H37Rv obtained originally from the human-

lung H37 isolate in 1934, since then it has been broadly used worldwide in biomedical

research. In 1905, Edward R. Baldwin isolated H37 from a male nineteen years old

pulmonary tuberculosis patient (Steenken & Gardner, 1946). MTB H37Rv preserves its

complete virulence properties in animal model and is susceptible to anti tubercular drugs. The

whole genome of this pathogenic strain was sequenced in 1998 (Cole et al., 1998). The

genome consists of 4411532 base pairs (Figure 2.1) having 65.6 % guanine + cytosine (G+C)

content. It contains more than 4000 protein coding genes and the gene density is at one gene

Page 18: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

per kilobases. Genes in the genome are evenly dispersed on both forward and reverse

strands. Nearly one half of the coding sequences are due to domain shuffling and gene

duplication (Tekaia et al., 1999).

2.12.2. Genome biology of MTB H37Ra

MTB H37Ra is an avirulent strain derived from the H37. The whole genome of the

avirulent strain of MTB was sequenced by the Chinese National Human Genome Center at

Shanghai. It has genome length of 4419977 base pairs (Figure 2.2) with G+C content of 65.6

%. It has 4034 protein coding genes out of 4084 genes. 45 genes are responsible for coding

tRNA whereas 3 for rRNA and 2 for others RNA

(http://www.ncbi.nlm.nih.gov/genome/genomes/166?details=on&project_id=58853).

Figure 2.1 Circular map of MTB H37Rv chromosome (Zhu et al., 2009; Stothard & Wishart 2005)

Page 19: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Figure 2.2 Circular map of MTB H37Ra chromosome (Zhu et al., 2009; Stothard & Wishart 2005)

2.12.3. Genomic and proteomic comparison of MTB H37Rv and MTB H37Ra

A genomic approach was first carried out by Brosch et al. (2000), for identifying the

variations between MTB H37Ra and MTB H37Rv at genetic level. Their study revealed dual

polymorphisms in these two strains i.e. a fragment of 480 kilo bases in MTB H37Rv was

found to be substituted by two segments of size 260 and 220 kilo bases in MTB H37Ra and

presence of a DraI segment of size 7900 bases in MTB H37Ra which was absent in MTB

H37Rv. The reported 7900 bases polymorphism was due to the removal of MTB H37Rv RvD2

in MTB H37Ra. Three IS6110 deletions (RvD3 to RvD5) from the MTB H37Rv genome were

also found in MTB H37Ra. Authors of this study also described the occurrence and

mechanisms of genomic differences at genomic level between MTB H37Rv and MTB H37Ra

but they were not clear about the role of variation in the MTB H37Ra attenuation (Brosch et

al., 2000).

Page 20: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Genomic comparison between MTB H37Rv and H37Ra also revealed that, the genome

of MTB H37Rv is very similar to that of MTB H37Ra and is 8,445 base pair smaller than that

of H37Ra (Zheng et al., 2008). In H37Ra and H37Rv, only 98 “single nucleotide variations

(SNVs)” were identified (Zheng et al., 2008). Out of them, 119 were found identical between

MTB CDC1551 and MTB H37Ra and three were because of MTB H37Rv variation, leaving

only 76 MTB H37Ra specific SNVs that affecting only 32 genes (Zheng et al., 2008).

An in silico analyses of PE/PPE family of MTB H37Ra and MTB H37Rv revealed

genetic variations in terms of numerous SNVs along with some deletions and insertions

between these two strains. Due to these variations, changes are also observed in their

physico-chemical properties, protein: protein interacting domains and phosphorylation, sites

which can be correlated to differences in their virulence and pathogenesis (Kohli et al., 2012).

A link between the avirulence of MTB H37Ra and a single amino acid substitution in

the PhoP protein was observed by Gonzalo-Asensio et al. (2008). In this study, they focused

on the phoP gene, which was found to have significant role in MTB virulence. This gene is

completely conserved in all MTB complexes including MTB H37Rv except that of MTB

H37Ra. There is point mutation in phoP gene resulting formation of mutilated protein with

single amino acid variation i.e. replacement of the polar residue Ser219 by the nonpolar

residue Leu (Gonzalo-Asensio et al., 2008).

Målen et al. (2011) compared membrane proteins of MTB H37Rv with its avirulent

sister strain MTB H37Ra and identified more than seventeen hundred proteins. Among these

proteins identified by them, majority were found to have comparable abundance in both the

strains. There were 29 “membrane-associated proteins” reported with a five or more fold

variation in their comparative abundance when compared one strain with the other. There

were nineteen membrane and lipo proteins of MTB H37Rv and 10 other proteins of MTB

H37Ra, observed with higher abundance in corresponding strains (Målen et al., 2011).

Page 21: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

2.13. Bioinformatics tools for genome and proteome analysis

2.13.1. Genome and proteome comparison tools

A number of Bioinformatics tools and techniques are available in the public domain

for genome and proteome comparison. GenomeVISTA available at http://genome.lbl.gov/cgi-

bin/. GenomeVista is an automatic server which can be used to find out the candidate

orthologous regions for a draft or finished DNA sequence from one species based on the

genome of a second species. It also provides their comparative analysis in details (Couronne

et al., 2003; Bray et al., 2003). A set of alignment programs present in the Lagan Toolkit

available at http://lagan.stanford.edu/, can be used for comparative genomics. LAGAN is

used for rapid global alignment of two homologous genomic sequences whereas Multi-

LAGAN is used for multiple global alignments of genomic sequences (Brudno et al., 2003).

PipMaker is a web based application which identifies conserved segments between two long

genomic sequences through sequence comparison. It provides an efficient technique for

aligning genomic sequences and returns a comprehensive result in the form of a plot known,

the percent identity plot (pip) (Elnitski et al., 2003). MUMmer is a system available at

http://mummer.sourceforge.net/, for rapidly aligning whole genomes, irrespective of being in

complete or draft. MUMmer 3.0 specifically aligns large genomes of eukaryotic organisms at

varying evolutionary distances (Kurtz et al., 2004).

GenomeBlast, is an online tool available at “http://bioinfo- srv1.awh.unomaha.edu/”

for comparative study of small genomes. Besides, identifying unique and homologous gene

among multiple genomes, it also illustrate their distributions on genomes in a graphical

manner (Lu et al., 2006).

Artemis Comparison Tool (ACT) is a free tool that allows pair-wise comparisons between

complete genome sequences with annotation. It can also be used for identification and

Page 22: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

analysis of regions of similarity and variation between genomes by considering entire

sequence comparison (Carver et al., 2008). ABWGAT (Anchor-Based Whole Genome

Analysis Tool) available at http://abwgc.jnu.ac.in/_sarba/cgi-bin/abwgc_retrival.cgi, is a web-

based tool for identification of sequence variations such as SNVs, indels, inversion and repeat

expansion at genomic level (Das et al., 2009).

PROCOM is a web-based tool available at http://procom.wustl.edu/, used for

comparing multiple eukaryotic proteomes. Currently it hosts proteomes of 32 eukaryotic

organisms for comparison (Li et al., 2005). PROMPT (Protein Mapping and Comparison

Tool) is a comprehensive bioinformatics software environment available at

http://www.geneinfo.eu/prompt/index.php, which can be used for retrieving, analyzing,

mapping and comparing protein sets. Easy mapping of various types of sequence identifiers,

automatic data retrieval & integration, and a user friendly graphical interface are the main

features of PROMPT (Schmidt & Frishman, 2006).

2.13.2. Mutation analysis tools

Comparative proteome analysis among different pathogenic organisms may come out

with some proteins with different type of variations in their amino acid sequences. These

variations may have some important role in the evolution of a particular organism that results

in the divergence of different strains. A single amino acid mutation in protein sequence may

cause alteration in protein structure and function that may account for virulence and drug

resistance properties of pathogenic organisms. Some mutation analysis systems are available

for analyzing the effect of amino acid variation in the structure & function of proteins, such

as PolyPhen (Adzhubei et al., 2010), SIFT (Ng et al., 2003; Kumar et al., 2009), PROVEAN

(Choi et al., 2012) and Project HOPE (Venselaar et al., 2010). Computational tools like SIFT,

PolyPhen and PROVEAN are able to predict the deleterious non-synonymous SNPs whereas

Page 23: CHAPTER 2. REVIEW OF LITERATURE - Shodhgangashodhganga.inflibnet.ac.in/bitstream/10603/34383/10/10_chapter 2.pdfCHAPTER 2. REVIEW OF LITERATURE 2.1. Tuberculosis Tuberculosis (TB),

Project HOPE is a system that can automatically analyze the consequence of a point mutation

on the three dimensional structure of a protein.