Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new...

17
Discrimination of breast tissular structure by FT-IR FPA imaging Audrey BENARD a and Erik GOORMAGHTIGH a,1 a Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium Abstract. The recent development of Fourier Transform Infrared spectroscopic imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular components can be observed without sample staining. However, in order to translate this emerging technology to the clinic, optimization of the acquisition, classification and validation techniques is a prerequisite step. The information extracted from the spectral data largely depends on the quality of the raw spectra but also on the corrections and processing methods. In this chapter, some of the guidelines for the recording and preprocessing methods will be presented. Unsupervised statistical approaches for tissue type discrimination illustrate the potential of this technology on histopathology. Keywords. Infrared micro-spectroscopy, infrared imaging, breast cancer. 1. Introduction Because of its potential to probe tissues and cells at the molecular level without requirement for extrinsic contrast agents, infrared spectroscopy could become an attractive tool in clinical and diagnostic analysis to complement the existing methods. Absence of dyes and contrasting reagents remain one of the major advantages of this investigating technique. IR spectroscopy has the ability to provide an accurate fingerprint of biological samples without disturbing their structure and integrity. Hundreds of biological applications of this technology have been published since it was demonstrated, in the eighties, that FT-IR spectra of bacteria provide a unique fingerprint that allows the identification of bacteria species [1-3]. In cancer research, it was demonstrated that different tumor cell lines can be identified by statistical analysis of their FT-IR spectra [4-6]. Furthermore, tumor cell lines with different biological behaviours can be separated thanks to this technique. For instance, in vivo aggressiveness and in vitro migration of glioma cell lines has been successfully predicted from their IR spectra while no molecular biology technique was available [7]. 1 Corresponding Author: Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Université Libre de Bruxelles, CP 206/2, Boulevard du Triomphe, B-1050 Brussels, Belgium; E-mail: [email protected].

Transcript of Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new...

Page 1: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

Discrimination of breast tissular structure by FT-IR FPA imaging

Audrey BENARDa and Erik GOORMAGHTIGHa,1

a Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Université Libre de Bruxelles, Brussels,

Belgium

Abstract. The recent development of Fourier Transform Infrared spectroscopic imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular components can be observed without sample staining. However, in order to translate this emerging technology to the clinic, optimization of the acquisition, classification and validation techniques is a prerequisite step. The information extracted from the spectral data largely depends on the quality of the raw spectra but also on the corrections and processing methods. In this chapter, some of the guidelines for the recording and preprocessing methods will be presented. Unsupervised statistical approaches for tissue type discrimination illustrate the potential of this technology on histopathology.

Keywords. Infrared micro-spectroscopy, infrared imaging, breast cancer.

1. Introduction

Because of its potential to probe tissues and cells at the molecular level without requirement for extrinsic contrast agents, infrared spectroscopy could become an attractive tool in clinical and diagnostic analysis to complement the existing methods. Absence of dyes and contrasting reagents remain one of the major advantages of this investigating technique. IR spectroscopy has the ability to provide an accurate fingerprint of biological samples without disturbing their structure and integrity. Hundreds of biological applications of this technology have been published since it was demonstrated, in the eighties, that FT-IR spectra of bacteria provide a unique fingerprint that allows the identification of bacteria species [1-3]. In cancer research, it was demonstrated that different tumor cell lines can be identified by statistical analysis of their FT-IR spectra [4-6]. Furthermore, tumor cell lines with different biological behaviours can be separated thanks to this technique. For instance, in vivo aggressiveness and in vitro migration of glioma cell lines has been successfully predicted from their IR spectra while no molecular biology technique was available [7].

1 Corresponding Author: Laboratory for the Structure and Function of Biological Membranes, Center

for Structural Biology and Bioinformatics, Université Libre de Bruxelles, CP 206/2, Boulevard du Triomphe, B-1050 Brussels, Belgium; E-mail: [email protected].

Page 2: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

As IR spectroscopy is based on the absorption of infrared light by vibrational transitions in covalent bonds, characteristic spectral features are correlated with biological properties of the sample. So, useful diagnostic information can be extracted from infrared spectra in case of different pathologies [8, 9].

Owing to the heterogeneity of cellular samples, many analyses on tissue sections by conventional FT-IR spectroscopy gave rise to spurious results. The recent availability of IR-sensitive multi-channel array of detectors has opened the doors to applications of IR spectroscopy in pathology. The integration of non invasive nature of the FT-IR spectroscopy with the ability to obtain spatially resolved chemical and structural information of tissue sample promotes IR micro-spectroscopy as a well suited technology for the differentiation and identification of tissue structures and pathology. The molecule-specific vibrational signatures due to inherent chemical differences found within cells of the tissue provides spatially resolved chemical characterization of samples and presents the pathologist with an important tool for biodiagnostics. Moreover, IR imaging presents several advantages. (i) This technology is rapid: within minutes, IR data can be collected and interpreted; (ii) neither staining of the samples nor chemical reagent addition are necessary as the intrinsic molecular vibrations probe the chemical composition and structural properties of the sample; (iii) this technology is non-destructive. Because of the subtle alterations in the morphological and biochemical composition of the tissue associated with cancer transformation, collection of high-quality spectra and high-spatial resolution is necessary for the diagnosis of cancer disease.

This recent technology is based on the segmentation of the IR radiation at the detection plane. Although, the entire tissue region of interest is illuminated, contributions of adjacent areas are separated by an array of thousands of IR sensitive detectors arranged in a Focal Plane Array (FPA). Multi-channel detectors allow thousands of spectra to be acquired in a period of time comparable to recording conventionally a single spectrum. The amount of data acquired requires automatized processing in order to extract significant information. Providing an automated unbiased computer-based technique is one of the most important aspects of the IR micro-spectroscopy approach.

More than a decade ago, IR micro-spectroscopy has become a new innovative opportunity to study tissue specimens for diagnosis/prognosis of disease and progression monitoring [10-15]. Some applications of the IR imaging in disease detection are reported in the literature; from the characterisation of xenografted carcinoma [16] to bone and cartilage [17], through diagnosis of benign and malignant lesions in breast tissue [18-20], histopathologic recognition of axillary lymph node [21], colorectal adenocarcinoma [22] and cell cervical carcinoma [23], IR imaging showed in many cases its potential for histopathological issues.

As an alternative to manual histopathological examinations, computer-based pattern recognition approaches provide more accurate and reproducible diagnoses. IR image segmentation based methodologies produce false colour map of the sample which can be compared to histopathologic gold standard and help pathologists making better decisions. As IR pattern recognition can be fully automated, non trained users can interpret these false colour images. Using tissue micro-arrays instead of entire tissue sections further opens the door to high throughput screening of spectroscopic images.

Page 3: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

In the present chapter, practical information about recording and processing spectra acquired by a FT-IR imaging system will be presented. We showed previously that identification of cell type in breast samples can be achieved on formalin-fixed paraffin-embedded (FFPE) tissues, which gives access to a large tumor tissue bank and allows direct comparison with histopathologic gold standard procedures [24]. Furthermore, the use of that kind of preserved tissues allows performing retrospective studies as enough hindsight on patient outcome is reported; representing a considerable source of information about pathological state and molecular content and allowing identification of prediction markers. In this chapter, we report examples obtained on FFPE breast cancer tissues.

Breast cancer is the most frequently diagnosed cancer in women in Western countries. This cancer represents approximately 30% of all cancer diagnosed and 16% of all cancer deaths [25]. The prognosis/diagnosis and the management of breast cancer have always been based on clinical variables such as histological type and grade, lymph node involvement, tumor size and status of hormonal receptors. The accurate diagnosis of the benign or malignant nature of a suspicious lesion identified by mammography requires an invasive procedure to obtain a tissue biopsy. The specimen obtained is prepared using histology procedures for viewing under a microscope and the diagnosis of tumors relies on the visual inspection of stained tissue sections by a trained pathologist after dye staining. The visualization of the structure and distribution of cellular components in tissue sections using light microscopy has become the histopathologic gold standard procedure. However, this morphological pattern recognition is a subjective assessment as levels of inter- and intra- observer agreement remains unsatisfactory [26-28].

2. Methodology/ Data analysis

2.1. Sample preparation / Recording of spectra

Human breast tissues were obtained from the FFPE Tumor Bank of the Institute J. Bordet in Brussels. A 3 µm thick tissue section was cut from paraffin-embedded tissue block mounted on a glass slide and stained with Haematoxylin and Eosin (H&E) for histological assessment. Morphological interpretation and identification of tissues classes and molecular composition were obtained using the stained section by a trained pathologist. An adjacent 3 µm tissue section was mounted on a barium fluoride (BaF2) disk for IR imaging analysis. This section was subsequently deparaffinized, rehydrated and dried but not stained. As paraffin exhibits strong vibration bands in the methyl and methylene stretching (3000 – 2800 cm-1) and deformation (~ 1450 cm-1), it could distort subsequent spectral interpretation. A complete chemical dewaxing of tissues is recommended for spectroscopic analyses. Alternatively, modelisation of paraffin contribution has been proposed [29].

The spectroscopic imaging data were acquired using a Hyperion 3000 IR imaging system (Bruker Optics, Ettlingem, Germany), equipped with a 64*64 Mercury Cadmium Telluride (MCT) Focal Plane Array (FPA) detector. The IR data were collected in transmission mode from sample regions of 184*184 µm2. Every individual element of the array detector covers an area of 2.9*2.9 µm2. The possibility of mapping to cover larger sample areas as binning pixels will be discussed further.

Page 4: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

0 10 20 30 40 50 60 70 80 900.42

0.422

0.424

0.426

0.428

0.43

0.432

0.434

0.436

0.438

0.44

Recording of spectra is the most important step prior to data analysis. If the signal of sample information is too low or if high noise disturbs data, there is no way to overcome the problem. Consequently, noise level is a critical parameter for defining recording conditions of spectra.

2.2. Noise Level

Noise level is a poorly defined property. Every manufacturer refers to a different definition creating confusion by giving different results for noise level parameter. In the present chapter, the noise is defined as the intra-spectrum variation mainly depending on the instrumentation. For each spectrum, the noise was measured as follows. First of all, a spectral region where no absorbance of biological sample occurs is chosen for noise estimation, typically 2250-2100 cm-1. A segment of defined length is moved on this region of the spectrum and for every position 1) a linear baseline is fitted to this segment and subtracted to account for any general tilt of the baseline and 2) the standard deviation around the mean is computed and stored (Figure 1). In order to minimize the contribution of real broad bands when present, this procedure is repeated step by step and the segment is shifted by one cm-1. The final Noise value is the mean of the noise computed on all the segment positions.

Figure 1. Procedure for noise level determination. A spectrum segment and the linear baseline fitted for noise calculation are represented. The absorbance values are presented in arbitrary units (A.U.). The length of the segment was set to 50 cm-1. This segment will be moved by step of one cm-1 on the spectral region defined; e.g. 2250-2100 cm-1. Noise is calculated as the mean of the standard deviation over all the segment positions in this section of the spectrum, defined after subtraction of a linear baseline.

Segment for noise level evaluation

Abs

orba

nce

(A.U

.)

O Spectral data points __ Linear baseline fitted

Page 5: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

2.3. Spectral resolution

Because band width in condensed state is large for reasons developed elsewhere [30], little information is gained when working at high spectral resolution. On the other hand, atmospheric contaminants have intrinsically narrow bands. When recording FT-IR data with a nominal resolution of 8 or 12 cm-1, atmospheric water bands appear broad and rather featureless on sample signature. Therefore, their subtraction from a protein spectrum for example is difficult and the coefficient applied for the subtraction may vary by about 50 % according to the operator [30]. Conversely, when recording a spectrum of the same sample in the same conditions with a nominal resolution of 0.5cm-1, the sharp atmospheric water bands appear clearly resolved from the broad amide bands [30]. In practice, in conventional FT-IR spectroscopy, atmospheric contaminants contributions are better corrected when spectra are recorded at relatively high resolution for taking advantage of the intrinsic bandwidth difference existing between the atmospheric absorption bands and these of the liquid or solid sample [31].

However, the situation is somewhat different in imaging. First, high resolution requires significant longer acquisition times; second, the size of the data files increases in proportion with the spectral resolution applied, making it unmanageable for 64*64 (4096 spectra) or 128*128 (16384 spectra per measure) detector arrays. Furthermore, discarding a large part of the interferogram obviously results in better noise level.

The effect of the spectral resolution and number of scans averaged per spectrum on

noise level (as defined in Section 2.2) is reported in Figure 2.

Figure 2. Evolution of the noise level (see Section 2.2) as a function of the number of scans per spectrum for different spectral resolution (from 2 to 12 cm-1). Noise levels represented here are the mean standard deviations on spectral region 2250-2100 cm-1 obtained on 4096 spectra collected on an FT-IR 64*64 detector array imaging system. The dotted line points to a resolution of 8 cm-1 and 256 scans, the standard condition used in the present study.

Page 6: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

For a better understanding of experimental conditions, the same data were reported in Figure 3 as a function of the time required for the data recording. Noise level is known to decrease as the square root of the number of scans taken per spectrum i.e. as the square root of the time required for data recording.

Figure 3. Evolution of the noise level (see Section 2.2) as a function of the recording time for different spectral resolution (from 2 to 12 cm-1). The graph above is presented in a double log scale yielding a linear relationship between noise level and recording time. Noise levels represented here are the mean standard deviations on spectral region 2250-2100 cm-1 obtained on 4096 spectral collected on an FT-IR 64*64 detector array imaging system. The dotted line points to a resolution of 8 cm-1 and 256 scans, the standard condition used in the present study.

Figure 2 and 3 reveal that reaching standard deviations below 5.10-4 OD units for resolution better than 8 cm-1 is time consuming. This practical problem coupled with the need for large matrices to be handled later during processing and analysis stages need to be considered when choosing recording conditions especially in FT-IR imaging system. Indeed, based on the Nyquist-Shannon sampling theorem, a signal with finite bandwidth should be sampled at a rate at least as fast as twice its nominal resolution resulting in ca 3100 data points between 3900 and 800 cm-1 at 2 cm-1 resolution for 775 data points at 8 cm-1.

At 8 cm-1/ 256 scans recording condition (see Figures 2 and 3), noise level is about 2,4.10-4 OD. As absorbances at Amide I peak vary from 0.3 to 0.8 in raw spectra acquired in breast tissue, a good Signal-to-Noise ratio from 1250 to 3333 can be expected.

In conclusion, a spectral resolution of 8 cm-1 seems to be a good compromise for imaging. One IR image results in 4096 spectra (ca 4 minutes recording), each one being the average of 256 scans recorded in a spectral range from 3900 to 800 cm-1.

5.10-4

Page 7: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

2.4. Spatial resolution

In IR histopathological evaluation, spatial resolution is a major critical factor defining image quality. As biological samples such as tissues exhibit distinct levels of morphologic heterogeneity, routine histopathological examination investigates the diverse structural units at various magnifications; starting at low magnification for gross morphological interpretations and ending up with high magnification for evaluation of sub cellular structures. Unfortunately, due to the so-called diffraction limit, low lateral spatial resolution is one of the drawbacks of any IR imaging system. Diffraction limit parameter is of the same order of magnitude as the wavelength λ of the IR light ranged from 2.5 to 10 µm i.e. 2.5 µm at 4000 cm-1, 5 µm at 2000 cm-1 and 10 µm at 1000 cm-1 [32]. In turn, due to the long wavelength of mid infrared light, the spatial resolution of IR microscopes cannot reach those of visible light based microscopes used for histopathological evaluation. The best correlation between histology and IR imaging will be found at a particular spatial resolution [32]. The more appropriate pixel size for determining histology structure will be discussed.

In the example below, a same tissue sample has been imaged with different pixel binning conditions and the noise level is reported on Figure 4. This process involves taking square groups of neighboring pixels which are averaged in order to obtain a single larger pixel. This results in smaller data file size as fewer spectra are stored. Moreover, pixel binning reduces the noise level of the IR image, as averaging spectra decreases the standard deviation. Yet, pixel binning also results in degrading image resolution. When no pixel binning is applied, 4096 spectra are acquired per IR image; every pixel covering a sample area of 2.9*2.9 µm². 2x2 pixel binning results in 1024 spectra for pixel size of 5.8*5.8 µm². When set to 4x4, 256 spectra are obtained and pixel size is about 11.6*11.6 µm².

Figure 4. Noise level (see Section 2.2) as a function of pixel binning parameter. When no pixel binning is applied, 4096 spectra are stored and pixel size is about 2.9*2.9 µm²; for 2x2, 1024 spectra and pixel size of 5.8*5.8 µm²; for 4x4, 256 spectra and pixel size of 11.6*11.6 µm²; for 8x8, 64 spectra and pixel size of 23.2*23.2 µm² and finally for 16x16, 16 spectra stored and pixel size of about 46.4*46.4 µm². Noise levels represented here are the mean standard deviations on spectral region 2250-2100 cm-1 obtained on an FT-IR 64*64 detector array imaging system.

Page 8: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

In Figure 4, a gain on the noise level is observed from no to 2x2 pixel binning. This benefit is only modest when applied 4x4 or 8x8 binning. Because of intrinsic spatial resolution limits related to the IR wavelength, it appears that a 5.8*5.8 µm² pixel size is a good compromise which keeps most of the spatial information contained in the raw data.

2.5. Primary spectral processing

We present below the preprocessing of the spectra automatically applied to the recorded data for further statistical analysis. This spectral preprocessing includes background and baseline corrections, water vapor compensation and normalization.

2.5.1. Spectral correction for Atmospheric Water

Because it is virtually impossible to maintain a constant humidity level in the microspectrometer room for long-term experiments and despite the fact that the microscope is purged continuously, the water vapor vibrational spectrum may interfere in the 1800-1400 cm-1 range of the spectrum of cells and tissues.

As described previously [30, 31, 33], a water reference spectrum is collected and subtracted from the raw spectra. Since it can be hypothesized that the best reference spectrum is the one collected in the same recording conditions and on top of the sample, the reference spectrum was defined as the mean spectrum of an IR measure acquired in transmission mode at spectral resolution set at 8 cm-1 after stopping the purging system for 20 minutes. A subtraction coefficient is then computed as the ratio of the atmospheric water peak reference between 1956 and 1935 cm-1 on the sample spectrum and on the reference atmospheric water spectrum. The area of the reference peak is evaluated after subtraction of a straight line drawn between the spectrum points at two wavelengths located at 1956 and 1935 cm-1. This level of processing is generally sufficient for IR imaging experiments.

2.5.2. Spectral rescaling

The intensity of the spectra strongly depends on the amount of material, essentially defined by the thickness of the tissue slice. As this latter parameter is never fully under control, a scaling of the spectra is required to achieve meaningful comparison, PCA or hierarchical clustering on spectral data. Various normalization procedures are reported in the literature [34, 35] but it is apparent that most of the normalizations do not radically change the analysis accuracy. Typically for IR imaging data, the area under a band or several bands is set to a given value so that all the spectra have the same integrated area in a chosen region. Amide I and II bands were selected here for equal area normalization, between 1725 and 1481 cm-1. Thereby, this preprocessing step reduces the influence of absorbance intensity variations caused by changes in cellular density and thickness of the tissue.

Page 9: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

2.5.3. Baseline correction

Vertical shift is a typical deviation often observed in FT-IR spectra. These background slopes can be handled either by baseline subtraction or by taking the second derivative of the spectra. Second derivative processing enhances peak resolution and does not require any subjective choice for baseline reference points. Nevertheless, the major disadvantage of this method is the enhancement of the noise level accompanying an apparent resolution enhancement of the spectrum. Contrary to second derivative procedure, baseline subtraction needs parameter choice. Although all the baseline corrections depending on the numbers of zeroed absorbances are almost equally efficient to reveal differences between series of spectra [36]. A 6-point baseline correction was used here; absorbances at these wavenumbers ranged from 1800 to 950 cm-1 were set to zero.

2.6. Spectral quality test

Spectral data displaying low absorbance values are sometimes observed from the edges of tissues or cells, mostly due to thin sample sections or not complete adherence of the sample on the BaF2 window. These spectra can also be distorted by dispersive band shapes superimposed on absorbance features. Several techniques have been reported to reduce or remove these dispersion artifacts from infrared microspectral data [37-41]. This optical effect is mostly prevailing in reflection or transflection measurement modes and all the data processing techniques suggested to correct these artifacts are computationally intensive. For transmission IR imaging, the application of adequate filters including a stringent Signal-to-Noise ratio criterion is usually sufficient. This spectral quality test is commonly used in the literature, for example in reference 21. Furthermore spectra with dispersive line shapes present similar spectral features in comparison to non distorted spectra. In turn, an unsupervised multivariate analysis such as K-means Clustering Analysis can easily identify the distorted spectra.

The two quality tests described below are applied on preprocessed and corrected spectra as described previously.

For Noise estimation, a spectral region where no absorbance of biological sample occurs, typically 2200-2100 cm-1 is used. As defined in Section 2.2, noise is calculated as the standard deviation in this region of the spectrum after subtraction of a linear baseline. The Signal value is the maximum intensity absorbance in the Amide I and II region after subtraction of a linear baseline, typically 1750-1480 cm-1.

Spectra were eliminated from further analyses when the S/N ratio was below 400:1.

A second spectral quality test is based on the absorbance on the Amide I and II region of the spectrum. Only spectra with no absorbance below -10 or larger than 120, after subtraction of a linear baseline were retained for statistical analysis. This filter eliminates spectra with negative lobes on the high wavenumber side of Amide I.

Figure 5 illustrates the necessity of applying these filters on IR images obtained by a multi-channel array of detectors. Some badly distorted or very noisy spectra have been selected for the illustration (Figure 5C). In practice, it allows to focus only on good looking spectra for diagnosis and structural differentiation purposes.

Page 10: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

1000120014001600180020002200

0

50

100

cm-1

Abs

orba

nce

1000120014001600180020002200

0

20

40

60

80

100

cm-1

Abs

orba

nce

Figure 5. Application of minimum, maximum and S/N filters on breast tissue sample. A. Photomicrograph of a 3 µm thin Haematoxylin and Eosin (H&E) stained breast tissue section (734 x 550 µm). B. False colour map representing the Amide I band intensity at 1656 cm-1 obtained from a corresponding unstained section. The colour bar indicates the Amide I absorbance intensity. Spectra with S/N ratio below threshold (S/N < 400) are black colour coded; spectra with minimum and maximum absorbance on 1800-1400 cm-1 inferior to -10 or superior to 120 are respectively light and dark grey colour coded. C. Some spectra eliminated based on the filters. D. Some spectra retained for further analyses after application of filters.

100

95

90

85

80

A. B.

C. 100 µm

D.

Abs

orba

nce

(A.U

.) A

bsor

banc

e (A

.U.)

Page 11: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

3. Applications

Imaging of a breast tissue section will be presented here in order to illustrate the potential of infrared imaging for histopathological recognition.

Breast tissue is a structurally complex organization composed essentially of fibrous connective and adipose tissues surrounding lobules and ducts, glands and blood vessels. The tissue also contains lymphocytes, macrophages and fibroblasts organised into specific structures.

IR histological differentiation can be highlight in breast tissue by chemical group mapping which necessitates identification of specific biomarker bands. The spectral hypercube is then converted into a two-dimensional image allowing the determination and quantification of the inherent chemical composition of the sample. As every spectrum associated with a pixel has a unique spatial x,y position in the imaging measure, a false colour map can be assembled. A specific component of the tissue (proteins, lipids, carbohydrates or nucleic acids) can be isolated by measuring the absorbance of specific molecular vibrations. For instance, the relative α helix protein and lipid chain contents obtained from the absorbance intensities at 1656 and 1468 cm-1 respectively give rise to contrast IR images reported in Figures 6 B and C. The ratio between these two wavenumbers is reported in Figure 6 D. The histomorphological features are preserved in IR spectral images allowing the comparison between H&E gold standard histological characterization (see Figure 5 A.) and IR spectroscopy.

Figure 6. Infrared spectroscopic univariate characterization of breast tissue sample. A. Photomicrograph of a 3 µm thick deparaffinized unstained tissue section adjacent to Figure 5 A mounted on BaF2 and used for IR spectroscopy. The red outline embraces the 4x3 IR mapping. Every square represents one IR image of 184*184 µm², with pixel size of 5.8*5.8 µm² (after 2x2 binning). B&C. IR images obtained by chemical group mapping. The image contrast is provided by the relative α helix protein and lipid chain contents obtained from peak intensities at 1656 cm-1 (B) and 1468 cm-1 (C). D. IR image representing the ratio between α helix protein and lipid chains content on breast sample. The image contrast is provided by 1656 /1468 cm-1. Absorbance and ratio intensities are calculated on preprocessed spectra and indicated by the colour bar. Spectra with S/N ratio below threshold (S/N < 400) are black colour coded; spectra with minimum and maximum absorbance on 1800-1400 cm-1 inferior to -10 or superior to 120 are respectively light and dark grey colour coded.

20 40 60 80 100 120

0

0

0

0

0

0

0

0

0

100 µm

Low

High

A. B.

C. D.

Page 12: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

The different views presented in Figure 6 highlight the different types of cellular species present in the tissue slice. The ductal epithelium surrounding the duct appears in red in Figure 6B (relatively high protein (α-helical protein) content) and in blue in Figure 6C (relatively low CH2 content). Combining these two observations in Figure 6D, which reports the ratio of the absorbance at 1656 and 1468 cm-1, results in a particularly sharp highlight of the ductal epithelium. On the one hand, this observation points that automated cell identification can be achieved. On the other hand, the fact that combination of the absorbance at two wavenumbers is necessary to achieve clear cut discrimination from the connective tissue and other cell species indicates that multivariate approaches might be much more powerful. This can be understood as different histologic classes such as epithelium, fibrous stroma, lymphocytes contain identical chemical components. So their characteristic spectra are quite similar. As it only quantifies chemical contents by peak absorbance intensities at specific wavenumbers, chemical group mapping does not take into account band shapes and spectral peak shifts. Yet these features are carrying information on molecular structural variations. Therefore, multivariate methods need to be used in order to describe the entire structural information present in a tissue sample.

Clustering analysis is a completely unsupervised technique commonly used for pattern recognition. The aim of the clustering process is to group into the same clusters IR spectra from histological regions that display similar spectral characteristics; and into different clusters IR spectra exhibiting different spectral features. The process will minimize the intra-cluster and maximize the inter-cluster variances. Same colour will be assigned to all spectra in a particular cluster. According to the number of clusters chosen for the analysis, different clustering structures can be observed. The only user input in this unsupervised method is the level at which the calculated dendogram needs to be cut in order to obtain the clustering structure best correlated to morphological interpretations.

K-means clustering is a non hierarchical process where an individual spectrum can belong to only one cluster, so that the class membership of spectra can only take the values 0 or 1. The clustering process aims at partitioning n observations into k clusters. Every cluster in the partition is defined by the spectra included and its centroid, calculated as the point to which the sum of the distances from all the data in that cluster is minimum. An iterative algorithm minimizing the sum of distances from spectra to its cluster centroid is used over all clusters.

A p-dimensional space is created where the IR spectra are defined as points, p being the number of data points in the spectra. In this space, k points (number of clusters defined by the user) are chosen randomly to be the origin of a future cluster. After computation of the distances between all the spectra and these k points, the algorithm assigns each spectrum to a cluster, based on the minimal distance. After calculating the centroids of the clusters generated in the previous step, the distance between spectra and centroids are recalculated. If the spectra are not associated with the nearest centroid, the iterative algorithm reorganises the clusters so that every spectrum belongs to the cluster with the nearest centroid. Every time a spectrum changes its cluster membership, the centroid positions are recalculated until none of the spectra needs to be reassigned to another cluster. At the end of the iterative clustering process, the data are organised so that spectra within a cluster present spectral features as similar as possible and spectra within different clusters as dissimilar as possible. Sample regions with similar spectra are identically coloured yielding a pseudo colour map which can be compared to the adjacent H&E stained section.

Page 13: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

Figure 7 B. depicts a three-cluster pseudo-colour map in the spectral region 1800–1000 cm-1. An obvious correlation between histological structures depicted by classical light microscopy and spectral clusters on reconstructed pseudo colour map is observed.

100012001400160018000

20

40

60

80

cm-1

Abs

orba

nce

Figure 7. A. Photomicrograph of a 3 µm thin Haematoxylin and Eosin (H&E) stained breast tissue section (734 x 550 µm). B. K-means cluster map acquired from infrared data of the adjacent unstained tissue section. Each colour is associated with a particular cluster. Distances for the clustering were computed across the 1800-1000 cm-1 range. C. Cluster centroids extracted from Figure 7B. Colours are identically assigned in B&C.

Importantly, the final result depends on the initial selection of the k points. In turn, the entire process must be repeated to assess its statistical validity. On the other hand, straight hierarchical clustering provides a unique result but is much less efficient when working with large data sets.

One of the problems of FPA micro-spectroscopy is the amount of spectral data to

be handled. For imaging the sample region presented in Figures 5,6 and 7, even if a 2x2 binning was applied, a total of 12,288 spectra were acquired and need to be preprocessed and analyzed. Each pixel in an IR measure corresponds to a full infrared spectrum defined by approximately 1000 biologically relevant wavelengths. The number of variables to be handled by statistical processes rises to 12,288,000. A Principal Component Analysis (PCA) is one of the most useful tools to reduce the number of variables. As described elsewhere [42, 43], this unsupervised multivariate method enables a variable reduction by building linear combinations of wavenumbers varying together. The first linear combination known as first principal component (PC),

0

0

0

0

0

0

0

0

0

# 1 # 2 # 3

A. B.

C.

100 µm

Abs

orba

nce

(A.U

.)

Page 14: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

explains most of the data variance. The second principal component, uncorrelated to the first one, accounts for most of the residual variance. The subsequent components are calculated similarly and explain each part of the remaining variance. Usually, about 90% of the data variance can be explained by the first five Principal Components.

Figure 8 depicts two PCA pseudo-colour maps in the spectral region 1800–

1000cm-1. Each pixel of the IR image is characterized by PC weight. The pseudo colour maps obtained are the projection of PC1 and 2 weights throughout the sample. As already observed for the K-means clustering, this unsupervised imaging approach has the potential to highlight the histological structures without any staining.

Figure 8. PCA analysis on breast tissue sample. A&B. PCA pseudo colour maps for PC1 and 2 respectively. The contrast images are provided by the distribution of the first two principal components weight throughout the sample. C. Spectra of the first five principal components on 1800-1000 cm-1 spectral region. Their contributions to the total variance are indicated in % in the right margin. The spectra are displayed with an offset for better readability.

10001200140016001800

0

0.2

0.4

0.6

0.8

cm-1

CP1 (28.8%)

CP2 (23.7 %)

CP3 (22.8%)

CP4 (11.1%)

CP5 (2.5%)

High Low

A. B.

C.

100 µm

Abs

orba

nce

(A.U

.)

Page 15: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

4. Future directions

IR micro-spectroscopy providing spatially resolved chemical information of the sample structural composition is a key technology allowing the automation of tissue evaluation procedure. Objectively analyzing progression and diagnosis of a tissue disease will be benefit for medical community, limiting the inter- and intra-observer contributions in the prognosis and diagnosis decisions.

The success of IR imaging approach results in the development and validation of spectral databases whose reliability depends on the quality of spectra compiled. Spectral preprocessing as proposed in this paper is an essential step for the development and application of classification models in cancer disease. The construction of robust algorithms translating spectral data into histopathological information helpful for the management of the disease is another challenge. Unsupervised methods are more suited for exploring the disease characteristics and discovering correlations with clinical parameters; some of these have been discussed in this paper. Once interesting correlations found, transferring bench knowledge to practical tool for cancer management decision needs a supervised approach from construction to validation of a histopathological recognition classifier. Validation step implies splitting the databases into subsets; one for teaching, a second for internal validation and a completely independent subset for external validation. Optimization of the classification model requires the selection of the appropriate spectral features in the entire spectral database [14]. The sensivity, accuracy and specificity of the diagnostic tool developed will be assessed during the external validation. The major challenge is the creation of spectral databases which can be used for classification model. Procedures for training/validation of IR based discrimination protocol are reported in the literature, for example references 13, 14 and 21.

5. Summary

The recent multi-channel detectors technology allows acquiring several thousands of spectra only in a few minutes. One of the IR micro-spectroscopy challenge is to handle this enormous quantity of data. The strength of IR spectroscopy relies on its potential to detect small differences in spectra, resulting in the necessity of correcting spectral data from environment artefacts. In this paper, the importance of recording conditions and preprocessing methods of the spectra have been shown in order to construct reliable databases. We discussed also the application of some filters to eliminate contaminated spectra from statistical analysis and the notion of spatial resolution which reveals to be crucial in IR micro-spectroscopy. Strong evidence of the IR spectroscopy potential for tissue discrimination is provided on an FFPE breast tissue sample. Unsupervised statistical approaches presented here lead to false colour maps, which can be correlated to morphological interpretations on an adjacent stained tissue section. Importantly, we show that the identification of cell type can be achieved on formalin-fixed paraffin-embedded tissues, giving access to an enormous bank of information paving the way to retrospective studies.

Page 16: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

References

[1] L.P. Choo-Smith, K.Maquelin, T. Van Vreeswijk, H.A. Bruining, G.J. Puppels, N.A.G. Thi, C. Kirschner, D. Naumann, D. Ami, A.M. Villa, F. Orsini, S.M.Doglia, H. Lamfarraj, G.D. Sockalingum, M.Manfait, P. Allouch and H.P. Endtz, Investigating microbial (micro)colony heterogeneity by vibrational spectroscopy, Appl. Environ. Microbiol. 67 (2001), 1461-9.

[2] C. Kirschner, K. Maquelin, P.Pina, N.A.G. Thi, L.P. Choo-smith, G.D. Sockalingum, C. Sandt, D. Ami, F. Orsini, S.M. Doglia, P. Allouch, M. Manfait, G.J. Puppels and D. Naumann, Classification and identification of enterococii: a comparative phenotypic, genotypic and vibrational spectroscopic study, J. Clin.Microbiol. 39 (2001):1763-70.

[3] D. Naumann, D. Helm and H. Labischinski, Microbiological characterizations by FT-IR spectroscopy, Nature 351 (1991):81-2.

[4] D.C. Malins, N.L. Polissar and S.J. Gunselman, Models of DNA structure achieve almost perfect discrimination between normal prostate, benign prostatic hyperplasia (BPH), and adenocarcinoma and have a high potential for predicting BPH and prostate cancer, Proc. Natl. Acad. Sci.(USA) 94 (1997):256-64.

[5] B. Rigas and P.T.T. Wong, Human colon adenocarcinoma cell lines display infrared spectroscopic features of malignant colon tissues. Cancer Res. 52 (1992), 84- 88.

[6] B. Rigas, S. Morgello, I.S. Goldman and P.T.T. Wong, Human colorectal cancers display abnormal Fourier-transform infrared spectra. Proc.Natl.Acad.Sci. (USA) 87 (1990), 8140-8144.

[7] A. Gaigneaux, C. Decaestecker, I. Camby, T. Mijatovic, R. Kiss, J.M. Ruysschaert and E. Goormaghtigh, The infrared spectrum of human glioma cells is related to their in vitro and in vivo behaviour. Exp. Cell Res. 297 (2004), 294-301.

[8] D.M. Haaland, D.T.J. Howland and E.V. Thomas, Multivariate classification of the infrared spectra of cell and tissue samples. Appl. Spectrosc. 51 (1997), 340-345.

[9] L.M. McIntosh, M. Jackson, H.H. Mantsch, M.F. Stranc, D. Pilavdzic and A.N. Crowson, Infrared spectra of basal cell carcinomas are distinct from non-tumor-bearing skin components. J.Invest.Dermatol. 112 (1999), 951-956.

[10] I.W. Levin and R. Bhargava, Fourier Transform Infrared Vibrational Spectroscopic Imaging: Integrating Microscopy and Molecular recognition, Annu. Rev. Phys. Chem. 56 (2005), 429-74.

[11] M. Diem, M. Romeo, S. Boydston-White, M. Miljkovic and C. Matthaüs, A decade of vibrational micro-spectroscopy of human cells and tissue (1994-2004), The Analyst 129 (2004), 880-5.

[12] D.C. Fernandez, R. Bhargava, S.M. Hewitt and I.W. Levin, Infrared spectroscopic imaging for histopathologic recognition, Nature Biotechnology 23 (4) (2005), 469-74.

[13] R. Bhargava, Towards a practical Fourier transform infrared chemical imaging protocol for cancer histopathology Anal Bioanal Chem 389 (2007), 1155-69.

[14] P. Lasch, M. Diem, W. Hänsch and D. Naumann, Artificial neural networks as supervised techniques for FT-IR microspectroscopic imaging, J. Chemometrics 20 (2006), 209-20.

[15] P. Garidel and M. Boese, Mid infrared microspectroscopic mapping and imaging: a bio-analytical tool for spatially and chemically resolved tissue characterization and evaluation of drug permeation within tissues, Microsc Res Tech 70 (4) (2007), 336-49.

[16] R. Wolthuis, A. Travo, C. Nicolet, A. Neuville, M-P. Gaub, D. Guenot, E. Ly, M. Manfait, P. Jeannesson and O. Piot, IR spectral imaging for histopathological characterization of xenografted human colon carcinomas, Anal. Chem. 80 (2008), 8461-9.

[17] A. Boskey and N.P. Camacho, FT-IR imaging of native and tissue-engineered bone and cartilage, Biomaterials 28 (15) (2007), 2465-78.

[18] H. Fabian, N.A.N. Thi, M. Eiden, P. Lasch, J. Schmitt and D. Nauman, Diagnosing benign and malignant lesions in breast tissue sections by using IR microspectroscopy, Biochim.Biophys.Acta 1758 (2006),874-82.

[19] H. Fabian, P. Lasch, M. Boese and W. Haensch, Mid-IR microspectroscopic imaging of breast tumor tissue sections, Biopolymers (Biospectroscopy) 67 (2002), 354-7.

[20] H. Fabian, P. Lasch, M. Boese and W. Haensch, Infrared microspectroscopic imaging of benign breast tumor tissue sections, J. Mol. Struct. 661-662 (2003), 411-7.

[21] B. Bird, M. Miljkovic, M.J. Romeo, J. Smith, N. Stone, M.W. George and M. Diem, Infrared micro-spectral imaging: distinction of tissue types in axillary lymph node histology, BMC Clinical Pathology 8 (2008), 8.

[22] P. Lasch, W. Haensch, D. Naumann and M. Diem, Imaging of colorectal adenocarcinoma using FT-IR microspectroscopy and cluster analysis, Biochim.Biophys.Acta 1688 (2004), 176-86.

[23] W. Steller, J. Einenkel, L-C. Horn, U-D. Braumann, H. Binder, R. Salzer and C. Krafft, Delimitation of squamous cell cervical carcinoma using infrared microspectroscopic imaging, Anal Bioanal Chem 384 (2006), 145-54.

Page 17: Discrimination of breast tissular structure by FT-IR FPA ......imaging devices provides a new imaging approach for cell and tissue pathology as distribution and structure of cellular

[24] A. Bénard, C. Desmedt, V. Durbecq, G. Rouas, D. Larsimont, C. Sotiriou and E. Goormaghtigh, Discrimination between healthy and tumor tissues on formalin-fixed paraffin-embedded breast cancer samples using IR imaging, Spectroscopy 24 (2010), 67-72.

[25] J.P. Brettes, C. Mathelin, B. Gairard and J.P. Bellocq, Cancer du sein, Masson, Issy-Moulineaux, France, 2007.

[26] P. Robbins, S. Pinder, N. de Klerk, H. Dawkins, J. Harvey, G. Sterrett, I. Ellis and C. Elston, Histological grading of breast carcinomas: a study of interobserver agreement, Hum Pathol 26(8) (1995), 873-9.

[27] F. Theissig, K.D. Kunze, G. Haroske and W. Meyer, Histological grading of breast cancer. Interobserver, reproducibility and prognostic significance, Pathol Res Pract 186(6) (1990), 732-6.

[28] J.S. Meyer, C. Alvarez, C. Milikowski, N. Olson, I. Russo, J. Russo, A. Glass, B.A. Zehnbauer, K. Lister and R. Parwaresch, Breast carcinoma malignancy grading by Bloom-Richardson system vs proliferation index: reproducibility of grade and advantages of proliferation index, Modern Pathology 18 (2005), 1067-78.

[29] E. Ly, O. Piot, R. Wolhuis, A. Durlach, P. Bernard and M. Manfait, Combination of FTIR spectral imaging and chemometrics for tumour detection form paraffin-embedded biopsies, The Analyst 133 (2008), 197-205.

[30] E. Goormaghtigh, V. Cabiaux and J.M. Ruysschaert, Determination of soluble and membrane protein structure by Fourier transform infrared spectroscopy. II. Experimental aspects, side chain structure and H/D exchange, Subcell. Biochem. 23 (1994), 363- 403.

[31] E. Goormaghtigh, FTIR data processing and Analysis Tools, Advances in Biometrical Spectroscopy, Vol.2, A. Barth and P.I. Haris, eds. IOS Press, Amsterdam, 2009, pp. 104-128.

[32] P. Lasch and D. Naumann, Spatial resolution in infrared microspectrocopic imaging of tissues, Biochim.Biophys.Acta 1758 (2006), 814-29.

[33] E. Goormaghtigh and J.M. Ruysschaert, Subtraction of atmospheric water contribution in Fourier transform infrared spectroscopy of biological membrances and proteins, Spectrochim. Acta 50A (1994), 2137- 44.

[34] K.A. Oberg, J.M. Ruysschaert and E. Goormaghtigh, The optimization of protein secondary structure determination with infrared and circular dichroism spectra, Eur.J.Biochem 271 (2004), 2937- 48.

[35] M. Diem, L. Chiriboga and H. Yee, Infrared spectroscopy of human cells and tissue. VIII. Strategies for analysis of infrared tissue mapping data and applications to liver tissue, Biopolymers (Biospectroscopy) 57 (2000), 282-90.

[36] A. Gaigneaux, Bio-engineering thesis, Determination of diagnostic and prognostic markers in varied tumoral pathologies by ATR-FTIR spectroscopy (2004).

[37] M.J. Romeo and M. Diem, Infrared spectral imaging of lymph nodes: Strategies for analysis and artefact reduction Vibrational Spectroscopy 38 (1-2) (2005), 115-9.

[38] M.J Romeo and M. Diem, Correction of dispersive line shape artefact observed in diffuse reflection infrared spectroscopy and absorption/reflection (transflection) infrared micro-spectroscopy, Vibrational Spectroscopy 38 (2005), 129-32.

[39] P. Bassan, H.J. Byrne, F. Bonnier, J. Lee, P. Dumas and P. Gardner, Resonant Mie scattering in infrared spectroscopy of biological materials – understanding the dispersion artefact, The Analyst 134 (2009), 1586-93.

[40] P. Bassan, H.J. Byrne, J. Lee, F. Bonnier, C. Clarke, P. Dumas, E. Gazi, M.D. Brown, N.W. Clarke and P. Gardner, Reflection contributions to the dispersion artefact in FTIR spectra of single biological cells, The Analyst 134 (2009), 1171-5.

[41] B. Bird, M. Miljkovic and M. Diem, Two step resonant Mie scattering correction of infrared micro-spectral data: human lymph node tissue, J. Biophoton. 3 (8-9) (2010), 597-608.

[42] A. Gaigneaux, J.M. Ruysschaert and E. Goormaghtigh, Infrared spectroscopy as a tool for discrimination between sensitive and multiresistant K562 cells, Eur. J. Biochem. 269 (2002), 1968-73.

[43] R. Gasper, J. Dewelle, R. Kiss, T. Mijakovic and E. Goormaghtigh, IR spectroscopy as a new tool for evidencing antitumor drug signatures, Biochim.Biophys.Acta 1788 (6) (2009), 1263-70.