epubs.surrey.ac.ukepubs.surrey.ac.uk/841039/1/Andrea_final.docx · Web viewThe focal spot size...

43
The effect of system geometry and dose on the threshold detectable calcification diameter in 2D-mammography and digital breast tomosynthesis SHORT TITLE Threshold detectable calcification diameter in 2D-mammography and DBT AUTHORS Andria Hadjipanteli 1* , Premkumar Elangovan 2 , Alistair Mackenzie 1 , Padraig T Looney 1 , Kevin Wells 2 , David R Dance 1,3 , Kenneth C Young 1,3 1 National Coordinating Centre for the Physics of Mammography, Royal Surrey County Hospital, Guildford, Surrey, UK. 2 Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK. 3 Department of Physics, University of Surrey, Guildford, UK. * Corresponding author: [email protected]. ABSTRACT Digital breast tomosynthesis (DBT) is under consideration to replace or to be used in combination with 2D-mammography in breast screening in the United Kingdom. Different DBT geometries should be compared with each other and with 2D-mammography in terms of their 5 10 15 20

Transcript of epubs.surrey.ac.ukepubs.surrey.ac.uk/841039/1/Andrea_final.docx · Web viewThe focal spot size...

The effect of system geometry and dose on the threshold

detectable calcification diameter in 2D-mammography and

digital breast tomosynthesis

SHORT TITLE

Threshold detectable calcification diameter in 2D-mammography and DBT

AUTHORS

Andria Hadjipanteli1*, Premkumar Elangovan2, Alistair Mackenzie1 , Padraig T Looney1, Kevin Wells2, David R Dance1,3 , Kenneth C Young1,3

1National Coordinating Centre for the Physics of Mammography, Royal Surrey County Hospital, Guildford, Surrey, UK.

2Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK.

3Department of Physics, University of Surrey, Guildford, UK.

*Corresponding author: [email protected].

ABSTRACT

Digital breast tomosynthesis (DBT) is under consideration to replace or to be used in combination

with 2D-mammography in breast screening in the United Kingdom. Different DBT geometries should

be compared with each other and with 2D-mammography in terms of their detectability of cancer. The

effect of system dose on cancer detectability should also be investigated. The aim of this study was

the comparison of the detection of one type of breast cancer (microcalcification clusters) by human

observers in breast images using 2D-mammography, narrow angle (15/15 projections) and wide angle

(50/25 projections) DBT. The effect of imaging geometry on calcification detection was tested for

different positions of the microcalcification cluster in the breast. The effect of three dose levels on

calcification detection was also studied. Simulated images of 6 cm thick compressed breasts were

produced with and without microcalcification clusters inserted, using a set of image modelling tools

for 2D-mammography and DBT. Image processing and reconstruction were performed using

5

10

15

20

25

commercial software. A series of 4-alternative forced choice (4AFC) experiments was conducted for

signal detection with the microcalcification clusters as targets. Threshold detectable calcification

diameter was found for each imaging modality with standard dose: 2D-mammography (164±5 μm),

narrow angle DBT (210±5 μm) and wide angle DBT (255±4 μm). Statistically significant differences

were found when using different doses, but different geometries had a greater effect. No differences

were found between the threshold detectable calcification diameters when at different heights in the

breast. The 4AFC results were correlated with threshold diameters obtained using the CDMAM test

phantom at different doses, showing that the CDMAM measurements provide a measure of

performance, which is relevant to calcification detection.

1. Introduction and background

Digital breast tomosynthesis (DBT) involves the acquisition of two-dimensional X-ray projections

over a limited angular range and their reconstruction to image planes parallel to the detector

(Sechopoulos et al 2013a, Sechopoulos et al 2013b). DBT is currently under consideration and study

for its use in combination or alone with 2D-mammography in breast cancer screening in the United

Kingdom. For DBT to be combined with 2D-mammography in breast screening it would require the

additional dose due to DBT to be justified in terms of mortality and morbidity. For DBT to replace

2D-mammography in screening it would have to at least provide the same detectability of cancer

lesions with 2D-mammography, at similar dose levels.

The detectability of a lesion in the breast using X-ray imaging depends on the image acquisition

methods, dose, image processing, reconstruction algorithms, the physical properties of the breast and

lesion itself. It has already been shown that DBT increases the detectability of masses in the breast

and reduces recalls, when used in combination with 2D-mammography (Rafferty et al 2013). Also,

DBT alone was shown to perform better in mass detection than 2D-mammography (Elangovan et al

2015). Masses do not have a high contrast in comparison with the healthy breast tissue when 2D-

projected, and can thus be superimposed by surrounding structures. DBT separates overlying

30

35

40

45

50

structures into planes thus decreasing the effect of masses being superimposed on healthy breast

tissue. Microcalcifications have a higher contrast in comparison with the healthy breast tissue, but due

to their small size (0.26 mm average calcification diameter for the real calcifications) (Warren et al

2014), their visualisation can be highly dependent on the resolution of the imaging system. Some

studies have shown that the detectability of microcalcifications with DBT is slightly lower than with

2D-mammography, whereas others have claimed that the converse is true (Spangler et al 2011,

Kopans et al 2011). It is still unclear whether the detectability of microcalcification clusters in DBT

can be as high as that in 2D-mammography.

Image acquisition parameters that are expected to affect the detectability of lesions in DBT include

the tomographic scan angular range and the number of projections (which form part of the system

geometry) and the breast dose. By increasing the DBT scan angular range, depth resolution increases

(Hu et al 2008). However, by increasing the angular range the movement required by the X-ray tube

may increase and, unless the system is step-and-shoot (Shaheen et al 2011), blurring can be

introduced due to tube motion and the oblique incidence angle (Mainprize et al 2006). Wider scan

angles need more projections for an adequate sampling of image data and fewer tomosynthesis

reconstruction artefacts. However, at a fixed dose, the relative quantum noise will increase in the

projection image with the number of projections increasing. In combination with insufficient angular

sampling, the quantum noise can have an effect on the detectability of small-scale signals (Reiser et al

2010). Also, electronic noise may become more dominant. Sechopoulos et al have shown that

increasing the number of projections decreases the contrast to noise ratio for microcalcification-like

objects (Sechopoulos et al 2009).

An optimum solution should be sought where the angle and number of projections provide a

combination of low blurring and low relative noise, for the highest possible detectability of

microcalcifications, at a particular dose level. The establishment of the optimum combination of these

variables for DBT is complicated, due to the large number of variables. The optimum angle and

number of projections for microcalcification detection reported from different studies varies and

55

60

65

70

75

80

seems to be dependent on the study methods, system characteristics (noise levels, blurring levels),

dose, imaging conditions, reconstruction methods used, imaging parameters investigated and

performance metrics employed (Reiser et al 2009, Sechopoulos et al 2009, Tucker et al 2013, Chan et

al 2014, Peterson et al 2015).

Decreasing dose for a fixed geometry increases the relative noise in an image, creating mainly

quantum noise-limited images and will lead to a decrease in the detectability of microcalcifications.

Previous studies that have investigated the effect of dose on cluster detection include, for 2D-

mammography, Warren et al (2012) and Samei et al (2013) and for DBT Timberg et al (2015). All

three studies have shown a decrease in diagnostic performance with decreasing dose.

For the investigation of the effect of factors like system geometry and dose on the detectability of

lesions, the methods to be used require settings as similar as possible to the clinical case. Ideally, a

direct comparison of different systems in a clinical environment would be used. However, this can be

time consuming and expensive. Furthermore, when using high doses, real images would be ethically

difficult or impossible to acquire. In this study we use simulation methods to study the detectability of

microcalcification clusters by 2D-mammography and DBT. To the best of our knowledge, no study to

date has made a quantitative measurement of the threshold diameter required for microcalcification

detection, using high resolution, realistic images with observers, for the comparison of DBT

geometries with 2D-mammography. The methods used in this study (Elangovan et al 2014) satisfy

these requirements. They involve the realistic simulation of breast images with calcification clusters

and their use in 4-alternative forced choice (4-AFC) observer studies. The values of the threshold

detectable calcification diameter determined from the observer studies were then compared for each

modality and imaging conditions investigated. Using this approach we compared the performance of

2D-mammography, narrow angle DBT and wide angle DBT for microcalcification cluster detection.

We also studied the influence of the height of the cluster above the breast support and breast dose on

this detection task.

85

90

95

100

105

110

Finally, the threshold detectable calcification diameters from the observer studies were correlated with

the threshold diameters obtained at different dose levels using the CDMAM mammography test

object (Artinis Medical System, Zetten, The Netherlands), the standard European method of

measuring 2D mammographic image quality (van Engen et al 2003). It has already been shown that

the clinical effectiveness of four available 2D-mammography systems in detecting calcification

clusters is linked to image quality assessment using the CDMAM phantom (Mackenzie et al 2016).

The European protocol for the quality control of the physics and technical aspects of DBT (van Engen

et al 2015) recommends some limited use of it in the DBT, for example, in assessing the stability of

image quality. However, it is currently unknown how the results obtained using the CDMAM test

object relate to the clinical performance of both 2D-mammography and DBT at different dose levels.

2. Methods and Materials

In this investigation the microcalcification detection performance of 2D-mammography and two DBT

systems has been compared using simulated images and a series of 4-AFC observer studies. The

simulation involved three stages: creation of voxel phantoms of the breast, creation of simulated

calcification clusters into the phantom, and calculation of images. These stages, together with the 4-

AFC methodology and analysis, and the correlation of the results to CDMAM results are described in

sections 2.1-2.6 below.

Two comparative performance studies were performed, as detailed in table 1. In arm 1 of the study the

effect of system geometry and cluster insertion height above the breast support on microcalcification

detection were tested, while keeping dose constant between the systems. In arm 2 the effect of dose

was investigated for the three different geometries, and the cluster insertion height was kept constant.

In all cases, the breast glandularity, breast thickness, cluster diameter and the processing and

reconstruction methods were kept constant.

115

120

125

130

135

Table 1: Details of variables and constants used in each study arm.

Arm 1 Arm 2

VariablesGeometry Insertion Height (cm) Geometry Mean glandular

dose (mGy)

2D-mammography 1 2D-mammography 1.25

Narrow angle DBT 3 Narrow angle DBT 2.50

Wide angle DBT 5 5.00

Constants Mean glandular dose (2.5 mGy)

Breast glandularity

Breast thickness

Cluster diameter

Processing and reconstruction methods

Insertion height (3 cm)

Breast glandularity

Breast thickness

Cluster diameter

Processing and reconstruction methods

2.1.Mathematical breast phantom

Realistic mathematical breast phantoms were created using a method described by Elangovan et al

(2016). As this study discussed, radiologists found it difficult to distinguish between segments of

image of the phantom and real mammograms. A variety of breast tissue structures were first extracted

from reconstructed DBT planes of real patient images. The extracted structures were de-noised using

a series of morphological image operations. These structures were then scaled and inserted into an

empty breast phantom volume containing only adipose tissue. At the end of the simulation process,

each phantom was composed of five different tissue types: skin, glandular tissue, adipose, Cooper’s

ligaments and blood vessels. The phantoms had a voxel size 100 μm 100 μm 100 μm.

Each breast phantom produced had compressed breast thickness 6 cm. The glandularity of each breast

phantom was set between 17% and 19% by volume. This glandularity was chosen as it matches the

average glandularity of 21% by mass in the central portion of the breast for women of age 50 to 64

(Dance et al 2000).

140

145

150

155

2.2. Simulated clusters

Simulated volumes of clusters composed of five microcalcifications were produced. The detection of

five microcalcifications in a cluster was regarded as a more realistic representation of the clinical task

than detecting a single microcalcification. One high resolution microcalcification image volume was

chosen from a database of 400 real microcalcification image volumes (breast biopsy samples), which

were acquired using a microcomputed tomography system (Shaheen et al 2011). The selected

microcalcification was chosen due to its approximately round shape.

Each calcification was assumed to be calcium oxalate, but with an attenuation coefficient set as the

product of the attenuation coefficient of calcium oxalate and the factor 0.84. This corrects for

differences in the attenuation of calcium oxalate of real calcifications (Warren et al 2013).

The selected microcalcification was replicated five times, but rotated at a different orientation each

time and randomly placed within a 2.5 2.5 2.5 mm3 cubic volume. This formed one cluster.

There was no overlap between any of the five microcalcifications in the planar projection of the

cluster. The above process was repeated to produce 15 different microcalcification clusters. The

clusters themselves were then rotated by 90º, 180º and 270º, resulting in 60 different clusters that

could be subsequently inserted into the simulated images. All clusters had the same volume (a 2.5

2.5 2.5 mm3), therefore only the microcalcification size, and not the cluster size (the spread of the

calcifications), had an effect on the detectability.

The same 60 clusters were then regenerated with different calcification sizes. The volume of the

cluster was kept the same, while the microcalcifications were scaled to a series of diameters in the

range 110 μm to 275 μm (section 2.3). Figure 1 shows examples of the microcalcification clusters

produced, with two microcalcification diameters: (a) 125 μm and (b) 250 μm.

160

165

170

175

180

The microcalcification clusters were inserted into the breast phantoms by voxel replacement. Since

the voxel size of the phantoms were much larger than the microcalcification clusters, in the interest of

execution time and memory, a slightly different approach was undertaken to avoid super sampling the

entire phantom to match the resolution of the microcalcification clusters. A cubic region around the

insertion site was represented at high resolution by super sampling the background tissue voxels of

the phantom to accommodate the microcalcification clusters without loss of information.

The microcalcification clusters were inserted into the mathematical phantoms at three heights above

the breast support: 1 cm (arm 1), 3 cm (arms 1 and 2) and 5 cm (arm 1). They were positioned so that

a range of positions in the reconstructed image planes was simulated.

(a) (b)

Figure 1. 2D projection images of (2.5 2.5 2.5 mm3 cubic volume) clusters with two different

microcalcification diameters before insertion: (a) 125 μm and (b) 250 μm.

2.3. Image Simulation

The image modelling tools developed and validated by Elangovan et al (2014) were used to calculate

simulated images of the breast phantom for 2D-mammography and DBT. Together with the

mathematical breast and cluster phantoms described above simulated breast images with and without

inserted clusters were produced.

185

190

195

200

205

210

A clinically used detector made of amorphous selenium was simulated for both 2D-mammography

and DBT. The physical pixel pitch of the detector was set at 0.07 mm, but for narrow and wide angle

DBT pixel binning was performed before reconstruction (giving a pixel size of 0.14 mm). The 2D-

mammography geometry was based on a clinically existing geometry (Hologic Selenia Dimensions),

with a 0.04 0.04 mm2 focal spot size and source to detector distance of 70 cm.

The narrow angle DBT geometry tested used a 15/15 projections configuration, based on the existing

commercial DBT geometry of Hologic Selenia Dimensions. The wide angle DBT geometry tested

used a 50/25 projections configuration, also based on an existing commercial geometry (Siemens

Mammomat Inspiration). However for this case only the angle, number of projections and source

movement blurring (see below) matched the commercial system. The rest of the system properties,

including the pixel size, detector characteristics and imaging conditions, matched those for narrow

angle DBT. The purpose of the first arm of the study was to test the effect of imaging geometry on

calcification detectability. It was beyond the scope of the study to compare clinically used systems.

Both DBT systems had a “continuous” and not a “step-and-shoot” configuration. In the “continuous”

configuration the source movement introduces blurring, and this effect was incorporated by increasing

the focal spot size in the direction of movement in the simulation model. Based on physical

measurements of the tube rotation speed and the time of exposure for each projection, made on a

Hologic Selenia Dimensions system, for the exposure of a 6 cm thick average breast, the focal spot

size length (in the direction of tube movement) was found to be 0.14 mm for the narrow angle DBT

and 0.22 mm for wide angle DBT. The focal spot size width (in the direction perpendicular to the

movement of the tube) was set to 0.04 mm, again based on physical measurements. The focal spot

size width and length for 2D-mammography were both set to 0.04 mm. The focal spot size was set as

above for both arms of the study.

The kVp and target/filter materials used in the x-ray simulation were: (i) 2D-mammography: 31 kVp

W/Rh; (ii) DBT: 33 kVp W/Al; which are typical of those used clinically (Automatic Exposure

215

220

225

230

235

Control) for a 6 cm thick compressed breast on a Hologic Selenia Dimensions system. The primary

images/projections were produced using a tracing tool developed for 2D-mammography and DBT

(Elangovan et al 2014), which is based on the Siddon algorithm (Siddon et al 1984). The breast

phantom and the high resolution cube containing microcalcification clusters were ray traced

separately at 35 μm and 12 μm to 35 μm (depending on the size of the calcificiation) respectively, and

the images were stitched together to produce the final image. The spectra used (Boone et al 1997)

were attenuated by an aluminium thickness that was used to match the calculated and measured half

value layers (HVL).

For each spectrum, the incident air kerma was calculated and the mean glandular dose (MGD) was

computed using data from Dance et al (2000) and Dance et al (2011). Then the spectrum was scaled

to achieve the required MGD in the simulations. When investigating the effect of geometry and the

height of lesion insertion on calcification detectability (Table 1, arm 1) the MGD was fixed at 2.5

mGy for all three modalities. This dose is used clinically for 2D-mammgraphy and DBT for this

breast thickness (Bouwman et al 2015). When testing the effect of dose on calcification detectability

in 2D-mammography and narrow angle DBT (Table 1, arm 2) three MGD levels were used: 1.25

mGy, 2.5 mGy and 5 mGy.

The ray tracing simulation included transmission through the grid (in 2D-mammography only), as

calculated using Monte Carlo simulations, geometric blurring due to focal spot finite size and tube

movement blurring as discussed above. The attenuation in the breast support and compression paddle

as appropriate to the three imaging modalities were also taken into account. Breast movement was

ignored. Scatter was calculated using Monte Carlo simulations for 5 cm of PMMA (equivalent to a 6

cm thick breast) and added to the images. The methods of Mackenzie et al (2012, 2014) were used to

further blur and add noise to the images appropriate for the detector and radiation dose being

simulated. In each case, noise was added with the correct magnitude and colour appropriate for the

dose and beam quality simulated. Finally, the energy absorbed per unit area in the detector, as

240

245

250

255

260

265

calculated in the simulations, was converted into pixel values, based on detector response

measurements (signal transfer properties) made on a Hologic Selenia Dimensions system.

Briona (Real Time Tomography, LLC, Philadelphia, USA) software was used for the processing and

reconstruction of images. Briona uses a back projection reconstruction method with iterative and non-

linear processing techniques to mitigate noise and artifacts. Briona software was chosen as it provides

flexibility to the user in modelling different DBT geometry configurations. For the present study the

same reconstruction filters were used for the two DBT imaging modalities simulated. Quantitative

analysis (full width at half maximum, contrast-to-noise ratio (CNR), contrast degradation factor) and

qualitative tests on DBT images reconstructed using Briona showed that it produced images

comparable to those produced by reconstruction and processing software used clinically.

After processing and reconstruction, the 2D-mammography images and DBT planes were cropped

into 30 mm 30 mm images to be used in the human observer experiments. In DBT, 10 planes were

used, centred on the centre of the cluster volume. In total, 1620 cropped images (60 images 3

diameters 3 modalities 1 dose 3 heights) with cluster were produced for the geometry study

(Table 1, arm 1) and 1080 cropped images (60 images 3 diameters 2 modalities 3 doses 1

height) for the dose study (Table 1, arm 2). The appropriate microcalcification diameters for each

modality and dose were chosen after a pilot study. Table 2 summarises the range of microcalcification

diameters produced for each study arm.

270

275

280

285

290

Table 2: The range of microcalcification diameters produced for each study arm.

Arm Geometry Insertion Height (cm) Dose (mGy) Microcalcification diameter range (m)

1 2D-mammography 1, 3, 5 2.50 125-175

1 Narrow angle DBT 1, 3, 5 2.50 175-225

1 Wide angle DBT 1, 3, 5 2.50 225-275

2 2D-mammography 3

3

3

3

1.25 135-180

2 2D-mammography 5.00 110-155

2 Narrow angle DBT 1.25 185-245

2 Narrow angle DBT 5.00 160-210

In addition, for arm 1, 768 background cropped images were produced without an inserted cluster for

each geometry. For arm 2, these images were scaled accordingly to match the three different dose

levels. In the 4AFC study (section 2.4), three of these images were selected randomly and shown to

the observer, together with an image that contained the cluster. For each 4AFC experiment, the

background images were chosen from the same phantom as the signal image to ensure that all four

image quadrants exhibited breast texture properties.

2.4. Observation and 4AFC study

The images produced using the above methods were used in a series of 4AFC human observer

experiments. The main characteristic of a 4AFC study is the requirement for a “signal known exactly”

and “background known exactly” detection task (Burgess et al 1995). This can be achieved for

simulated images as the ground truth is known (i.e. the observer knows where the signals would be

located). A 4AFC experiment was performed for each imaging modality, breast dose and cluster

height simulated, and through this the threshold diameter for microcalcification detection in each case

was determined.

295

300

305

310

Five physicists participated as observers in the study. As this was a forced choice study and does not

include the effect of “searching”, non-radiologists as observers were acceptable. All observers had

basic training (with at least 40 images for each modality) before undertaking the study. Sets of four 30

mm 30 mm breast phantom 2D-mammography cropped images or DBT planes were randomly

selected and shown in turn to the observer. In each set one image contained a microcalcification

cluster in the centre and the other three did not. A circle appeared around the centre of all the cropped

images to better define where the signal would be. The images were presented using an in-house

graphical user interface (GUI) (figure 2). The observers had to decide which of the four images

contained the cluster and register their decision by selecting the relevant quadrant. A 2D projection of

the inserted 3D cluster (reference copy of the signal or “image cue”) is also shown without the

background breast. This is an alternative forced choice and “signal known exactly” requirement.

Figure 2: Example image displayed for the 4AFC GUI showing four 2D-mammography image patches and the signal cue for a 175 μm microcalcification cluster. The red arrow has been added to the image to indicate which of the four images contains the cluster.

In the DBT 4AFC studies the observer could scroll through ten planes. This ensured a clinically

realistic task of observing the cluster through several planes. Ten planes were chosen as this number

included the cluster and its “shadow” spread and 2-3 extra planes without signal. If more planes

without signal had been shown there was the risk of the observer going through a “search” task, which

315

320

325

330

335

340

would not meet the requirements of a 4AFC study. The initial DBT plane displayed to the observer

was chosen at random and the DBT display wrapped around when the observer scrolled to the end of

the stack of planes.

In the geometry study (Table 1, arm 1), each observer was shown 180 groups of four images for each

modality. Within these 180 scenarios, there were 60 for each of the three different microcalcification

diameters. These were presented to the observer in a random order. This was repeated with the

insertion cluster inserted at three different heights (1 cm, 3 cm and 5 cm) above the breast support. In

the dose study (Table 1, arm 2), the observers were shown 180 of four images (one with cluster and

three without) for each dose level (1.25 mGy, 2.5 mGy and 5 mGy) for 2D-mammography and

narrow angle DBT. In some cases, 240 of four images were shown to ensure that the PC values were

within the 30% to 98% for three different diameters.

All experiments were undertaken on a high-resolution reporting quality monitor (Barco, B-8500,

5MP, Belgium). All images were displayed at 100% magnification (one to one pixel between the

image and the monitor). No changes in magnification were allowed. Low lighting levels were used as

in clinical practice and no time limit was imposed.

2.5. Analysis

Based on the Rose Model (Burgess et al 1995,1999), a linear relationship can be assumed between the

detail diameter (microcalcification diameter in this case) and detectability, if the experiment is

unbiased. The detectability can be expressed numerically in terms of the detectability index (d'), a

quantity that is related to the percentage correct (PC) for an observer (Macmillan and Creelman

2004).

In this study, following Timberg et al (2013) the threshold detectable calcification diameter was taken

as the size at which the observer makes 92.5% correct decisions. This value of PC corresponds to a d'

of 2.5. For each study arm and observer a linear least square fit to the calcification diameter versus d'

345

350

355

360

365

370

was used to find the threshold detectable calcification diameter (at a d' of 2.5). The overall mean d' for

each modality, dose and cluster insertion height was then calculated. In addition, the threshold

microcalcification diameter for each of the three modalities, using the standard dose of 2.5 mGy and

with the cluster inserted at 3 cm above the breast support, at a PC of 62.5% was also calculated.

Based on the calcification diameters chosen in the 4AFC studies, the percentage correct PC range

used in the analysis varied between 32% and 98%. In the cases where values were greater than 98%

they were not taken into account in the analysis of the results, to avoid using "saturated" results.

Analysis of variance (ANOVA) was performed on the results to identify any statistically significant

differences between the threshold detectable calcification diameters for the three imaging modalities,

the three heights of cluster insertion and the three doses. ANOVA was also performed on the results

to investigate the differences between the observers.

2.6. CDMAM correlation to calcification detection

The threshold detectable calcification diameters obtained from the observer studies from study arm 2

were correlated with the threshold thicknesses obtained using the CDMAM test object at different

doses, to investigate how well CDMAM correlates to calcification detection.

The CDMAM images used in this analysis were acquired on a 2D-mammography and DBT Hologic

Selenia Dimensions system as described in the European protocol (Perry et al 2006). The test object

was positioned with 20 mm PMMA blocks above and below it on the breast support, as the CDMAM

phantom with 40 mm PMMA is considered to have the equivalent absorption of a 60 mm compressed

breast as used in this study. Sixteen images were acquired in each modality using the automatic

exposure control (AEC) selected imaging parameters. Sixteen images were also acquired in each

modality at half and double exposure of the AEC conditions. Each set of unprocessed images was

automatically analysed to determine the threshold gold thickness at all disc diameters (Young et al

2006, 2008). The threshold gold thicknesses were acquired at slightly different MGD levels (2D-

375

380

385

390

395

mammography: 0.98 mGy, 1.95 mGy, 3.90 mGy, narrow angle DBT: 1.33 mGy, 2.66 mGy and 5.32

mGy) to those used in the 4-AFC studies (1.25 mGy, 2.50 mGy, 5.00 mGy) and were therefore

corrected for dose before the comparisons were made. The corrections were made by multiplying the

threshold gold thickness by the square root of the ratio of the actual dose used to the dose of the 4-

AFC study. The results of the CDMAM threshold thickness for gold discs of diameter 100 μm and

250 μm, were used as they are in the range of calcification sizes used in this study (100 μm to 275

μm).

3. Results

3.1. The effect of system geometry on calcification detection

Figure 3 shows the detectability index, d', versus calcification diameter results for 2D-mammography

for a single observer. The increase in detectability index with calcification diameter follows a linear

relationship, as expected (Burgess et al 1999, Burgess et al 1995). Similarly good fits were obtained

for all modalities and observers where three data points were available. In 4 of the 15

modality/observer combinations, the detectability of the largest calcification diameter was 100% and

only two data points could be used, as fitting needed to be done only to the linear part of the

relationship between d' and detail diameter.

400

405

410

415

420

425

Figure 3. The detectability index, d', versus calcification diameter for 2D-mammography for one observer. The

calcification diameter at a detectability index of 2.5 was taken as the threshold detectable calcification

diameter.

The threshold detectable calcification diameters for the five observers and the three imaging

modalities, at a MGD of 2.5 mGy and with a height of insertion of 3 cm above the breast support, are

presented in figure 4 (a). ANOVA gave a p-value of 0.57 for the effect of the variation between the

observers, showing confidence in the rejection of the hypothesis that there is a significant difference

in the threshold detectable calcification diameters found by the five observers. The difference between

the three modalities is obvious from figure 4 (a). Figure 4 (b) shows the averages of the observers’

results for each modality at a MGD of 2.5 mGy, a height of insertion of 3 cm and with PC values of

62.5% and 92.5%. It can be seen that 2D-mammography performs better than DBT and narrow angle

DBT performs better than wide angle DBT (for both a PC of 92.5% and 62.5%). ANOVA was used to

test if the threshold detectable calcification diameters of the three modalities (for a PC of 92.5%) were

significantly different. A p-value of <0.0001 was found showing that there was a highly significant

statistical difference between the threshold calcification diameter that can be detected by 2D-

mammography, narrow and wide-angle DBT.

(a)

430

435

440

445

450

455

(b)

Figure 4. (a) The threshold detectable calcification diameter for each observer for 2D-mammography, narrow

and wide angle DBT, for a MGD of 2.5 mGy and cluster insertion 3 cm above the breast support. (b) The

average of the observers results in (a) for each imaging modality and with PC values of 62.5% and 92.5%. The

error bars are two standard errors of the mean.

When comparing the results at a PC of 92.5% to those for a PC of 62.5% a difference of 23-33 μm

was found depending on modality. The results in the following sections are for a PC of 92.5%.

3.2. The effect of cluster insertion height on its detection

460

465

470

475

480

485

Figure 5 shows the threshold detectable calcification diameter for each imaging modality, with the

calcification cluster inserted at three heights above the breast support. As before, in each case errors in

the threshold detectable calcification diameter were calculated as two standard errors of the mean.

These results show that the cluster insertion height for a 6 cm breast does not have significant effect

on its detection for any of the modalities tested. ANOVA gave p-values>0.05 showing no statistically

significant difference in the results for different heights.

Figure 5. The threshold detectable calcification diameter for 2D-mammography, narrow and wide angle DBT,

with a MGD of 2.5 mGy and the cluster inserted at three heights above the breast support: 1 cm, 3 cm and 5

cm. The error bars are two standard errors of the mean.

3.3. The effect of dose on calcification detection

Figure 6 shows the threshold calcification diameters for MGD values of 1.25 mGy, 2.5 mGy and 5

mGy for 2D-mammography and narrow angle DBT. For both modalities, significant differences were

found between the detectable diameters when the images were acquired at the three different doses.

The maximum p-value was 0.001 for 2D-mammography and 0.003 for narrow angle DBT.

490

495

500

505

Figure 6. The threshold detectable calcification diameter for 2D-mammography and narrow angle DBT, with

the cluster insertion height at 3 cm and MGD at: 1.25 mGy, 2.5 mGy and 5 mGy. The error bars are two

standard errors of the mean.

3.4. CDMAM correlation to calcification detection

Figure 7 shows the threshold detectable calcification diameter plotted against the threshold gold

thickness determined from the CDMAM images, for each 2D-mammography and narrow angle DBT,

at three different MGD levels (1.25 mGy, 2.50 mGy and 5.00 mGy). Results are shown for 0.10 mm

diameter and 0.25 mm diameter CDMAM discs. As dose increases, the threshold gold thickness in the

CDMAM and the threshold detectable calcification diameter from the observer studies decrease

linearly for both 2D-mammography and narrow angle DBT. The CDMAM measurements provide a

measure of performance, which is relevant to calcification detection. Figure 7 also demonstrates that

narrow angle DBT requires more than four times higher dose than 2D-mammography for the same

detectable calcification diameter. The European protocol for the quality control of the physical and

technical aspects of digital breast tomosynthesis recommends that the acceptable and achievable

limits for mammography image quality parameters should not be applied to DBT; therefore they are

not presented in these results.

510

515

520

525

530

535

(a)

(b)

Figure 7. Observers average threshold detectable calcification diameter, as found in the observer study, plotted

against threshold gold thickness from CDMAM phantom images for (a) 0.10 mm and (b) 0.25 mm gold disc

diameter.

540

545

550

555

560

565

4. Discussion

In this study, the threshold detectable calcification diameter for 2D-mammography, narrow and wide

angle DBT has been quantified through 4AFC observer studies. No previous study has quantified the

threshold detectable calcification diameter using 4AFC observer experiments.

2D-mammography was found to have a lower threshold calcification diameter than both DBT

imaging modalities. Narrow-angle DBT was found to have a smaller threshold calcification diameter

than wide-angle DBT. Thus the narrow angle geometry of the clinically available DBT system

simulated in this study could be expected to have better microcalcification detection than a wide angle

version of the same system. The better performance of 2D-mammography compared to DBT raises

concern that detection of calcification clusters is lower in DBT and also that the loss of the

detectability of the smallest calcifications may affect the radiographic classification of clusters. Even

though DBT could favour the detectability of masses when used in screening (Elangovan et al 2015,

Rafferty et al 2013), when used alone it might miss small calcifications that would usually be

diagnosed with 2D-mammography.

The DBT images are less sharp than 2D-mammography images as the modulation transfer function

(MTF) is lower (Mackenzie et al 2013). This is due to the larger pixel pitch of DBT compared to 2D-

mammography (due to pixel binning in DBT) and tube movement that introduces blurring. Also, a

slight further increase in geometric blurring might be introduced as the projection angle becomes

wider (Mainprize et al 2006). The reduced DBT MTF may partially explain why 2D-mammography

was found to detect smaller calcifications than narrow angle DBT.

At wider angles the path of X-rays through the tissue is greater, the signal reaching the detector is

lower and the relative noise increases. To keep the same MGD between the DBT acquisition modes,

the dose per projection needs to be decreased for a wide angle geometry, thus the relative noise in

each projection is increased. Sechopoulos et al (2009) found that for a constant angle, the CNR in

reconstructed planes will be lower for a higher number of projections. Also, in this study each

570

575

580

585

590

595

projection contains the same magnitude of electronic noise. The simulated wide angle DBT images

were acquired with more projection images and thus a higher total electronic noise than the simulated

narrow angle DBT images. The above reasons may explain why narrow angle DBT performs better

than wide angle DBT for calcifications. It has to be noted that a different wide angle DBT system with

lower electronic noise might have better performance than the system in this study.

No statistically significant differences were found between the threshold detectable diameters for both

2D-mammography and DBT, when clusters were inserted at different heights within the breast.

Marshall and Bosmans (2012) showed that the MTF decreases with increasing distance from the

detector cover due to geometric blurring and tube movement blurring. Therefore, clusters will be

more blurred in the breast image as the height from the breast support is increased. However, the

effect of magnification may compensate this blurring as the calcifications will be larger in the image.

The above may partially explain how the above two factors, which are in opposite directions, could

lead to no effect on detectability in our observer studies.

The threshold diameter for 2D-mammography was between 46 μm and 67 μm smaller than for the

narrow angle DBT modality, depending on dose. In comparison, the effect of doubling dose from 1.25

mGy or 2.5 mGy reduced the threshold diameter between 11 μm and 33 μm for 2D-mammography

and narrow angle DBT, in agreement with other studies, which have shown that the detection rate of

calcification clusters will reduce if dose is lowered in 2D-mammography (Warren et al 2012). As the

above numbers suggest, the choice of the imaging system can be more important than dose. In DBT,

halving the dose from 2.5 mGy to 1.25 mGy causes a bigger increase (33 μm) in the threshold

calcification diameter compared to halving the dose from 5 mGy to 2.5 mGy (11 μm). This also has a

bigger effect than decreasing the dose in 2D (5 mGy to 2.5 mGy :12 μm, 2.5 mGy to 1.25 mGy: 18

μm). This can be explained as electronic noise is the same irrespective of dose, therefore there will be

a larger percentage of noise at low dose levels. Clearly at 1.25mGy the detection task is affected by

electronic noise. Lowering the dose from current levels in DBT could adversely affect clinical

outcomes.

600

605

610

615

620

This study has the advantage of testing the effect of acquisition method on calcification detection

using real observers. However, the 4AFC methodology does limit the number of acquisition methods

that can be practicably studied. Other authors (Petersson et al 2015, Sechopoulos et al 2009) used

image metrics (e.g. contrast to noise ratio (CNR), CNR/ASF (artefact spread function)) instead of

observer experiments to assess the visibility of calcifications and tested more geometries. However,

such an approach lacks the link to the detection performance of a real observer.

Limitations of this study include omitting any breast movement in the simulations of the three

imaging modalities, as the scale of movement is unknown. Patient movement could potentially

decrease the threshold detectable calcification diameter of DBT more than 2D-mammography, based

on the fact that in DBT there is more time for patient movement to take place. Also, the systems’

simulation did not include any lag or ghosting effects for the DBT images. It is not expected that lag

and ghosting would significantly affect the results of this study. In the dose arm of the study, it was

assumed that the exposure time for each DBT projection was the same for the three dose levels used

and MGD was altered by varying the tube current. The Hologic system decreases the current to reduce

dose, however, it increases the exposure time to increase dose. Therefore, in reality there would be

more blurring included in the 5 mGy case as the tube travel distance would be greater while the X-ray

tube would be on. Inevitably, there is an uncertainty in the image production associated with finite

voxel size and the number of rays used in the image formation. However, these are expected to be

small and not affect the outcome of the study. The simulation could be improved by reducing the size

of the voxel and increasing the number of rays, but the ray tracing would be impractically long.

A real calcification cluster may contain a variable number of calcifications of different sizes and

shapes, while here we used a simplified cluster with five identical calcifications spread over a volume.

Many clusters contain larger calcifications, which may be more easily detectable and so caution must

be taken in applying these results to the detection of all calcification clusters. It was advantageous to

use a cluster rather than a single calcification as this better simulates the clinical task.

625

630

635

640

645

650

Finally, based on the size of the errors bars on the results section it can be concluded that the study

could have been undertaken with a smaller number of images and/or observers, therefore decreasing

the time of the study.

Although previous publications have compared calcification detection between different DBT

geometries, none have used a reproducible quantitative approach with observers, calcification clusters

and clinically realistic backgrounds to compare different DBT geometries and 2D imaging. An

attempt was made here to employ conditions and methods that are used clinically, which makes this

study relevant to the use of existing clinical systems. It has to be stated that the above findings cannot

necessarily be generalised, for example, to other acquisition geometries or other processing and

reconstruction methods. However, the method used in this study is reproducible and can be applied to

compare different imaging technologies for the detection of masses and microcalcifications using a

clinically realistic background. In a preliminary investigation we found that masses were better

detected by narrow DBT than by 2D-mammography (Elangovan et al 2015).

Conclusions

This study has shown that the methods described can quantify the detectability of calcification

clusters for different breast imaging modalities using observer studies. The results show that small

calcifications in clusters can be more reliably detected using 2D-mammography than DBT. This

should raise some concern on the detectability of calcifications by DBT. It was also found that there is

not any significant difference between the detectability of clusters situated at different heights in the

breast from the detector support. As expected, dose has an effect of detectability, but the imaging

modality geometry has a greater impact. Finally, results showed that measurements using the

CDMAM phantom correlate well to calcification detection for 2D-mammography and narrow angle

DBT at different doses.

655

660

665

670

675

6. Acknowledgements

This work is part of the OPTIMAM2 project (grant number: C30682/A17321) funded by Cancer

Research UK. The authors would like to thank their colleagues Jack Miskell and Lucy Warren at the

National Co-ordinating Centre for the Physics of Mammography and Isabel Dodson at the Regional

Radiation Protection Service, Royal Surrey County Hospital, for participating in this study. The

authors would like to acknowledge the helpful discussions on the preparation of clusters with Lucy

Warren and the CDMAM image acquisition and analysis by Celia Strudley. We would like to thank to

the staff of Real Time Tomography for their help in using their software.

7. References

Boone J M, Fewell T R and Jennings R J 1997 Molybdenum, rhodium, and tungsten anode spectral models using interpolating polynomials with application to mammography Med. Phys. 24 1863-974

Bouwman R W, van Engen R E, Young K C, den Heeten G J, Broeders M J, Schopphoven S, Jeukens C R, Veldkamp W J and Dance D R 2015Average glandular dose in digital mammography and digital breast tomosynthesis: comparison of phantom and patient data Phys. Med. Biol. 60 7893-7907

Burgess A E 1995 Comparison of receiver operating characteristic and forced choice observer performance measurement method Med. Phys. 22 643-55

Burgess A E 1999 The Rose model revisited OSA Proc. 16 633-46Chan H P et al 2014 Digital breast tomosynthesis: Observer performance of clustered microcalcification detection on breast

phantom images acquired with an experimental system using variable scan angles, angular increments and number of projection views Radiology 273 675-85

Dance D R, Skinner C L, Young K C, Beckett J R and Kotre C J 2000 Additional factors for the estimation of mean glandular breast dose using the UK mammography dosimetry protocol Phys. Med. Biol. 45, 3225-40

Dance D R, Young K C and Engen R E 2011 Estimation of mean glandular dose for breast tomosynthesis: factors for use with the UK, European and IAEA breast dosimetry protocols Phys. Med. Biol. 56 453-72

Elangovan P, Warren L M, Mackenzie A, Diaz O, Rashidnasab A, Dance D R, Bosmans H, Young K C and Wells K 2014 Development and validation of a modelling framework for simulating 2D-mammography and breast tomosynthesis images Phys. Med. Biol. 59 4275–93

Elangovan P, Rashidnasab A, Mackenzie A, Dance D R, Young K C, Bosmans H, Segars W P and Wells K 2015 Performance comparison of breast imaging modalities using a 4AFC human observer study Proc. SPIE 9412 94121T-1-94121T-7

Elangovan P, Dance D R, Young K C and Wells K 2016 Simulation of 3D synthetic breast blocks Proc. SPIE 9783 97832E-1-5

Gur D, Zuley M L, Anello M L, Rathfon Y, Chough D M, Ganott M A, Hakim C M, Wallace L, Lu A and Bandos A I 2012 Dose reduction in digital breast tomosynthesis (DBT) screening using synthetically reconstructed projection images: An observer performance study Acad. Radiol. 19 166-71

Hu Y H, Zhao B and Zhao W 2008 Image artifacts in digital breast tomosynthesis: Investigation of the effects of system geometry and reconstruction parameters using e linear system approach Med. Phys. 35 5242-52

Kopans D, Gavenonis S, Halpern E and Moore R 2011 Calcifications in the breast and digital breast tomosynthesis Breast J 17 638-44

Mackenzie A, Dance D R, Workman A, Yip M, Wells K and Young K C 2012 Development and validation of a method for converting images to appear with noise and sharpness characteristics of a different detector and X-ray system Med. Phys. 39 2721-34

Mackenzie A, Marshall N, Dance D R, Bosmans H and Young K C 2013 Characterisation of a breast tomosynthesis unit to simulate images Proc. of SPIE 8668 86684R1-8

Mackenzie A, Dance D R, Diaz R and Young K C 2014 Image simulation and a model of noise power spectra across a range of mammographic beam qualities Med. Phys. 41 121901-1-14

Mackenzie A, Warren L M, Wallis M G, Given-Wilson R M, Cooke J, Dance D R, Chakraborty D P, Haling-Brown M D, Looney P T and Young K C 2016 The relationship between cancer detection in mammography and image quality measurements Phys. Medica 32 568-74

Macmillan N A and Creelmans C D 2004 Detection theory, A user’s guide, 2nd edition, Lawrence Erlbaum Associates Inc.

680

685

690

695

700

705

710

715

720

725

730

Mainprize J G, Bloomquist, Kempston M P and Yaffe M J 2006 Resolution at oblique incindence angles of a flat panel imager for breast tomosynthesis Med. Phys. 33 3159-64

Marshall N, Jacobs J, Cockmartin L and Bosmans H 2010 Technical evaluations of digital breast tomosynhtesis Proc. IWDM 6136 350-6

Marshall N W and Bosmans H 2012 Measurements of system sharpness for two digital breast tomosynthesis systems Phys. Med. Biol. 57 7629-50

Nelson J S, Wells J R, Baker J A and Samei E 2016 How does c-view image quality compare with conventional 2D FFDM? Med. Phys. 43 2538-47

Perry N, Broeders M, de Wolf C, Törnberg S, Holland R and von Karsa L 2006 European guidelines for quality assurance in breast cancer screening and diagnosis The European protocol for the quality control of the physical and technical aspects of mammography screening, Part B: Digital mammography, 4th edn (Europeam Commission, Luxemburg)

Petersson H, Dustler M, Tingberg A, Timberg P 2015 Monte Carlo simulation of breast tomosynthesis: visibility of microcalcifications at different acquisition schemes Proc. SPIE 9412, 94121H1-94121H7

Rafferty E A, Park J M, Philpotts L E, Poplack S P, Sumkin J H and Niklason L T 2013 Assessing radiologist performance using combined digital mammography and breast tomosynthesis compared with digital mammography alone: results of a multicentre, multireader trial Radiology 266 104-13

Reiser I and Nishikawa R M 2010 Task-based assessment of breast tomosynthesis: Effect of acquisition paramters and quantum noise Med Phys 4 1591-1600

Samei E, Saunders R S, Baker J A and Delong D M 2007 Digital mammography: effects of reduced radiation dose on diagnostic performance Radiology 243 396-404

Sechopoulos I and Ghetti C 2009 Optimization of the acquisition geometry in digital tomosynthesis of the breast Med. Phys. 36 1199-1207

Sechopoulos I 2013a A review of breast tomosynthesis. Part I. The image acquisition process Med. Phys. 40 014301-1 – 12Sechopoulos I 2013b A review of breast tomosynthesis. Part II. Image reconstruction, processing and analysis, and advanced

applications Med. Phys. 40 014302-1 – 17Shaheen E, Marshall N and Bosmans H 2011 Investigation of the effect of tube motion in breast tomosynthesis: continuous

or step and shoot? Proc. SPIE 7961 79611E-1-9Shaheen E, Van Ongeval C, Zanca F, Cockmartin L, Marshall N, Jacobs J, Young K C, Dance D R and Bosmans H 2012

The simulation of 3D microcalcification clusters in 2D digital mammography and breast tomosynthesis Med. Phys. 38 6659-71

Siddon R L 1984 Fast calculation of the exact radiological path for a three-dimensional CT array Med. Phys. 12 252-55Spangler M L, Zuley M L, Smukin J H, Abrams G, Ganott M A, Hakim C, Chough D M, Shah R and Gur D 2011 Detection

and classification of calcifications on digital breast tomosynthesis and 2D digital mammography: a comparison AJR Am J Roentgenol 196 320-324

Strudley C J, Looney P and Young K C 2014 Technical evaluation of Hologic Selenia Dimensions digital breast tomosynthesis system NHBSP Equipment report 1307 Version 2 (NHS Cancer Screening Programmes)

Strudley C J, Warren L M and Young K C 2015 Technical evaluation of Siemens Mammomat Inspiration of digital breast tomosynthesis NHSBSP Equipment report 1306 Version 2 (NHS Cancer Screening Programmes)

Timberg P, Båth M, Andersson I, Mattsson S, Tingberg A and Ruschin M 2012 Visibility of microcalcification clusters and masses in breast tomosynthesis image volumes and digital mammography: a 4AFC observer study Med. Phys. 39 2431-7

Timberg P, Dustler M, Petersson H, Tingberg A and Zackrisson S 2015 Detection of calcification clusters in digital breast tomosynthesis slices at different dose levels utilizing a SRSAR reconstruction and JAFROC Proc. SPIE 9416 941604-1-4

Tucker A W, Lu J and Zhou O 2013 Dependency of image quality on system configuration parameters in a stationary digital breast tomosynthesis system Med Phys. 3 1-10

van Engen R E, van Woudenberg A, Bosmans H, Young K and Thijssen M 2003 European protocol for the quality control of the physical and technical aspects of mammography screening 4th edn (EUREF)www.euref.org/downloads?download=26:physico-technical-protocol

van Engen R E et al 2015 Protocol for the quality control of the physical and technical aspects of digital breast tomosynthesis systems 1st edn (EUREF) http://www.euref.org/european-guidelines/physico-technical-protocol

Warren L M, Mackenzie A, Cooke J, Given-Wilson R M, Wallis M G, Chakraborty D P, Dance D R, Bosmans H andYoung K C 2012 Effect of image quality on calcification detection in digital mammography Med. Phys. 39 3202-13

Warren L M, Mackenzie A, Dance D R and Young K C 2013 Comparison of the x-ray attenuation properties of breastcalcifications, aluminium, hydroxyapatite and calcium oxalate Phys. Med. Biol. 58, 104-113

Warren L M, Dummott L, Wallis M G, Given-Wilson R M, Cooke J, Dance D R and Young K C 2014 Characterisation of screen detected and simulated calcification clusters in digital mammography Proc. IWDM 8539 364-71

Young K C, Cook J J H, and Oduko J M 2006 Automated and human determination of threshold contrast for digital mammography systems Proc. IWDM 4046 255-272

Young K C, Alsager A, Oduko J M, Bosmans H, Verbrugge B, Geertse T and van Engen R 2008 Evaluation of software for reading images of the CDMAM test object to assess digital mammography systems Proc. SPIE 6913 69131C

735

740

745

750

755

760

765

770

775

780

785

790

795