A Real-timelsp · further exploration. Navigator Interface The user interface of the client applet,...

Sebastian Gerlachand Roger D. HerschEcole Polytechnique Fédéralede Lausanne, Switzerland

IEEE INTERNET COMPUTING, Vol.6,No.2,pp.27-33 ©2002 IEEE http://computer.org/internet/ MARCH • APRIL 2002 27

Rea

l-Ti

me

Nav

igat

ion

A Real-timeNavigator for theVisible HumanAdapting data transfer to network throughput enables

real-time interactive Web-based navigation of large 3D

anatomical data sets.

The Visible Human data set, pro-duced by the National Library ofMedicine’s Visible Human Pro-

ject,1 provides researchers with digitalcross-sections of the human body.Many institutions use the VisibleHuman for research and teaching pur-poses. However, working with the fulldata set on a workstation is cumber-some and requires advanced program-ming skills. Storing the data on Webservers and offering online accessallows many more students, profession-als, and researchers can benefit.

The first Web-based application thatlet researchers explore the images wasthe Northeast Parallel ArchitecturesCenter’s (NPAC) Visible Human Viewerapplet,2 developed in 1995, whichallowed extraction of slices perpendic-ular to the human body’s main axes. TheVisible Human Slice and Surface Serv-er, which went online in 1999, added

access to arbitrarily oriented and posi-tioned slices and surfaces, as well as toslice sequence animations.3 These appli-cations require the user to define theposition and orientation of each slice;each application takes a few seconds topresent the results.

To test the feasibility of letting usersinteract with the data set in real time overthe Internet, we recently developed theVisible Human Navigator (visiblehuman.epfl.ch). We used a client-server architec-ture for the application because the fulldata set is too large (10 Gbytes) to trans-fer to the client in a Web-based applica-tion. The prototype can display severalslices per second on a standard PC con-nected to the Internet. It also extracts andincorporates anatomic labeling informa-tion from the Classified and Labeled Vis-ible Human data set (www.gsm.com/docs/products/segclass.htm) into theslices extracted at any position or orien-

tation. Because stan-dard HTTP uses a text-based request-responsemodel that createshigh overheads, wehad to use a customprotocol based on TCPsockets. And becausewe wanted to let alarge audience accessthe application, wedeveloped the client inJava.

This article focusesmainly on network per-formance and data-encoding issues. Wedescribe our im-ple-mentation and theprinciples of slice andlabel extraction. Eval-uating the proposedsolution’s performance

and the server’s behavior when serving multipleclients simultaneously points to several issues forfurther exploration.

Navigator InterfaceThe user interface of the client applet, shown inFigure 1, divides information into two parts: thenavigation pane and the current slice view. To

facilitate orientation and navigation within theVisible Human data set, the navigation panel pro-vides a 3D view of the human body from whichthe current slice is cut. Three buttons let usersinstantly return to planes with standard orienta-tions (axial, coronal, and sagital), and two sliderscontrol the desired frame rate and the resultantspeed-to-quality trade off. The slice view displaysthe current slice and highlights the current struc-ture with labeling data.

The status bar (below the slice view) shows thecurrent position in the data set and the name ofthe highlighted structure. To verify the current net-work data rate and frame rate, the status bar alsoprovides the current effective network data rate,the required network data rate (based on thedesired frame rate), and other related information.The client applet updates these values in real time.

The navigator provides real-time interactivenavigation using a standard mouse. Navigating a3D data set requires three degrees of freedom fortranslation and three additional degrees of free-dom for rotation around the main axes of the cur-rent slice’s local coordinate system. The clientinterface also allows the user to freely zoom in andout of the slice.

To find specific anatomic structures, such as themetatarsal bone, shown in Figure 2, you first movealong slices until part of the desired structurebecomes visible. Then you can rotate, zoom, andmake fine adjustments to the current position untilyou find the structure you’re looking for.

Visible Human Slice ExtractionThe Visible Man data set — a subset of the VisibleHuman — is a collection of 1,871 color images,each of size 2,048 × 1,212 pixels. The stackedimages form a volume of 682 × 404 × 1,871 mil-limeters. Each voxel (graphic information thatdefines a point in three-dimensional space) repre-sents 0.33 × 0.33 × 1 mm of the body. The VisibleHuman server stores each piece of data as smallcubic subvolumes called extents. As illustrated inFigure 3, each extent consists of a stack of colorbytemaps of equal size extracted from full-sizeaxial slices.

For our Web-based navigator, we set the size ofthe extents at 32 × 32 × 16 voxels to ensure thatan approximately constant amount of data will beloaded for a given slice regardless of its orienta-tion. In addition to the standard full-resolutiondata set, our project stores multiple-resolution ver-sions at reduction factors of 2, 4, 8, 16, and 32, asshown in Figure 4 . This data set keeps the num-

28 MARCH • APRIL 2002 http://computer.org/internet/ IEEE INTERNET COMPUTING

Real-Time Navigation

Figure 1. User interface of the Visible Human navigator.The left sidedisplays the navigation pane, and the right side shows an obliqueview across the mandible in the current slice view.

Figure 2. Oblique view of theright foot.This slice view shows ahighlighted metatarsal bone.

ber of bytes for each extent constant across allresolutions. Lower-resolution extents thereforerepresent a larger volume of the data set.

Our slice-extraction algorithm can produce anarbitrarily oriented and positioned slice from thedata set. As Figure 5 shows, three vectors defineeach slice. The system extracts the slice andresamples it using an incremental fixed-pointalgorithm.4 The algorithm begins rendering at thetop-left corner of the slice and evaluates for eachpixel the 3D coordinates of the correspondingpoint. The algorithm then retrieves the nearestvoxel (using nearest-neighbor interpolation) orsurrounding voxels (using trilinear interpolation)from the data set.

The algorithm traverses the requested sliceincrementally from the current coordinate usingthe right and up vectors of the slice visualizationparameters. To provide acceptable speed, the sys-tem stores the extents in a memory cache.Caching this data turns out to be an effectivetechnique because the position and orientation ofslices varies in small increments during interac-tive navigation.

The classified and labeled Visible Human dataset contains labeling information for eachcryosection voxel. The extents store this labelinginformation in addition to the color data. Eachlabel includes a 16-bit number to indicate whichanatomical structure contains each voxel. We usethe labels to highlight selected anatomical struc-tures on the cross-section or to indicate the nameof the structure over which the user is currentlyholding the cursor. In Figure 6 (next page), forexample, our system uses labeling information tohighlight the carotid artery leaving the aortic arch.

Client-ServerApplication PartitioningWe considered several possible mechanisms forslice compression, including the JPEG and emerg-ing JPEG 2000 (based on wavelet compression)standards. Although JPEG 2000 can provide high-er quality at an identical compression ratio,5 italso requires greater processing capacity. StandardJPEG compression is relatively easy to use, encod-ing libraries are readily available, and JPEG com-pression can be decoded quickly with pure Java.Also, JPEG’s independent macroblock structureprovides a simple mechanism for streaming smallor partial images from server to client.

At application start-up, the server transmitsthe JPEG header information, which contains theHuffman tables and other static information and

IEEE INTERNET COMPUTING http://computer.org/internet/ MARCH • APRIL 2002 29

Visible Human

Extent

Axial slices

Figure 3. Extents in the Visible Human project. Each piece of data isstored as an extent (small cubic subvolumes), which consists of astack of equal-sized bytemaps extracted from full-size axial slices.

Figure 4. Multiple-resolution versions of the images. Subdividing vol-ume data sets into extents of decreasing resolution keeps the num-ber of bytes for each extent constant across all resolutions.

Extents (subvolumes)

High resolution Low resolution

Figure 5. Slice definition.Three vectors define theposition and orientation of each slice in 3D space.

RightUp

Position

consists of only 574 bytes of data. The servertransmits image data as blocks of 16 × 16 pixels.That corresponds to groups of six JPEG mac-roblocks with four 8 × 8 pixel macroblocks forthe luminance channel at full resolution and two8 × 8 pixel macroblocks for the two chrominancechannels at half resolution.

The server compresses these blocks, on average,from 768 bytes down to 37 bytes. We reached thiscompression ratio of 20 with the IndependentJPEG Group’s (IJG) libjpeg compression library(www.ijg.org) by using a quality of 75 on the IJGJPEG quality scale (which ranges from 0 to 100).The instant compression ratio that we obtaindepends on the source data because the JPEG com-pressor strives for constant quality rather than forconstant compression ratio. The source data con-sists of the complete Visible Human data set withall areas outside the body removed.

Decompression happens on the client with acustom Java applet based on code developed atUSC,6 but which we modified to accommodatethe JPEG block-based streaming decompressionrequired by our application. Tests have shownthat the quality loss caused by compression isnot noticeable.

Client-Server InteractionFigure 7 shows the basic client-server interactionpipeline. The sequence must let the client displayslices that match the user’s positioning requests asquickly as possible and with a relatively constantframe rate. To guarantee optimal use of the avail-able network bandwidth and to avoid sending out-

dated information, the server must transmit exact-ly the amount of data that it can transfer in thetime interval between two displayed frames on theclient.

The interval between requests corresponds tothe user’s desired frame rate. The requests con-tain the parameters for the currently requestedslice, including the three vectors that define posi-tion and orientation, as well as an identifier andthe viewport parameters. (The client issues ashorter request when no parameters havechanged.) The request also contains the maxi-mum allowable message size for the reply. Theuser can adjust the allowed reply size to providethe best trade off between response time andimage quality.

Upon receiving the request, the server process-es it and starts sending data back as soon as pos-sible. The response contains only as many bytes asthe maximum set by the client’s request. Anyexcess data would arrive after the client had sentits next request, which would produce a backlogof data in the client’s input buffers and an addi-tional delay between the client request and thedisplay of the corresponding slice. To prevent datatransfer backlogs, the client stops sending requestswhen there are more than two unansweredrequests in circulation. This mechanism lowers theframe display rate when network transfers cannotkeep up with client requests.

The server carries out all transfers using TCPsockets. While UDP might be better suited for mul-timedia streaming applications — where low laten-cy is more critical than reliability7 — UDP is unre-liable in unsigned Java applets. Also, firewalls donot generally let UDP traffic pass.

Slice TransferBecause the server carries out all processing,transferring compressed slices between server andclient is extremely simple. When the serverreceives a request, it extracts the slice from thedata set, compresses it, and sends it back to theclient. The client then decompresses and displaysthe slice to the user.

The data set resides on the server in extents of32 × 32 × 16 voxels at resolutions ranging fromfull to 1:32. At the lowest resolution, the completedata set fits into 2 × 1 × 4 extents, and individualvoxels have a size of 10.56 × 10.56 × 16 mm. Theserver selects the extracted slice’s resolutionaccording to the client’s request. For a new request,the server extracts the slice at a resolution set tofit into the maximum reply size. For a continua-



Figure 6. Sample cross-section from the cryosec-tion data set.The labeling information is used tohighlight the carotid artery in blue.

tion request, the server extracts a full resolutionslice and sends it back in parts distributed acrosssuccessive reply messages.

The client always displays whatever data itreceives after decompression, which means thatthe final high-resolution slice is built progressive-ly. If the server receives a request for a new slicebefore completing a transmission, it aborts thetransfer of the full-resolution slice and insteadtransmits a new low-resolution slice.

The server sends all slices as collections of JPEGblocks; the headers indicate to which request eachblock corresponds. When the slice has to fit into asingle reply message, the client computes its res-olution based on an estimated compression ratioof 14:1, yielding a JPEG block size of 55 bytes. Thesize must also be a multiple of 16 pixels to ensureoptimal use of the JPEG macroblocks. Assuming asquare aspect ratio, the formula for calculating thetransmitted slice size is

(1)

For a square aspect ratio image and a reply sizeof 4,000 bytes, for example, the transmitted slicehas a resolution of 128 × 128 pixels. The Javaclient then scales this lower-resolution image tofill the user’s window.

When the user stops moving the slice frame, theserver transmits a full-resolution image that takesas many reply slots as required. For a full-resolu-

tion slice of 384 × 384 pixels and a reply size of4,000 bytes, the complete transmission of the fullframe typically takes six reply slots, so the clientsees the high-resolution image building up oversix frames. For the same quality factor, high-reso-lution slices achieve higher compression than lowresolution ones.

This server-client transmission mechanismgives the shortest possible response time whenthe user moves the slice frame. When the userstops moving the slice, the highest resolutionversion immediately starts building up on thedisplay. The response time is the sum of thedelays induced by the system, which caninclude network latency, server processing, andclient processing.

Labeled DataTo make the labeled data available to the client,the server extracts the labels simultaneouslywith the high-resolution slice and compressesthe label using the deflate-compression algo-rithm implemented in zlib (http://www.gzip.org/zlib/). This lossless algorithm provides a com-pression ratio of 50:1, resulting in a label slicesize of approximately 6 Kbytes for a typical sliceof 384 × 384 pixels.

The client decompresses the labeled slices usingstandard Java libraries. The server simply trans-fers the compressed label slice after the full reso-lution slice when the user has stopped moving theslice frame. When the server finishes transferring

slicesize floor

replysize= ⋅16

55( )


Visible Human

Server

Client

Client

Client

Send request(1)

(2)

(3)

(4)

Process requestand send reply

Receive anddecompress reply

Extract anddisplay slice

Figure 7. Client-server interaction pipeline model.Yellow represents the time for sending data and bluerepresents the processing time. The arrows represent packets transmitted between client and server.Numbers indicate the sequence of operations.

the labeled data, the highlighting features becomeavailable on the client’s display.

PerformanceThe performance of the slice-based model isstraightforward. For a given network data rate ndr(the sustained throughput available on the client’snetwork connection in the server-client direction)and a client’s requested frame rate fps, we canevaluate the reply size rs as

(2)We compute the resolution of the image pro-

duced during the interactive frame movement —which fits into the reply message size — accord-ing to Equation 1. Both Equations 1 and 2 showthat the system has to make a fundamental com-promise between image quality (depending direct-ly on the reply size rs) and frame rate. The net-work data rate ndr cannot be controlled, since itdepends on the network infrastructure betweenclient and server. Our system leaves the trade-offto the user’s discretion by providing a slider thatlets the user specify a position between high inter-activity (reply size 1 Kbyte) and high quality(reply size 32 Kbytes). Figure 8 shows the expect-

ed frame rates for various network data rates andreply sizes (the frame rate has an upper bound of10 frames per second).

Three factors limit the effective frame rate userscan achieve: the network bandwidth betweenclient and server, the time required to decompressa frame, or the client’s desired frame display rate.The interaction response time on the client is thesum of the following intervals:

■ Network latency for sending the request andthe reply. The interval ranges from less than 1millisecond on a local area network to morethan 100 ms on a slow connection (typically 70to 80 ms for long distances).

■ Request transfer from the client to the server.The interval depends on the effective networkthroughput (typically 2 to 3 ms for 48-byterequests with a data rate of 128 Kbps).

■ Request processing (slice extraction and com-pression). The interval depends on the serverprocessing power and disk throughput (typi-cally 10 to 20 ms to serve a request from theserver’s memory cache on a Pentium III run-ning at 733 MHz).

■ Reply transfer from the server to the client.Again, the interval depends on the effectivenetwork throughput.

A Pentium III at 733 MHz can decompress 1,000JPEG blocks per second (roughly 40 Kbytes ofincoming data, depending on the compressionratio), and decompression starts as soon as thefirst data chunk arrives. Because decompressionis generally quicker than network data transfer,decompression can occur simultaneously withthe transfer.

Experiments have shown that even latencies of250 ms still provide acceptable responsivenesslevels. Because the client displays frames afterslice arrival and no more than two requests canbe pending at any time, the effective frame rateadapts itself to instantaneous network through-

rs

ndrfps

=



8163264128

256

512

1024

1

4

16024

6

8

10

Replysize(Kbyte)

Network data rate (Kbyte/second)

Framerate

Figure 8. Frame display rate for various networkdata rates and reply sizes.

Table 1. Image quality during navigationacross the Visible Human data set.

Network data rate Reply size Frame rate Extracted slice size RMS error(Kbit/s) (bytes) (frames (pixels) (scale 0.255)

per second)128 4,000 4 128 × 128 13.001,000 8,000 16 192 × 192 8.871,000 32,000 4 384 × 384 3.30


Visible Human

put. Table 1 shows the image quality when navi-gating at a fixed speed of 3.33 mm/s across theVisible Human data set for different connectiontypes and reply sizes. The root-means-squared(RMS) error denotes the distance between theactual displayed slice and a full-resolutionuncompressed slice.

Multiple-Client PerformanceBecause our slice-based system places a high loadon the server, it currently supports a rather limitednumber of users. A client that moves the sliceframe continuously, requesting six slices per sec-ond, for example, places a 17 percent processorload on the server for a single Pentium III at 733MHz. Such a server can accommodate at most fiveclients simultaneously.

Other factors that can influence server perfor-mance in multiclient scenarios include the totalavailable bandwidth on the server’s network con-nection, the server’s memory size, and its diskthroughput. The total bandwidth requirement isnot much of a problem for high-throughput insti-tutional network connections, which generallyprovide between 35 and 155 Mbps. The diskthroughput and memory size are critical, howev-er, because the server cannot load the completedata set into memory at one time. To avoid hav-ing to reload data from disk with every trans-ferred slice, the server must be able to hold atleast enough extents for all the slices the clientsare currently viewing. A typical 384 × 384 pixelslice requires from 160 to 300 extents, for exam-ple, or a maximum of 15 Mbytes of memory. Theserver can thus support all five clients with 100Mbytes of RAM.

Future WorkInteractive navigation makes it possible forresearchers to follow complex, twisted anatom-ic structures and move through organs in anygiven direction. With real-time interaction, auser can position slices at odd angles acrossstructures to find the optimal observation direc-tion. The present implementation shows thatusable real-time navigation is feasible on cur-rently available networks. The performanceremains lower than what we can obtain with alocal data set on DVD.8

The next step in our research involves creatingscalable server architectures using clusters of PCsas server systems. Such systems would be of par-ticular interest for extracting information from ter-abyte data sets. To situate detailed slice informa-

tion within larger 3D anatomic structures, weintend to combine the real-time navigation servicewith the 3D visualization of anatomic organs con-structed from the labeled data set.

References

1. M. Ackerman, “The Visible Human Project,” Proc. IEEE,

vol. 86, no. 3, Mar. 1998, IEEE Press, Piscataway, N.J., pp.

504-511.

2. C. North, B. Schneiderman, and C. Plaisant, “User Con-

trolled Overviews of an Image Library: The Visible

Human Explorer,” Proc. Visible Human Conf., Oct. 1996;

http://www.nlm.nih.gov/research/visible/vhp_conf/

vhpconf.htm.

3. R.D. Hersch et al., “The Visible Human Slice Web Server: A

First Assessment,” Proc IS&T/SPIE Conf. Internet Imaging,

The Int. Soc. for Optical Eng., Bellingham, Wash., vol. 3964,

2000, pp. 253-258.

4. A. Kaufman and E. Shimony, “3D Scan-Conversion Algo-

rithms for Voxel-Based Graphics,” Proc. 1986 Workshop

Interactive 3D Graphics, ACM Press, New York, 1986, pp.

45-75.

5. M. Charrier, D. Santa Cruz, and M. Larsson, “JPEG 2000:

The Next Millennium Compression Standard for Still

Images,” IEEE Multimedia Systems 99, IEEE Computer Soc.

Press, Los Alamitos, Calif., vol. 1, 1999, pp. 131-132.

6. A. Ortega, S. Breslin, and K. Lengwehasatit, “Variable Com-

plexity Algorithm for JPEG decoding,” http://biron.usc.

edu/~lengweha/jpeg/jpg.html.

7. W. R. Stevens, TCP/IP Illustrated: The Protocols, Addison-

Wesley, Reading, Mass., 1994.

8. S. Gerlach and R.D. Hersch, “The Real-time Interactive

Visible Human Navigator,” Proc. Third Visible Human

Conf., U.S. Nat’l Library of Medicine, 2000; http://

www.nlm.nih.gov/research/visible/vhpconf2000/.

Sebastian Gerlach is an assistant and PhD candidate at Ecole

Polytechnique Fédérale de Lausanne (EPFL). He received a

diploma in microengineering from EPFL. His research

interests include real-time 3D visualization and parallel

application frameworks.

Roger D. Hersch is a professor at Ecole Polytechnique Fédérale

de Lausanne (EPFL). He received an electrical engineer-

ing diploma from ETH Zurich and a PhD in computer sci-

ence from EPFL. His current research interests include

high-performance server applications (imaging servers,

Web servers, and PC clusters) and novel imaging tech-

niques (color prediction, color reproduction, artistic imag-

ing, and anticounterfeiting).

Readers can contact the authors at {Sebastian.Gerlach,

RD.Hersch}@epfl.ch.

A Real-timelsp · further exploration. Navigator Interface The user interface of the client applet,...

Documents

Transcript of A Real-timelsp · further exploration. Navigator Interface The user interface of the client applet,...