Applications of Dip

30
Page 1 www.magix.in M5.1: Introduction and Definition: A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels, elements, each of which has a particular location. For each pixel, there is an associated number known as Digital Number (DN) or Sample, which dictates the color and brightness for that particular pixel. An image may be defined as a two-dimensional function, f(x, y) , where ‘x’ and ‘y’ are spatial (plane) coordinates, and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. Digital image processing” is the technology of applying a number of computer algorithms to process digital images. The outcomes of this process can be either images or a set of representative characteristics or properties of the original images. What is digital image processing? Digital image processing deals with the manipulation and analysis of pictures by a computer. It can: Improve pictorial information for better clarity (human interpretation). Automatic machine processing of scene data (interpretation by a machine/non-human, storage, transmission). M5.2: Application Areas of Digital Image Processing: Today, there is almost no area of technical endeavor that is not impacted in some way by digital image processing. We can cover only a few of these applications in the context and space of the current discussion. In general, the fields that use digital image processing techniques can be divided into criminology, morphology, microscopy, photography, remote sensing, medical imaging, forensics, transportation and military application but not limited to.

description

All about various fields where image processing can be used to give efficient results.

Transcript of Applications of Dip

Page 1: Applications of Dip

Page 1 www.magix.in

M5.1: Introduction and Definition:

A digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels, elements, each of which has a particular location. For each pixel, there is an associated number known as Digital Number (DN) or Sample, which dictates the color and brightness for that particular pixel. An image may be defined as a two-dimensional function, f(x, y), where ‘x’ and ‘y’ are spatial (plane) coordinates, and the amplitude of ‘f’ at any pair of coordinates (x, y) is called the intensity or gray level of the image at that point. “Digital image processing” is the technology of applying a number of computer algorithms to process digital images. The outcomes of this process can be either images or a set of representative characteristics or properties of the original images.

What is digital image processing? Digital image processing deals with the manipulation and analysis of pictures by a computer. It can: Ø Improve pictorial information for better clarity (human interpretation). Ø Automatic machine processing of scene data (interpretation by a

machine/non-human, storage, transmission).

M5.2: Application Areas of Digital Image Processing: Today, there is almost no area of technical endeavor that is not impacted in some way by digital image processing. We can cover only a few of these applications in the context and space of the current discussion. In general, the fields that use digital image processing techniques can be divided into criminology, morphology, microscopy, photography, remote sensing, medical imaging, forensics, transportation and military application but not limited to.

Page 2: Applications of Dip

Page 2 www.magix.in

M5.2.1: CRIMINOLOGY/FORENSICS: Few types of evidence are more incriminating than a photograph or videotape that places a suspect at a crime scene. Ideally, the image will be clear, with all persons, settings, and objects reliably identifiable. Unfortunately, though, that is not always the case, and the photograph or video image may be grainy, blurry, of poor contrast, or even damaged in some way. In such cases, investigators may rely on computerized technology that enables digital processing and enhancement of an image. The U.S. government, and in particular, the military, the FBI , and the National Aeronautics and Space Agency (NASA), and more recently, private technology firms, have developed advanced computer software that can dramatically improve the clarity of and amount of detail visible in still and video images.

M5.2.2: MEDICAL IMAGING: This is a technology that can be used to generate images of a human body (or part of it). These images are then processed or analyzed by experts, who provide clinical prescription based on their observations. Ultrasonic, X-ray, Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) are quite often seen in daily life, though different sensory systems are individually applied.

M5.2.3: REMOTE SENSING: This is a technology of employing remote sensors to gather information about the Earth. Usually the techniques used to obtain the information depend on electromagnetic radiation, force fields, or acoustic energy that can be detected by cameras, radiometers, lasers, radar systems, sonar, seismographs, thermal meters, etc.

M5.2.4: MILITARY: This area has been overwhelmingly studied recently. Existing applications consist of object detection, tracking and three dimensional reconstructions of territory, etc. For example, a human body or any subject producing heat can be detected in night time using infrared imaging sensors. This technique has been commonly used in the battle fields. Another example is that three dimensional recovery of a target is used to find its correspondence to the template stored in the database before this target is destroyed by a missile.

M5.2.5: TRANSPORTATION: This is a new area that has just been developed in recent years. One of the key technological progresses is the design of automatically driven vehicles, where imaging systems play a vital role in path planning, obstacle avoidance and servo control. Digital image processing has also found its applications in traffic control and transportation planning, etc.

Page 3: Applications of Dip

Page 3 www.magix.in

M5.2.6: INDUSTRIAL INSPECTION/ QUALITY CONTROL: A major area of digital image processing is in automated visual inspection of manufactured goods. Figure M5.1 shows some examples. Figure M5.1(a) is a controller board for a CD-ROM drive. A typical image processing task with products like this is to inspect them for missing parts. Figure M5.1(b) is an imaged pill container. The objective here is to have a machine look for missing pills. Figure M5.1(c) shows an application in which image processing is used to look for bottles that are not filled up to an acceptable level. Figure M5.1(d) shows a clear-plastic part with an unacceptable number of air pockets in it. Detecting anomalies like these is a major theme of industrial inspection that includes other products such as wood and cloth.

(a) (b)

(c) (d) (Fig M5.1: Some examples of manufactured goods often checked using digital image processing.)

Page 4: Applications of Dip

Page 4 www.magix.in

M5.2.7: DIGITAL CAMERA IMAGES: Digital cameras generally include dedicated digital image processing chips to convert the raw data from the image sensor into a color-corrected image in a standard image file format. Images from digital cameras often receive further processing to improve their quality, a distinct advantage that digital cameras have over film cameras. The digital image processing typically is executed by special software programs that can manipulate the images in many ways. Many digital cameras also enable viewing of histograms of images, as an aid for the photographer to understand the rendered brightness range of each shot more readily.

M5.2.8: MORPHOLOGY: The word morphology commonly denotes a branch of biology that deals with the form and structure of animals and plants. We use this image processing under the context of mathematical morphology as a tool for extracting image components that are useful in the representation and description of region shape, such as boundaries, skeletons and the convex hull. The language of mathematical morphology is Set theory. Sets in this represent the shapes of objects in an image.

M5.2.9: COMPUTER VISION: Computer vision is the science and technology of machines that see, where see in this case means that the machine is able to extract information from an image that is necessary to solve some task. As a scientific discipline, computer vision is concerned with the theory behind artificial systems that extract information from images. The image data can take many forms, such as video sequences, views from multiple cameras, or multi-dimensional data from a medical scanner. Examples of applications of computer vision include systems for:

• Controlling processes (e.g., an industrial robot or an autonomous vehicle). • Detecting events (e.g., for visual surveillance or people counting). • Organizing information (e.g., for indexing databases of images and image

sequences). • Modeling objects or environments (e.g., industrial inspection, medical image

analysis or topographical modeling). • Interaction (e.g., as the input to a device for computer-human interaction).

Computer vision is, in some ways, the inverse of computer graphics. While computer graphics produces image data from 3D models, computer vision often produces 3D models from image data.

Page 5: Applications of Dip

Page 5 www.magix.in

M5.2.10: AUGMENTED REALITY: Augmented reality (AR) is a term for a live direct or indirect view of a physical, real-world environment whose elements are augmented by computer-generated sensory input, such as sound or graphics. It is related to a more general concept called mediated reality, in which a view of reality is modified (possibly even diminished rather than augmented) by a computer. As a result, the technology functions by enhancing one’s current perception of reality. By contrast, virtual reality replaces the real-world with a simulated one.

M5.2.11: NON-PHOTOREALISTIC RENDERING: Non-photorealistic rendering (NPR) is an area of digital image processing that focuses on enabling a wide variety of expressive styles for digital art. In contrast to traditional computer graphics, which has focused on photorealism, NPR is inspired by artistic styles such as painting, drawing, technical illustration, and animated cartoons. NPR has appeared in movies and video games in the form of "toon shaders," as well as in architectural illustration and experimental animation. An example of a modern use of this method is that of Cel-shaded animation. There are a number of other applications of digital image processing. Face recognition, iris recognition, speaker recognition and finger print classification etc can be seen in daily routine life. Digital watermarking is used in image and data security. Medical image processing refers to X-ray, sonography, image enhancement, 3-D image reconstruction. On line inspection of industrial parts is applied in industry. The remotely sensed data is captured through multispectral camera, e.g. images of crops are analyzed and watermarking sequence is found using remotely sensed images. We can say about digital image processing is that “future is here”.

M5.3: Difference between Image processing and Computer Graphics: There is always a confusion among the newbies about the difference between Image Processing and Computer Graphics. Unless specifically mentioned, Computer Graphics is all about Synthesizing a new image from Geometry, Lighting parameter, Materials and Textures. The Emphasis is on Digital Image Synthesis. Image Processing is the process of manipulating an image acquired through some device. The image too often will be acquired from photographs, scanners, medical equipments. The emphasis is on Analysis and Enhancement of the image. Computer Vision is an area where Image Analysis is used a lot. Raster Operations dominate in the case of image processing. In the case of Computer Graphics, you will mix vector and raster operations to generate the final image.

Page 6: Applications of Dip

Page 6 www.magix.in

The key element that distinguishes image processing (or digital image processing) from computer graphics is that image processing generally begins with images in the image space and performs pixel-based operations on them to produce new images that exhibit certain desired features. For example, we may reset each pixel in the image displayed on the monitor screen to its complementary color (e.g., black to white and white to black), turning a dark triangle on a white background to a white triangle on a dark background, or vice versa. While each of these two fields has its own focus and strength, they also overlap and complement each other. In fact, stunning visual effects are often achieved by using a combination of computer graphics and image processing techniques.

M5.4: Fundamental Steps in Digital Image Processing : The fundamental steps in image processing are: Ø Image acquisition Ø Preprocessing Ø Segmentation Ø Representation and description Ø Recognition and interpretation

M5.4.1: IMAGE ACQUISITION : The first step in this process is to acquire a digital image. To do so, it requires an imaging sensor and the capability to digitize the signal produced by the sensor. The imaging sensor could also be a line-scan camera that produces a single image line at a time. In this case the object motion past the line scanner produces a 2-dimensional image. If the output of the camera or other imaging sensor is not already in digital form an analog to digital converter digitizes it. Two elements are required to acquire digital images. They are § A Physical device that is sensitive to a band in the electromagnetic energy

spectrum (such as the X-ray, ultra violet, Visible or Infrared bands) and that produces an electrical signal output proportional to the level of energy sensed.

§ The Digitizer, is a device for converting the electrical output of the physical sensing device into digital form.

Image digitization is achieved by feeding the video output of the cameras into a digitizer as stated earlier which converts the given input to its equivalent digital form.

M5.4.2: IMAGE PREPROCESSING : After a digital image has been obtained, the next step deals with preprocessing that image. The key function of preprocessing is to improve the image in ways that increase the chances for success of other processes. It typically deals with techniques for enhancing contrast, removing noise and isolating regions whose

Page 7: Applications of Dip

Page 7 www.magix.in

texture indicate a livelihood of alphanumeric information. The three main categories of digital image processing are image compression, image enhancement and restoration, and measurement extraction. Image compression is a mathematical technique used to reduce the amount of computer memory needed to store a digital image. The computer discards some information, while retaining sufficient information to make the image pleasing to the human eye. Image enhancement techniques can be used to modify the brightness and contrast of an image, to remove blurriness, and to filter out some of the noise. Using mathematical equations called algorithms, the computer applies each change to either the whole image or targets a particular portion of the image. The principal objective of enhancement technique is to process the image so that the result is more suitable than the original image for specific application. In Image Measurement, the aim is to extract information about the distribution of the sizes of the objects. This usually involves segmenting the image to separate the objects of interest from the background.

M5.4.3: SEGMENTATION : The first step in image analysis generally is to segment the image. Segmentation subdivides an image into its constituent parts or objects. The level to which this subdivision is carried depends on the problem being solved. That is, segmentation should stop when the objects of interest in an application have been isolated. In general autonomous segmentation is one of the most difficult tasks in image processing. M5.4.4: REPRESENTATION AND DESCRIPTION : Representation and description almost always follow the output of a segmentation stage, which usually is raw pixel data, constituting either the boundary of a region (i.e., the set of pixels separating one image region from another) or all the points in the region itself. In either case, converting the data to a form suitable for computer processing is necessary. The first decision that must be made is whether the data should be represented as a boundary or as a complete region. Boundary representation is appropriate when the focus is on external shape characteristics, such as corners and inflections. Regional representation is appropriate when the focus is on internal properties, such as texture or skeletal shape. In some applications, these representations complement each other. Choosing a representation is only part of the solution for transforming raw data into a form suitable for subsequent computer processing. A method must also be specified for describing the data so that features of interest are highlighted. Description, also called feature selection, deals with extracting attributes that result in some quantitative information of interest or are basic for differentiating one class of objects from another.

Page 8: Applications of Dip

Page 8 www.magix.in

M5.4.5: RECOGNITION AND INTERPRETATION : Recognition is the process that assigns a label to an object based on the information provided by its descriptors. Interpretation involves assigning meaning to an ensemble of recognized objects. In terms of example, identifying a character as, say, a ‘c’ requires associating the descriptors for that character with the label c. We conclude the coverage of digital image processing by developing several techniques for recognition and interpretation.

Page 9: Applications of Dip

Page 9 www.magix.in

M5.5: The Storage and Capture of Digital Images: Almost all graphics software deals with some “real” images that are captured using digital cameras or flatbed scanners. This section deals with the practicalities of acquiring, storing and manipulating such images.

Images are stored in computers as a 2-dimensional array of numbers. The numbers can correspond to different information such as color or gray scale intensity, luminance, chrominance, and so on.

Before we can process an image on the computer, we need the image in digital form. To transform a continuous tone picture into digital form requires a digitizer. The most commonly used digitizers are scanners and digital cameras. The two functions of a digitizer are sampling and quantizing. Sampling captures evenly spaced data points to represent an image. Since these data points are to be stored in a computer, they must be converted to a binary form. Quantization assigns each value a binary number

Any image from a scanner, or from a digital camera, or in a computer, is a digital image. Computer images have been "digitized", a process which converts the real world color picture to instead be numeric computer data consisting of rows and columns of millions of color samples measured from the original picture. The way a digital camera creates this copy of a color picture is with a CCD chip behind the lens, constructed with a grid of many tiny light-sensitive cells, or sensors, arranged to divide the total picture area into rows and columns of a huge number of very tiny subareas. A 3 megapixel camera CCD has a grid of 2048x1536 sensors (3 million of them). Each sensor samples the color of one of those tiny areas, creating an image of size 2048x1536 pixels.

Page 10: Applications of Dip

Page 10 www.magix.in

A scanner has a one-row array of similar cells, and a motor moves this row of sensors down the page, making columns to form the full grid. In either case, the color and brightness of each tiny area seen by a sensor is "sampled", meaning the color value of each area is measured and recorded as a numeric value which represents the color there. This process is called digitizing the image. The data is organized into the same rows and columns to retain the location of each actual tiny picture area. Each one of these sampled numeric color data values is called a pixel. Pixel is a computer word formed from PICture ELement, because a pixel is the smallest element of the digital image. In your photo editor program, zoom an image to about 500% size on the screen, and you will see the pixels. The fundamental thing to understand about digital images is that they consist of pixels, and are dimensioned in pixels. It may help to realize that a picture constructed of colored mosaic tile chips on a wall or floor is a somewhat similar concept, being composed of many tiny tile areas, each represented by a sample of one color. From a reasonable viewing distance, we do not notice the individual small tiles, our brain just sees the overall picture represented by them. The concept of pixels is similar, except that these pixels (digitized color sample values) are extremely small, and are aligned in perfect rows and columns of tiny squares, to compose the rectangular total image. A pixel is the remembered color value of each one of these color samples representing tiny square areas. The size of the image is dimensioned in pixels, X columns wide and Y rows tall. When all of this image data (millions of numbers representing tiny color sample values, each called a pixel) is recombined and reproduced in correct row and column order on printed paper or a computer screen, our human brain recognizes the original image again.

What's a pixel? Numbers. A digital color image pixel is just a RGB data value (Red, Green, Blue). Each pixel's color sample has three numerical RGB components (Red, Green, Blue) to represent the color. These three RGB components are three 8-bit numbers for each pixel. Three 8-bit bytes (one byte for each of RGB) is called 24 bit color. Each 8 bit RGB component can have 256 possible values, ranging from 0 to 255. For example, three values like (250, 165, 0), meaning (Red=250, Green=165, Blue=0) to denote one Orange pixel. Photo editor programs have an EyeDropper tool to show the 3 RGB color components for any image pixel. 24 bit RGB color images use 3 bytes, and can have 256 shades of red, and 256 shades of green, and 256 shades of blue. This is 256x256x256 = 16.7 million possible combinations or colors for 24 bit RGB color images. The pixel's RGB data value shows "how much" Red, and Green, and Blue, and the three colors and intensity levels will be combined at that image pixel, at that pixel location. The composite of the three RGB values creates the final color for that one pixel area.

Page 11: Applications of Dip

Page 11 www.magix.in

How to store these pixels? We know that we store the image in a matrix of real numbers. Assume that an image f(x, y) is sampled so that the resulting digital image has ‘M’ rows and ‘N’ columns. Thus, the values of the coordinates at the origin are (x, y)=(0, 0). The next coordinate values along the first row of the image are represented as (x, y)=(0, 1). It is important to keep in mind that the notation (0, 1) is used to signify the second sample along the first row.

Page 12: Applications of Dip

Page 12 www.magix.in

The notation introduced in the preceding paragraph allows us to write the complete M X N digital image in the following compact matrix form:

The right side of this equation is by definition a digital image. Each element of this matrix array is called an image element, picture element, pixel, or pel. This matrix of all this RGB data is stored as an “Image File”. The image file contains three color values for every RGB pixel, or location, in the image grid of rows and columns. The data is also organized in the file in rows and columns. File formats vary, but the beginning of the file contains numbers specifying the number of rows and columns (which is the image size, like 800x600 pixels) and this is followed by huge strings of data representing the RGB color of every pixel. The viewing software then knows how many rows and columns, and therefore how to separate and arrange the following RGB pixel values accordingly into rows and columns. The image itself is an abstract thing. When we display that color data on the screen, then our human brain makes an image out of it from the appearance of all of these RGB data values.

M5.6: FILE FORMS: Most RGB image formats use eight bits for each of the red, green and blue channels. This results in approximately three megabytes of raw information for a single million-pixel image. To reduce the storage requirement, most image formats allow for some kind of compression. At a high level, such compression is either “lossless” or “lossy”. No information is discarded in lossless compression, while some information is lost unrecoverably in a lossy system. Including proprietary types, there are hundreds of image file types. The PNG, JPEG, and GIF formats are most often used to display images on the Internet. These graphic formats are listed and briefly described below, separated into the two main families of graphics: raster and vector.

Page 13: Applications of Dip

Page 13 www.magix.in

M5.6.1: RASTER FORMATS: These formats store images as bitmaps (also known as pixmaps). Let’s discuss some of the popular formats in this category:

M5.6.1.1: JPEG: JPEG (Joint Photographic Experts Group) is a compression method; JPEG-compressed images are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG compression is (in most cases) lossy compression. The JPEG/JFIF filename extension in DOS is JPG (other operating systems may use JPEG). Nearly every digital camera can save images in the JPEG/JFIF format, which supports 8 bits per color (red, green, blue) for a 24-bit total, producing relatively small files. When not too great, the compression does not noticeably detract from the image's quality, but JPEG files suffer generational degradation when repeatedly edited and saved. JPGs use a complex compression algorithm, which can be applied on a sliding scale. Compression is achieved by ‘forgetting’ certain details about the image, which the JPG will then try to fill in later when it is being displayed. You can save a JPG with 0% compression for a perfect image with a large filesize; or with 80% compression for a small but noticeably degraded image. In practical use, a compression setting of about 60% will result in the optimum balance of quality and filesize, without letting the lossy compression do too much damage. Though JPGs can be interlaced, they lack many of the other special abilities of GIFs, like animation and transparency; but as I said, they really are only for photos. Simple graphics with large blocks of colour should not be saved as JPGs because the edges get all smudgy. The JPEG format is likely to be replaced at some point in the future by the updated JPEG2000 format.

Advantages of JPEG Images: Ø Huge compression ratios mean faster download speeds. Ø JPEG produces excellent results for most photographs and complex images. Ø JPEG supports full-color (24-bit, "true color") images.

M5.6.1.2: JPEG2000: JPEG 2000 is a compression standard enabling both lossless and lossy storage. The compression methods used are different from the ones in standard JFIF/JPEG; they improve quality and compression ratios, but also require more computational power to process. JPEG 2000 also adds features that are missing in JPEG. It is not nearly as common as JPEG, but it is used currently in professional movie editing and distribution (e.g., some digital cinemas use JPEG 2000 for individual movie frames).

Page 14: Applications of Dip

Page 14 www.magix.in

M5.6.1.3: GIF: GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is still widely used to provide image animation effects. It also uses a lossless compression that is more effective when large areas have a single color, and ineffective for detailed images or dithered images.

The GIF file format uses a relatively basic form of file compression that squeezes out inefficiencies in the data storage without losing data or distorting the image. The LZW compression scheme is best at compressing images with large fields of homogeneous color. LZW is the compression scheme used in GIF format. It is less efficient at compressing complicated pictures with many colors and complex textures.

Characteristics of LZW compression can be used to improve its efficiency and thereby reduce the size of your GIF graphics. The strategy is to reduce the number of colors in your GIF image to the minimum number necessary and to remove stray colors that are not required to represent the image. A GIF graphic cannot have more than 256 colors but it can have fewer colors, down to a minimum of two (black and white). Images with fewer colors will compress more efficiently under LZW compression. For combining multiple GIF images into a single file to create animation, GIF file format is used. There are a number of drawbacks to this functionality. The GIF format applies no compression between frames, so if you are combining four 30-kilobyte images into a single animation, you will end up with a 120 KB GIF file to push through the wire. Another drawback of GIF animations is that there are no interface controls for this file format, GIF animations play whether you want them to not. And if looping is enabled, the animations play again and again and again.

Advantages of GIF Files: Ø GIF is the most widely supported graphics format on the Web. Ø GIFs of diagrammatic images look better than JPEGs. Ø GIF supports transparency and interlacing.

M5.6.1.4: BMP: The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large; the advantage is their simplicity and wide acceptance in Windows programs.

Page 15: Applications of Dip

Page 15 www.magix.in

M5.6.1.5: RAW: RAW refers to a family of raw image formats that are options available on some digital cameras. These formats usually use a lossless or nearly-lossless compression, and produce file sizes much smaller than the TIFF formats of full-size processed images from the same cameras. Although there is a standard raw image format, (ISO 12234-2, TIFF/EP), the raw formats used by most cameras are not standardized or documented, and differ among camera manufacturers. Many graphic programs and image editors may not accept some or all of them, and some older ones have been effectively orphaned already. Adobe's Digital Negative (DNG) specification is an attempt at standardizing a raw image format to be used by cameras, or for archival storage of image data converted from undocumented raw image formats, and is used by several niche and minority camera manufacturers including Pentax, Leica, and Samsung. The raw image formats of more than 230 camera models, including those from manufacturers with the largest market shares such as Canon, Nikon, Sony, and Olympus, can be converted to DNG.

M5.6.1.6: PNG: The PNG, (Portable Network Graphics) file format was created as the free, open-source successor to the GIF. The PNG file format supports truecolor (16 million colors) while the GIF supports only 256 colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of photographic images, because in this case JPG files are usually smaller than PNG files. Many older browsers currently do not support the PNG file format; however, with Mozilla Firefox or Internet Explorer 7, all contemporary web browsers now support all common uses of the PNG format, including full 8-bit translucency (Internet Explorer 7 may display odd colors on translucent images ONLY when combined with IE's opacity filter).

M5.6.2: VECTOR FORMATS: As opposed to the raster image formats above (where the data describes the characteristics of each individual pixel), vector image formats contain a geometric description which can be rendered smoothly at any desired display size.

M5.6.2.1: CGM: CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster graphics, and text. All graphical elements can be specified in a textual source file that can be compiled into a binary file or one of two text representations. CGM provides a means of graphics data interchange for computer representation of 2D graphical information independent from any particular application, system, platform, or device. It has been adapted to some extent in the areas of technical illustration and professional design, but has largely been superseded by formats such as SVG and DXF.

Page 16: Applications of Dip

Page 16 www.magix.in

M5.6.2.2: SVG: SVG (Scalable Vector Graphics) is an open standard created and developed by the World Wide Web Consortium to address the need (and attempts of several corporations) for a versatile, scriptable and all-purpose vector format for the web and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual nature of XML, an SVG graphic can be compressed using a program such as gzip. Because of its scripting potential, SVG is a key component in web applications: interactive web pages that look and act like applications

M5.7: BASIC DIGITAL IMAGE PROCESSING TECHNIQUES : There exists thousands of technique to enhance the quality of digital image. Here we will discuss some of them that are widely used. M5.7.1: Anti-Aliasing : Anti-aliasing is a method for improving the realism of an image by removing the jagged edges from it. These jagged edges, or “jaggies”, appear because a computer monitor has square pixels, and these square pixels are inadequate for displaying lines or curves that are not parallel to the pixels and other reason is low sampling rate of the image information, which in turn leads to these jaggies. For better understanding, take the following image of darkened circle:

Anti-Aliasing is a method of fooling the eye that a jagged edge is really smooth. Anti-Aliasing is often referred in games and on graphics cards. In games especially the chance to smooth edges of the images goes a long way to creating a realistic 3D image on the screen. Remember though that Anti-Aliasing does not actually smooth any edges of images it merely fools the eye. Like a lot of things they are only designed to be good enough. If you can't tell the difference then that's fine. There are several algorithms developed for anti-aliasing, the simplest is to increase the resolution. Doubling the resolution in horizontal and vertical direction, will make the jags half in size and double their numbers. They will look smoother, but this will quadruple the use of graphical memory, which is expensive. There are other and cheaper ways to handle the problem.

Page 17: Applications of Dip

Page 17 www.magix.in

“Another Anti-aliasing technique widely used in computer graphics to optimize the look of graphics and typography on the display screen; visually ‘smoothes’ the shapes in graphics and type by inserting pixels of intermediate colors along boundary edges between colors”

Let’s take a look at the example below to demonstrate the effects of Anti-Aliasing.

The letter on the left is a blown up letter ‘a’ with no anti-aliasing. The letter on the right has had anti-aliasing applied to it. In this blown up form it looks like it is simply blurred but if we reduce the size down to a more standard size you may see the difference.

Now look closely at the two letters. You can still tell that the letter of the left is jagged but the letter on the right looks a lot smoother and less blurry than the example above. Remember I have only shrunk the image down back to normal size and have not altered anything else to the image at all. So as you can see, Anti-Aliasing brings a much more pleasing image to the eye.

Page 18: Applications of Dip

Page 18 www.magix.in

Pro's and Cons - The Summery There are pro's and cons for using anti-aliasing in both games and applications. We have been through them but here is a quick summery to help you make up your mind if using Anti-Aliasing is right for you and your PC

Pro's Ø Smoothes out screen fonts Ø Rounded edges look to have smooth curves Ø Type can be easier to read due to better quality fonts Ø Games look a lot prettier and more realistic

Cons Ø Small text can be too blurred to read Ø Already sharp edges can be made fuzzier Ø You can’t print out Anti-Aliased text as it blurs Ø Static image sizes are larger Ø Games are affected by lower frame rates

M5.7.2: Convolutions : One of the reasons for capturing an image digitally is to allow us to manipulate it to better serve our needs. Often this will include trying to improve the subjective appearance of an image through smoothing of grainy features or sharpening of indistinct features. These goals sometimes can be accomplished through the use of a “discrete convolution” operation (also called digital filtering). A convolution lets you do many things, like calculate derivatives, detect edges, apply blurs, etc. A very wide variety of things. And all of this is done with a “convolution kernel”. In general Convolution is a common image processing technique that changes the intensities of a pixel to reflect the intensities of the surrounding pixels. A common use of convolution is to create image filters. Using convolution, you can get popular image effects like blur, sharpen, and edge detection—effects used by applications such as Photo Booth, iPhoto, and Aperture. But, at first we should know that any picture get stored via pixel’s intensity level, as shown in following figure:

Page 19: Applications of Dip

Page 19 www.magix.in

Discrete convolution determines a new value for each pixel in an image by computing some function of that pixel and its neighbors. Often this function simply is a weighted sum of pixel values in a small neighborhood of the source pixel. These weights can be represented by a small matrix that sometimes is called a “convolution kernel”. The dimensions of the matrix must be odd so there will be a central cell to represent the weight of the original value of the pixel for which we are computing a new value. The new value is computed by multiplying each pixel value in the neighborhood of the central pixel by the corresponding weight in the matrix, summing all the weighted values, and dividing by the sum of the weights in the matrix.

(Fig: An example convolution kernel)

Page 20: Applications of Dip

Page 20 www.magix.in

The anchor point starts at the top-left corner of the image and moves over each pixel sequentially. At each position, the kernel overlaps a few pixels on the image. Each overlapping pair of numbers is multiplied and added. Finally, the value at the current position is set to this sum. Here’s an example:

The matrix on the left is the image and the one on the right is the kernel. Suppose the kernel is at the highlighted position. So the ’9′ of the kernel overlaps with the ’4′ of the image. So you calculate their product: 36. Next, ’3′ of the kernel overlaps the ’3′ of the image. So you multiply: 9. Then you add it to 36. So you get a sum of 36+9=45. Similarly, you do for all the remaining 7 overlapping values. You’ll get a total sum. This sum is stored in place of ’2′ (in the image). Following figure is another example to show how this is done. Here the filter used in “embossed kernel”:

Page 21: Applications of Dip

Page 21 www.magix.in

Now, the question that will arise is “from where the convolution kernel or filter matrix comes from?”. Kernel is precalculated matrix. Design of kernels is based on high levels mathematics. You can find ready-made kernels on the Web. We will discuss few examples later in this topic.

Problematic corners and edges The kernel is two dimensional. So you have problems when the kernel is near the edges or corners. So we have no idea what to do with it. Usually to compensate them we create extra pixels near the edges. There are a few ways to create extra pixels: Ø Set a constant value for these pixels i.e. zero Ø Duplicate edge pixels Ø Warp the image around (copy pixels from the other end)

This usually fixes the problems that might arise.

Examples: Design of kernels is based on high levels mathematics. You can find ready-made kernels on the Web. Here are a few examples:

Page 22: Applications of Dip

Page 22 www.magix.in

Convolutions are used by many applications for engineering and mathematics. Many types of blur filters or edge detection use convolutions. It is based on the convolution theorem, which states that an enhanced image g(x,y) can be produced by convolving the image f(x,y) with an operator h(x,y).

Page 23: Applications of Dip

Page 23 www.magix.in

M5.7.3: Thresholding : Image thresholding is a useful method in many image processing and computer vision applications, especially in image segmentations. It can be used to distinguish object and background pixels in a digital image by their gray-level values. The output of the bi-level thresholding operation is a binary image whose one part indicates the object and the other the background. Image thresholding is one of the most common image processing operations, since almost all image processing schemes need some sort of separation of the pixels into different classes.

Method and Variants: During the thresholding process, individual pixels in an image are marked as “object” pixels if their value is greater than some threshold value (assuming an object to be brighter than the background) and as “background” pixels otherwise. This convention is known as “threshold above”. Variants include “threshold below”, which is opposite of threshold above; “threshold inside”, where a pixel is labeled "object" if its value is between two thresholds; and “threshold outside”, which is the opposite of threshold inside. Typically, an object pixel is given a value of “1” while a background pixel is given a value of “0.” Finally, a binary image (image containing white and black color only) is created by coloring each pixel white or black, depending on a pixel's label. An example for this is shown in following figure:

(a) Original Image (b) Image after threshold

(Fig: Example of Thresholding)

Threshold Selection: The key parameter in the thresholding process is the choice of the threshold value. Several different methods for choosing a threshold exist; users can manually choose a threshold value, or a thresholding algorithm can compute a value automatically, which is known as “automatic thresholding”. A simple method would be to choose the mean or median value, the rationale being that if the object pixels are brighter than the background, they should also be brighter than the average. In a noiseless image with uniform background and object values, the mean or median will work well as the threshold, however, this will generally not be the case. A more sophisticated approach might be to create a histogram of the image pixel intensities and use the valley point as the threshold. The histogram of an image might be

Page 24: Applications of Dip

Page 24 www.magix.in

considered as a powerful measure for thresholding since it represents the distribution of the image brightness. The histogram approach assumes that there is some average value for the background and object pixels, but that the actual pixel values have some variation around these average values. However, this may be computationally expensive, and image histograms may not have clearly defined valley points, often making the selection of an accurate threshold difficult.

(Fig: An example histogram)

One method that is relatively simple, does not require much specific knowledge of the image, and is robust against image noise, is the following iterative method:

1. An initial threshold (T) is chosen; this can be done randomly or according to any other method desired.

2. The image is segmented into object and background pixels as described above, creating two sets:

a) G1 = {f(m,n): f(m,n)>T} (object pixels) b) G2 = {f(m,n): f(m,n)T} (background pixels) (note, f(m,n) is the value of

the pixel located in the mth column, nth row).

3. The average of each set is computed. m1 = average value of G1 m2 = average value of G2

4. A new threshold is created that is the average of m1 and m2 T’ = (m1 + m2)/2

5. Go back to step two, now using the new threshold computed in step four, keep repeating until the new threshold matches the one before it (i.e. until convergence has been reached).

This iterative algorithm is a special one-dimensional case of the k-means clustering algorithm, which has been proven to converge at a local minimum—meaning that a different initial threshold may give a different final result.

Page 25: Applications of Dip

Page 25 www.magix.in

Adaptive thresholding: Thresholding is called adaptive thresholding when a different threshold is used for different regions in the image. This may also be known as local or dynamic thresholding.

Multiband thresholding: Color images can also be thresholded. One approach is to designate a separate threshold for each of the RGB components of the image and then combine them with an AND operation. This reflects the way the camera works and how the data is stored in the computer, but it does not correspond to the way that people recognize color. Therefore, the HSL and HSV color models are more often used. It is also possible to use the CMYK color model.

Advantages: Ø Simple to implement Ø Fast (especially if repeating on similar images) Ø Good for some kinds of images (e.g., documents, controlled lighting)

Disadvantages: Ø No guarantees of object coherency—may have holes, extraneous pixels, etc.

M5.7.4: Image Enhancement : Image enhancement is the process of manipulating an image so that the result is more suitable than the original for a specific application. The word ‘specific’ is important here; because it establishes at the outset that enhancement techniques are problem oriented. Thus, for example, a method that is quite useful for enhancing X-ray images may not be the best approach for enhancing satellite images taken in the infrared band of the electromagnetic spectrum. There is no general “theory” of image enhancement. When an image is processed for visual interpretation, the viewer is the ultimate judge of how well a particular method works. Enhancement techniques are so varied, and use so many different image processing approaches, that it is difficult to assemble a meaningful body of techniques suitable for enhancement in one topic. We will discuss some of the basic image enhancement techniques here. Consequently, the enhancement methods are application specific and are often developed empirically. Figure M5.12 illustrates the importance of the application by the feedback loop from the output image back to the start of the enhancement process and models the experimental nature of the development. In this figure we

Page 26: Applications of Dip

Page 26 www.magix.in

define the enhanced image as E(r,c). The range of applications includes using enhancement techniques as preprocessing steps to ease the next processing step or as postprocessing steps to improve the visual perception of a processed image, or image enhancement may be an end in itself.

(Fig M.12: Image Enhancement process) First of all, let me tell you that the enhancement methods can broadly be divided in to the following two categories:

1. Spatial Domain Methods 2. Frequency Domain Methods

In spatial domain techniques, we directly deal with the image pixels. The pixel values are manipulated to achieve desired enhancement. In frequency domain methods, the image is first transferred in to frequency domain. It means that, the Fourier Transform of the image is computed first. All the enhancement operations are performed on the Fourier transform of the image and then the Inverse Fourier transform is performed to get the resultant image. Before we proceed for the further discussion, I must tell you that we will consider only gray level images. The same theory can be extended for the color images too. A digital gray image can have pixel values in the range of 0 to 255. Let’s discuss some of common image enhancement techniques:

Page 27: Applications of Dip

Page 27 www.magix.in

1. Contrast Enhancements: Contrast enhancements improve the perceptibility of objects in the scene by enhancing the brightness difference between objects and their backgrounds. Contrast enhancements are typically performed as a contrast stretch followed by a tonal enhancement, although these could both be performed in one step. A contrast stretch improves the brightness differences uniformly across the dynamic range of the image, whereas tonal enhancements improve the brightness differences in the shadow (dark), midtone (grays), or highlight (bright) regions at the expense of the brightness differences in the other regions.

2. Image Negatives: The most basic and simple operation in digital image processing is to compute the negative of an image. The pixel gray values are inverted to compute the negative of an image. For example, if an image of size RxC, where R represents number of rows and C represents number of columns, is represented by I(r,c). The negative N(r,c) of image I(r,c) can be computed as

N(r,c) = 255 – I(r,c) where 0 <= r <= R and 0 <= c <= C It can be seen that every pixel value from the original image is subtracted from the 255. The resultant image becomes negative of the original image.

Reversing the intensity levels of an image in this manner produces the equivalent of a photographic negative. This type of processing is particularly suited for enhancing white or gray detail embedded in dark regions of an image, especially when the black areas are dominant in size. An example is shown in upper figure.

Page 28: Applications of Dip

Page 28 www.magix.in

3. Brightness Control: If the digital image is of poor brightness, the objects in the image will not be visible clearly. It should be the case when the image is captured under low light conditions. To rectify this problem, we can further increase the brightness of the captured digital image and make the image more attractive. If we study the histogram of a low-brightness image, we will find that the most of the pixels lie in the left half of the gray value range as shown in figure M5.7(a) below.

(a) Dark image (b) Bright Image

(Fig M5.7)

The brightness of a dark image can easily be increased by adding a constant to gray value of every pixel. This addition operation will shift the histogram towards brighter side with a constant factor. While applying this method to increase brightness of an image, we must choose the constant wisely so that the complete range of gray values lies within 0 to 255. If the final gray value of any pixel is greater than 255 then we will lose the information. It will create loss of information in the image.

4. Gamma Correction: Gamma correction operation performs nonlinear brightness adjustment. Brightness for darker pixels is increased, but it is almost the same for bright pixels. As result more details are visible.

(Example of Gamma Correction)

Page 29: Applications of Dip

Page 29 www.magix.in

5. Image Subtraction: The difference between two images f(x, y) and h(x, y), expressed as

g(x, y) = f(x, y) - h(x, y) is obtained by computing the difference between all pairs of corresponding pixels from f and h. The key usefulness of subtraction is the enhancement of differences between images. The higher-order bit planes of an image carry a significant amount of visually relevant detail, while the lower planes contribute more to fine (often imperceptible) detail. Figure 5.8(a) shows the fractal image to illustrate the concept of bit planes. Figure 5.8(b) shows the result of discarding (setting to zero) the four least significant bit planes of the original image. The images are nearly identical visually. The pixel-by-pixel difference between these two images is shown in Fig. 5.8(c).The differences in pixel values are so small that the difference image appears nearly black when displayed on an 8-bit display. In order to bring out more detail, we can perform histogram equalization. The result is shown in Fig. 5.8(d).This is a very useful image for evaluating the effect of setting to zero the lower-order planes.

FIGURE 5.8 (a) Original fractal image. (b) Result of setting the four lower-order bit planes to

zero. (c) Difference between (a) and(b). (d) Histogram equalized difference image.

Page 30: Applications of Dip

Page 30 www.magix.in

6. Edge Enhancement in Spatial Domain: For many remote sensing earth science applications, the most valuable information that may be derived from an image is contained in the edges surrounding various objects of interest. Edge enhancement delineates these edges and makes the shapes and details comprising the image more conspicuous and perhaps easier to analyze. Generally, what the eyes see as pictorial edges are simply sharp changes in brightness value between two adjacent pixels. The edges may be enhanced using either linear or nonlinear edge enhancement techniques.

The techniques we discussed here are some common techniques for image enhancement. There are hundreds of more techniques for image enhancement specific to application domain. But we can’t discuss all of them here, because of space limitation. Here, we come at the end of syllabus for “Computer Graphics and Image Processing”. But don’t say sayonara; just take a kitkat break for now; as we will help you in MCA also. Happy to help you. -Magix Team