FPGA based Embedded

download FPGA based Embedded

of 12

Transcript of FPGA based Embedded

  • 7/30/2019 FPGA based Embedded

    1/12

    FPGA-based Embedded

    Fingerprint Verification and Matching System

    Preetham.P1,Rajagopal.M2,Sri Ramalinga Ganesa Perumal.N3

    1

    UG Scholar , Department ofElectronics and Communication

    Engineering, Sudharsan Engineering College,Pudukkottai.

    [email protected]

    2UG Scholar , Department ofElectronics and Communication

    Engineering, Sudharsan Engineering College,[email protected]

    3

    Professor and Head , Department ofElectronics and Communication

    Engineering, Sudharsan Engineering College,Pudukkottai

    Abstract The development of a fingerprint verification system on a low-cost

    embedded platform is an open issue in nowadays biometrics. Our paper describes

    a low-cost fingerprint minutiae extraction and matching system based on a

    Spartan3 family FPGA with an embedded Leon2 open core processor. The

    proposed system architecture incorporates a Floating Point Unit and a Discrete

    Fourier Transform coprocessor to accelerate the minutiae extraction process. The

    whole verification algorithm is based on the NFIS version 2 open source

    software developed by the National Institute of Standards and Technology (NIST).

    The results on execution time reduction and FPGA occupation for different system

    configurations show that the proposed architecture improves substantially the

    performance of the baseline system architecture.

    1 Introduction

    In nowadays society identity verification is becoming a crucial issue in several business

    sectors such as access or border control. Due to this fact, a new field known as biometrics

    has emerged, which uses some unique physiological or behavioural characteristics, not

    shared by any other individual, to positively identify a person. Examples of physical

    characteristics include fingerprints, facial patterns and hand measurements, eye retinasand irises, while examples of mostly behavioural characteristics include signature, gait

    and typing patterns.

    Fingerprint based verification is one of the most used biometric systems nowadays due

    to its easiness of acquisition and high distinctiveness, persistence and acceptance by the

    public [1].

    This paper describes the design flow of a low-cost embedded system for fingerprint

    verification.Theproposed system consists of a 32-bit Sparc Leon2 processor, a fingerprint

    image sensor, signalprocessing hardware acceleration and a floating point unit. It is worth

    noting that the minutiae extraction module is open to work with any fingerprint sensor, in

    which case a change in the fingerprint image capture driver would be enough.

  • 7/30/2019 FPGA based Embedded

    2/12

    Similar FPGA-based fingerprint verification systems have been developed and proposed

    in the literature [2][3][4]. From the system architecture point of view, the Thumbpod

    project [2] is probably the most important reference due to its similarities to the platform

    presented in this paper. Both systems have been built upon the Leon2 soft processor,

    although for our system implementation we have selected a low-cost Spartan3 family

    FPGA as the core of the design, while a more expensive Virtex II device was chosen in

    the Thumbpod project. Moreover, a hardware coprocessor for floating point management

    (FPU) enabled our system to accurately perform high speed floating point operations in

    contrast with the fixed-point refinement required in the Thumbpod project.

    Regarding software development issues, both projects are based on the NFIS (NIST

    Fingerprint Image Software) open source software. Nevertheless, the verification system

    proposed in this paper hasits roots in the enhanced version 2 (NFIS2) of this algorithm

    and uses the specified input fingerprint image format (500 dpi and 256 greyscale images)

    for its optimum performance. On the other hand, the Thumbpod project algorithm uses

    low quality images (3 bits per pixel) as an input pattern to execute the NFIS version 1minutiae extraction flow. The proposed system is also open to most fingerprint sensors

    in contrast to other platforms which have been customized for a specific fingerprint sen-

    sor. As for the matching algorithm is concerned, the BOZORTH3 algorithm has been

    implemented.

    The paper is organized as follows: In Section 2, the software architecture for the minu-

    tiae extraction and matching algorithm targeted to a Leon2 based platform is described.

    In section 3, the proposed HW architecture of the design is explained, emphasizing in

    the floating point operation acceleration achieved by means of a FPU and a DFT co-

    processing engine. In Section 4, some results on speed enhancement for the best of the

    three proposed system configurations are shown and finally, in Section 5 some concludingremarks will be drawn.

    2 Software Architecture

    The fingerprint authentication algorithms developed for the target system are based on

    some routinesofthe NFIS2 collection from the National Institute of Standards and Tech-

    nology (NIST). Specifically, custom versions of the MINDTCT and BOZORTH3 pack-

    ages have been developed for the minutiae acquisition and matching respectively.

    It is worth noting that the algorithm and the implemented software parameters have

    been designed and set for an optimum performance with 256 greyscale and 500 ppi-

    scanned images. These input image characteristics match perfectly with those of the

    image provided by the most relevant fingerprint sensors, such as the Fujitsu MBF200,

    which was the chosen sensor for the research that has been carried out.

    2.1 Software Implementation on a Leon2 Platform

    The original software was designed and tested to be run on a Linux operating System and

    compiled with gcc. A bare-C cross-compiler and a GRMON debug monitor from Gaisler

    Research have been used in this project. Dynamically allocated data arrays have been

    used for intermediate result storing depending on the applications requirements.

    On the other hand, although the original software only accepts input image files in

  • 7/30/2019 FPGA based Embedded

    3/12

    A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

    ANSI/NIST, WSQ, JPEGB, JPEGL and IHEAD formats, the selected fingerprint sensors

    provide the fingerprint image in RAW format, for which the algorithm has been modified

    and adapted so that only RAW format image files are accepted.

    2.2 Minutiae Extraction Algorithm

    The MINDTCT software has been designed in a modular fashion, and as a result of that,

    each step in the algorithm is mainly executed in a subroutine. Only those strictly nec-

    essary modules for XYT formatted (position, direction and quality) output minutiae list

    generation have been taken from the original algorithm. The functional steps executed in

    the minutiae extraction algorithm are shown inFigure 1.

    Figure 1: Functional steps of the minutiae extraction algorithm

    In the image map generation phase degraded fingerprint areas prone to give as a result

    erroneous minutiae are identified. To this end, unreliable image zones are detected based

    on the following criteria:

    Low contrast: Marks low contrast areas in the image which mainly correspond with the

    background of the image or smudges in the fingerprint (Low Contrast Map, LCM)

    (Figure 2).

    Low ridge flow: Identifies those image areas where the dominant ridge flow could not

    be determined initially (Low Flow Map, LFM) (Figure 2).

    High curvature: Flags high curvature areas in the image, such as the fingerprint core or

    possible delta regions (High Curve Map, HCM) (Figure 2).

  • 7/30/2019 FPGA based Embedded

    4/12

    Low Contrast Map Low Flow Map High Curve Map

    Figure 2: Generated Image Maps

    The computed Low Contrast Map, Low Flow Map and High Curvature Map are shown

    in Figure 2.

    As a combination of these three features a quality map is derived. This map assigns one

    of the possible five quality levels to each of the blocks in the image (Figure 3). The

    quality ranking is sorted as follows:

    0: Poor quality

    1: Fair quality

    2: Good quality

    3: Very good quality

    4: Excellent quality

    In this phase of the algorithm one of the fundamental maps for the minutiae extraction

    process is also derived: The directional ridge flow map. For the acquisition of this map,

    the original image is divided into 8x8 size pixel blocks. For each of the image blocks a

    24x24 pixel sized window is defined, conformed by the block itself and other surrounding

    pixels. The window is rotated incrementally in the 16 orientations defined in the algorithm

    (each of them 11.25 apart) and a DFT is executed at each position. In every orientation,

    the pixels along each rotated row of the window are summed up to form 16 vectors of

    row sums. Each one of these vectors is then convolved with four waveforms of different

    frequency. The spatial frequency of each waveform discretely represents the width ofdifferent ridges and valleys, in such a way that 12, 6, 3 and 1.5 pixel widths are covered

    in the algorithm. To determine the dominant ridge flow within a block, the resonance

    coefficient obtained from the convolution is evaluated. The result for this module of the

    algorithm is shown in Figure 4.

    Once the image maps have been acquired, it is necessary to binarize the fingerprint

    image so that the minutiae can be extracted. To carry out this process, the previously

    computed directional ridgeflow map is used to determine the binary value assigned to

    each pixel. After the binarization (Figure 4), the minutiae detection module analyzes

    the binarized image looking for candidate minutiae (ridge ending or bifurcation). How-

    ever, not all the ridge patterns selected after this procedure correspond to true minutiae,

  • 7/30/2019 FPGA based Embedded

    5/12

    A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

    Figure 3: Computed quality map

    therefore, a false minutiae removing process should be carried out. Even after the false

    minutiae removing process, false minutiae may potentially remain in the candidate list.

    To counteract this fact, a reliability measure is assigned to each minutia based on the qual-

    ity map and other pixel intensity statistics. The resulting minutiae for the template image

    are shown in Figure 4.

    2.3 Matching Algorithm

    The BOZORTH3 matching algorithm, included in the second distribution of NFIS, com-

    putes a match score that reflects the similarity degree between a fingerprints minutiae

    and a template minutiae set, both of them in XYT format. One of the most remarkable

    features of this algorithm is its invariance to both rotation and translation.

    The first step in the algorithm flow shown in Figure 5 is to construct a comparison

    table for each one of the input minutiae sets. Relative measures between a minutia and

    the rest of the minutiae in the same fingerprint are computed and stored in a comparison

    table. This is what provides the algorithms translation and rotation invariance.

    The next step is to look for compatible entries between the two tables. The resultsof this analysis are stored in a new compatibility table which consists of a list of com-

    patibility associations between two possible corresponding minutiae. Each one of these

    associations represents single links in the compatibility graph.

    In the final phase of the matching software flow, the compatibility graph is traversed and

    clusters are created by linking table entries. Once the traversals are complete, compatible

    clusters are combined and a match score is computed by accumulation of the linked table

    entries across the combined clusters. Generally, a match score greater than 40 indicates

    that both fingerprint minutiae and template minutiae belong to the same finger, and so, to

    the same individual.

  • 7/30/2019 FPGA based Embedded

    6/12

    Direction Map Binarized Image Final Minutiae

    Figure 4: Resulting images of the minutiae extraction process

    3 HW Architecture

    3.1 Initial System Architecture

    The initial HW architecture is composed of a 50 MHz fixed-point Leon2 soft-processor

    with 8 KB of cache memory for data and instructions, all of this embedded in a GR-

    XC3S1500 board.This processor has been chosen for this application not only because of

    its high performance and usability, but also due to the fact that it can be obtained under

    LGPL license. According to a report on synthesizable CPU cores [5], where Leon2, Mi-

    croBlaze and OpenRISC 1200 where tested under three different hardware configurations

    and three different benchmarks, Leon2 yielded the best performance per clock cycle for

    all the benchmarks and configurations. Moreover, in the opinion of the authors of the

    mentioned report, Leon2 is the processor with the highest usability among the tested

    CPU cores. A reason for this may be the VHDL code availability and the TCL/Tk based

    configuration tool, which facilitates the design of a custom Leon2 based system. The fin-

    gerprint image acquisition is performed by means of an IP (Intellectual Property) module

    connected to the MBF200 fingerprint sensor. This module is attached to the APB bus

    (AMBA Peripheral Bus) and provides the processor with the input fingerprint image as

    shown in Figure 6. The operation of the sensor is controlled by means of three On-chip

    registers (control, data and status) generated in the address range allocated for the APB

    bridge.

    3.1.1 Running the application on the initial system

    The original minutiae extraction and matching algorithm was implemented using floating-

    point notation while the Leon2 soft-processor is fixed-point. In order to run the program

    on the target platform the-msoft-float option must be set in the compiler options. This

    option forces all floating-point operationsto be done in software with integer arithmetic.

    The execution of the algorithm is successful as for the matching results is concerned but

    not in terms of execution time. The required computation time for the minutiae extraction

    is 157 seconds while the matching process for a one-to-one comparison is carried outin

    51 seconds. Even when the results for the extracted minutiae and match score are correct,

  • 7/30/2019 FPGA based Embedded

    7/12

    A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

    Figure 5: Functional steps of the matching algorithm

    the program execution delay is unacceptable for a biometric verification system.

    The excessive execution time is mainly due to the MINDTCT algorithm, and thus, the

    analysis of the reduction of the time required for minutiae extraction becomes one of the

    main objectives of this paper.

    In the MINDTCT module a great amount of floating-point data is used. The emulation

    of this data format introduces a serious delay in the execution of the program. To ac-celerate this process, a floating point unit (FPU) is required. In the following section the

    effects of this core concerning execution time reduction and floating-point data processing

    are analyzed.

    3.2 FPU Tests

    The Leon2 processor provides an interface to different FPUs. The floating-point units

    from Gaisler Research (GRFPU) and Sun Microsystems (Meiko), as well as the incom-

    plete LTH FPU [6] are compatible with this interface. For the acceleration of the floating-

    point processes the GRFPU has been used eventually for its high-performance and com-

    pliance with the IEEE-754 standard.The insertion of a FPU in the embedded system leads to a considerable increase in the

    amount of logic inside the FPGA. This is the reason why a reduction in the processor

    clock and/or a cutback in the cache memory amount will be required. The effect of

    the attachment of a floating point unit has been analyzed for the following three system

    configurations:

    31 MHz and 8KB cache memory.

    37 MHz and 8KB cache memory.

    40 MHz and 4KB cache memory.

  • 7/30/2019 FPGA based Embedded

    8/12

    Figure 6: Proposed system architecture

    The FPGA utilization is almost complete and very similar for the three analized hard-

    ware configurations. The device utilization summary for the three cases is shown in Ta-

    ble 1.

    31 MHz and

    8 KB cache

    37 MHz and

    8 KB cache

    40 MHz and

    4 KB cache

    Number of external IOBs 36% 36% 36%

    Number of LOCed External IOBs 97% 97% 97%

    Number of MUL18X18s 53% 53% 53%

    Number of RAMB16s 56% 68% 56%

    Number of slices 99% 99% 99%

    Number of SLICEMs 1% 1% 1%

    Number of BUFGMUXs 37% 37% 37%

    Number of DCMs 50% 50% 50%

    Table 1: Device utilization for different system configurations

    The floating-point behaviour has been assessed by means of the Stanford and Paranoia

    benchmarks. The first one measures the execution time in milliseconds for each one of

    the ten small programsincluded in the algorithm. Only two of these modules make use of

    numbers in floating-point format: FFT and Mm. On the other hand, Paranoia is a program

    to test the compliance with the IEEE-754 floating-point standard [5].

    It is important to point out that the compilation of these programs must be carried out

    without the -msoft-float compiler option set for those system configurations where the

    floating-point core isinserted.

  • 7/30/2019 FPGA based Embedded

    9/12

    A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

    A B C D

    Perm 34 50 33 34

    Towers 50 83 67 50

    Queens 33 50 33 33

    Intmm 166 133 100 116

    Mm 1000 84 50 67

    Puzzle 317 450 350 350

    Quick 50 50 33 33Bubble 50 50 50 50

    Tree 233 334 266 250

    FFT 1067 83 67 50

    3.2.1 Stanford Benchmark

    The outcome of the execution of this program for different system configurations is shown

    in Table 2.The study has been carried out for the following embedded designs:

    A: 50 MHz clock frequency with 8KB cache memory (No FPU).

    B: 31 MHz clock frequency with 8KB cache memory and FPU.

    C: 37 MHz clock frequency with 8KB cache memory and FPU.

    D: 40 MHz clock frequency with 4KB cache memory and FPU.

    It is worth mentioning the considerable execution time reduction achieved in those pro-

    gram modules where floating-point operations where carried out. The decrease in the

    execution time for the matrix multiplication program (Mm) is of 95% in the best case

    (37 MHz and 8KB cache memory) and 91.6%in the worst case (31 MHz and 8KB cache

    memory). The results for the FFT analysis are similar: A reduction of 95.3% of the com-

    putation time has been achieved in the best case (40 MHz and 4KB cache memory) and

    of 92.22% in the worst case (31 MHz and 8KB cache memory). The execution time for

    integer operations does not show remarkable variations.

    Table 2: Stanford benchmark results for different system configurations

    3.2.2 Paranoia Benchmark

    The outcome of the Paranoia benchmark has been identical in the three proposed con-figurations. The arithmetic diagnosed is satisfactory though a little underflow flaw has

    been found:

    Paranoia version 1.1 [cygnus]

    .. .

    FLAW: X = 3.05947655544740190e-308

    is not equal to Z = 2.22507385850720138e-308.

    yet X - Z yields 0.00000000000000000e+00 .

    This is correct as long as the underflow is notified.

  • 7/30/2019 FPGA based Embedded

    10/12

    3.3 Introducing the GRFPU in the design

    Tests on IEEE-754 compliancy drew positive results for the Gaisler Research floating-

    point core, and therefore this module was included in the hardware design. Results

    for computation time have improved substantially after the FPU insertion, mainly in theexecution time required for the MINDTCT process completion. A 94.14% time reduction

    has been achieved for the case of 40MHz and 4KB cache memory configuration. The

    execution time for the matching algorithm had a slight improvement.

    The timing results for the different system configurations are shown in Figure 7. Note

    that the analized system configurations are the same as those in section 3.2.1.

    Figure 7: Execution Time Results for different Leon2-based Configurations

    3.4 HW Speed Enhancement

    Even if the computation time for the minutiae extraction algorithm has been reduced in

    a 94.14%, the program completion delay is yet excessive for its implementation in a real

    commercial system.

    Several timing analyses show that the 75% of the computation time is occupied by

    the low contrast map, direction map and low flow map generation process. The 92% of

    the time dedicated to image map computation is needed to generate the directional ridge

  • 7/30/2019 FPGA based Embedded

    11/12

    A LOW-COST EMBEDDED FINGERPRINT VERIFICATION AND MATCHING SYSTEM

    flow map, as shown in Figure 8. This process is accelerated by means of a hardware

    accelerator that computes the required DFT calculations for the accomplishment of the

    algorithm.

    Figure 8: Profiling of the execution time for: a.- MINDTCT algorithm b.- LCM, LFM

    and DM modules

    4 Conclusions

    This paper describes the implementation of a fingerprint minutiae extraction and matching

    algorithm running on a Spartan3 based system with an embedded Leon2 soft-processor.

    The original application developed by NIST has been modified and ported to the target

    platform. Several tests have been carried out to analyze the performance of the software

    algorithms with different Leon2 and GRFPU configurations.

    After the insertion of a floating-pointunit, the results on execution time of the algorithm

    have been reduced in a 94.14% for a 40MHz and 4KB cache memory configuration.

    Acknowledgments

    This work was supported by the BIOSEG PROFIT Project funded by the Spanish Ministry of

    Science and Technology. The authors would also like to thank Jiri Gaisler and Richard Pender

    for their support in setting up the system architecture.

    References

    [1] A.K. Jain. Biometric recognition: How do I know who you are?. In Signal Processing and Commu-

    nications Applications Conference on, 2004. Proceedings of the IEEE 12th, April 2004.

    [2] S. Yang, K. Sakiyama and I. Verbauwhede. A Compact and Efficient Fingerprint Verification System

    for Secure Embedded Devices. In Signals, Systems and Computers, 2003. Conference Record of the

    Thirty-Seventh Asilomar Conference, pages 2058-2062, Pacific Grove, California, November 2003.

    [3] A. Lindoso, L. Entrena and J. Izquierdo. FPGA-Based acceleration of fingerprint minutiae match-

    ing. In III Southern Conference on Programmable Logic (SPL2007), Mar del Plata, Argentina,

    February 2007.

  • 7/30/2019 FPGA based Embedded

    12/12

    [4] M. Lopez Garca and E. F. Canto Navarro. FPGA Implementation of a Ridge Extraction Fingerprint

    Algorithm Based on a MicroBlaze and Hardware Coprocessor. In 16th International Conference on

    Field Programmable Logic and Applications (FPL 2006), Madrid, Spain, August 2006.

    [5] Daniel Mattson and Marcus Christensson. Evaluation of synthesizable CPU cores, Masters Thesis,

    Chalmers University of technology, Gothenburg, Sweden, 2004.

    [6] Martin Kasprzyk. Floating Point Unit, Digital IC Project 2001, January 2002.