Vol.2 No.10

(IJCNS) International Journal of Computer and Network Security, Vol. 2, No. 10, 2010

1

Intelligent Online Course evaluation system Using NLP Approach

Behrang Parhizkar, Kerfalla Kourouma, Siti Fazilah, Yap Sing Nian, Sujata Navartnam, Edmund Ng Giap Wen

Faculty of Ict, Limkokwing University of Creative Technology,

Cyberjaya, Selangor, Malaysia [email protected]

[email protected] [email protected]

[email protected] [email protected]

[email protected]

Abstract: Every semester students are asked to complete course evaluations at Limkokwing University. The main goal of the course evaluation is to collect feedback from student in order to improve the quality of the education. However, a traditional method of using paper and pencil is the current approach in Limkokwing University. In this paper we proposed an intelligent online course evaluation system that aims to automate this routine in order to facilitate data gathering, analysis and storage. The document is an essential element as it provides a summary of the literature about online course evaluation in the first part of the paper and describes our findings on the approaches use for text mining in the second part. And finally we discussed about the proposed system and the implementation of Natural Language Processing.

KeyWords: Online Course evaluation, data mining, Natural Language Processing.

1. Introduction The evolution of technology and computer reveals the Internet as the fastest medium of communication where the information is at your fingertips. This evolution has brought a new era where real time information are being accessed from everywhere. As part of that evolution the education has come to the level where universities, lecturers and students communication through internet. We are living the era of virtual world where everything seems to be transformed from physical to digital form. Thus, new concepts such as virtual classrooms and digital libraries have been introduced to break the barriers of education and meet challenges of the new millennium. The Idea of online course evaluation system is to abandon the paper evaluation system that has been used for years. The evolution from paper to online student evaluation system is an innovative idea of the new millennium where everything is been automated and accessed from home., If students can study online and register online why not evaluate their lecturer online? Despite the growth of World Wide Web, the online course evaluation remains a new topic to many institutions of higher education. Most of them are stuck on the traditional approach and have problem to move

on by introducing the web-based approach. However some universities have conducted research and implemented the online course evaluation and have found it effective. Online evaluation system promises a lower costs compared to paper-based evaluation. In addition it saves time for the faculty, anonymity for students, better safeguards against tampering, and more flexibility in questionnaire and report design [2]. Just like any normal system, the online evaluation system also has some drawbacks which include easy sensitive data access by unauthorized users, lower response rates and ratings may be less favourable to lecturers. In this paper we discuss about the argument over the online and offline course evaluation method. We also discuss about the evolution of online course evaluation system, and we presented an intelligent online course evaluation proposed to replace the current paper method use in Limkokwing University.

2. Previous Works

2.1. Evolution of Course evaluation According to Haskell the student evaluation of faculty members was first used at the Univeristy of Wisconsin in the early 1920s to collect students’ feedbacks. Many other universities introduce it in the 1960s as a decision-support tool regarding salary, promotion, and tenure. Since then it has been the dominant method for evaluating teaching across North America, and continue to be the same today but used for formative purpose to help faculty improve teaching instead of summative decisions regarding salary, promotion, tenure and merit. With the emergence of internet we were introduced to online evaluation system in the 1990s. In 1997, Columbia University implemented their Web course Evaluation system [30]. This system allowed faculty to customize their surveys and was linked directly to the registrar’s office for security. The result of evaluation was published on a public web where anyone could view. In Australia, Deakin University recognized the potential


2

savings in time and expense they would gain by shifting from traditional evaluation to an online evaluation system [31]. Before the implementation of their online system in 1998, off-campus students used to mail their evaluation form and the compilation of these forms into electronic form could take up to 3 months to complete. The implementation of an HTML and CGI-based online system raised the response rate to 50% in 1997. Then later after the implementation the complete online evaluation system, online and offline student could log in with their unique ID and complete the evaluation. In the late 1990s two (2) other universities in china (HKUST, HKU and HKPU) collaborated to create 2 online evaluation systems [32]. COSSET (Centralised Online System for Student Evaluation of Teaching) and OSTEI (Online System for Teaching Evaluation of Instructors). Comparatively, COSSET provided more features than OSTEI and relied on registration information for student logins while OSTEI used a combination of instructor ID and questionnaire ID for logins which was less secure. But OSTEI was flexible and allowed instructor to register and create their own questionnaire and also provided a bank of 800 questions to allow custom questionnaire. Another system was implemented by Drexel University [30]. This system was based on HTML, SQL and the Perl scripting language. Instructors would submit their questions on a template email which would be uploaded as evaluation forms into the system. The students name and birth date were used to log in and complete the evaluation. to encourage more participation, E-mail was used as main means of communication between students and the faculty to remind student about completing the evaluations. According to Hmieleski in a report on higher education in 2000, only 2 institutions ranked as the most wired were using the online evaluation among the 200 wired-institutions in Australia. However In 2002, the online evaluation system was still considered limited in higher education. Electronic Evaluation Method Vs Traditional paper Method: Many Universities hesitate to convert to web-based evaluation due to fears regarding cost, return rates, and response quality [1] but in one of the previous studies on this topic, [4] compared traditional course evaluation with online evaluations at Rutgers College of Pharmacy. they compared the evaluation rates of both methods and found that the paper had a evaluation rate of 97% with a response rate of 45% to the open-ended questions whereas online method had an evaluation rate of 88% and a response rate of 33% for the open-ended questions. Dommeyer also conducted a survey to determine preferred method of student appraisal where the tagged was a group of business professors [3]. Out of 159 faculty members, 33% responded and there was a preference for the paper evaluation because they believed it has a higher response rate and accurate response. It was concluded that the online approach could be more appealing to faculty members if techniques could be used to increase students’ response rates.

In 2002 and 2004, Teaching Questionnaire ratings were collected online in several University departments in Pilot tests. A low performance of web-based Teaching Questionnaire compared to the standard was observed in the tests with a lower response rate and less favourable responses [2]. Several studies demonstrated the low response rate provided by the web-based questionnaires which are illustrated in Table 1.

Table 1: comparison between web Based and Paper approach Researchers Yea

r Web

Based/ E-mail

Mail/ pape

r Medlin et al 1999 28% 47% Guterbock et al 2000 37% 48% Kwak and Radler 2000 27% 42% Crawford et al 2001 35% Ranchlod & Zhou 2001 20% 6%

A research since 1986 [11] noticed a drop of email-survey response rates from 46% in 1995/1996 to 31% in 1998/1999. Other research also noticed a drop of response rates in a survey completed in 1995 and 1998 [12]. Layne also was conducted a comparative study between electronic and paper course evaluation [13]. In this survey a number of 2,453 were evaluated using the same question in the electronically and paper-based evaluation. The response rate was 60.6% for the class evaluation against 47.8% for the online evaluation. Another research that conducted in 2000 had a very less participation of student in the online-based evaluation, and the reason was that students were satisfied with their lecturers’ performances which give them an excuse not to fill the evaluation form [14]. Students found the online evaluation easy to use and liked it because of the anonymity. The online method gave them the ability to provide more thoughtful comment than the traditional method. However some researchers found a positive result in email response rates. Unlike table 1, Tables 2 provides the findings that demonstrates high response rate for online evaluation over the traditional approach.

Tables 2: Comparison between E-mail and Mail Evaluation

Authors Year Email Mail

Parker 1992 60% 38%

Kiesler & Sproull

1986 67% -

Walsh et al 1992 76%

Jaclyn M, Grahan H

1998-2002 64%

Some others researchers in their research stated that the advantage of E-mail or web based survey over traditional method is that Paper resource use savings decrease costs by


3

80 to 95% [11] [19] [20]. Compared to a normal mail survey it is cheaper and the cost decreases as survey size increases [21]. In addition, students provide more answers to the open ended question online [21] [12], according to another research on 1998 email surveys are cheaper and faster than paper surveys, encourage respondents to reply and can provide a friendly environment [22]. According to [3], St. Louis College of pharmacy compared the traditional paper with online evaluation. Out of 169 students in the survey, 50 were randomly chosen to complete the same form online, and the other 119 students to complete the traditional paper. This study showed that despite the small number of students completing the online evaluation, they provided more comment than the big amount of student that completed traditionally; and the number of words typed online was 7 times the number of words typed offline. The time spent by students to complete was approximately 10 minutes online VS 25 minutes offline. The staff took 30 hours to compile scores and comment from the paper VS an hour to just download scores and comments from the online survey.

2.2. Web based survey methodology Based on our research we found that either paper or web based survey; the methodology matters as it affects the response rates. A research shows that there is a way to maximise response rates by keeping the questionnaire short and the following up notice is an important aspect that affects the response rate. A reminder after two days had a completion rate of 30.3% while a reminder after five days had a completion rate of 24.3 %(p<0.050) . In general, two days reminder notice is suggested [6]. But others researchers [23] have found improved response rates with fourth contacts. In a Crawford et al study, the authors mentioned that the more complicated is the access to the questionnaire; fewer users are motivated to respond. Researchers have demonstrated that the ease of access to the survey page is important. Dommeryer and Moriarty showed that an embedded survey with an en easy access had a better response rate compared to an attached questionnaire that requires downloading, completing and uploading the questionnaire [24]. To minimize the download time of questionnaire pages other researchers recommended the use of simple designs for the system [25]. The question should be straight forward and simple, in addition each point should ask only one question [26]. A lack of anonymity in the use of email surveys has been underline as one of the reason of the low response rates [16] [25]. Administrators can track passwords and easily access user answers. Especially with Emails, an author is easily traceable via return email on which the respondent email may be included. And if the survey is designed online and no password, there is no way to follow up on non respondents and there won’t be any control over the number of survey complication per person rates [6]. Some research teams provided recommendation for the methodology to be used. There was a recommendation for

the University to implement an online rating system, including communication and training as essential components of the system. They also recommend comments not to be stored in the database or electronics files after using them as they could be easily accessed by an unauthorized user which reduce the Freedom of Information Act (FOIA). And they also stated that authors of comments and ratings should not be identified during collection of data. And the list of respondents should be deleted from the system and should never be available to the faculty and teachers. In addition they gave recommendation to overcome one of the common concerns of online evaluation system which is low response rates. They forbid Universities to use incentives and sanctions to improve the response but instead, use targeted announcements and frequent follow-up reminders for students during period when the evaluation is being collected. Completion time indicated in the invitation, timing of the reminder notice, access to the survey. Perceived anonymity and confidentiality of responses including reward are factors may affect the email survey response rates [6]. An investigation of another research on 2001 shows that more the time given to complete the survey increases more the response rate decreases because the user will not focus knowing that he has a lot of time to complete it [10]. Online respondent are more distracted with others opened windows and may have less attention. A risk of being attacked by virus is considerable and the download time or number of pages accessed may affect the online survey. Unlike the web-based, these are generally not issues with the paper or mail survey [6]. Due to the big amount of email received by users, it is likely that they may ignore email form unknown senders [10].

3. Text Mining As the human speak or write English, he uses a lot of word combination to express something or to explain something. Out of 50 words spoken or written, the useful information needed by others might only be 10 to 20 words, the rest are the way to make the English language beautiful. when we read a text, our brain try to get the useful information out of it and try to match it with something store on our mind in order to understand and interpret it for decision support . So does the computer through test mining. Text mining also called text data mining, processes unstructured textual information to extract high quality information through patterns and trends such as statistical pattern learning. Data mining involve the process of structuring text that would be stored in a database and restore later for interpretation. Text mining tasks include text categorization, text clustering, and concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modelling. Text mining also called knowledge discovery for text was mentioned for the first time by. Un Yong N. and Raymond


4

J. M defined it as “the process of finding useful or interesting patterns, models, directions, trends, rules from unstructured text”. It refers generally to the process of extracting interesting information and knowledge from unstructured text [32]. It has been describe as truly interdisciplinary method drawing on information retrieval, machine learning, statistics, computational linguistics and especially data mining [32]. That implies that text mining uses techniques for information extraction and natural language processing (NLP) to extract data to which algorithms and methods of KDD can be applied. This technique is widely used by many authors [2] [32]. Text mining was also described as the extraction of not yet discovered information in large collections of texts [35]. It is considered as process oriented approach on texts. For a good comprehension of the topic, there are few terms that you need to be familiar with.

3.1 Text Mining Approaches Text mining has several approaches or techniques. Some proposed techniques are knowledge Discovery and Data mining, Information Retrieval, Information Extraction, Knowledge Discovery and Data mining: Knowledge discovery or knowledge discovery in databases (KDD) is defined by [35] as the non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data. It considers the application of statistical and machine-learning methods to discover novel relationships in large relational databases [34]. KDD has for goal finding hidden patterns or facts in a database or text file and includes several processing steps that have to be applied to the data in order to achieve this goal. The main steps defined by Cross Industry Standard process for Data mining (Crisp DM) is shown in Figure 1 namely: phases of Crip DM [10]. Business understanding and Data understanding are the first steps that consist of analysis and understanding of the initial problem. The next step which is Data preparation consist of pre-processing to convert data from textual to a format that can be suitable for the data mining algorithm which is applied at the Modelling phase. The process is completed by an evaluation and deployment of the obtained model.

Figure 1: Phases of Crip DM (Andreas H, Andreas N,

Gerhard P 2005)

KDD and data mining are two terms that are sometimes confusing and used as synonyms. Reasons given were that Data mining includes all the aspect of knowledge discovery process [33] and that data mining was a part of the KDD-processes or modelling phase of KDD process. However data mining represent the same concept as KDD and has for goal to retrieve useful information in data. It was defined by [33] as the search for valuable information in large quantities of data. For data mining to achieve its goals, few research areas need to be included which are database, machine learning and statistics. • Database is not only used to store found information

only but it is necessary to support the data mining algorithms for the identification of useful information.

• Machine learning (ML) is a field of artificial intelligence which consists of developing techniques to allow computers to learn by analysis of data.

• Statistics deals with science for the analysis of empirical data. Today many methods of statistics are used in the field of KDD [33].

Information Retrieval (IR): Information retrieval is the finding of documents which contain answers to questions and not the finding of answers itself [35]. IR refers to the research of information using methods, automatic processing of text data and comparison of question and answer. IR is a research area that has been widely used with the growth of World Wide Web. It was first used for automatic indexing. IR also refers to the extraction of information based of keywords such as search engines [33].

3.2 Natural Language Processing (NLP) Natural Language Processing refers to text processing for an understanding of human language by a computer. Its goal is to interpret human language through the use of computers [36]. Natural Language Processing is a linguistic analyse technique for a fast text processing. The concept of Natural Language Processing is all about understanding an input in a form of natural language and producing an interpretation of it in sentence in a form of natural language as well through the use of computer. The process of understanding a sentence by the computer in Natural Language Processing is illustrated in Figure 2.


5

Figure 2. Modules of Natural language understanding (Josef Leung & Ching-Long Yeh)

The pre-processing block reforms the words in the input sentence in various forms of single words. Then the pre-processed is received as input in the parsing block that constructs syntactic structure by referring to a syntax rule database. The syntax rules would specify the type of words (i.e Noun, Verb). Figure 3 demonstrates how a sentence is segmented to form a syntactic structure.

Figure 3. A Sample Syntactic structure (Josef Leung &

Ching-Long Yeh) The semantics component represents the meaning of the sentences in a semantic representation. Semantic interpretation maps the syntactic structure of a sentence to the logic based representation which as result is the interpretation of the context-independent meaning of sentence. The process of mapping the semantic representation to the knowledge representation, contextual interpretation, is performed to obtain the way the sentence is used in particular context. Figure 4 illustrates the process of Natural language

Generation that is done based on the user goal. The process accepts any goal input from the user and the text planning consults the planning operators to get an appropriate operator that suit the goal. Then the linguistic realisation linearlises the message content from the hierarchical structure to generate cohesive unit of text then finally maps it into the surface sentences.

Figure 4. A general architecture of natural language

generation (Josef Leung & Ching-Long Yeh)

3.3 Information Extraction (IE) Un Yong N. and Raymond J. M consider it as a key component for text mining with the goal to find specific data in natural-language text [34]. That information is stored in database like patterns and data to be extracted is given by a template containing the information needed and blank field to be filled with information retrieved from the text. Example: my name is kerfalla kourouma, i was born in 1986 in the US. I am married with two children and live in paris. Let’s say that from this short text our goal was to know the personal details of the author of the text. The information that would be looking for would be the name, birth date, marital status, address and phone number. Therefore our template would by default contain these titles with blank slot that would be filled with information extracted in the text or document. Name: kerfalla Kourouma Birth date: 1986 Marital status: married Location: Paris The above information is now in a structure form and can be stored in a database and retrieve later for a further use. From the above example we would describe information extraction as retrival and transformation of unstructured information into structure information stored in a database. Califf suggested using machine learning techniques for extracting information from text documents in order to create easily searchable databases from the information, thus making the information more easily accessible. Text Encoding: In text mining, it’s very important to encode a plain text into a data structure for a more appropriate processing. Most text mining approaches are based on the idea that a text document can be represented by a set of words [33].those words are contained in a bag-of-words representation. The


6

words are defined using vector representation whereby numerical value is stored for each word. The currently predominant approaches are the vector space model [39], the probabilistic model [25] and the logical model. The Vector Space Model: The vector space model was introduced first for indexing and information retrieval [25] but now it is also used for text mining and in most currently available document retrieval systems [33]. It enables efficient analysis of huge text documents by representing them as vectors in m-dimensional space. Each document d is described by a numerical feature vector w(d) =( x(d, t1), . . . , x(d, tm)) thus, the documents can be compared by use of simple vector operations and even queries can be performed by encoding the query terms similar to the document in a query vector. The query vector can then be compared to each document and a result list can be obtained by ordering the documents according to the computed similarity [25].

3.4 Applications for text Mining Text mining is an area currently used in various domains. Security applications: Text mining is usually used to analyze plain text source in internet. Text can be filtered by removing some inappropriate terms such as bad word in a chat room. It is used as well to automate the classification of texts. i.e it can be applied to filter undesirable junk email based on certain terms or words that are not likely to appear in normal messages. Those messages can be automatically discarded or routed to the most appropriate department. Analyzing open-ended survey responses: It is used in survey research in which various open-ended questions about the topic are included. The idea is to allow respondents to express their opinions with no limitation or without constraining them to a particular response format. Text mining is now used in marketing applications to analyze customer relationship management apply it to improve predictive analytics models for customer attrition. This method is often used by marketing to discover a certain set to words used to describe the pro’s and con’s of a product or service. The proposed system falls into this area of application. It will be using the same concept to interpret the student comment in the open-ended question of the student appraisal. Analyzing warranty or insurance clains, diagnostic interviews, etc: In some business domains, most data are collected in open-ended textual form. For example warranty claims or medical interviews are usually written in text form by a customer to explain the problems and the needs. These information are type electronically and available for input text mining algorithms. And as output, can generate useful structured information that identifies common clusters of problems. Others applications: • Various biomedical applications use text mining such as

PubGene, GoPubMed.org and GoAnnatator. • Online Media applications uses by media companies such

as Tribune Company use text mining to monetize the content.

• Sentiment analysis may involve analysis of movie reviews to estimating how favourable a review is for a movie.

• Academic applications: text mining is an important tool for publishers who hold database of information and require indexing for retrieval.

Example of Applications using text Mining: • AeroText: is a package of text mining applications for

content analysis. • Attensity: can be hosted, integrated or stand-alone text

mining software that uses natural language processing technology to address collective intelligence in social media and forums.

• Endeca Technologies: provides software to analyze and cluster unstructured text.

• Autonomy: suite of text mining, clustering and categorization solutions for a variety of industries.

• Expert System- suit of semantic technologies and products for developers and knowledge managers.

4. Proposed System In our research we noticed that most online course evaluation have common requirements that many programs meet such as user authentication to prevent from unauthorized use of the system and prevent double evaluation, student anonymity that protects them from being trace by their lecturers, user validation and report. However we found that only very few existing system use chart to represent data in the report and none of them allow the system to provide suggestion to the manager based on the students comment. That is the reason we propose an intelligent course evaluation system that will generate a report that includes suggestion and chart. It would be a system that uses algorithms to understand and retrieve from students open-ended responses useful data and interpret it into information for the report.

4.1. Proposed System Features and user Roles The intelligent online evaluation system would have 3 user roles: The faculty members that would own the admin role, the student and the lecturer role. These roles have different level of access to data and are provided with different features. Students: are the potential stakeholders of this system. Upon log in they are provided questionnaire link for each module that they can evaluate only once. They can see the status of each evaluation questionnaire to know which modules they have not evaluated yet. Their input will be processed and used to generate the report. Lecturers: have a passive or viewer role. They are only able to view students comments related to the course they teach without having a possibility to trace the authors. This would help them to know what student think about the course and would know how to improve their way of teaching. Faculty members: the faculty members are those who can view every single thing except the student name of a


7

particular response. They give account to lecturer and student, allocate module to student and lecturer, create, edit and delete questionnaire and view report. The report includes the student’s feedback, statistical result with chart for every lecturer and suggestion that would help the faculty to take strategic and academic decision. The faculty members would be able to view as well the response percentage of each module and can set starting date and finishing date of the questionnaire. The challenges presented here brought us into other area of research that would allow us to achieve this innovative and very useful change. We conducted research in the area of Artificial Intelligence and to be more specific in text processing method. The idea is to understand the student answers to the open ended question (in Natural language), process it and provide reports and suggestion (in Natural language). In our research we studied various text mining approaches and found the Natural Language processing (NLP) as the suitable approach to meet the system requirement. As the students would be writing their comment in sentence using their ordinary way to speak and write, the system would need a robust natural language analysis to process the students input (sentences) and interpret it to generate a report in sentences. To generate the report in sentences, some element would be needed as input in the report generator at the first place. First, input first as data mining results in a form of rules. Then input the background information such as variables names and categories. The next step is to set the text goal for the report generator to produce sentences accordingly [29].

4.2. Proposed System Technology The system will be built using ASP.NET MVC which is a part of the ASP.Net framework newly adopted by Microsoft to improve the productivity when creating web application and improve maintainability of the application as it allows the user to have nimbleness and flexibility in building and maintaining the application. Using this framework to develop our system will ensure the authenticity of the technology as ASP.NET MVC 2.0 was released last month along with Visual Studio 2010. SQL Server 2008 Expression will be used to store and manage the database. SQL Server 2008 is a robust Database management system and easy to connect to visual studio. SyncFusion is third party software that provides content for .Net application. SyncFusion will be used to generate excel report and create chart.

5. Conclusion The literature presented in this paper outline the previous research and evidence about online evaluation system and shows how important and effective would be to implement the intelligent online course evaluation system to replace the paper and pencil approach in Limkokwing University. The implementation of this system would save time, resource

and reduce work as it would provide chart and suggestion using Artificial Intelligence (Natural Language Processing). With a proper project plan the proposed system would be a platform that would be benefiting the University. Future Enhancement As the time past, new challenges are presented to man-kind. One of the most common challenges of the new millennium is the rapid access of information and very recently the mobility of the information. The technology is moving to mobile computing, and all the businesses are introducing it to have a huge amount of customer. As future enhancement of the intelligent online course evaluation we are planning to provide a platform that would allow student the evaluation of their lecturer through mobile phone and PDA.

Acknowledgment We would like to express our sincere gratitude to ARASH HABIBI LASHKARI (PHD candidate of UTM) for his supervision and guidance. Also, we would like to express our appreciation to our parents and all the teachers and lecturers who help us to understand the importance of knowledge and show us the best way to gain it.

References [1] K. Hmieleski & M.V. Champagne, “barrier to online

Evaluation’ surveying the nation’s top 200 most wired colleges”; interactive and distance Education Assessment laboratory at Rensselaer Polytechnic Institute,2000

[2] H Gavin , L Maria , H Lisa Annemarie , K James, N Helen, P Diana, T Michael, W Gretchen;”report one of the task force on online Evaluations & placement Examinations”, Online Course Evaluations, 2005, pp. 1-15

[3] M Heidi, Anderson, Jeff Cain & Eleanora,”Review of literature and a pilot study”, Online Student Course Evaluations, American journal of Pharmaceutical Education, 2005, pp. 34-41

[4] DK Woodward ,”Comparison of course evaluations by traditional and computerized on-line methods”, Am J Pharm Educ., 1998, pp 62-90

[5] CJ Dommeyer , P Baum, KS Chapman, RW Hanna ,” Attitudes of business faculty towards two methods of collection teaching evaluations: paper Vs. Online”, Asses Eval Higher Educ., 2002, pp. 455-462

[6] M Jaclyn, H Grahan ,”Use of electronic surveys in course evaluation”, British journal of Educational technology, 2002, vol 33 No 5. pp. 583-592,

[7] N Kwak, BT Radler ,”using the web for public opinion research: a comparative analysis between data collected via mail and the web paper”, annual meeting of the American Association for public Opinion research Portland Oregon May,2000


8

[8] C Medlin, S Roy and T Ham Chain, ”World Wide Web versus mail surveys”, ANZMAC99 conference,1999

[9] T Guterbock, B Meekins, A Weaver & J Fries,” ‘Web versus paper’ A mode experiment in a survey of university computing”, American Association for public opinion Research Portland Oregon May,2000

[10] Crawford, A Ranchhod & F Zhou, “Comparing respondents of email and mail surveys”, Marketing Intelligence and planning ,2001, pp. 245-262

[11] K Sheenan,” Email survey response rates”, journal of computer Mediated communication,2001

[12] DP Bachmann, J Elfrink & G Vazzana, ”Email and snail mail face off in rematch Marketing Research”,1999

[13] BH Layne, JR DeCristofor, D McGinty, ”Electronic versus traditional student ratings of instruction”, Res Higher Educ, 1999, pp. 221-232

[14] B Ravelli ,”Anonymous online teaching assessments: preliminary findings”, Annual National Conference of the American Association for Higher Education,2000

[15] L Parker, ”Collecting data the email way training and Development”, 1992 ,pp. 52-54,

[16] S Kiesler and LS Sproull, ”Response effects in the electronic survey public opinion Quarterly”, 1986, pp. 401-413

[17] J Walsh, S Kiesler, LS Sproull and B Hesse, ”Self selected and randomly”, computer network survey Public Opinion Quarterly, 1992, pp. 241-244

[18] K sheenan and Hoy , ”Using email to survey internet users in the United states”, journal of computer Mediated Communication, 1999

[19] Weible and J Wallace, ”the impact of the Internet on data collection”, Marketing Research;1998, pp. 19-23

[20] JH Watt, ”Internet systems for evaluation research in Gay G and Bennington T L”, social moral epistemological and practical implication JosseyBass, San Francisco, 1999, pp. 24-44

[21] GW Yun ,”Comparative response to a survey executed by post email and web form”, journal of Computer Mediated Communication, 2000

[22] ABC Tse ,”Comparing the response rates response speed and response quality of two methods of sending questionnaires’ email vs mail”, journal of the Market Research Society,1998, pp. 353-362

[23] R Mehta & E Sivada, ”Comparing response rates and response content in mail versus electronic mail”, surveys journal of the Market Research Society,1995 pp. 429-439

[24] CJ Dommeyer & E Moriarty, ”Comparing two forms of an email survey’ embedded vs attached”, International Journal of Market Research,1999, pp. 39-50

[25] A Ranchhod and F Zhou,”Comparing respondents of email mail surveys’ understanding the implications of technology”, Marketing Intelligence and Planning, 2001, 245-262.

[26] JK Peat, ”Health Science Research’ A handbook of quantitative methods”, Allen and Unwin Crows Nest,2001

[27] P Thomas, W Jeromie, X Yingcai, ”existing online evaluation system, A Role-based online evaluation System Text Mining”, retrieved on May 15th, 2010 from http://www.statsoft.com/textbook/text-mining/

[28] Text mining retrieved on May 27th 2010 from http://en.wikipedia.org/wiki/Text_mining

[29] Josef Leung & Ching-Long Yeh,”Natural Language Processing for Verbatim Text Coding and Data Mining Report Generation”, 2010, pp. 1-14

[30] J Mc Gourty , K Scoles & S Thorpe, ”Web-based course evaluation: comparing the experience at tow Universities”, 2002

[31] Goodman;”Developing Appropriate Administrative Support for Online Teaching with an Online Unit Evaluation System”, Proceedings of ISIMADE 99 (international symposium on Intelligent Multimedia and Distance Education), 1999, pp. 17-22

[32] T Ha, J Marsh & J Jones ,”A web-based System for Teaching Evaluation, Retrieved on April 12th, 2010 from http://home.ust.hk/~eteval/cosset/ncitt98.pdf

[33] H Andreas, N Andreas, P Gerhard, ”A brief Survey of Text Mining”, LDV Forum, Band 20, 2005, Pp 19-56

[34] N Un Yong and JM Raymond ,”Text Mining with Information Extraction”, American Association for Artificial Intelligence,2002, pp. 60-67

[35] M Hearst,”untangling text data mining. In Proc. Of ACL 99 the 37th Annual Meeting of the Association for computational Linguistics, 1999

[36] L Kaufman & Rouseeuw, ”finding groups in data, an introduction to cluster analysis, 1990

[37] R Feldman & L Dagan,”Kdt – Knowledge discovery in texts”, In Proc. Of the first int. Conf. On knowledge Discovery (KDD), 1995, pp.112-117

[38] G Salton, A wong & C Yang, ”A vector space model for automatic indexing”, Communications of the ACM, 1975, pp. 613-620

[39] G Salton, J Allan & C Buckley, ”Automatic structuring and retrieval of large text files”, Communication of the ACM, 1994, pp. 97-108

[40] R Haskell,” Academic freedom, tenure, and student evaluation of faculty”, Galloping polls in the 21st century,1997


9

The Effect of Public String on Extracted String in A Fuzzy Extractor

Yang Bo 1, Li Ximing 2 and Zhang Wenzheng 3

1College of Informatics, South China Agricultural University,

Guangzhou, 510642, P.R. China [email protected]

2 College of Informatics, South China Agricultural University,

Guangzhou, 510642, P.R. China [email protected]

3 National Laboratory for Modern Communications,

Chengdu, 610041, P.R. China [email protected]

Abstract: A fuzzy extractor is designed to extract a uniformly distributed string from a noisy input in an error-tolerant manner. It has two outputs for a noisy input, a uniformly distributed string and a public string. This paper gives the effect of public string on the entropy loss in a fuzzy extractor, and obtains the relationship between the entropy loss and public string, and the relationship between the size of extracted string and public string.

Keywords: Cryptography, Secure sketch, Fuzzy extractor,

Min-entropy, Entropy loss

1. Introduction To securely derive cryptographic keys from a noisy input such as biometric data, a fuzzy extractor is designed to extract a uniformly distributed string from this noisy input in an error-tolerant manner [1, 2, 5, 6, 7]. A fuzzy extractor has two outputs for a noisy input, a uniformly distributed string which is used as cryptographic key, and a public string which is used to encode the information needed for extraction of the uniformly distributed string. The difference between the min-entropy of the input and the conditional min-entropy of the input given extracted string is defined as the entropy loss of a fuzzy extractor.

This paper gives the effect of public string on the entropy loss in a fuzzy extractor, and obtains the relationship between the entropy loss and public string, and the relationship between the size of extracted string and public string.

A similar problem in unconditionally-secure secret-key agreement protocol was considered in [3, 4, 8], which dealt with the effect of side-information, obtained by the opponent through an initial reconciliation step, on the size of the secret-key that can be distilled safely by subsequent privacy amplification.

2. Preliminaries We repeat some fundamental definitions and conclusions in this section. Random variables are denoted by capital letter,

the alphabet of a random variable is denoted by the corresponding script letter, the cardinality of a set is denoted by . The expected value of a real-valued random variable is denoted by . The uniform distribution over is denoted by .

A useful bound for any real-valued variable , any , and any ( is the set of real numbers) is

. Take , we have

(1) The Rényi entropy of order of a random variable with

distribution and alphabet is defined as ,

for and . The min-entropy of is

. The conditional min-entropy of given is

. We have . The statistical distance between two probability

distributions with the same alphabet is defined as

. Lemma 1 [1]: Let be two random variables, if

has possible values, then for any random variable ,

A metric space is a set with a distance function , satisfying if

and only if , and symmetry and the triangle inequality

. Definition 1. An -secure sketch is a pair of

randomized procedures, “sketch” ( ) and “recover” ( ), with the following properties:

(i) The sketching procedure on input returns a bit string . The recovery procedure takes an element and a bit string .


10

(ii) Correctness: If , then . (iii) Security: For any distribution over , if , then . Definition 2. An -fuzzy extractor is a pair

of randomized procedures, “generate” ( ) and “reproduce” ( ), with the following properties:

(i) The generation procedure on input outputs an extracted string and a helper string

.The reproduction procedure takes an element and a bit string as inputs.

(ii) Correctness: If and , then . (iii) Security: For any distribution over , if

and , then .

3. The Effect of Public String on Extracted String and the Size of Extracted String

In the following two theorems, we give the relationship between the entropy loss and the public string in a fuzzy extractor.

Theorem 1. In an -fuzzy extractor, let be a random variable with alphabet , be a deterministic function of , and with alphabet . Then with probability approximately 1, we have

. Proof. We first consider the entropy loss of the Rényi entropy of order . Since is a deterministic function of , and , it follows that

Interpreting as a function of , the equation above is equivalent to

or

. Let be an arbitrary constant,

From (1), we have

or

with probability at least .

Divide by and obtain

Because is an arbitrary constant, we take it big enough such that is approximately 1, take limit with , and obtain

, with probability approximately 1.

Let , if is distributed uniformly, then , ,

. Therefore, the lemma1 is a special case of Theorem1. Lemma2. Let be a constant, then

with probability at least . Proof. From

, we have , ,

. From (1), it follows So

, with probability at least . Because the inequality holds for each , we have

,

with probability at least . Theorem2. Let be the same as theorem1,

be an arbitrary constant. Then for

with probability at least . Proof. From

and Lemma2, we have


The variance of is

. By chebychef inequality, we have , and

,

with probability at least


11

. Combined with theorom1, it follows that


In the following theorem, we obtain the relationship between the size of extracted string and the public string in a fuzzy extractor.

Theorem 3. In a fuzzy extractor constructed from secure sketch and pair-independent hashing based strong extractor , the length of extracted string satisfies

with probability approximately 1.

Further, let be two constants, satisfy , , then the length of extracted string

satisfies

Proof. From [1], we have

. From theorem1, it follows

Let , we have

with probability approximately 1.

If be two constants, satisfy , and , then from

and theorem2, we have

.

Let , we have


From theorem3, we have

Therefore, for a fuzzy extractor to extract a uniformly

distributed string with some length from a noisy input, it is necessary that the entropy of public string must be smaller than some value, and the smaller the entropy of public string, the longer the uniformly distributed string extracted by a fuzzy extractor.

Acknowledgement This work is supported by the National Natural Science Foundation of China under Grants 60973134, 60773175, the Foundation of National Laboratory for Modern Communications under Grant 9140c1108010606, and the Natural Science Foundation of Guangdong Province under Grants 10351806001000000 and 9151064201000058.

References [1] X. Boyen, “Reusable cryptographic fuzzy extractors,”

In Eleventh ACM Conference on Computer and Communication Security. ACM, October 25-29 2004. 82-91.

[2] X. Boyen, Y. Dodis, J. Katz, Ostrovsky R. and A. Smith, “Secure remote authentication using biometric data,” In Advances in Cryptology-EUROCRYPT 2005, Ronald Cramer, editor, Lecture Notes in Computer Science 3494, Springer-Verlag, 2005, 147-163.

[3] C. Cachin, U. M. Maurer, “Linking information reconciliation and privacy amplification,” EUROCRYPT’94, Lecture Notes in Computer Science, Vol. 950, Springer-Verlag, 1995, 266-274.

[4] C. Cachin,“Smooth entropy and Rényi entropy,” In EUROCRYPT’97, Lecture Notes in Computer Science, Springer Verlag, 1997, 193-208.

[5] R. Cramer, Y. Dodis, S. Fehr, C. Padró and D. Wichs, “Detection of Algebraic Manipulation with Applications to Robust Secret Sharing and Fuzzy Extractors,” Adv. in Cryptology- EUROCRYPT 2008, Lecture Notes in Computer Science 4965, Springer Berlin,2008, 471-488.

[6] Y. Dodis, L. Reyzin and A. Smith, “Fuzzy Extractors: How to Generate Strong Keys from Biometrics and Other Noisy Data,” Adv. in Cryptology- Eurocrypt 2004, Lecture Notes in Computer Science 3027, Springer-Verlag, 2004, 523-540.

[7] Y. Dodis, J. Katz, L. Reyzin and A. Smith, “Robust Fuzzy Extractors and Authenticated Key Agreement from Close Secrets,” In Advances in Cryptology-CRYPTO’06, volume 4117 of Lecture Notes in Computer Science. Springer, 2006, 232-250.

[8] Bo Yang, Tong Zhang, Changxing Pei, “The effect of side information on smooth entropy,” Journal of Discrete Applied Mathematics, 136(2004), 151-157.

Yang Bo received the B. S. degree from Peking University, Beijing, China, in 1986, and the M. S. and Ph. D. degrees from Xidian University, China, in 1993 and 1999, respectively. From July1986 to July 2005, he had been at Xidian University, from 2002, he had been a professor of National Key Lab. of ISN in Xidian University, supervisor of Ph.D. He has served as a Program Chair for the fourth China Conference on Information and Communications Security (CCICS'2005) in May 2005, vice-chair for ChinaCrypt'2009 in Nov. 2009, and general chair for the Fifth Joint Workshop on Information Security (JWIS 2010), in Aug. 2010. He is currently dean, professor and supervisor of Ph.D. at College of Informatics and College of Software, South China


12

Agricultural University. His research interests include information theory and cryptography. Li Ximing received the B.A. degree from the Shandong University of Technology, Jinan, Shandong, China, in 1996 and M. E. degree from Jinan University, Guangzhou, China, in 2005. He is currently a candidate of Ph.D. degree in College of Informatics, South China Agricultural University, His research interests include information theory and cryptography. Zhang Wenzheng received the B. S. degree and the M. S. degree from University of Electronic Science and Technology of China, in 1988 and 1991, respectively. He is currently general engineer at National Laboratory for Modern Communications.


13

Analysis of Statistical Path Loss Models for Mobile Communications

Y. Ramakrishna1, Dr. P. V. Subbaiah2 and V. Ratnakumari 3

1PVP Siddhartha Institute of Technology, Vijayawada, India

[email protected]

2Amrita Sai Institute of Science & Technology, Vijayawada, India [email protected]

3PVP Siddhartha Institute of Technology, Vijayawada, India

[email protected] Abstract: The ability to accurately predict radio propagation behavior for mobile communications is becoming crucial to system design. Unlike deterministic models which require more computations, statistical models are easier to implement, require less computational effort and are less sensitive to the environmental geometry. In mobile radio systems, most of the models regarding fading apply stochastic process to describe the distribution of the received signal. It is useful to use these models to simulate propagation channels and to estimate the performance of the system in a homogeneous environment. Propagation models that predict the mean signal strength for an arbitrary Transmitter-Receiver (T-R) separation distance are called large-scale propagation models, since they characterize signal strength over large T-R separation distance. In this paper, the large-scale propagation performance of COST-231 Walfisch Ikegami and Hata models has been compared varying Mobile Station (MS) antenna height, T-R separation distance and Base Station (BS) antenna height, considering the system to operate at 850 MHz. Through MATLAB simulation it is observed that the COST-231 model shows better performance than Hata Model.

Keywords: Path Loss, COST-231 Walfisch Ikegami Model, Hata Model.

1. Introduction

Propagation models have traditionally focused on predicting the received signal strength at a given distance from the transmitter, as well as the variability of the signal strength in a close spatial proximity to a particular location. Propagation models that predict the signal strength for an arbitrary T-R separation distance are useful in estimating the radio coverage area of a transmitter. Conversely, propagation models that characterize the rapid fluctuations of the received signal strength over very short travel distances are called small-scale or fading models [1]. Propagation models are useful for predicting signal attenuation or path loss. This path loss information may be used as a controlling factor for system performance or coverage so as to achieve perfect reception. The common approaches to propagation modeling include physical models and empirical models. In this paper, only empirical models are considered. Empirical models use measurement data to model a path loss equation. To conceive these models, a correlation was found between the received signal

strength and other parameters such as antenna heights, terrain profiles, etc through the use of extensive measurement and statistical analysis. Radio transmission in a mobile communication system often takes place over irregular terrain. The terrain profile of a particular area needs to be taken into account for estimating the path loss. The terrain profile may vary from a simple curved earth profile to a highly curved mountainous profile. A number of propagation models are available to predict path loss over irregular terrain. While all these models aim to predict signal strength at a particular receiving point or in a specific location called sector, the methods vary widely in their approach, complexity and accuracy. Most of these models are based on a systematic interpretation of measurement data obtained in the service area. In this paper, the wideband propagation performance of COST-231 Walfisch Ikegami and Hata models has been compared varying MS antenna height, propagation distance, and BS antenna height considering the system to operate at 850 MHz. Through the MATLAB simulation it turned out that the COST-231 Walfisch Ikegami model outperforms the other large scale propagation models.

2. Models for Predicting Propagation Path Loss

A good model for predicting mobile radio propagation loss should be able to distinguish among open areas, sub urban areas and urban areas. All urban areas, hilly or flat areas are unique in terrain, buildings and street configurations. The models described in this paper are considered to design a prediction model for urban area. A good prediction model follows the same guidelines, so that every user gets the same answer for given conditions. Path loss may occur due to many effects, such as free-space loss, refraction, diffraction, reflection, aperture-medium coupling loss and absorption [2]. Path loss is also influenced by terrain contours, environment (urban or rural, vegetation and foliage), propagation medium (dry or moist air), the distance between the transmitter and the receiver, and the height of antennas.

Path loss normally includes propagation losses caused by


14

• The natural expansion of the radio wave front in free space.

• Absorption losses (sometimes called penetration losses) • When the signal passes through media not transparent

to electromagnetic waves and diffraction losses. The signal radiated by a transmitter may also travel along many and different paths to a receiver simultaneously; this effect is called multipath propagation. Multipath propagation can either increase or decrease received signal strength, depending on whether the individual multipath wave fronts interfere constructively or destructively. In wireless communications, path loss can be represented by the path loss exponent, whose value is normally in the range of 2 to 5 (where 2 is for propagation in free space, 5 is for relatively lossy environments) [1]. In some environments, such as buildings, stadiums and other indoor environments, the path loss exponent can reach values in the range of 4 to 6. On the other hand, a tunnel may act as a waveguide, resulting in a path loss exponent less than 2. The free-space path loss is denoted by L p (d) , which is

( )dB

cp d

fcdL

Π−= 4/log20)( 10 (1)

where c = velocity of light, fc = carrier frequency and d = distance between transmitter and receiver. For long-distance path loss with shadowing, the path loss is denoted by L p (d), which is

00

,)( dddddL

n

p ≥

∝ (2)

or equivalently,

( )( ) 00100 ,log10)()( ddd

dndLdLdBpp ≥+= (3)

where n = path loss component, d0 = the close-in reference distance (typically 1 km for macro cells, 100m for micro cells), d = distance between transmitter and receiver.

3. Point-to-Point Prediction Models

Calculation of the path loss is usually called prediction. Exact prediction is possible only for simpler cases, such as the above-mentioned free space propagation or the flat-earth model. For practical cases the path loss is calculated using a variety of approximations. The area-to-area model provides path loss with long range of uncertainty. Point-to-Point prediction reduces the uncertainty range by applying the detailed terrain contour information to the path-loss predictions. Point-to-point prediction is very useful in mobile cellular system design, where the radius of each cell is 16 kilometers or less. It can provide information to insure uniform coverage and avoidance of co-channel interference. Statistical methods (also called stochastic or empirical) are based on fitting curves with analytical expressions that recreate a set of measured data. In the cities the density of people is high. So the more accurate loss prediction model will be a good help for the Base Station Transceiver System (BTS) mapping for

optimum network design. Among the radio propagation models, city models are to be analyzed in this paper to find the best fitting city model. The well known propagation models for urban areas are:

i) COST-231 Walfisch Ikegami Model ii) Hata Model

3.1 COST-231 Walfisch Ikegami Model

This model is being considered for use by International Telecommunication Union-Radio Communication Sector (ITU-R) in the international Mobile Telecommunications-2000 (IMT-2000) standards activities [1]. This model is applicable for frequencies in the range of 150 to 1800 MHz. This utilizes the theoretical Walfisch-Bertoni model, and is composed of three terms:

00L

)(0

00 ≤+

>+++=

msdrts

msdrtsmsdrts

LLLLforLLL

dL (4)

where L0 represents the free space path loss, Lrts is the rooftop-street diffraction and scatterer loss, and Lmsd is the multi screen diffraction loss. The free space loss is given by

fdL log20log204.320 ++= (5)

Where d is the radio-path length (in km), f is the radio frequency (in MHz), and

oriMobilerts LhfwL +∆++−−= log20log10log109.16 (6)

Here w is the street width (in m) and

MobileRoofMobile hhh −=∆ (7)

is the difference between the height of the building on which the base station antenna is located, hRoof, and the height of the mobile antenna, hMobile. Lori is the loss that arises due to the orientation of the street. It depends on the angle of incidence (φ ) of the wave relative to the direction of the street. Lori is given by

00

00

00

90555535

350

)55(114.00.4)35(075.05.2

354.010

<≤<≤

<≤

−−−+

+−=

φφ

φ

φφ

φforLori

(8)

Lmsd is given by

bfkdkkLL fdabshmsd log9loglog −+++= (9)

Where b is the distance between the buildings along the signal path and Lbsh and ka represent the increase of path loss due to a reduced base station antenna height. Using the abbreviation

RoofBaseBase hhh −=∆ (10)

Where hbase is the base station antenna height, we observe that Lbsh and ka are given through


15

RoofBase

RoofBaseBasebsh hh

hhhL

≤>∆+−

=0

)1log(18 (11)

RoofhBasehkmddBasehRoofhBasehkmdBasehRoofhBaseh

ak≤≥∆−

≤≥∆−

>

=

5.06.1545.08.054

54 (12)

The terms kd and kf control the dependence of the multiscreen diffraction loss versus distance and the radio frequency of operation, respectively. They are

RoofBaseRoof

Base

RoofBase

d hhhh

hhk ≤

∆−

>= 1518

18 (13)

And

−+−= 1

9257.04 fk f (14)

for medium-sized cities and suburban centers with moderate tree densities and for metropolitan centers.

−+−= 1

9255.14 fk f (15)

3.2 Hata model

It is an empirical formulation of the graphical path loss data provided by Okumara’s model. The formula for the median path loss in urban areas is given by

tehcfdBurbanL log82.13log16.2655.69))((50 −+=

dhha tere log)log55.69.44()( −+− (16)

where fc is the frequency and varies from 150 to 1500 MHz, hte and hre are the effective height of the base station and the mobile antennas (in meters) respectively, d is the distance from the base station to the mobile antenna, and a(hre) is the correction factor for the effective antenna height of the mobile which is a function of the size of the area of coverage [2]. For small to medium-sized cities, the mobile antenna correction factor is given by

dBcfrehcfreha )8.0log56.1()7.0log1.1()( −−−= (17)

For a large city, it is given by

MHzffordBhMHzffordBhha

cre

crere 30097.4)75.11(log2.3

3001.1)54.1(log29.8)( 2

2

≥−≤−

= (18)

When the size of the cell is small, less than 1 km, the street orientation and individual blocks of buildings make a difference in signal reception [3]. Those street orientations and individual blocks of buildings do not make any noticeable difference in reception when this signal is well attenuated at a distance over 1 km. Over a large distance the relatively great mobile radio propagation loss of 40 dB/dec is due to the situation that two waves, direct and reflected, are more or less equal in strength [4] – [6]. The local scatterers (buildings surroundings the mobile unit) reflect this signal causing only the multipath fading not the path loss at the mobile unit. When the cells are small, the signal arriving at the mobile unit is blocked by the individual buildings; this weakens the signal strength and is considered

as part of the path loss [7] – [9]. In small cells, the loss is calculated based on the dimensions of the building blocks. Since the ground incident angles of the waves are small due to the low antenna heights used in small cells, the exact height of buildings in the middle of the propagation paths is not important. Although the strong received signal at the mobile unit is come from the multipath reflected waves not from the waves penetrating through buildings, there is a correlation between the attenuation of the signal and the total building blocks, along the radio path.

4. Performance Analysis

In this paper, the propagation path loss has been assessed by considering the parameters BTS Antenna height, MS Antenna height and T-R separation for the COST-231 Walfisch Ikegami and Hata models by MATLAB simulation.

Figure 1. Propagation path loss due to the change in the BTS antenna height.

3 4 5 6 7 8 9 10145

150

155

160

165

170

175

180

185

190

Mobile Antenna Height (Mt.)

Pat

h lo

ss (d

B)

HataIkegami

Figure 2. Propagation path loss due to the change in the MS

antenna height.

30 40 50 60 70 80 90 100160

165

170

175

180

185

190

Base Station Antenna Height (Mt.)

Pat

h lo

ss (d

B)

IkegamiHata


16

Figure 1 depicts the variation of path loss with base station antenna height keeping the parameters MS antenna height and T-R separation constant. It is noted that path loss is decreasing due to increase in BTS antenna height for both models. However path loss continues to be low in COST-231 model. Figure 2 evaluates the path loss by varying MS Antenna height and fixing the other two parameters. As MS antenna height is increased, the path loss is decreased in this case.

Figure 3 illustrates the change in path loss upon change in T-R antenna separation distance. It is observed that the path loss is less up to 4 km radial distance for COST-231 model while the path loss is more beyond 4 km separation distance.

In cases 1 and 2 the path loss is low for both the models and in case 3 the trend is different beyond 4 kilometers separation between transmitter and receiver. Hence COST-231 model may be preferred to design cellular network where the cell radius is less than 4 km. Hence this model is preferred for densely populated urban areas where call traffic is high.

Figure 3. Path loss due to the change in T-R separation.

5. Conclusions

In this paper, two widely known large scale propagation models are studied and analyzed. The analysis and simulation was done to find out the path loss by varying the BTS antenna height, MS antenna height, and the T-R separation. Cost-231 Walfisch Ikegami model was seen to represent low power loss levels in the curves. The result of this analysis will help the network designers to choose the proper model in the field applications. Further up-gradation in this result can be possible for the higher range of carrier frequency.

References [1] Tapan K. Sarkar, M.C.Wicks, M.S.Palma and R.J.

Bonnea, Smart Antennas, John Wiley & Sons, Inc., Publication, NJ, 2003.

[2] M. A. Alim, M. M. Rahman, M. M. Hossain, A. Al-Nahid, “Analysis of Large-Scale Propagation Models for Mobile Communications in Urban Area”, International Journal of Computer Science and Information Security, Vol. 7, No. 1, 2010, pp. 135–139.

[3] W.C.Y.Lee, Mobile Communications Design Fundamentals, Sec. Edition, John Wiley & Sons, Inc., 1992.

[4] W.C.Y.Lee, Mobile Cellular Telecommunications, Sec. Edition, Tata McGraw-Hill Publishing Company Ltd., India, 2006.

[5] Robert J. Piechocki, Joe P. McGeehan, and George V. Tsoulos, “A New Stochastic Spatio-Temporal Propa-gation Model (SSTPM) for Mobile Communications with Antenna Arrays”, IEEE Transactions on Communi--cations, Vol. 49, No. 5, May 2001, pp. 855–862.

[6] Frank B. Gross, Smart Antennas for Mobile Communications, The Mc-Graw Hill Companies, 2005.

[7] C. Jansen, R. Piesiewicz , D. Mittleman and Martin Koch, “The Impact of Reflections From Stratified Building Materials on the Wave Propagation in Future Indoor Terahertz Communication Systems”, IEEE Transactions on Antennas and propagation, Vol. 56, No. 5, May 2008, pp. 1413–1419.

[8] J. C. Rodrigues, Simone G. C. Fraiha, Alexandre R.O. de freitas, “Channel Propagation Model for Mobile Network Project in Densely Arboreous Environments”, Journal of Microwaves and Optoelectronics, Vol. 6, No. 1, June 2007, pp. 236–248.

[9] A.R. Sandeep, Y. Shreyas, Shivam Seth, Rajat Agarwal, and G. Sadashivappa, “Wireless Network Visualization and Indoor Empirical Propagation Model for a Campus WI-FI Network”, World Academy of Science, Engineering and Technology, 42, 2008, pp. 730–734.

[10] J. B. Anderson, T.S. Rappaport and Susumu Yoshida, “Propagation Measurements and Models for Wireless Communication Channels”, IEEE Communication Magazine, January 1995, pp. 42-49.

Authors’ Profile Y.Ramakrishna is currently a research student under Dr. P. V. Subbaiah. He received M.Tech. degree in Microwave Engineering from Acharya Nagarjuna University, India in 2005. He received B.E. degree in Electronics and Communication Engineering from the University of Madras, India in 2002. He

is presently working as Senior Assistant Professor in the Department of Electronics and Communication Engineering, PVP Siddhartha Institute of Technology, Vijayawada, India. His research interests are: Mobile

0 2 4 6 8 10 12 14100

120

140

160

180

200

220

240

Distance Between Base-station and Mobile-station (km.)

Pat

h lo

ss (d

B)

HataIkegami


17

Communications, Smart Antennas, Satellite Communications.

Dr. P. V. Subbaiah received his Ph.D. in Microwave Antennas from JNT University, India 1995, His Master‘s degree in Control Systems from Andhra University, India 1982. He received B.E. degree in Electronics and Commu-nication Engineering from Bangalore University in 1980. He

is currently working as Principal in Amrita Sai Institute of Science and Technology, Vijayawada, India since 2007. His research interest includes Microwave Antennas, Optical Communications and Mobile Communications.

V. Ratnakumari received M.Tech. degree in Microwave Engineering from Acharya Nagarjuna University, India in 2008. She received B.Tech. degree in Electronics and Communication Engineering from JNT University, India in 2005. She is presently working as Assistant Professor in the Department of

Electronics and Communication Engineering, PVP Siddhartha Institute of Technology, Vijayawada, India. Her research interests are: Mobile communications and Signal Processing.


18

Development of Smart Antennas for Wireless Communication System

T. B. Lavate1, Prof. V. K. Kokate2 and Prof. Dr. M.S. Sutaone3

1Department of E & T/C ,College of Engineering Pune-5,

Pune University Road, Shivajinagar Pune-5, India [email protected]





Abstract: In 3G wireless system the dedicated pilot is presented in the structure of uplink CDMA frame of IMT-2000 physical channels and this dedicated pilot supports the use of smart antennas. Switched beam smart antenna (SBSA) creates a group of overlapping beams that together results in omni directional coverage. To reduce the side lobe level and improve the SINR of SBSA non adaptive windowed beam forming functions can be used. In this paper performance of eight element linear SBSA with Kaiser-Bessel window function has been investigated using MATLAB and it is observed that the use of such SBSA at the base station of 3G cellular system improves the capacity by 26% compared to 120o sectorized antennas. However these SBSAs provide limited interference suppression. The problem of SBSA can be overcome using adaptive array smart antenna. In this paper eight element adaptive array smart antenna is investigated where adaptive array smart antenna estimates the angle of arrival of the desired signal using MUSIC DOA estimation algorithm and received signal of each antenna element is weighted and combined to maximize SINR using RLS beam forming algorithm. When such adaptive array smart antenna employed at the base station of 3G cellular system, it provides 34% system capacity improvement. Keywords: Adaptive array smart antenna, Beam forming algorithms, DOA, Switched beam smart antenna, System capacity.

1. Introduction The capacity of 3G wireless system using CDMA is measured in channels/km2 and is given as [3] ]]/[[]/[ cob ANERWC ×÷= (1) where, W is Bandwidth of the system, R is data rate of user, Ac is coverage area of the cellular system and Eb/No is the signal to interference plus noise ratio. From equation (1) it is evident that 3G CDMA system is still interference limited and the capacity of such wireless system can be improved by interference reduction technique such as smart antenna. The smart antenna types, their performance analysis in 3G cellular mobile system is investigated here.

2. Development of Smart Antennas To reduce the multiple access interference (MAI) of 3G wireless communication system it is essential to make the antenna more directional or intelligent. All this leads us to the development of smart antenna. Depending upon the various aspects of smart antenna technology they are categorized as switched beam smart antenna and adaptive array smart antenna. In cellular system for macro cells angular spread (AS) at the base station is generally below 15o and the upper bound on number of elements in the array is about 8 for AS of 12o. Hence eight element smart antenna arrays are suggested for 3G cellular wireless system.

3. Development of Eight Element Linear Array Switched Beam Smart Antenna

3.1 Switched Beam Smart Antenna (SBSA) The switched beam smart antenna (SBSA) has multiple fixed beams in different directions and this can be accomplished using feed network referred to as beam former and most commonly used beam former is Butler matrix. The receiver selects the beam that provides greatest signal enhancement and interference reduction as shown in Fig.1.

Figure 1.. Switched Beam Smart Antenna

However SBSA solutions work best in minimal to moderate MAI scenario. But they are often much less complex and are easier to retrofit to existing wireless technologies.

Beam Select

Beam-1

Beam-2

Beam-3

Beam-4

Desired signal Direction

Signal output

Beam former


19

3.2 Array Pattern of SBSA In practice SBSA creates several simultaneous fixed beams through the use of Butler matrix. With Butler matrix for SBSA of N elements the array factor can be given as [4]

)sin(sin)/()]sin(sin)/sin[()(

l

l

θθλπθθλπ

θ−∗

−∗=

dNdNAF (2)

where sinөℓ = ℓλ/Nd; ℓ= ±1/2, ±3/2, ------- ± (N-1)/2. If element spacing is d =0.5λ the beams of SBSA are evenly distributed over the span of 1200 and they are orthogonal to each other. Using MATLAB, equation (2) for N = 8 is simulated and simulation results are shown in Fig.2. It is obvious that 8 element SBSA forms 8 spatial channels which are orthogonal to each other. Each of these spatial channels has the interference reduction capability depending on side lobe level (γ). It is apparent from Fig.2 that the array factor

-60 -40 -20 0 20 40 60-50

-45

-40

-35

-30

-25

-20

-15

-10

-5

0

AOA

Nor

mal

ized

pow

er g

ain(

dB)

Figure 2. SBSA Array pattern for number of antenna

elements N = 8.

of SBSA has side lobe levels γ=-16 dB. These harmful side lobes of SBSA can be suppressed by windowing the array elements as shown Fig. 3. [4],

w1w2 w2w1 wN/2

y

x

Ѳ

wN/2

d

Figure 3 . N Element linear antenna array with weights

The array factor of such N linear element windowed array is given by

∑=

−=2/

1)sin)2/)12cos((()(

N

nnn kdnwAF θθ (3)

To determine the weights wn the various window functions such as Hamming, Gaussian and Kaiser-Bessel weight functions can be used in eight element SBSA. Out of these,

Kaiser-Bessel weight function provides minimum increase in beam width of main lobe and hence investigated in detail. The Kaiser Bessel weights are determined by

][

]))2//(((1[ 2

παπα

o

on I

NnIw

−= (4)

where, n = 0,….., N/2, α > 1, N is the number of elements in the array. The Kaiser Bessel normalized weights for N = 8 are found using the Kaiser (N α ) command in MATLAB. With these weights the equation (3) is simulated using MATLAB and results are presented in Fig.4.

-90 -60 -30 0 30 60 90-60

-50

-40

-30

-20

-10

0

θ

|AF|

dB

Kaiser-BesselBoxcar window

Figure 4 Array factor with Kaiser-Bessel weights and N=8

Fig.4 shows that Kaiser Bessel function provides side lobe suppression γ = -33 dB but with minimum increase in main lobe beam width ∆ = 1.2

4. Development of Eight Element Adaptive Array Smart Antenna

4.1 Adaptive array smart antenna Adaptive array smart antennas are the array antennas whose radiation pattern is shaped according to some adaptive algorithms. Smart essentially means computer control of the antenna performance. Actually adaptive array smart antenna is an array of multiple antenna elements which estimates the angle of arrival of the desired signal using DOA estimation algorithms [8] such as MUSIC (Multiple signal classification) or ESPRIT (Estimation of signal Parameters via Rotational invariant Techniques). The estimated DOA is used for beam forming in which the received signal of each antenna element is weighted and combined to maximize the desired signal to interference plus noise power ratio which essentially puts a main beam of an antenna in the direction of desired signal and nulls in the direction of interference. The weights of each element of an array may be changed adaptively [5] and used to provide optimal beam forming in the sense that it reduces MSE (Mean Square Error) between desired signal and actual signal output of an array. Typical algorithms used for this beam forming are LMS (Least Mean Square) or RLS (Recursive Least Square) algorithms.


20

As this smart antenna generates narrower beams it creates less interference to neighboring users than switched beam approach. Adaptive smart antennas provide interference

W1

W2

WN

Beamformer

Source Estimation Processor

∑ Mobile location data

DOA Estimation Algorithmseg; Music, ESPRIT

X1(k)

X2(k)

XN(k)

Ѳ1

Ѳ2

ѲD

N signal ports

Figure 5. N element Adaptive antenna array with D

arriving signals

rejection and spatial filtering capability which has effect of improving the capacity of wireless communication system. MUSIC DOA estimation algorithm is highly stable, accurate and provides high angular resolution compared to all other DOA estimation algorithms. Similarly RLS [6] beam forming algorithm is faster in convergence and hence MUSIC and RLS algorithms are suitable for mobile communication and they are investigated much in detail.

4.2 MUSIC Algorithm MUSIC is an acronym which stands for multiple signal classification and it is based on exploiting the eigen structure of input covariance matrix. As shown in Fig.5, if the number of signals impinging on N element array is D the number of signal eigen values and eigenvectors is D and number of noise eigen values and eigenvectors is N-D. The array correlation matrix with uncorrelated noise is given by [4],

IARAR nH

ssxx2σ+∗∗= (5)

where, A = [a(θ1) a(θ2) a(θ3) --- a(θD)]is NxD array steering matrix Rss=[s1(k) s2(k) s3(k) ---- sD(k)]T is D x D source correlation matrix. Rxx has D eigenvectors associated with signals and N – D eigenvectors associated with the noise. We can then construct the N x (N-D) subspace spanned by the noise eigenvectors such that

[ ]DNN vvvvV −−−−−= 321 ,, (6)

The noise subspace eigenvectors are orthogonal to array steering vectors at the angles of arrivals θ1, θ2, θ3, --- θD and

the MUSIC Pseudospectrum is given as ))()((/1)( θθθ aVVaabsP H

NNH

MUSIC = (7)

However when signal sources are coherent or noise variances vary the resolution of MUSIC diminishes. To overcome this we must collect several time samples of relieved signal plus noise, assume ergodicity and estimate the correlation matrix via time averaging.

4.3 Simulation results of MUSIC algorithm

For simulation of MUSIC algorithm MATLAB is used and the array used is eight element linear array with, * Spacing between array elements d =0.5λ * DOAs of desired signals : -50, 100 and 250

-30 -20 -10 0 10 20 30 40-40

-35

-30

-25

-20

-15

-10

-5

0

5

10

AOA Degrees

|P( θ

)|db

K=10

K=100

K=10K=100

Figure 6. MUSIC Spectrum for N = 8 and DOA = -50, 100and 250.

Fig.6 shows the MUSIC spectrum of eight element adaptive array smart antenna obtained for snapshots equal to 10 and 100 and direction of arrivals of desired signals are -50, 100 and 250. Increased snap shots leads to sharper MUSIC spectrum peaks indicating more accurate detection of desired signals and better resolution.

4.4 RLS Beam forming algorithm

Since signal sources can change with time, we want to de-emphasis the earliest data samples & emphasis the most recent ones & this can be accomplished by modifying the co-relation matrix & co-relation vector equations such that we forget the earliest time samples. Thus

∑=

−=k

i

Hik iXiXkR1

)()()( α

and


21

-90 -60 -30 0 30 60 900

0.2

0.4

0.6

0.8

1

AOA (deg)

|AF n| N=4

N=8

N=4N=8

∑=

−=k

i

Hik iXidkr1

)()(*)( α

where α is the forgetting factor & it is positive constant such that

0 ≤ α ≤ 1 Following the recursion formula for above equation the gain vector,

)()1()(1)()1()( 11

11

kXkRkXkXkRkg H −+

−= −−

−−

αα (8)

After kth iteration the weight update equation for RLS algorithm,

W(k)=W(k-1)+g(k)[d*(k)-XH(k)W(k-1)] (9) In RLS there is no need to invert large co-relation matrix. The recursive equations allow for easy updates of inverse of the co-relation matrix. The RLS algorithm also converges much more quickly than the LMS algorithm hence it is recommended to use RLS algorithm for adaptive array beam forming.

4.5 Simulation results of RLS algorithm For simulation of RLS algorithm eight element linear array is used with, * Spacing between array elements d = 0.5λ

* DOA of desired signal = 250 * DOA of Interfering Signals = - 450 and +450

Figure 7 .Adaptive beam forming using RLS algorithm.

Fig.7 shows the eight element adaptive array smart antenna with MUSIC and RLS algorithm which puts the main beam in the DOA of 25o of desired signal and nulls in the direction of interfering signals at - 45o and 45o i.e. this array

accepts the signal at 25o and rejects the signals at + 45o and thus improves the SINR of the wireless system. Fig.7 also shows that as the number of elements in the array are increased from

four to eight DOA detection and beam forming in the desired direction becomes more accurate and highly stable.

5. Performance Analysis of Kaiser-Bessel Windowed SBSA and adaptive Array Smart Antenna with MUSIC and RLS Algorithm We consider the DS-CDMA system in which the data is modulated using BPSK format. We assume that the PN code length M = 128 and the power of each mobile station is perfectly controlled by the same base station BS. The bit error rate (BER) for DS-CDMA, 1200 sectorized systems is given by [1],[2],

+=

−

=∑

1

)1(2

)1(

)(

21

3/

ob

K

k ob

ok

be NENE

MNEQP (10)

where Eb(1)/N0 is SINR for user of interest #1, Eb(k)/N0 is the same for interfering users. We extended equation (10) to switched beam smart antenna as[9]

+++=

−

===∑∑∑

1

)1(

3

2)1(

)(2

2)1(

)(1

2)1(

)(

21

/3/

/3/

3/

ob

k

k o

Ok

bk

k o

Ok

bK

k ob

ok

be NENE

MNENENE

NENE

QPbb

γγ (11)

where k1 is the number of interfering users with same PN code like user #1, affected side lobe, k2 is the number of interfering users affected main lobe & k3 is like k2 but affected side lobes. For Boxcar (non windowed) SBSA the side lobe level (as shown in fig.2) γ= -16 db. While as Kaiser-Bessel weights can be selected for SBSA so that its side lobe level (as shown in fig.4) can be reduced to γ=-33db. We extended the equation (11) to eight element adaptive array smart antenna with MUSIC and RLS algorithm as

+=

−

=∑

1

)1(

2

2)1(

)(

21

3/

ob

K

k ob

ok

be NENE

MNEQP (12)

where Eb

(1)/No is SINR for user of interest #1, Eb(k) /No is

the same for interfering users. K2 is the number of user affected main lobe. In adaptive array smart antenna with MUSIC and RLS algorithm the side lobe level γ is reduced to insignificant magnitude. Now using MATLAB the equations (10) and (11 ) and (12) are simulated and simulation results are presented in Fig.8 where Pe is a function of Eb/No and number of active users is 100 in service area . As follows from Fig.8 for fixing level of Pe for 3G cellular mobile communication system acceptable Pe=10-4, the required Eb/No by the application of SBSA, Kaiser Bessel windowed


22

SBSA and adaptive array with MUSIC and RLS are determined and capacity improvements of 3G cellular system using (1) are found and as follows from the Fig.8 the use of adaptive array smart antenna in 3G cellular system provides maximum capacity improvement.

0 2 4 6 8 10 12 1410-10

10-8

10-6

10-4

10-2

100

SINR dB

Pe

120 deg. sect. AntennaBoxcar SBSAKaiser-Bessel SBSAAdaptive array

Figure 8. BER Performance of SBSA and Adaptive array

smart antenna

6. Conclusion The array pattern of eight element SBSA has side lobe level -16 dB and when used in 3G CDMA system improves the capacity by 16.8% compared to sectorized antennas. But the Kaiser-Bessel weight function with SBSA provides side lobe suppression to -33dB with minimum increase in main lobe beam width (∆=1.2) and when it is used in 3G CDMA cellular mobile wireless system, it improves the system capacity by 26 %. Further the adaptive array smart antennas with MUSIC DOA estimation algorithm & RLS beam forming algorithm rejects the interfering signal completely. When such adaptive array smart antenna is employed in 3G CDMA wireless cellular mobile system it improves the system capacity by 34 %. References [1] T. S. Rappaport, “Wireless Communications: Principles

and Practice”, 2005 ,Prentice Hall . [2] J. C. Liberti, T. S. Rappaport, “Smart Antennas for

Wireless Communications: IS-95 and 3G CDMA Applications”, 1999,Prentice Hall

[3] Ahmed EI Zooghby, “Smart Antenna Engineering”, Artech House, 2005

[4] Frank Gross, “Smart Antennas for Wireless Communication with MATLAB”, McGraw-Hill, NY, 2005

[5] Simon C. Swales, David J. Edwards and Joseph P.McGeehan, “ The performance Enhancement of

Multibeam Adaptive Base Station Antennas for Cellular Land Mobile Radio Systems”, IEEE Transactions on Vehicular Technology, vol.3, No.1 pp 56-67, February 1990

[6] Ch. Shanti , Dr. Subbaiah, Dr. K. Reddy, “ Smart Antenna Algorithms for W-CDMA Mobile Communication systems”, IJCNS , International Journal of Computer Science and Network Society vol.8 No.7 July 2008.

[7] A. Kundu, S. Ghosh, B. Sarkar A. Chakrabarty, “ Smart Antenna Based DS-DMA System Design for 3rd generation Mobile Communication”, Progress in Electromagnetic Research M, Vol.4, 67-80, 2008

[8] Lal C. Godara, “Application of Antenna Arrays to Mobile Communications, Part-II: Beam forming and Directional of Arrival”, Proceedings of the IEEE vol.85 No.88 pp. 1195-1245 August 1997.

[9] David Cabrera, Joel Rodriguez, “Switched Beam Smart Antenna BER Performance Analysis for 3G CDMA Cellular communication”, May 2008

[10] N. Odeh, S. Katun and A. Ismail, “ Dynamic Code Assignment Algorithm for CDMA with Smart Antenna System”, Proceedings IEEE, International Conference on communication, 14-17, May 2007, Penang Malaysia

Author’s Profile T. B. Lavate received M.E.(Microwaves) degree in E & T/C in 2002 from pune university. He is pursuing his Ph.D in college of engineering Pune-5. In his credit, he has about eleven papers published in international/national repute conferences and journals. He is member of IETE/ISTE. V. K. Kokate received M.E.(Microwave) from pune university. Having over 35 years experience in teaching and administrative, his fields of interest are Radar, Microwaves, and Antennas. In his credit, he has about forty papers published in national / international repute conferences and journals. He is member of ISTE / IEEE. Dr. M. S. Sutaone received his B.E. in electronics from Vshveswarayya National Institute of Technology (VNIT) Nagpur in 1985. He did his M.E. in E & T/C from college of engineering Pune-5 in 1995and Ph.D. in 2006 from Pune University. He is currently professor and HOD in E & T/C department of college of engineering Pune-5. His areas of interest are signal processing , Advanced communication systems and communication networks. In his credit, he has about twenty five papers published in international/ national repute conferences and journals. He is member of ISTE /IETE.


23

Accumulation Point Model: Towards Robust and Reliable Deployment

Yong Lin, Zhengyi Le and Fillia Makedon

416 Yates Street, P.O. Box 19015, Arlington, TX 76019, USA ylin, zyle, [email protected]

Abstract: The reliability and fault tolerance problems are very important for long strip barrier coverage sensor networks. The Random Point Model and the Independent Belt Model are prevalent deployment models for barrier coverage. In this paper, we propose an economical, robust and reliable deployment model, called Accumulation Point Model (APM). Given the same length of barrier and the same nodes activation rate, the number of sensors required by APM is only 1/[t(1 -θ)] times that of random point model. Our analysis indicates that the network failure probability of the APM is lower than the other two models. In our network simulation, the network lifetime of APM is much higher than random point model. Compared with independent belt model, APM exhibits good failure tolerance. This paper also presents a light weight accumulation point protocol for building the network and scheduling the sleep cycle. This is a localized algorithm for k-barrier coverage.

Keywords: sensor networks, barrier coverage, fault tolerance, sensor deployment.

1. Introduction Barrier coverage sensor networks are deployed in a belt region to detect traversing targets. When barrier areas are very long, the deployment problem becomes more prominent. Sensor networks are usually composed of thousands of relatively economical, energy constraint and hence failure-prone sensors [9][10]. How to guarantee the reliability and robustness is an important issue of sensor network quality of service (QoS). Most of the applications require the sensor network to be reliable and robust. For example, for a barrier sensor network deployed in a military field or a country border, it is highly expected that the belt be strong enough and resist to failure.

Current solutions for the reliability and fault tolerance ability of sensor networks focus on increasing the node density [10]. This idea is originated from area coverage. In a large plane, it is hard to deploy sensors precisely in a regular geometric shape due to the deployment cost. So the stochastic deployment is often used instead of deterministic deployment. Random point model is a kind of stochastic deployment. It works on the following assumptions: (1) when the node density reaches a threshold, the coverage and connection can be guaranteed, and (2) if we increase the node density further, the sensor network can be robust and failure proof. Since barrier coverage considers a narrow strip region, it is different from area coverage.

Node failure is a major concern in barrier coverage. In area coverage, if a node fails, there might be a blind spot, but the entire system can still work. However, in barrier coverage, the field of interest (FoI) is a long strip area, if a sensor node fails, it is possible to disconnect the communication link and break the monitoring belt. This

will be a serious malfunction. In this paper, we address the reliability and fault tolerance

problems of barrier coverage by proposing an asymptotic regular deployment model, called Accumulation Point Model (APM). Barrier coverage considers a long strip region, not a plane like area coverage. This makes the deterministic deployment possible. Summarized from previous literature, there are two models to deploy sensors in barrier coverage: Random Point Model (RPM) and Independent Belt Model (IBM). RPM is to deploy sensors randomly in FoI. While IBM is a regular deployment model, it separates the belts into two sets, the working belts set and the backup belts set. With the same length of barrier and the same node activation rate, our APM requires 1/[t(1-θ)] times the number of sensors compared with RPM, where θ is the overlap factor denoting the overlap degree, t is used by activation rate k/t for k-barrier coverage. Our theoretical analysis proves the failure probabilities of RPM and IBM are higher than APM. The simulation results indicate that the network lifetime of APM is the best of the three deployment models. If we use poor quality sensors, the network lifetime decreases a little in APM. RPM is also robust for poor quality sensors, but its absolute network lifetime is only 1/4 times that of APM. IBM works well for good quality sensors, but it gets extremely bad when we use sensors that have a high failure rate. APM has a high performance to price ratio compared with other models.

The concepts of barrier belt and barrier graph are introduced to make a precise description of barrier coverage. In a barrier graph, the sensor nodes are divided into a barrier brother set and a barrier neighbor set. The barrier graph of APM is a result of three types of eliminations of the redundant nodes. This can be seen as an application of the relative neighborhood graph (RNG(G)) [4] on barrier coverage. APM is made up of multiple accumulation points,

denoted as Akt . Every accumulation point contains t

sensors, k of them active. By this structure, we build strong k-barrier coverage [11].

2. Related Work The concept of barrier coverage first appeared in [7]. Kumar et al. formalized the concept of k-barrier coverage as a barrier of sensors, such that it can guarantee that any penetration be detected by at least k distinct sensors [11].

Wang et al. consider the relationship of coverage and connection [14], if RC >2RS, where RC is the communication radius and RS is the sensing radius, k-coverage means k-

connection. In APM, we demonstrate that in Akt

accumulation belt, the connection degree is at least 3k-1. A series of research has been taken to schedule sleep and


24

maximize the lifetime of sensor network. Most of them were for area coverage, such as [6] and [8]. Only two algorithms are proposed for barrier coverage [13]. It uses IBM and RPM to address the sleep-wakeup problem for homogeneous lifetime and heterogeneous lifetime.

Another important metric of sensor network in barrier coverage is the fault tolerance. Many papers propose the use high sensor density to provide redundancy [1][5][10][15]. It is generally believed that sensors are relatively economical and unreliable devices, therefore high density is required to provide a reasonable level of fault tolerance. This high density assumption belongs to RPM for both area coverage and barrier coverage.

For barrier coverage, the centralized algorithms are introduced in [11][13]. An L-zone localized algorithm is provided in [2] for k-barrier coverage. Our APM uses a localized algorithm based on accumulation points, a subset of sensor nodes that locate closely together.

Weak barrier coverage and strong barrier coverage were introduced in [11]. Weak barrier coverage assumes only when the intruder does not know the traverse path will it be detected with high probability. The QoS improvements are discussed in [3][12]. In this paper, we assume the quality is important and all of the assumptions are based on strong barrier coverage.

3. Preliminary and Network Model

3.3 Barrier Belt and Barrier Graph Assume the coverage of sensor is a unit disk with sensing radius R. A Coverage Graph G = (V,E) indicates that for the active sensor set V , for ∀u, v V , if the distance duv

<2R, then there is an edge euv E. There are two virtual nodes s, t V, corresponding to the left and right boundaries. Barrier Space is the field of interest for barrier coverage. It is a long strip physical region with arbitrary shape and deployed by the barrier sensors.

From above introduction, we can summarize the

properties of the barrier belt as 1) There is no loop in a barrier belt; 2) It is possible that there are crossing edges of two

distinct barrier belts in a barrier graph; 3) Except for the virtual terminal nodes {s, t}, there is no

shared node for two distinct barrier belts in a barrier graph.

Theorem 1. A sensor network barrier space is k-barrier coverage iff ∃k barrier belts in the barrier graph. Proof: If each crossing path is covered by k distinct sensors, then it must be that these k distinct sensors belong to k not

intersected barrier belts.

3.4 Redundancy Elimination and Minimum Barrier Belts

A barrier graph contains a set of sensors that are active for barrier coverage. But what is more meaningful is a minimum set of sensors to meet the QoS. The redundant sensors need to be eliminated or scheduled to sleep.

To change a coverage graph to a barrier graph, first we have to eliminate the round trip edges. We call this kind of elimination to be Type I Elimination. There is a very special kind of redundant node - the single edge node. If a node (not including the virtual terminal nodes) has only one edge in a coverage graph, then it has to be eliminated in a barrier graph. This is called Type II Elimination. The result of type I, II eliminations is a barrier belt. However, this barrier belt is not the minimum barrier belt. So we need Type III Elimination, by which we eliminate the redundant nodes, so that we can use the minimum number of nodes to meet QoS requirement. We get two reasons for type III elimination: first, we can schedule the redundant nodes to sleep, so that if an active node depletes, a redundant node can take its place; second, the redundant nodes are also possible to be utilized by other barrier belts.

Figure 1. Redundancy Elimination and Minimum Barrier

Graph

Let us take Fig. 1 as an example, to show more details about three types of elimination. First we get a coverage graph from the coverage relationship of sensors. The edge set {eaf, efb, ecd} ⊂ V . But as for a barrier graph, every node can only has two edges to support the barrier. So in this barrier graph, edges {eaf , efb, ecd} are eliminated. However, this is only one barrier graph. There are several other barrier graphs for this example; this leaves as an exercise for readers. For this barrier graph Gb = (Vb,Eb). Since {eab, ecd} are edges of the coverage graph, e and f are redundant nodes. After the elimination of e, f, we get the final minimum barrier graph. If a node is eliminated from the minimum barrier graph, and it is not utilized by other


25

barrier belts, then it is a Backup Node in the belt. A backup node keeps sleeping until one of the working nodes depletes its energy or fails. A backup node is also called a sponsor node. A Backup Belt is a redundant barrier belt in the barrier region. It keeps sleep until one of the working belts depletes or fails. If the minimum number of active sensors that is required by a certain QoS is denoted as nk, the total number of sensors deployed in the field of interest denoted as nt, then the Activation Rate ρa = nk/nt. If a k-barrier coverage degrades the QoS: leaving a barrier hole, or breaking the communication link, such that the whole system is no more k-barrier monitored. Suppose we cannot find a backup node to take the place of a failed node, nor can we find a backup belt to work instead of the failed barrier belt, then we consider the k-Barrier Coverage Fails.

3.3.1 Overlap Factor and Minimum Pass-through Distance

For ∀ edge ebc ∈ Vb in Gb, the overlap factor θ reflects a overlap degree of the sensing area for node b and c. If we link b, c by line bc (see Fig. 2). The intersection points for bc with the circle of the sensing disk of b, c are A,B. Then the overlap factor θ is

The θ cannot be 1, or else it conforms to type III

elimination. In barrier coverage, an important metric for detection quality for an unauthorized traverse is the Minimum Pass-through Path, and its distance is the Minimum Pass-through Distance. In the minimum pass-through path, a moving target has the least possibility to be detected by sensors. So, the minimum pass-through distance is an important QoS metric for a whole barrier.

Lemma 1. The minimum pass-through path for barrier coverage is the intersection line of the sensing disks of neighbor nodes.

Proof: From Fig. 2, we know the intersection line of the sensing disks for b, c is CD. Suppose E is a point outside both intersection regions of the sensing disks, and EN is vertical

The area of CAF is A CAF = (2Rsin α)(2Rcos α)/2 =R2 sin (π − 2α). The same, A EAF = R2 sin (π − 2β). Since α > β, we get A CAF < A EAF . Also, A CAF =R ・ dCM, and A CAF = R ・ dEN, so we have dEN > dCM. Then it is easy to know that dCD = 2dCM is the minimum path to pass-through the circle.

Theorem 2. For a barrier coverage system with an overlap factor θ for any neighbor sensors, the minimum pass-through

Figure 2. Overlap Factor and Minimum Pass-through

Distance

4. Categories of Barrier Deployment Deterministic deployment is hard for area coverage due to a large plane area. So, stochastic deployment is often used in area coverage. While the barrier coverage concerns a long strip region, we need to reconsider the deterministic and stochastic deployment in barrier space. In this chapter, we summarize two existing deployment models of barrier coverage, and we propose a new model - the accumulation point model. Let us discuss these models in more detail.

4.1 Random Point Model Random Point Model (RPM) for barrier coverage is a sensor network whose nodes distributed along a belt region randomly, according to uniform distribution or Poisson distribution, such that the sensors can be deployed equally at any location of the belt region. In RPM, nodes are often denoted by the node density λ. The model is based on the assumption that the deployment of sensors in deterministic geometric shape in a large area is not practical due to the large amount of deployment cost. So we can deploy sensors in FoI randomly. Given enough amount of sensors, and if we do not care how many times we deploy the sensors, then λ is expected to be the same in each point of FoI. When λ reaches a certain threshold value, both coverage and connection can be guaranteed. In RPM, we need to find out the minimum set of sensors to support the whole system, and put the other sensors sleep. When the coverage and connection cannot meet, some of the sleeping sensors will wake up to join the minimum serving set.

Theorem 3. For a k-barrier coverage system built in random point model with an activation rate ρa = k/t, denote

as Rkt . There are nr sensors deployed in the barrier space

with width 2R, where R is the sensing radius of a sensor. Then total barrier length lr ≤ 2Rnr/t2.

Proof: In RPM, if the density is high enough, then sensors are deployed in the barrier space according to uniform distribution. Each sensor will be deployed in a square domain with edge

In the barrier space, to support the k-barrier coverage with activation rate ρa = k/t, we need to have

So we get lr ≤ 2Rnr/t2.

If the length lr = l is given, then we know that the number of sensors required to support k-barrier coverage under RPM is


26

Figure 3. Barrier Coverage Sensor Network Deployment

Models

Figure 4. Barrier Graph and Coverage Graph

If a k-barrier coverage system is build under random point model, there are nr sensors deployed in the barrier space with width 2R, where R is the sensing radius of a sensor. Denote the activation rate as ρa = k/t, then the overlap factor θr for the neighbor nodes is

From Equ. (3), we get θr ≥ 1 − 1/t. When t > 2, θr > 1/2, we know it will result in a type III elimination to schedule a neighbor node to sleep.

4.2 Independent Belt Model Independent Belt Model (IBM) for barrier coverage is a sensor network whose nodes distributed along a belt region according to several independent barrier belts: some are active belts, in which all of the nodes are working, and others are backup belts, in which all of the nodes are sleeping. The assumption for IBM is that for the barrier coverage system, we can deploy sensors belt by belt, and make these belts work independently. Here we assume the system has already been optimized by three types of elimination. Each active belt uses the minimum number of sensors.

4.3 Accumulation Point Model

4.3.1 Description of Model Definition 3. Accumulation Point An accumulation point is a subgraph Ga = (Va, Ea) of a sensor network barrier graph Gb = (Vb,Eb), such that for ∀u Va, ∃v ∈ Vb such that

the distance duv < R iff v Va and the edge E aeuv ∉ .

Definition 4. Neighbor Point If Ga1, Ga2 are two accumulation points of a sensor network barrier graph Gb = (Vb,Eb), for ∀u Ga1, ∃v Ga2 such that euv Gb, then we say Ga2 is a neighbor point of Ga1.

For a pair of neighbor points Ga1, Ga2, for ∀ node a Ga1, node b Ga2, a is the Neighbor Node of b and vice versa. For ∀a, c Ga1, c is the Brother Node of a and vice versa. For each node, the barrier neighbors are separated into barrier left neighbors and barrier right neighbors according to local address information.

Accumulation Point Model (APM) for barrier coverage is a sensor network whose nodes distributed along a belt region according to a set of accumulation points {Ga1, Ga2, ..., Gan}. For any node b not in {s, t}, b Gai, i [1, n]. For a long strip region, actually it is easier to deploy sensors in a series of accumulation points rather than to deploy them randomly.

Lemma 2. For any accumulation point Ga = (Va,Ea) in a barrier graph Gb = (Vb, Eb), if node A, B Ga, A and B cannot be within a same barrier belt.

Proof: Use type III elimination if A, B are in the same belt.

Lemma 3. For a long strip barrier coverage system that has an arbitrary shape, if it is based on accumulation point model, for any accumulation point Ga, it has and only has two neighbors.

Proof: Use type I, II elimination for more neighbors.

Theorem 4. If a barrier coverage system is based on accu-

A barrier belt has two virtual terminal nodes, so the number of vertices equals to the number of accumulation points plus 2, the number of edges equals to the number of accumulation points plus 1.

4.3.2 Accumulation Degree and Accumulation Deduction

Definition 5. Accumulation Degree An accumulation degree is a metric that indicates how well the nodes are

accumulated. We denote it as Akt , 0 ≤ k ≤ t. Here t is the

total number of nodes in an accumulation point, and k is the number of active nodes required to meet the quality of service.

Specifically, A00 =ø, i.e. there is no sensor at all. At

0 indicates that this accumulation point is a backup point. If

we use At0 to describe a barrier belt, then it will be a backup

belt.


27

Observation 1. For a barrier coverage system in a barrier

space, if it is constructed by n Akt accumulation points, then

the activation rate ρa = k/t.

Definition 6. Accumulation Deduction An accumulation deduction is a decomposition and combination of an accumulation degree, to make a conversion of a high accumulation degree to lower accumulation degrees and vice versa, denoted as

Let us look at an example: A3

5= A23 + A1

2 = A113 + A2 0

1 . Here we know for a 3-barrier coverage system, we can use

A35APM in one barrier space. If we put them into two

barrier spaces, we can use A23 and A1

2 APM. We can also

use the IBM A3 11 + A2 0

1 . Here we have A3 11 working belts,

and A2 01 backup belts.

4.3.3 Analysis of Barrier Length The barrier length of IBM and APM follows the same schema.

Theorem 5. For a barrier coverage deployed in Akt

accumulation point model, the total number of nodes na = t・n, the overlap factor θ, then the barrier coverage length la = 2R(1−θ)n.

If the length is given, la = l, then the number of nodes na required is

From Equ. (3), it is easy to know that under the same

activation rate ρa = k/t, the same barrier length, the cost of the RPM compared with the APM is

4.3.4 Permitted Deployment Area

Figure 5. Permitted Deploy Region for Accumulation Point

Model

Theorem 6. For a barrier coverage deployed in Akt

Accumulation point model with overlap factor θ, a permitted deploy region is a cycle with radius r,

Proof: First, it is easy to know if θ ≥ 1/2, then from type

III elimination there will be a node be eliminated, so it must be θ < 1/2. From Fig. 5, AB = 2Rθ. We know the minimal distance between each pair of points in the neighboring dotted circles is dCD, from geometry dCD = 2R(1 − θ) − 2r ≥ R. And the maximal distance is dEF = 2R(1 − θ) + 2r ≤ 2R. So, we

have r = min(Rθ, R(1 − 2θ)/2). Let Rθ ≥ R(1 − 2θ)/2,we know the permitted deploy region is a circle with radius r, then we can get Equ. (6).

5. Comparison of the Barrier models

5.1 Connection Degree The connection degree for a sensor node A is the number of active nodes within the communication distance of A. Here, the node we are referring is a normal node, not including the virtual terminal nodes. We assume the communication distance Rc ≥2R.

Observation 2. For barrier coverage in Ak 11+ Akt 0

1)( − independent belt model, the connection degree is at least 2.

Observation 3. For barrier coverage in Akt accumulation

point model, the connection degree is at least 3k − 1.

As for each internal node A, it has two sets of neighbors, each set has k active nodes. Also A has k – 1 brothers.

Observation 4. For barrier coverage in Rkt random point

model, the connection degree is at least πk2/2.

Each node lies in a square (Equ. (2)). There are at least t nodes in each cross path, k of them active. The nodes with least connection degree are the border nodes. We have (2R)2/2 (2R)2 = n/k2. So, we get n = πk2/2.

5.2 Network Lifetime If we do not concern the different roles of sensors, we can assume all sensors’ lifetime is independent and follow the

same distribution. Considering a sensor’s lifetime is mainly

determined by battery, we can use the normal distribution to

interpret the lifetime of a sensor node, with random variable

X to indicate a sensor node’s lifetime, the mean lifetime as μ and variance σ, X ∼ N(μ, σ2).

The cumulative distribution function (cdf) is as follows:

We assume there is no node waste. The sleep state of a

node also need to consume a little energy, so when a working node depletes, a backup node can only work for a

time interval of (1− )μ. So the lifetime of Akt follows a

normal distribution of

This can be rewritten as

Denote random variable Y as an accumulation point’s

lifetime, then

Next let’s compute the lifetime of a whole belt of Ak

t .


28

Assume there are n accumulation points in the belt. Denote random variable Z as the lifetime of the belt. Then Fap,z(z) ={ the probability that Z ≤ z} = P(Z ≤ z). Denote py as the probability of an accumulation point’s lifetime. Then

Theorem 7. The lifetime for a barrier in accumulation point model is t (μ − Zασ)/k, 0 ≤ α ≤ 1. Zα is the z-value in standard normal cumulative probability table.

Proof: Let us analyze the lower bound and upper bound for the lifetime. Let P(Z ≥ z) = 1 − Fzp(z) = 1 − .

5.3 Network Failure

5.3.1 Independent Belt Model There are t belts, k of them are active belts, and t −k of them

are backup belts, denoted as Ak 11+ Akt 0

1)( − . To ease the analysis, in this evaluation method, we assume backup belts are the same with working belts, i.e. they are active, although they do not really work; if one of them fails, we put it into the statistic. The output of this evaluation is a failure probability and network living time. We do not use the network living time as a real network lifetime, but the failure probability we calculated here is a good approximation of the real failure probability. Denote a random variable Y as living time of a belt in IBM, and Fib(y) = {the probability that Y ≤y} = P(Y ≤y). Denote px to be the probability of living time for a node. Then

As for the whole barrier system of IBM, there are k active belt, and t−k backup belt. We denote random variable Z as the living time of the whole barrier system. Denote Fib,z(z) = {the failure probability for Z ≤ z} = P(Z ≤ z). and py to be the probability of living time for a belt, defined in Equ. (11). Then

5.3.2 Accumulation Point Model

To compute the failure probability, we need to know the

living time of an accumulation point Akt . We use a random

variable Y to denote the living time of Akt . Then Fap,y(y) =

{the probability that Y ≤y}=P(Y ≤y). We denote px to be the

probability of a node’s lifetime defined in Equ. (7). Then

Assume there are n accumulation points in a belt, and denote a random variable Z as living time for a belt. Then Fap,z(z) = { the probability that Z ≤ z} = P(Z ≤ z). Denote py to be the probability of an accumulation point’s living time defined in Equ. (13). Then

5.3.3 Random Point Model

A belt with an active rate ρa =k/t is denoted as Rkt .

Compared with APM, which has n nodes in a belt, from Equ. (5), RPM has nr = t(1−θ)n nodes in a belt. So we can

use the living time of Akt2 2/

2/ APM barrier belts as an approximation for the living time of RPM. We use a random

variable Z to denote the living time of Akt2 2/

2/ . Then Frp,z(z) = {the probability that Z ≤ z} = P(Z ≤ z). Denote pz as the

probability of Akt

2/2/ belt’s living time. There are t(1 − θ)n

nodes in a barrier belt. Then

6. Building the Accumulation Point k-Barrier Coverage

If a barrier space is deployed with sensors in APM, we can implement the building and scheduling algorithms locally using an accumulation point protocol.

6.1 Building Process

Figure 6. After message “respond barrier edge”, the nodes

B,C become new heads Assume the localization of each node is known, using GPS or other localization algorithms. After the sensors are deployed in the barrier space, each sensor denoted as A, initializes a neighbor finding process. A broadcasts an “I am here” message, including its location, sensor id, current status (ready, work or sleep). From the definition of


29

accumulation point, we know that the neighbors and brothers of A can receive the broadcasting message. After a node B received A’s “I am here” message, B can compute the distance dAB, if dAB < R then

A is B’s brother; else {dAB < 2R}

A is B’s neighbor; else

discard this message; end if

After this process, each sensor will construct a list of

brothers. Therefore, a sensor can learn the value of t in Akt

by itself. After the node find out the value of t, it will broadcast a message “my accumulation point”, including the list of its brothers. Usually, a neighbor node can receive the “my accumulation point” message 2t times from its neighbors. It simply separates the neighbors to be the left neighbors and right neighbors. This finishes our neighbor finding process.

When the system begins the building process, a node can be chosen randomly to be the Seed. The seed is informed the

value of k in Akt . The seed selects k −1 brothers from its

brother list to be the Heads of the barrier belts. After a node A is appointed to be the head, it backoffs a random time, then broadcasts a “request barrier edge” message, if it can get the responses from left and right neighbors in a waiting time, we get to the next step, or else A will backoff another time.

For the first free left neighbor B that is waiting for command, if it heard A’s “request barrier edge” message, and the first free right neighbor to be C. The B,C will send A a “respond barrier edge” message. A finds out that B,C are the first left and right neighbors to respond, it will reply an “accept” message to B,C respectively. Then B,C become A’s barrier neighbors. After this process, B,C become the new barrier heads (see Fig. 6), and each of them will repeat the above steps. But this time, B,C only need to receive one “respond barrier edge” message before it hands off the role of barrier head. After a node finds that there are k brothers become the barrier heads, and it is still free, then it will be scheduled to sleep.

6.2 Sleep Schedule There is no need for an accumulation belt to use belt

switch. Even if multiple barrier spaces are considered, e.g.

( A23 , A2

4 , A12 ). First, for this system, we can use A5

9 to

implement it. If we have to use ( A23 , A2

4 , A12 ), since in each

accumulation point, we have already set up backup nodes, we still do not need belt switch. Instead, we need node switch to change the role of nodes in an accumulation point. The following is a sleep scheduling algorithm:

A redundant node A sleeps for ps plus a random delay dr; A wakes up and broadcasts a ”query status” message; Any working brothers of A response a ”status information”; if one message indicates that the node is depleting then A

invokes the node switch process;

else {after a period pw, A cannot receive k messages back}

A invokes the barrier building process; end if In the above algorithm, we use a wake-up query scheme

instead of an activate scheme. The activate scheme is unpractical considering the current technique [8]; if a sensor went to sleep, we cannot wake it up unless the sensor wake up itself by a short period. In the wake up scheme, it is possible that during the sleep period of a node, there is a brother node depleted. And so it is possible to exist an interval that the belt degrades to (k − 1)-barrier coverage. Let’s denote the period from a node begin to report depleting to the time it totally depletes as pd. If we set the fixed sleep period ps = 2pd, and the random sleep delay dr [0, 2pd], then even in the worst case, that is, t degrades to k+1, only one brother node left as a backup node, the maximum possible degraded monitoring interval max(ps + dr − 2pd) = 2pd.

If we decrease ps, the maximum possible degraded monitoring interval max(ps + dr − 2pd) can be decreased. For k > 1, the wake-up time is not a problem for APM, since node switch in each accumulation point is done locally, even if one belt degrades a little time due to node switch, the other belts are still working. While in IBM, unless we assume nodes can be activated, or we use the wake-up scheme and set dr to be very small, which will lose the benefit of saving energy when sleeping, the barrier domino phenomenon will be explicit during belt switch. RPM does not need belt switch either, so its wake up scheme is the same as that of APM.

Figure 7. Network Lifetime for Barrier Coverage

Deployment Models

Figure 8. Network Energy Residue Rate for Barrier

Coverage Deployment Models


30

7. Simulation Results To evaluate our analysis, we build a simulation platform based on C++. We set the sensing radius R = 10m, communication radius Rc = 30m. Sensors are deployed in a belt region of 3000m, and for RPM and APM, the width of the belt is 20m. As for IBM, the width is not fixed in order to make sure different belt independent. The k is fixed to 3, and t is a variable. The overlap factor θ = 0.1. The sensors’ mean lifetime μ = 400, the variance of lifetime σ = 20. We deploy different number of sensors, so that the node densities are different. Here the network energy residue rate ρr is a metric to measure how much energy remained when the barrier fails,

Fig. 7 indicates RPM needs more sensors to maintain a certain network lifetime. When the number of sensors is lower than 1800, the network lifetime of RPM is zero. This is because RPM needs a high node density to work. After the density reaches a threshold, RPM’s lifetime increase fast, but it is still lower than IBM, while IBM’s network lifetime is lower than APM. In this experiment, we optimize the backup nodes and backup belts to sleep in order to save energy and maximize the network lifetime. That is why the network lifetime can be 3000 days for APM, nearly 9 years.

Figure 9. The Effect of Sensor Quality on Network Lifetime

The network energy residue rate in Fig. 8 is an indicator

for the performance to price ratio. Low network residue rate indicates a High Performance to Price Ratio. APM’s residue ratio is the least, which means its waste is the least of all models. RPM has a higher residue rate than 50%. This means when the network cannot meet the quality requirement, only less than half of the whole energy consumed. The graph also indicates that when more sensors are deployed, the residue rate decreases as a tendency.

In the next experiment, we need to find out the network failure resistance ability of different deployment models. We fix the number of sensors to be 3000, and change the sensor lifetime variance σ from 15 to 130 days, in order to indicate we use sensors with poor quality. All the other parameters are the same as the first experiment. We find out that IBM’s lifetime deteriorates fast from 1800 to 0, even though the number of sensors remains the same (Fig. 9). As for RPM, although its lifetime is not long, only 390 at the beginning, it does not decrease so sharply as that of IBM. APM’ lifetime is the longest. It only decreases a little when using

the low quality sensors. This indicates APM is a very robust and reliable structure. Even if the quality of the sensors are not good and vulnerable to all kinds of failures, we can still maintain a good network service if we deploy sensors according to APM.

8. Conclusions Compared with existing models of barrier coverage - Random Point Model and Independent Belt Model, the Accumulation Point Model is an economical, robust and reliable structure: its cost is the least, and the failure resistance is the best. The Accumulation Point Model is a long lived deployment model for sensor networks. Although originated from barrier coverage, it is possible to use Accumulation Point Model for area coverage. If we deploy the accumulation points instead of single sensors, the fault tolerance and network lifetime will definitely increase. This will be more helpful than simply increase the node density.

Acknowledgments

This work is supported in part by the National Science Foundation under award number CT-ISG 0716261.

References [1] P. Balister, B. Bollobas, A. Sarkar, and S. Kumar.

Reliable density estimates for coverage and connectivity in thin strips of finite length. In Proceedings of the 13th annual ACM international conference on Mobile computing and networking (MobiCom), pages 75–86, NY, USA, 2007.

[2] Ai Chen, Santosh Kumar, and Ten H. Lai. Designing localized algorithms for barrier coverage. In Proceedings of the 13th annual ACM international conference on Mobile computing and networking (MobiCom), pages 63–74, New York, NY, USA, 2007. ACM.

[3] Ai Chen, Ten H. Lai, and Dong Xuan. Measuring and guaranteeing quality of barrier-coverage in wireless sensor networks. In Proceedings of MobiHoc, pages 421–430, NY, USA, 2008.

[4] X. Cheng, X. Huang, and Xiang yang Li. Applications of computational geometry in wireless networks, 2003.

[5] Qunfeng Dong. Maximizing system lifetime in wireless sensor networks. In Proceedings of the 4th international symposium on Information processing in sensor networks (IPSN), page 3, Piscataway, NJ, USA, 2005. IEEE Press.

[6] D.W.Gage. Command control for many-robot systems. In AUVS-92, the Nineteenth Annual AUVS Technical Symposium, pages 28–34. Unmanned Systems Magazine, 1992.

[7] J.A. Fuemmeler and V.V. Veeravalli. Smart sleeping policies for energy efficient tracking in sensor networks. In IEEE Transactions on Signal Processing, pages 2091–2101, Washington, DC, USA, 2008. IEEE Computer Society.

[8] Rajagopal Iyengar, Koushik Kar, and Suman Banerjee. Lowcoordination topologies for redundancy in sensor networks. In Proceedings of the 6th ACM international


31

symposium on Mobile ad hoc networking and computing (MobiHoc), pages 332–342, NY, USA, 2005.

[9] S. Kumar, T.H. Lai, and A. Arora. Barrier coverage with wireless sensors. Wirel. Netw., 13(6):817–834, 2007.

[10] Benyuan Liu, Olivier Dousse, Jie Wang, and Anwar Saipulla. Strong barrier coverage of wireless sensor networks. In MobiHoc ’08: Proceedings of the 9th ACM international symposium on Mobile ad hoc networking and computing, pages 411–420, New York, NY, USA, 2008. ACM.

[11] M. E. Posner S. Kumar, T. H. Lai and P. Sinha. Optimal sleep wakeup algorithms for barriers of wireless sensors. In Fourth International Conference on Broadband Communications, Networks, and Systems, Raleigh, NC, 2007.

[12] Xiaorui Wang, Guoliang Xing, Yuanfang Zhang, Chenyang Lu, Robert Pless, and Christopher Gill. Integrated coverage and connectivity configuration in wireless sensor networks. In Proceedings of SenSys, pages 28–39, NY, USA, 2003.

[13] Jerry Zhao and Ramesh Govindan. Understanding packet delivery performance in dense wireless sensor networks. In Proceedings of SenSys, pages 1–13, NY, USA, 2003.

Author’s Profile

Yong (Yates) Lin is a Ph.D. candidate in University of Texas at Arlington. He received his M.S. degree in Computer Science at University of Science and Technology of China in 2003. His research interests include sensor network, pervasive computing, robotics, artificial intelligence, machine learning. Zhengyi Le received my Ph.D. majored in Computer Science at Dartmouth College. She received my B.S. degree in Computer Science at Nanjing University, China. Her research interests include computer security & privacy, recommendation systems, collaboration systems, P2P

and sensor networks.

Fillia Makedon received a doctorate in computer science from Northwestern University in 1982. She joined the faculty of the University of Texas at Arlington in 2006. Between 1991 and 2006 she has been a professor of Computer Science and Chair of the Master Program at Dartmouth College, in the Department of Computer Science.


32

Object Handling Using Artificial Tactile Affordance System

Naoto Hoshikawa1, Masahiro Ohka1 and Hanafiah Bin Yussof2

1Nagoya University, School of Information Science,

Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan [email protected]

2Universiti Teknologi MARA, Faculty of Mechanical Engineering

Shah Alam, Selangor Darul Ehsan, 40450, Malaysia

Abstract: If the theory of affordance is applied to robots, performing the entire process of recognition and planning is not always required in its computer. Since a robot’s tactile sensing is important to perform any task, we focus on it and introduce a new concept called the artificial tactile affordance system (ATAS). Its basic idea is the implementation of a recurrent mechanism in which the information obtained from the object and the behavior performed by the robot induce subsequent behavior. A set of rules is expressed as a table composed of sensor input columns and behavior output columns, and table rows correspond to rules; since each rule is transformed to a string of 0 and 1, we treat a long string composed of rule strings as a gene to obtain an optimum gene that adapts to its environment using a genetic algorithm (GA). We propose the Evolutionary Behavior Table System (EBTS) that uses a GA to acquire the autonomous cooperation behavior of multi-finger robotic hands. In validation experiments, we assume that a robot grasps and transfers an object with one or two hands, each of which is equipped with two articulated fingers. In computational experiments, the robot grasped the object and transferred it to a goal. This task was performed more efficiently in the one-hand case than in the two-hand case. Although redundant grasping induces stability, it decreases the efficiency of grasping and transferring.

Keywords: Artificial intelligence, Affordance, Genetic Algorithm, Tactile sensing.

1. Introduction Tactile sensation possesses a salient characteristic

among the five senses because it does not occur without interaction between sensory organs and objects. Touching an object induces both deformation of it and the sensory organ. Since the tactile sensing of robots is crucial to perform any task [1], we previously focused on tactile sensing and introduced a new concept called the artificial tactile affordance system (ATAS) [2], whose basic idea is an extension of the theory of affordance [3].

ATAS is based on a recurrent mechanism in which the information obtained from the object and the behavior performed by the robot itself induce subsequent behavior. To explain ATAS, we introduce its schematic block diagram (Fig. 1). When the tactile information obtained from the environment is input into ATAS, a command for robot behavior is output, and the robot actuators are controlled based on that command. After that, the environment is

changed due to the deformation and the transformation caused by the robot behavior, and the result is sent as tactile information to the ATAS inlet by a feedback loop.

In ATAS, a key point is producing a set of rules in which the sensor input patterns and behavior output patterns are if-clauses and then-clauses, respectively. This ATAS concept resembles an expert system in artificial intelligence [4] in which the matching degree between the fact selected by a fact database and the if-clause is evaluated; if the fact matches the if-clause, then the then-clause is displayed as a temporal conclusion, and the then-clause is simultaneously added to the fact database. The biggest difference between ATAS and an expert system is that in the latter, the fact database is possessed inside a computer, but in ATAS the whole environment is treated as the fact database.

Figure 1. Artificial tactile affordance system (ATAS)

Since ATAS is categorized as a behavior-based control

system, it more closely resembles subsumption architecture (SSA) [5][6]. Although in SSA each connection between sensors and actuators has priority, in ATAS no modules have priority, and they are arranged in parallel form. In each module, we can include a relative complex procedure if needed. Therefore ATAS is suited to such rather complex tasks as assembly and walking on uneven loads.

In a previous paper [2], we implemented ATAS based on the following two methodologies. In methodology 1, ATAS is composed of several program modules, and a module is selected from the set of modules based on sensor information to perform a specified behavior intended by a designer. In methodology 2, a set of rules is expressed as a


33

table composed of sensor input columns and behavior output columns, and each rule is transformed to a string of 0 and 1 treated as a gene to obtain an optimum gene that adapts to the environment using a genetic algorithm (GA) [7]. While methodology 1 was very effective for such fine control as handling tasks of humanoid robots, methodology 2 was very useful to obtain general robotic behavior that was suitable for its environment. However, the possibility of method 2 for object handling must be further pursued because the robot can automatically obtain the relationship between sensor input and behavior output patterns after it has committed to the environment.

In this paper, we apply the Evolutionary Behavior Table System (EBTS) of methodology 2 to the object handling of a two-arm hand because we are examining the possibility of methodology 2 for fine control. After explaining the behavior acquisition method using EBTS, we derive a simulation procedure for a two-hand-arm robot that was developed in another project. In computational experiments, the robot grasps an object and transfers it to a goal. We examine the difference of grasping efficiency between one- and two-hand tasks.

Table 1: Behavior tables

2. Behavior Acquisition Method

2.1 Behavior Table We must define the relationship between environmental information and the behavior rules in a single-layered reflexion behavior system to control the behavior of a two-hand-arm robot. In this paper we treat the fingertip as an agent. Stimulus and response are defined by sensor-truth and finger-movement-state tables, respectively (Table 1). These tables are prepared for the robot agent equipped with three sensing elements, as introduced in the next section.

S01, S02, S11, S12, S21, and S22 show the status of the three sensing elements mounted on an agent. Since each sensing element can measure four grades of contact force, a 2-bit signal is used for each sensing element. A0, A1, A2, and A3 show the status of the actuators. While A0 shows that the agent movement is stopping or active, A1, A2, and A3 show the agent’s movement direction. Because the sensor status is described by Boolean values (1 or 0), the number of total patterns of the sensor status is 6426 = . An agent refers to

the current values of A0, A1, A2, and A3 of Table 1 in every set period to decide its behavior.

2.2 Behavior Acquisition Method The behavior table is composed of the truth and motor status tables of the sensors. These tables can be expressed with a one-dimensional array like genes because all of the truth-values are Boolean data. Therefore, the behavior table can be designed as a model of a simple genetic algorithm (SGA) comprised of Boolean data [7]. Since the behavior table is evolvable, we call it the Evolutionary Behavior Table System (EBTS). The design procedure of the behavior table using SGA is shown in the following.

First, we design a genotype array that has information about the behavior table. The length of the genotype array is shown with the next formula:

MG S ×= 4 (1),

where the number of sensors and the bit number of the output gradation are S and M , respectively. Sensor patterns are calculated as S4 because each sensor has 2-bit graduation.

The genotype model used for the numerical experiments described in the subsequent section is shown in Fig. 2. The agent possesses three sensing elements, the output of 4-bit gradation, and a length of 256464 =× bits as gene data information.

The agent’s fitness value is calculated in a simulator that is equipped with internal functions that evaluate the efficiency and the task accuracy degrees of the agent. Then the simulator generates a behavior table from the genotype array of a one-dimensional vector composed of G elements (Fig. 2). The agent is evaluated on the basis of the task achievement degree in the simulator field during a specified period. The evaluation value obtained by this simulation is sent to the calculating system for genetic algorithms as fitness.

Figure 2. Genotype model

3. Simulation Procedure

3.5 Two-hand-arm System Since communication between robots is performed through tactile data in ATAS, we adopted a two-hand-arm robot as an example. Fig. 3 shows a two-hand-arm robot, which is an


34

advancement over the previously presented one-hand-arm robot [8]. The arm system’s DOF is 5, and each finger’s DOF is 3. To compensate for the lack of arm DOF, this robot uses its finger’s root joint as its wrist’s DOF.

Figure 3. Two-hand-arm robot equipped with optical three-

axis tactile sensors

Figure 4. Three-axis tactile sensor system

On each fingertip, it has a novel, optical three-axis tactile sensor that can obtain not only normal force distribution but also tangential force distribution. See previous articles [8]-[10] for explanatory details. Although ordinary tactile sensors can only detect either normal or tangential force, since this tactile sensor can detect both, the robot can use the sensor information as an effective key to induce a specified behavior.

The tactile sensor is composed of a CCD camera, an acrylic dome, a light source, and a computer (Fig. 4). The light emitted from the light source is directed into the acrylic dome. Contact phenomena are observed as image data, which are acquired by the CCD camera and transmitted to the computer to calculate the three-axis force distribution. The sensing element presented in this paper is comprised of a columnar feeler and eight conical feelers. The sensing elements, which are made of silicone rubber, are designed to maintain contact with the conical feelers and the acrylic dome and to make the columnar feelers touch an object.

3.6 Simulator Design for Robotic Hands As an object handling task, we adopted an object transportation problem for easy comparison to previous research. In the task, a hemispherical fingertip equipped with three tactile sensing elements is modeled (Fig. 5). The motion of the modeled robotic hand is restricted within two dimensions. Since the wrist is assumed to be fixed, it only manipulates an object with its fingers. We presume that the robotic fingertip transports the object to the goal.

Figure 5. Model of hand and tactile sensor

Figure 6. Evolutionary Behavior Table System (EBTS)

A simulator is composed of a map field (work area) and the objects on it. They are categorized into two types: an autonomous mobile agent defined as a fingertip and a non-autonomous object defined as the transportation object. The autonomous mobile agent can be equipped with multiple sensor inputs and behavior outputs as its module, which is formed to function as a suitable module based on the assumptions of numerical experiments.

In this object transportation problem, the autonomous mobile agent is equipped with three tactile elements. An overall view of the evolutionary behavior table is shown in Fig. 6. The optimization procedure of the genetic algorithms is summarized as follows:

• The population of the random gene data is produced as an initial value.

• The evolutionary computation engine sends gene data to the simulator to evaluate the gene fitness.

• Elite genes are selected based on their fitness. • A set of individuals is chosen based on the roulette

wheel selection. • A pair of individuals is selected and used for

uniform crossover. • The newborn children from the pair mutate under a

certain probability. • The children’s gene data are sent to the simulator to

evaluate their fitness. • The fitness of the elite group is compared with that

of its children group. The group of the next


35

generation population is selected from the high score group in descending order.

• If it is not a final generation, it returns to the above procedure 3).

• Evolutionary computation is finished.

3.7 Fitness Evaluation by Simulator After the agents performed object transportation, the task’s achievement degree was evaluated as a fitness value of the gene. Evaluation functions for task achievement were composed of transportation accuracy value 1E , which was evaluated from the geometrical relationship between the agents and the transportation object, and transportation efficiency value 2E , which was decided by reaching the goal. Evaluation function 1E is defined as follows:

BA

KE

cc −= 1

1 (2),

where Ac and Bc are the position vectors of the object and the goal, respectively. Coefficient 1K is assumed to be 1,000.

On the other hand, evaluation function 2E is defined by

tK

E 22 = (3)

using spent time t to transfer the object to the goal area. In this study, we assumed 1602 =K .

The fitness value of the genetic algorithm is calculated as the summation of 1E and 2E . In the fitness calculation, the agent often accidentally pushed the transportation object into the goal. To avoid such accidents, we divided the simulation time into task execution and task evaluation and compared the positioning between the agent and the transportation object at the termination of the execution time with the termination of the evaluation time.

Figure 7 One- and two-hand cases

3.8 Simulator Conditions in Numerical Experiments In numerical experiments, we assume one- and two-hand

cases (Fig. 7). The map field of each case is assumed to be a Gaussian plane of 800×800 [pixel2] with an origin at coordinates (0, 0) in a series of numerical experiments. The transportation object and the agents are specified positions. In Fig. 7(b), four agents are placed near each corner of the rectangular field. The eight initial positions of the objects are assumed to relax during adaptation in the task

procedure. An individual mobile agent tries object transportation

eight times in each simulation. Evolution calculation, which continues for 1,000 generations, is repeated ten times for each numerical experimental condition. The number of individuals, maintaining the elite number, and the mutation rate per 1-bit of gene are 10, 2, and 10%.

4. Numerical Experimental Result

4.1 Grasping Behavior Figure 8 shows the task process using two hands with the four fingers shown in Fig. 7(b). The robot hands successfully transported the object to the center circle, which is its goal. After examining other computational experiments with several initial conditions, we verified that this task does not depend on the initial position of object.

Figure 8. Four fingertips acquired cooperative behavior as

collective task to convey object

4.2 Comparison Between One- and Two-hand Grasping

Figure 9 shows the result for the one-hand grasping test, which depicts a variation in the fitness value of the first elite, the second elite, and the top score of one population under different generations. The variation in the fitness value of the first elite almost coincided with that of the second elite. Even at 1,000 generations, they don’t seem saturated yet and exceed 600. Besides the first and second elites, the top score of one population increased with additional generations. Therefore, the efficiency of object transportation with one-hand grasping is enhanced with more generations.

The two-hand result is shown in Fig. 10. Although the fitness value of the first and second elites increased with an increase of generation, similar to the one-hand case, it almost becomes saturated at 1,000 generations. The


36

difference between the fitness values of the first and second elites is greater than that of one-hand case. The variation in the top score of one population becomes steady after around 100 generations. Moreover, the fitness value of the first elite at 1,000 generations is around 560 and is smaller than that of one-hand case. Therefore, the efficiency of object transportation with two-hand grasping saturated very rapidly.

Since object grasping and transportation can be performed by one hand, the two-hand case is obviously redundant. Although redundant grasping induces stability, it decreases the efficiency of grasping and transferring.

Finally, we discuss the efficiency of acquiring autonomy cooperation behavior of a multi-finger in this EBTS using GA. In this numerical experiment, the number of simulation trials was 100,000, as a result of optimizing the gene data until 1,000 generations. This final truth table obtained by the gene data doesn’t always assure an optimum solution, but the calculation cost is reduced from 77102.1 × to

5100.1 × because the combination numbers of the input and output patterns are calculated as 77256 102.12 ×≅ , according to Eq. (1). If we used top-down methodology,

77102.1 × trials are needed to specify the optimum pattern. Therefore, we accomplished the automatic design of cooperative behavior of multi-agents for collective tasks.

Figure 9. Relationship between maximum fitness and generation for one-hand case

Figure 10. Relationship between maximum fitness and generation for two-hand case

5. Conclusion We implemented an artificial tactile affordance system (ATAS) based on the following methodology. A set of rules is expressed as a table composed of sensor input columns and behavior output columns, and a row of the table corresponds to a rule. Since each rule is transformed to a string of 0 and 1, we treat a long string composed of rule strings as a gene to obtain an optimum gene that adapts to the environment using a genetic algorithm (GA). We propose the Evolutionary Behavior Table System (EBTS) using a GA to acquire the autonomous cooperation behavior of multiple mobile robots.

In validation experiments, we assume that a robot grasps and transfers an object with one or two hands; each hand is equipped with two articulated fingers. In computational experiments, the robot grasped the object and transferred it to the goal. This task is performed more efficiently in the one-hand case than in the two-hand case. Although redundant grasping induces stability, it decreases the efficiency of grasping and transferring. Although the present method requires many computational resources, it is attractive because it is very useful for obtaining general robotic behavior that is suitable for its environment.

References [9] M. Ohka, “Robotic Tactile Sensors,” Encyclopedia of

Computer Science and Engineering, Wiley Encyclopedia of Computer Science and Engineering, 5-Volume Set, Editor: Benjamin, W. Wah, pp. 2454 – 2461 (Vol. 4), 2009.

[10] M. Ohka, N. Hoshikawa, J. Wada, and H. B. Yussof, “Two Methodologies Toward Artificial Tactile Affordance System in Robotics,” International Journal on Smart Sensing and Intelligent Systems, Vol. 3, No. 3, pp. 466-487, 2010.

[11] J. J. Gibson, “The Ecological Approach to Visual Perception,” Houghton Mifflin Company, 1979.

[12] P. H. Winston, “Artificial Intelligence (second edition),” Addison-Wesley, pp. 159-204, 1984.

[13] Rodney A. Brooks, “A robust layered control system for a mobile robot,” IEEE Journal of Robotics and Automation, RA-2(1), pp. 14-23, 1986.

[14] Rodney A. Brooks, “Intelligence without representation,” Technical Report MIT AI Lab, 1988.

[15] D. E. Goldberg, “Genetic Algorithms in Search, Optimization and Machine Learning,” Addison Wesley, 1989.

[16] H. B. Yussof, J. Wada, and M. Ohka, Object Handling Tasks Based on Active Tactile and Slippage Sensations in Multi-Fingered Humanoid Robot Arm, 2009 IEEE International Conference on Robotics and Automation, pp. 502-507, 2009

[17] M. Ohka, H. Kobayashi, J. Takata, and Y. Mitsuya, An Experimental Optical Three-axis Tactile Sensor Featured with Hemispherical Surface, Journal of

Relationship between Fitness and Generation for 4 agents

0

100

200

300

400

500

600

0 200 400 600 800 1000

Generation

Fit

ness

First Elite

Second Elite

Top score of one population

First elite Second elite Top score of one population

Relationship between Fitness and Generation for 2 agents

0

100

200

300

400

500

600

700

0 200 400 600 800 1000

Generation

Fitne

ss

First Elite

Second Elite

Top score of one population

First elite Second elite Top score of one population


37

Advanced Mechanical Design, Systems, and Manufacturing, Vol. 2, No. 5, pp. 860-873, 2008.

[18] M. Ohka, J. Takata, H. Kobayashi, H. Suzuki, N. Morisawa, and H. B. Yussof, Object Exploration and Manipulation Using a Robotic Finger Equipped with an Optical Three-axis Tactile Sensor, Robotica, Vol. 27, pp. 763-770, 2009.


38

Combined Tactile Display Mounted on a Mouse

Yiru Zhou1, Cheng-Xue Yin1, Masahiro Ohka1 and Tetsu Miyaoka2

1Nagoya University, School of Information Science, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan

[email protected]

2Shizuoka Institute of Science and Technology, Faculty of Comprehensive Informatics 2200-2, Toyosawa, Fukuroi, 437-8555, Japan

Abstract: In previous haptic presentation devices, the combination effects of distributed pressure and slippage force sensations have not been investigated despite their possibilities for virtual reality systems. In this study, we developed a mouse capable of presenting combined stimulation to discuss combination effects on virtual reality. The mouse was equipped with a bimorph-piezoelectric-actuator-array and a two-dimensional electro-magnetic-linear motor to present pressure and slippage force, respectively. To evaluate the mouse’s presentation accuracy, we performed a series of tracing experiments of virtual figure contours. Since the deviation errors of the combination presentation were only smaller than those of the pressure presentations, the combination presentation was effective for virtual reality because of the edge tracing ease induced by the slippage force used on the edges.

Keywords: Human interface, Virtual reality, Tactile mouse, Combination presentation, Slippage force.

1. Introduction In the research fields of virtual reality technology and

tele-existence [1], several display mechanisms have been tentatively presented for tactile displays, and visual and auditory displays have already been established, including head-mounted displays and five-channel surround-sound systems. So far, for display mechanisms, researchers have adopted different mechanisms such as vibrating pin arrays [2], surface acoustic waves [3], pin arrays driven by pneumatic actuators [4], stepping-motor arrays [5][6], DC-servo-motor arrays [7], piezoelectric actuators [8][9], haptic devices [10][11], and mechanochemical actuators made of ionic conducting polymer gel film (ICPF) [12].

On the other hand, the mechanoreceptive units of human tactile organs are known as the Fast Adapting type I unit (FA I), the Fast Adapting type II unit (FA II), the Slowly Adapting type I unit (SA I), and the Slowly Adapting type II unit (SA II). FA II can perceive a mechanical vibration of 0.2 μm in amplitude. FA I or II can perceive a surface unevenness of 3 μm in amplitude. SA I can perceive a pattern formed with Braille dots. According to one of this paper’s author’s previous studies, SA II is a mechanoreceptive unit that contributes to sense tangential stimulus such as finger slippage [13], [14]. As mentioned above, the four mechanoreceptive units accept different stimuli corresponding to each unit’s characteristics, while the above tactile displays only present a single kind of stimulus.

One of this paper’s authors, Miyaoka, obtained human psychophysical thresholds for normal and tangential vibrations on the hand by measuring the vibrotactile thresholds on the distal pad of the left index finger by transmitting tangential-sinusoidal vibrations onto the skin surface with a small contactor [13], [14]. The tangential and normal vibration stimuli were transmitted with a 2.5-mm diameter contactor. Miyaoka attached the contactor to a vibrator that was assembled to produce tangential vibrations, which were transmitted to the distal pad of the subject’s left index finger. The stimuli amplitudes were changed based on the Parameter Estimation Sequential Testing (PEST) procedure [15]. The average tangential-threshold curve decreased linearly from 4 to 50 Hz, increased gently until 100 Hz, and decreased again above 100 Hz. A U-shaped curve was observed between 100 and 350 Hz. The shape of the tangential-threshold curve indicates that at least two types of mechanoreceptors determine the shape of the curve. The normal- and the tangential-threshold curves did not overlap between 4 and 50 Hz and were U-shaped, but they did overlap between 100 and 350 Hz. From the above experimental results, Miyaoka suggested that FA II is the mechanoreceptor above 100 Hz and that the slowly adapting type II unit (SA II) contributed to determining the shapes of tangential-threshold curves under 50 Hz.

In the present paper, we developed a new tactile display mounted on a mouse capable of presenting not only distributed pressure but also tangential force because in Miyaoka’s above experiment, human tactile sensation showed different characteristics between normal and tangential stimuli. The mouse is equipped with a bimorph-piezoelectric-actuator array and a two-dimensional electro-magnetic-linear motor to present pressure and slippage forces, respectively. To evaluate the mouse’s presentation accuracy, we performed a series of experiments that traced the virtual figure contours. The distributed pressure display stimulates FA I, FA II, and SA I, while the x-y linear motor stimulates SA II. We evaluated the deviation from the desired trajectory for each virtual figure to confirm the effect on the present combined stimulus.


39

2. Tactile Display

2.3 Overview Based on Miyaoka’s experimental results, the tactile

display should generate not only normal force but also tangential force, since FA I and SA II accept different directional stimuli that correspond to each other’s characteristics. In this paper, we are developing a new tactile display mounted on a mouse capable of presenting both distributed pressure and tangential forces.

A block diagram for controlling the present tactile mouse is shown in Fig. 1. The x and y-directional position data of the mouse cursor are sent to a computer through a USB interface to calculate the output stimuli of the distributed pressure and the tangential forces on the basis of the positioning relationship between the mouse cursor and the virtual texture. The stimulus pins of the display pad are driven by the driver boards for the piezoelectric actuator installed on the Braille dot cell controlled by the DIO board to generate the specified pin protruding pattern. The signals for the tangential force are transmitted to a motor control board installed in the mouse through a USB interface to generate slippage force.

Figure 1. Schematic block diagram of present system

Figure 2. Braille dot cell for pin protruding pattern

2.4 Distributed Pressure Display In this paper, we use the Braille dot cell (SC2, KGS, Co.) [16] as the distributed pressure display. Since the piezoelectric ceramic actuator in the SC2 is a bimorph type, its motion is controlled by the voltage applied to the core electrode. When source voltage 200 V is applied between the upper and bottom electrodes, the end of the actuator bends downward if 200 V is applied to the core electrode. If 0 V is applied, the end of the actuator bends upward. The stroke of the upward and downward motion is about 1 mm. The strength of the generated force is around 0.1 N. The stimulus pin is moved up-and-down based on the upward and downward bending of the piezoelectric ceramic actuator. Since we used three Braille dot cells, a 46× dot pattern is formed on the display pad.

Figure 3. Principle of x-y linear motor

Figure 4. Combined tactile display mounted on a mouse

2.5 Slippage Force Display and Combined Stimulation Type of Tactile Mouse

We used Fuji Xerox’s tactile mouse [17] as a slippage force display and a mouse to transmit the tangential force on the fingertip. The mouse is equipped with an x-y linear motor on an optical mouse to generate static shearing force and its vibration.

Figure 3 shows the principle of the linear motor. It is equipped with x- and y-directional sliders. The x-directional slider resembles a square frame, and the y-directional slider


40

is a bar that moves linearly within the square frame. Each slider has an 8-shaped coil (Fig. 3). If the current flows as shown by arrows, the magnetization directions generated in the top- and bottom-half x-directional coils are upward and downward, respectively. As the result, the slider moves to the right. The y-directional slider moves in the downward direction, since the magnetization direction of right- and left-half coils are downward and upward, respectively. Force signals are transmitted to a motor control board installed in the mouse through the USB interface. The x-y linear motor generates x-y directional forces within about 7 N.

The left figure in Fig. 4 shows the appearance of Fuji Xerox’s tactile mouse installed in the x-y linear motor. Users put their index fingers on the circular concave portion of the mouse. Since the circular concave portion is vibrated tangentially using the x-y linear motor, they can experience a virtual texture feeling.

We developed a combined stimulation type tactile mouse by combining the distributed pressure display and the Fuji Xerox’s tactile mouse, as shown on the right in Fig. 4. In this mouse, the distributed pressure display is mounted on the x-y linear motor after the circular concave portion and the top cover of the tactile mouse are removed. If user fingers are placed on the tactile mouse’s display pad and move the mouse on the virtual texture, users can feel a convex or concave surface based on the virtual texture.

3. Experimental Procedure To evaluate the present tactile mouse, we performed edge tracing tests for virtual circles, triangles, and squares (Fig. 8). According to a previous feasibility study, we found that operators did not judge whether their fingers were on the virtual figure or not. If the positions of all the stimulus pins are on a virtual figure and their fingers remain on it, all stimulus pins keep protruding and operators cannot judge if the mouse cursor is on it because their sense of feeling is adapted for the continuing protrusion of the pins. To prevent this problem, the shearing force was generated in proportion to the cursor’s speed when the cursor is on a virtual figure. In addition, when the cursor entered a virtual figure to emphasize the edge line, the reaction force was generated in the direction of the edge of the virtual figure (Fig. 5). Four male subjects performed the edge tracing tests. Their average age was about 26.

Figure 5. Virtual objects

Figure 6. Slippage stimulation generated on virtual edge

300 400 500–400

–300

–200

x–coordinate pixel

y–co

ordi

nate

pixe

l

(a) Only distributed pressure

300 400 500–400

–300

–200


y–co

ordi

nate

pixe

l

(b) Distributed pressure and slippage force

Figure 7. Trajectory obtained by virtual circle tracing


41

4. Experimental Results and Discussion We exemplify subject A’s edge tracing trajectories for

the circles in Fig. 7. Ordinate and abscissa show the x- and y-coordinates of the cursor on the computer screen. The desired trajectory is depicted by the solid line in Fig. 6. As shown in Fig. 7, the tracing precision of the pressure and tangential force presentation is higher than that of only the pressure distribution presentation.

Additionally, only the number of points registered by pressure stimulation is larger than that registered by the pressure and tangential force presentation. In the pressure presentation, 986 points were recorded, but 468 points were recorded in the combined stimulation. Since sampling time is a constant, the time consumptions required for the pressure stimulation only and the combined stimulation were about 30 and 15 seconds, respectively. This result indicates that operator recognition for the virtual figure is enhanced with pressure and tangential force stimulation.

Figure 8 shows the edge tracing trajectories for a square. Compared to Fig. 7, the enhancement of the tracing precision induced using combined stimulation is not clear. Table 1 summarizes the mean deviation apart from the desired trajectory among the four subjects. Since the deviation of the pressure and tangential force is smaller than that of the only pressure presentation for every shape, the presentation capability is enhanced by the combined stimulation, even if the effect is not considerable in the triangles and squares. Since the stimulus pins are aligned as squares, the horizontal and vertical lines, including those in the triangles and squares, are easily recognizable without tangential stimulus. Therefore, using a combined stimulus is especially effective for such difficult shapes as circles that do not include straight lines.

300 400 500–400

–300

–200


y–co

ordi

nate

pixe

l

(a) Only distributed pressure

300 400 500–400

–300

–200


y–co

ordi

nate

pixe

l

(b) Distributed pressure and slippage force

Figure 8. Trajectory obtained by virtual square tracing

5. Conclusion To enhance the presentation capability of a tactile

mouse, we developed a tactile mouse capable of distributed pressure and tangential force. The distributed pressure stimulation was generated using a distributed pressure display based on a two-point differential threshold. It possesses a 6-by-4 array of stimulation points. The distance between the two adjacent stimulus pins is 2.0 mm. We adopted Fuji Xerox’s tactile mouse as a slippage force display and a mouse to transmit tangential force on the fingertip. We equipped the tactile mouse with an x-y linear motor on the optical mouse to generate static shearing force and its vibration. We also developed the combined stimulation type of tactile mouse by combining the distributed pressure display and Fuji Xerox’s tactile mouse. To evaluate our tactile mouse, we performed edge tracing tests for virtual circles, triangles, and squares with four male subjects. The experimental results showed that presentation capability was enhanced with the combined stimulation because the edge tracing precision obtained using combined stimulation exceeded that using only pressure stimulation. Subsequently, we are continuing to verify the present tactile mouse with a new psychophysical experiment.

Table 1: Deviation errors (Unit: pixel)

Pressure stimulus only

Pressure plus shearing stimuli

Circles 15.9 6.2

Triangles 14.8 12.2

Squares 15.9 15.2


42

Acknowledgments The authors express their appreciation to Fuji Xerox for their donation of a tactile mouse.

References [1] S. Tachi and K. Yasuda, “Experimental Evaluation of

Tele-Existence Manipulation System,” Proc. of the 1993 JSME Int. Conf. on Advanced Mechatronics, Tokyo, 110-115, 1993.

[2] Y. Ikei, M. Yamada, and S. Fukuda: Tactile Texture Presentation by Vibratory Pin Arrays Based on Surface Height Maps, Proc. of Int. Mechanical Engineering Conf. and Exposition, 51-58, 1999.

[3] M. Takahashi, T. Nara, S. Tachi, and T. Higuchi: “A Tactile Display Using Surface Acoustic Wave,” Proc. of the 2000 IEEE Inter. Workshop on Robot and Human Interactive Communication, pp. 364-377, 2000.

[4] Y. Tanaka, H. Yamauchi, and K. Amemiya, “Wearable Haptic Display for Immersive Virtual Environment,” Proc. 5th JFPS International Symposium on Fluid Power, Vol. 2, pp. 309-314, 2002.

[5] M. Shinohara, Y. Shimuzu, and A. Mochizuki, “Three-Dimensional Tactile Display for the Blind,” IEEE Trans. on Rehabilitation Engineering, Vol-3, pp. 249-1998, 1998.

[6] M. Shimojo, M. Shinohara, and Y. Fukui, “Human Shape Recognition Performance for 3-D Tactile Display,” IEEE Trans. on Systems, Man, and Cybernetics - Part A: Systems and Humans, Vol. 29-6, pp. 637-647, 1999.

[7] H. Iwata, H. Yano, F. Nakaizumi, and R. Kawamura, “Project FEELEX: Adding Haptic Surface to Graphics,” Proc. ACM SIGGRAPH 2001, pp. 469-475, 2001.

[8] T. Watanabe, K. Youichiro, and T. Ifukube: “Shape Discrimination with a Tactile Mouse,” The Journal of the Institute of Image Information and Television Engineers, Vol. 54, No. 6, pp. 840-847 2000. (in Japanese)

[9] M. Ohka, H. Koga, Y. Mouri, T. Sugiura, T. Miyaoka, and Y. Mitsuya: Figure and Texture Presentation Capabilities of a Tactile Mouse Equipped with a Display Pad Stimulus Pins, Robotica, Vol. 25, pp. 451-460, 2007.

[10] J. K. Salisbury and M. A. Srinivasan: “Phantom-Based Haptic Interaction with Virtual Objects,” IEEE Computer Graphics and Applications, Vol.17, No.5, pp6-10, 1997.

[11] PHANToM, http://www.sensable.com/products/phantom_ghost

[12] M. Konyo, S. Tadokoro, T. Takamori, K. Oguro, and K. Tokuda: “Tactile Feeling Display for Touch of Cloth Using Soft High Polymer Gel Actuators,” JRSJ, Vol. 6-4, pp. 323-328, 2000. (in Japanese)

[13] T. Miyaoka: “Measurements of Detection Thresholds Presenting Normal and Tangential Vibrations on Human Glabrous Skin,” Proceedings of the Twentieth Annual Meeting of the International Society for Psychophysics, 20, 465-470, 2004.

[14] T. Miyaoka: “Mechanoreceptive Mechanisms to Determine the Shape of the Detection-threshold Curve Presenting Tangential Vibrations on Human Glabrous Skin,” Proceedings of the 21st Annual Meeting of The International Society for Psychophysics, 21, 211-216, 2005.

[15] M. M. Taylor and C. D. Creelman: “PEST: Efficient Estimate on Probability Functions,” The Journal of the Acoustical Society of America, l41, 782-787, 1967.

[16] Braille Cells, http://www.kgs-jpn.co.jp/epiezo.html [17] K. Sakamaki, K. Tsukamoto, K. Okamura, T. Uchida,

and Y. Okatsu: “A Tactile Presentation System Using a Two-dimensional Linear Actuator,” Research Report of Human Interface Association, Vol. 1, No.5, pp. 83-86, 1999. (in Japanese)


43

Active Constellation Extension combined with Particle Swarm Optimization for PAPR Reduction

in OFDM Signals

A. Ouqour1, Y. Jabrane2, B. Ait Es Said3, A. Ait Ouahman4

1,3Cadi Ayyad University, Faculty of Sciences Semlalia, Department of physics, Avenue

Prince My Abdellah, P.O. Box 2390, 40001, Marrakech, Morocco [email protected] , [email protected]

2,4Cadi Ayyad University, National School of Applied Sciences, Avenue Abdelkarim

Khattabi, P.O. Box 575, 40001, Marrakech, Morocco [email protected] , [email protected]

Abstract: Active Constellation Extension - Approximate Gradient Project (ACE-AGP) algorithm is an interesting method for reducing the envelope fluctuations in multi-carrier signals, but it occurs a very high computational complexity and thus a large convergence time. In this paper, Particle Swarm optimization (PSO) is introduced to fight this complexity, by searching and reducing the Peak to Average Power Ratio (PAPR) with respect to ACE-AGP. Simulation results show that the proposed method solves the PAPR problem with a very low complexity and better performance.

Keywords: OFDM, PAPR, PSO, ACE-AGP.

1. Introduction When Multi-carrier modulations such as Orthogonal Frequency Division Multiplexing (OFDM) has been advocated for Long Term Evolution (LTE) of wireless personal communications [1]. However, it undergoes large envelope fluctuations, causing a loss in energy efficiency due to the need of power back-off at the High Power Amplifiers (HPA). Several proposals in the literature try to reduce their envelope fluctuations to fight this problem which is the most important drawbacks in multi-carrier modulations [2] - [8]. Active Constellation Extension (ACE) [9] is an interesting technique since it is able to achieve large reductions, it does not need side information and it only needs a small increase in transmits power. However, ACE requires much iteration for convergence and this, unfortunately, constitutes its main weakness [10]. In this paper, ACE will be used combined with PSO, with the aim of drastically reducing its implementation complexity. To that end, we will use Particle Swarm Optimization (PSO), a well-known tool (heuristic) to solve the complexity problems [11] - [14]. The balance of this paper is organized as follows. Section 2 introduces the system model. Section 3 describes the ACE method. In Section 4, the proposed PSO-ACE-AGP architecture to reduce the envelope fluctuations is described and analyzed. Then, the obtained results are presented and discussed in Section 5. Finally, conclusions are drawn in Section 6.

2. System Model In a multi-carrier system, the time-domain complex base-band transmitted signal for the ℓ-th symbol can be written as:

where N is the number of sub-carriers and is the frequency-domain complex base-band symbol modulated on the k-th sub-carrier at OFDM symbol . The classical metric to evaluate the peak power is the PAPR, defined as:

where denotes the expectation, and and represent the -norm and the 2-norm of , respectively.

3. ACE method The ACE method modifies and expands the constellation points within an allowable region which does not affect the demodulation slicer, and thus, it does not need side information to be sent. By using this new degree of freedom, multi-carrier signals with arbitrarily low envelope fluctuations can be obtained. In [9], different algorithms to achieve PAPR reduction are provided. In this paper, the Approximate Gradient-Project (AGP) algorithm will be used. Let's be the frequency-domain version from , i.e, its FFT, and Q the clipping amplitude value. The algorithm proceeds as follows:

• Initialization: i accounts for the iteration index.

• Clip any , Q and form

where . • Compute the clipped signal:


44

• Calculate the FFT of to obtain .

• Maintain the values of when they are valid point extensions on the constellation and set to 0 when not. Apply an IFFT to obtain c.

• Determine the step size according to some criterion and compute new version of the time-

domain signal .

• Calculate the for the new signal. If

acceptable, stop the algorithm and return as output, otherwise, increase i and go to step 2.Iterate until target is accomplished or a maximum number of iterations is reached.

4. Particle Swarm Optimization Particle swarm optimization (PSO) is a population based stochastic optimization technique, system is initialized with a population of random solutions and searches for optima by updating generations. However, PSO has no evolution operators such as crossover and mutation. In PSO, the potential solutions, called particles, fly through the problem space by following the current optimum particles with respect to a specific algorithm. In a previous work [15] a suboptimal partial transmit sequence (PTS) based on particle swarm optimization (PSO) algorithm has been presented for the low computation complexity and the reduction of the peak-to-average power ratio (PAPR) of an orthogonal frequency division multiplexing (OFDM) system. The procedure of standard PSO can be summarized as follows:

PSO algorithm

In this work, during the PSO process, each potential solution is represented as a particle with a position vector

, referred to as clipped-off portion and a moving velocity represented as and , respectively.

PSO ACE-AGP implementation

Thus for a K-dimensional optimization, the position and velocity of the ith particle can be represented as:

(4) respectively. Each particle has its own best position:

corresponding to the individual best objective value obtained so far at time t, referred to as pbest. The global best (gbest) particle is denoted by:

which represents the best particle found so far at time t in the entire swarm. The new velocity (t+1) for particle i is updated by:

where is the old velocity of the particle i at time t. Apparent from this equation, the new velocity is related to the old velocity weighted by weight w and also associated to the position of the particle itself and that of the global best one by acceleration factors c1 and c2. The c1 and c2 are therefore referred to as the cognitive and social rates, respectively, because they represent the weighting of the acceleration terms that pull the individual particle toward the personal best and global best positions. The inertia weight w in eq (7) is employed to manipulate the impact of the previous history of velocities on the current velocity. Generally, in population-based optimization methods, it is desirable to encourage the individuals to wander through the entire search space, without clustering around the local optima, during the early stage of the optimization. A suitable value for w(t) provides the desired balance between the global and local exploration ability of the swarm and, consequently, improves the effectiveness of the algorithm. Experimental results suggest that it's preferable to initialize the inertia weight to a large value, giving

Modifed ACE-AGP Modifed ACE-AGP

Modifed ACE-AGP

PSO

OFDM signals with reduced PAPR

Original O

FDM

signals

Modifed ACE-AGP

Start define solution space

Generate initial population random

Evaluate fitness for each particle and store the global and person best positions

Decision taken

End

Update the personal and global best position according to the fitness value

G = G + 1

Adjust inertial weight monotonically decreasing function

Update each particle velocity and position

Check stop criteria


45

priority to global exploration of the search space, linear decreasing w(t) so as to obtain refined solutions [13], [14]. For the purpose of intending to simulate the slight unpredictable component of natural swarm behaviour, two random functions r1 and r2 are applied to independently provide uniform distributed numbers in the range [0, 1] to stochastically vary the relative pull of the personal and global best particles. Based on the updated velocities, new position for particle i is computed according the following equation:

The populations of particles are then moved according to the new velocities and locations calculated by (11) and (12), and tend to cluster together from different directions. Thus, the evaluation of each associate fitness of the new population of particles begins again. The algorithm runs through these processes iteratively until it stops. In this paper, the current position can be modified by [16]:

where is the initial weight, is the final weight,

is maximum number of iterations, and is the current iteration number.

5. Results Monte Carlo simulations have been carried out with 10000 randomly generated QPSK-modulated OFDM symbols for N=256, N=512 and N=1024. In figure 1, we can observe that the proposed method outperform the one based on ACE-AGP since the PAPR reduction of PSO-ACE-AGP when Gn = 40 is about respectively 7dB, 5.9dB and 4.9dB for N=256, 512 and N=1024, comparing to 7.3dB, 6.8dB and 5.9dB for N=256, N=512 and N=1024 when using ACE-AGP (the convergence of ACE-AGP is slower). It is worth noting that these results are achieved with very low complexity and with only one iteration ( ) by choosing different number of generations (Gn = 10; 20; 30; 40) (2000 iterations for ACE-AGP). The complexity of the algorithm, in terms of complex multiplications and additions per OFDM symbol, is

and , respectively, where is the number of iterations, which is usually high. Besides, on each iteration, the needs to be evaluated to determine if the target goal has been reached. These operations are also required in many other methods. Besides, a DFT/IDFT are needed per iteration. A complexity summary and comparison has been detailed in table 1.

(a) N=256

(b) N=512

(c) N=1024

Figure 1. Comparison of PAPR reduction between ACE-AGP and PSO-ACE-AGP using QPSK


46

Table 1: Computation complexity

Method Computation complexity per symbol

Optimal

ACE-AGP

PSO ACE-AGP

Gn=40

Optimal

ACE-AGP

PSO ACE-AGP

Gn=40

Optimal

ACE-AGP

PSO ACE-AGP

Gn=40

6. Conclusion In this paper a proposed algorithm for reducing the envelope fluctuations in OFDM signals has been described. Simulation results show that, this algorithm is able to obtain signals with low envelope fluctuations at a very low complexity and is valid for any number of sub-carriers, moreover it achieves a better performance than previous work. It is worth noting that, the obtained improvement of the envelope behavior is better for large number of sub-carriers.

References [19] 3rd Generation Partnership Project (3GPP), “UTRA-

UTRAN Long Term Evolution (LTE) and 3GPP System Architecture Evolution (SAE),” http://www.3gpp.org/Highlights/LTE/LTE.htm, [Accessed: Sept. 1, 2010].

[20] M. Breiling, S. H. Muller-Weinfurtner, and J. B. Huber, “SLM peakpow- erreduction without explicit side information, ” IEEE Communications Letters, V (6), pp. 239 - 241, 2001.

[21] J. C. Chen, “Partial Transmit Sequences for peak-to- average power ratio reduction of OFDM signals with the cross-entropy method, ” IEEE Signal Processing Letter, XVI (6), pp. 545 - 548, 2009.

[22] J. Tellado, “Peak-to-average power reduction, ” Ph.D. dissertation, Stan-ford University, 1999.

[23] M. Ohta, Y. Ueda, and K. Tamashita, “PAPR reduction of OFDM signal by neural networks without side information and its FPGA implementation,” Transactions of the Institute of Electrical Engineers of Japan, pp. 1296-1303, 2006.

[24] Y. Jabrane, V. P. G. Jimnez, A. G. Armada, B. A. E. Said and A. A. Ouahman, “Reduction of Power Envelope Fluctuations in OFDM Signals by using Neural Networks, ” IEEE Communications Letters, XIV (7), pp. 599-601, 2010.

[25] I. Sohn, “RBF neural network based SLM peak-to-average power ratio reduction in OFDM systems,” ETRI Journal, XXIX (3), pp. 402-404, 2007.

[26] S. H. Han and J. H. Lee, “An overview of peak-to-average power ratio reduction techniques for multicarrier transmission,” IEEE Wireless Communications, XII (2), pp. 56-65, 2005.

[27] B. S. Krongold and D. L. Jones, “PAR reduction in OFDM via Active Constellation Extention,” IEEE transactions on Broadcasting, XLIX (3), pp. 258-268, 2003.

[28] Y. Jabrane, V. P. G. Jimnez, A. G. Armada, B. E. Said, and A. A. Ouahman, “Evaluation of the effects of Pilots on the envelope fluctuations reduction based on Neural Fuzzy Systems,” In Proceedings of IEEE International Workshop on Signal Processing Advances for Wireless Communications (SPAWC), 2010.

[29] S. Xu, Q. Zhang and W. Lin, “PSO-Based OFDM Adaptive Power and Bit Allocation for Multiuser Cognitive Radio System,” In Proceedings of Wireless Communications, Networking and Mobile Computing (WICOM), pp. 1-4, 2009.

[30] A. Ratnaweera, S. K. Halgamuge, and H. C. Watson, “Selforganizing hierarchical particle swarm optimizer with timevarying acceleration coeffcients,” IEEE Transactions on Evolutionary Computation, VIII (3), pp.240-255, 2004.

[31] M. Clerc and J. Kennedy, “The particle swarm-explosion, stability, and convergence in a multidimensional complex space,” IEEE Transactions on Evolutionary Computation, VI (1), pp. 58-73, 2002.

[32] Y. Shi and R. Eberhart, “A modified particle swarm optimizer,” In Proceedings of IEEE International Conference on Evolutionary Computation (ICEC), pp. 69-73, 1998.

[33] J. HorngWen, S. H. Lee, Y. F. Huang and H. L. Hung, “A Suboptimal PTS Algorithm Based on Particle Swarm Optimization Technique for PAPR Reduction in OFDM Systems,” EURASIP Journal on Wireless Communications and Networking, pp. 1-8, 2008.

[34] J. HorngWen, S. H. Lee, Y. F. Huang and H. L. Hung, “Particle swarm optimization in electromagnetics,” IEEE Transactions on Antennas and Propagation, LII (2), pp. 397-407, 2004.


47

Author’s Profile

Ahmed Ouqour received engineering degrees of state in telecommunications in 1998 from the Air Force Academy of Marrakesh, Morocco. He also obtained engineering degrees of state in computer and networks in 2005 from the National School of Mineral Industry Rabat, Morocco. He is now with the Team of Telecommunications and Computer Networks of the University Cadi Ayyad of

Marrakech, Morocco preparing his PhD.

Younes Jabrane received his PhD in telecommunications and informatics from Cadi Ayyad University of Marrakech, Morocco. He is now Assistant Professor in National School of Applied Sciences of Marrakech. He did several stay researches in Department of signal theory and communications of Carlos III of Madrid, Spain, his researches are on CDMA and

OFDM.

Brahim Ait Es Said is a Professor in Faculty of Sciences Semlalia of Marrakech, he is a supervisor of several students PhD, also he did many stay researches in University of Valenciennes, France. His researches are on Channel equalization, image processing and OFDM.

Abdellah Ait Ouahman is a Professor Director of National School of Applied Sciences of Marrakech. He supervises several students’ PhD, he is the coordinator of several projects and he is also the local chair of IEEE conferences organized in Marrakech for many times. His researches are on logistic, telecommunications and informatics networks.


48

Some Results of T-Fuzzy Subsemiautomata over Finite Groups

M. Basheer Ahamed1 and J.Michael Anna Spinneli2

Department of Mathematics, Karunya University, Coimbatore-641114, Tamilnadu, INDIA. [email protected]; [email protected]

Abstract: In 1965, Zadeh introduced the concept of fuzzy sets. Wee introduced the concept of fuzzy automata in 1967. Malik et al. introduced the concept of fuzzy state machines and fuzzy transformation semi groups based on Wee’s concept of fuzzy automata in 1994. A group semiautomaton has been extensively studied by Fong and Clay. Das fuzzified this concept to introduce fuzzy semiautomaton over a finite group. Sung-jin cho, jae- Gyeom Kim, Seok-Tae Kim introduced the notion of T-fuzzy semiautomaton, T-fuzzy kernel and T-fuzzy subsemiautomaton over a finite group. In this paper, we further give some properties of T-fuzzy subsemiautomaton over finite groups.

Keywords: T-fuzzy normal subgroup, T-fuzzy kernel, T-fuzzy subsemiautomata.

1. Introduction Fuzzy automata concept was introduced by Wee in 1967 [8].Using this concept Malik et al. [5] introduced the concept of fuzzy state machines and fuzzy transformation semigroups. A Group semiautomata has been extensively studied by Fong et al [3].This concept was fuzzified by Das [2], and he introduced fuzzy semiautomaton over finite groups. The notion of T fuzzy semiautomaton over finite group was introduced by Kim and Cho [1]. In this paper we have got some more results of T fuzzy subsemiautomaton and T fuzzy kernel.

2. Preliminaries In this section we summarize the preliminary definitions, and results that are required for developing main results.

2.1 Definition [1] A binary operation T on[ ]0,1 is called a t- norm if

(1) ( ,1)T a a= (2) ( , ) ( , )T a b T a c whenever b c≤ ≤ (3) ( , ) ( , )T a b T b a= (4) ( , ( , )) ( ( , ), ) , , [0,1]T a T b c T T a b c for all a b c= ∈ . The maximum and minimum will be written as ∨ and ∧ , respectively. Define T0 on [0,1] by

0 0( ,1) (1, )T a a T a= = and

0 ( , ) 0 1 1 , [0,1]T a b if a and b for all a b= ≠ ≠ ∈ Here T always will mean a t- norm on[0,1] . A t-norm T on [0,1] is said to be ∨ distributive if

( , ) ( , ) ( , ) , , [0,1].T a b c T a b T a c for all a b c∨ = ∨ ∈ Throughout this paper, T shall mean a ∨ - distributive

t- norm on[0,1] unless otherwise specified. By an abuse of notation we will denote

1, 2, 1( ( (......., ( , )....)))n nT a T a T T a a−

1( ,....... )nbyT a a where [ ]1 2, ,..... 0,1na a a ∈ . The

legitimacy of this abuse is ensured by the associativity of T (Definition 2.1(4)). Note: For further discussions we are considering ( ),G + as

a finite group. 2.2 Definition [1]

A fuzzy subset λ of G is called a T – fuzzy subgroup of G if (i) ( ) ( ( ), ( )),x y T x yλ λ λ+ ≥ (ii) ( ) ( ) ,x x for all x y Gλ λ= − ∈ . 2.3 Definition [1]

A T – fuzzy subgroup λ of G is called a T- fuzzy normal subgroup of G if

( ) ( ) ,x y y x for all x y Gλ λ+ = + ∈ . 2.4 Definition [1]

A triple ( , , )M Q X τ= where ( , )Q + is a finite group, X is a finite non empty set and τ is a fuzzy subset of Q X Q× × , that is τ is a function from Q X Q× × to [0,1],is called a fuzzy semiautomaton if ( , , ) 1

q Q

p a qτ∈

≤∑ for all p Q∈ and a∈X. If

( , , ) 1q Q

p a qτ∈

=∑ for all p Q∈ and a X∈ , then M is

said to be complete. 2.5 Definition [1]

Let ( , , )M Q X τ= be a fuzzy semiautomaton.

Define * *: [0,1]Q X Qτ × × → by * 1

0( , , ) { i f p qi f p qp qτ =

≠Λ =

and

( )

1 1*

1 1 2 2

1

( ( , , ),( , ..... , ) ( , , ).....,

, , ) /n

n n i

T p a qp a a q q a q

q a q q Q

τ

τ τ

τ −

= ∨ ∈

where ,p q Q∈ 1,..... .na a X∈ When T is applied on M as

above, M is called a T-fuzzy semiautomaton.


49

Note: Hereafter, a fuzzy semiautomaton always will be written as a T-fuzzy semiautomaton because a fuzzy semiautomaton always induces a T-fuzzy semiautomaton as in Definition 2.5. 2.6 Definition [1]

A fuzzy subset µ of Q is called a T- fuzzy kernel of a T- fuzzy semiautomaton ( , , )M Q X τ= (i) µ is a T- fuzzy normal subgroup ofQ . (ii) ( ) ( ( , , ), ( , , ), ( ))p r T q k x p q x r kµ τ τ µ− ≥ + for all , , , ,p q r k Q x X∈ ∈ . 2.7 Definition [1]

A fuzzy subset µ ofQ is called a T- fuzzy subsemiautomaton of a T-fuzzy semi automaton ( , , )M Q X τ= if (i) µ is a T – fuzzy subgroup Q , (ii) ( ) ( ( , , ), ( )) , ,p T q x p q for all p q Q x Xµ τ µ≥ ∈ ∈ . 2.8 Definition [7]

Let λ and µ be T- fuzzy subsets of G, then the sum of λ and µ is defined by

( ( ), ( )) / ,( )( )

T y x y z Gx

such that x y zλ µ

λ µ∈

+ = ∨ = +

for all x G∈ . 2.9 Remark

Let λ be a fuzzy subset of G. Then λ will be a fuzzy subgroup of G depending on the membership values that we are choosing. Let us see the following examples.

2.10 Example [4]

Let { }, , ,G e a b c= be the Klein four groups. Define the

fuzzy subset A of G by ( ) ( ) ( ) ( )1, 1, 3 / 4, 3 / 4A e A a A b A c= = = = .Clearly

A is a fuzzy subgroup of G.

2.11 Example [6]

Let { }, , ,G e a b c= be the Klein four groups. Define the

fuzzy set λ of G defined by

( ) ( ) ( ) ( )0.6, 0.7, 0.4, 0.4e a b cλ λ λ λ= = = = .Clear

ly λ is not a fuzzy group of G. 3. Main results This section consists of some more results of T fuzzy subsemiautomaton and T fuzzy kernel. 3.1 Proposition Let µ be a T- fuzzy kernel of a T fuzzy semiautomaton

( , , )M Q X τ= over a finite group Q .Then µ is a T- fuzzy subsemiautomaton M if and only if

( ) ( ( , , ), ( ))p T e x p eµ τ µ≥ for all ,p Q x X∈ ∈ .

Proof:

Suppose that the given condition is satisfied.

For all , ,p q r Q and x X∈ ∈ . We have ( ) ( )p p r rµ µ= − +

( ( ), ( ))T p r rµ µ≥ −

( ( ( , , ), ( , , ), ( )), ( ))T T e q x p e x r q rτ τ µ µ≥ +

By definition 2.6

( ( , , ), ( , , ), ( ), ( ))T e q x p e x r q eτ τ µ µ≥ +

By given condition

( ( , , ), ( ))T e q x p qτ µ≥ + Since ( ) ( )e pµ µ≥ ,

( ) ( ), , , ,e x r e q x pµ µ≥ +

( ( , , ), ( ))T q x p qτ µ=

Then, µ is a T- fuzzy sub semiautomaton of ( , , )M Q X τ= and the converse is immediate.

3.2 Example Consider a finite T fuzzy semiautomaton

( , , )M Q X τ= where { }, , ,Q p q r k= and

{ },X a b=

Figure 1. Finite T-fuzzy semiautomaton

Clearly,τ is a T-fuzzy kernel. Let µ be a fuzzy subset of

Q defined by ( ) 0.4pµ = ,

( ) 0.2qµ = , ( ) 0.1,rµ = ( ) 0.1kµ = .T-norm is defined

by ( ) { }, 1,0T a b a b= ∨ + −

( ),Q + is a group defined by

, , ,p p p p q q p r r p k k+ = + = + = + =, , ,q p q q q p q r k q k r+ = + = + = + =

p q

r k

a/0.3

b/0.1

b/0.2

b/0.4

a/0.2

a/0.4 a,b/0.2


50

, , ,r p r r q k r r q r k p+ = + = + = + =, , ,k p k k q r k r p k k q+ = + = + = + =

p is the identity element. 1 1 1, ,q q r k k r− − −= = =

Then, µ is a T-fuzzy subsemiautomaton. 3.3 Proposition Let µ be a T- fuzzy kernel, and ν be a T- fuzzy subsemiautomaton of ( , , )M Q X τ= . Then µ ν+ is a T- fuzzy subsemiautomaton of M .

Proof: ( )( ) ( )( )p p r rµ ν µ ν+ = + + −

( )( ), ( )T p r rµ ν≥ −

( ) ( ) ( ) ( ) ( )( ), , , , , , , , , ,T a b x p a x r b a x r aτ τ µ τ ν≥ +

By definition 2.6 ( ) ( ) ( )( ), , , ,T a b x p b aτ µ ν≥ +

Since ( ) ( ), , , ,a b x p a x rτ τ+ ≤

Then for all , ,p q Q x X∈ ∈

( )( ) ( ) ( ) ( )( ), , , , :

, ,

T a b x p b ap

a b q a b Q

τ µ νµ ν

+ + ≥ ∨ + = ∈

( ) ( ) ( )( ){ }( ), , , , : , ,T a b x p T b a a b q a b Qτ µ ν≥ + ∨ + = ∈

( ) ( )( )( ), , ,T a b x p qτ µ ν= + +

By definition 2.8 ( ) ( )( )( ), , ,T q x p qτ µ ν= +

Hence ( )µ ν+ is a T- fuzzy subsemiautomaton.

3.4 Example

Consider a finite T fuzzy semiautomaton ( , , )M Q X τ=

where { }, , ,Q p q r k=

and { }X a=


Let µ be a fuzzy subset of Q defined

by ( ) 0.4pµ = , ( ) 0.1qµ = , ( ) 0.25rµ = , ( ) 0.25kµ =

Clearly, µ is a T-fuzzy kernel.et ν be a fuzzy subset of Q

defined by ( ) 0.6pν = ,

( ) 0.2qν = , ( ) 0.1rν = , and ( ) 0.1kν = .

Clearly ν is T-fuzzy subsemiautomaton.T-norm is defined by ( ),T a b ab= , ( ),Q + is a group defined by

, , ,p p p p q q p r r p k k+ = + = + = + =, , ,q p q q q p q r k q k r+ = + = + = + =, , ,r p r r q k r r q r k p+ = + = + = + =, , ,k p k k q r k r p k k q+ = + = + = + =

p is the identity element. We have 1 1 1, ,q q r k k r− − −= = = .

Then ( )µ ν+ is a T-fuzzy subsemiautomaton

of ( , , )M Q X τ= .

3.5 Theorem

Let µ and ν are T-fuzzy kernels of ( , , )M Q X τ= . Then µ ν+ is a T-fuzzy kernel of ( , , )M Q X τ= . Proof: Since µ and ν are T-fuzzy normal subgroups ofQ ,

( )µ ν+ is also a T-fuzzy normal subgroup.

( )( ) ( )( )p r p q q rµ ν µ ν+ − = + − + −

( ) ( )( ),T p q q rµ ν≥ − −

( ) ( ) ( )( )( ) ( ) ( )( )

, , , , , , ,

, , , , , ,

T a b c x p a b x q cT

T a b x q a x r b

τ τ µ

τ τ ν

+ + + ≥ +

By definition2.6

( ) ( )( ) ( ) ( )

, , , , , ,

, , , ,

a b c x p a b x qT

c a x r b

τ τ

µ τ ν

+ + + =

( ) ( ) ( ) ( )( ), , , , , , ,T a b c x p c a x r bτ µ τ ν= + +

Since ( ) ( ), , , ,a b c x p a b x qτ τ+ + ≤ + .

Then for all , , ,p q r k Q and x X∈ ∈

( )( )p rµ ν+ −

( ) ( ) ( ) ( )( ), , , , , , :

, ,

T q b c x p q x r c b

b c Q b c k

τ τ µ ν + + ≥ ∨ ∈ + =

( ) ( ) ( ) ( )( ){ }, , , , , , , :

, ,

q b c x p q x r T c bT

b c Q b c k

τ τ µ ν + + ∨=

∈ + =

( ) ( ) ( ) ( )( ), , , , , ,T q k x p q x r kτ τ µ ν= + + Hence

( )µ ν+ is a T-fuzzy kernel of ( , , )M Q X τ= .

3.6 Example

p q

k

r

a/0.4

a/0.3

a/0.2

a/0.1


51

Consider a finite T-fuzzy semiautomaton ( , , )M Q X τ= ,

where { }, , , , ,Q p q r s t u= ,

{ }X a=


Here ( ),Q + is a group defined by

, , , ,,

p p p p q q p r r p s sp t t p u u

+ = + = + = + =+ = + =

, , , ,,

q p q q q s q r u q s pq t r q u t

+ = + = + = + =+ = + =

, , , ,,

r p r r q u r r q r s tr t p r u s

+ = + = + = + =+ = + =

, , , ,,

s p s s q p s r t s s qs t u s u r

+ = + = + = + =+ = + =

, , , ,,

t p t t q r t r p t s ut t s t u q

+ = + = + = + =+ = + =

, , , ,,

u p u u q t u r s u s ru t q u u p

+ = + = + = + =+ = + =

p is the identity element. We have 1 1 1 1 1 1, , , , ,p p q s r t s q t r u u− − − − − −= = = = = = T-

norm is defined by ( ),T a b ab=

Let µ be a fuzzy subset of Q defined by

( ) 0.1pµ = , ( ) 0.2qµ = , ( ) 0.1rµ = , ( ) 0.2sµ = ,

( ) 0.1tµ = , ( ) 0.3uµ =

Clearly, µ is a T-fuzzy kernel. Let ν be a fuzzy subset of Q defined by

( ) ( ) ( ) ( )( ) ( )

0.1, 0.2, 0.2, 0.2,

0.2, 0.1

p q r s

t u

ν ν ν ν

ν ν

= = = =

= =Clearly,

ν is a T-fuzzy kernel Then, ( )µ ν+ is a T-fuzzy kernel of

( , , )M Q X τ= . References [1] S. J. Cho, J. G. Kim, S. Tae Kim, T-fuzzy

semiautomata over finite groups, Fuzzy sets and systems 108 (1999) 341-351.

[2] P.Das, On some properties of fuzzy semiautomaton over a finite group, Information Sciences 101 (1997) 71-84.

[3] Y. Fong, J.R. Clay, Computer programs for investigation syntactic near rings of finite group semiautomata, Academia Sinica 16 (4) (1988) 295-304.

[4] D. S. Malik, J. N. Mordeson, and P.S. Nair, Fuzzy normal subgroups in fuzzy subgroups, J. Korean Math. Soc. 29 (1992), No. 1, 1–8.

[5] D. S. Malik, J. N. Mordeson, and M. K. Sen, On subsystems of a fuzzy finite state machines, Fuzzy sets and systems 68 (1994) 83-92.

[6] Sandeep Kumar Bhakat, (∈ ,∈ ,Vq) - fuzzy normal, quasinormal and maximal subgroups, Fuzzy Sets and Systems 112 (2000) 299-312

[7] Sessa, On fuzzy subgroups and fuzzy ideals under triangular norms, Fuzzy Sets and Systems 13(1984) 95-100.

[8] W.G. Wee, On generalizations of adaptive algorithm and application of the fuzzy sets concept to pattern classification, Ph.D., Thesis, Purdue University, 1967.

[9] L.A. Zadeh, Fuzzy sets, Info. and Control 8 (1965) 338–353.

Author’s Profile

Basheer Ahamed M received the Ph,D., degree in Mathematics from Bharathidasan University, Tiruchirappalli, Tamilnadu in 2005. He is working as Associate Professor of Mathematics in Karunya University, Coimbatore, India. His research areas are Fuzzy Mathematics and Discrete Mathematics. J. Michael Anna Spinneli received her M.Sc and M.Phil degrees in mathematics from Manonmanium Sundaranar.University. Tirunelveli, Tamilnadu. Now she is working as an Assistant Professor of mathematics in Karunya University, Coimbatore, India. She is

doing research on Fuzzy Automata.

p

t

s

q

u

r

a/0.1 a/0.1

a/0.2

a/0.3

a/0.1

a/0.1


52

Semi-deflection Routing: A Non-minimal Fully-adaptive Routing for Virtual Cut-through

Switching Network Yuri Nishikawa1, Michihiro Koibuchi2, Hiroki Matsutani3 and Hideharu Amano1

1Keio University,

3-14-1 Hiyoshi Kouhoku-ku Yokohama, Kanagawa 223-8522, Japan {nisikawa,hunga}@am.ics.keio.ac.jp

2National Institute of Informatics,

2-1-2, Hitotsubashi Chiyoda-ku Tokyo 101-8430, Japan

3 The University of Tokyo 7-3-1 Hongo Bunkyo-ku Tokyo 113-8656, Japan

Abstract: In this paper, we propose a routing methodology called “Semi-deflection” routing, a non-minimal fully adaptive deadlock-free routing mechanism for System area networks (SANs), which usually employ virtual cut-through switching. We present that by adding a simple turn-model-based restriction to classical deflection (or hot-potato) routing, which is to allow non-blocking transfer between specific pairs of routers, Semi-deflection routing mechanism guarantees deadlock and livelock-free packet transfer without use of virtual channels in a two-dimensional Mesh topology. As the result of throughput evaluation using batch and permutation traffic patterns, Semi-deflection routing improved throughput by maximum of 2.38 times compared with that of North-last turn model, which is a typical adaptive routing. The routing also reduces average hop count compared to deflection routing.

Keywords: SAN, virtual cut-through, non-minimal fully adaptive routing, deflection routing, turn-model

1. Introduction Network-based parallel processing using system area networks (SANs) has been researched as potential cost-effective parallel-computing environments [1][2][3], as well as traditional massively parallel computers. SANs, which consist of switches connected with point-to-point links, usually provide low-latency high-bandwidth communications. Modern SANs and interconnection networks of massively parallel computers usually use virtual cut-through [4] as their switching technique, and they achieve reliable communications at the hardware level with deadlock-free routing. The following two strategies can be taken when a deadlock free routing algorithm is designed. Deterministic routing takes a single path between hosts, and it guarantees in-order packet delivery between the same pair of hosts [5]. On the other hand, adaptive routing [6][7][8][9] dynamically selects the route of a packet in order to make the best use of bandwidth in interconnection networks. In adaptive routing, when a packet encounters a faulty or congested path, another bypassing path can be selected. Since this allows for a better balance of network traffic, adaptive routing improves throughput and latency. In spite of the adaptive routing’s advantages, most current SANs do not always employ it[1][2]. This is because

it does not guarantee in-order packet delivery, which is required for some message-passing libraries, and the logic to dynamically select a single channel from among a set of alternatives might substantially increase the switch’s complexity. However, simple sorting mechanisms for out-of-order packets in network interfaces have been researched [10][11], and real parallel machines, such as the Cray T3E [12], the Reliable Router [13], or BlueGene/L[14] have shown the feasibility of adaptive routing. A simple method to support adaptivity in InfiniBand switches has also been proposed [15]. We thus consider that these switches and routers will employ adaptive routing not only in interconnection networks of massively parallel computers but also in SANs. There are two basic approaches to avoid deadlocks in adaptive routing. The simpler strategy removes cyclic channel dependencies in the channel dependency graph (CDG) [6][7][8]. The more complex ones deal with cyclic channel dependencies by introducing escape paths [9]. It is difficult to apply the latter strategy to interconnection networks with no virtual channels. In general, the most flexible routing algorithm is non-minimal fully-adaptive routing, because it allows each router to maximize the number of alternative paths. Non-minimal fully-adaptive routing supports dynamic selection of various paths among multiple routers. The challenge of such algorithm is to guarantee deadlock-free. Fully-adaptive routing algorithms usually introduce virtual channels to guarantee deadlock freedom of packet transfers. For example, a minimal fully-adaptive routing called Duato’s protocol uses two virtual channels for Mesh topologies, and three for Torus topologies. However, virtual channels require addition of channel buffers for each router ports, and they are sometimes used for providing different quality of services (QoS) classes. In addition, they are not always supported in SANs. In this paper, we propose a non-minimal fully-adaptive routing called Semi-deflection routing which does not require virtual channels in virtual cut-through networks. Semi-deflection routing allows non-blocking transfer among certain routers in order to guarantee deadlock and livelock freedom. Since non-blocking transfer means that a packet that arrived in a router must secure an output port in


53

minimum transfer time, it cannot necessarily choose the shortest path to the destination. This is a similar approach with deflection (hot-potato) routing[16][17][18] and chaos

Figure 1. Prohibited turn of North-last turn model

Figure 2. Example of non-minimal routing

router[19] where all incoming packets are moved to output ports at every cycle as non-blocking which may introduce non-minimal paths in order to avoid packet blocking. These studies already found that their proposed routing is deadlock-free and livelock-free. Unlike these existing works, some packets wait for the appropriate ports to their destinations in Semi-deflection routing when other packets occupy them. The rest of this paper is organized as follows. In Section 2 we propose Semi-deflection routing. Section 3 gives evaluation results, and Section 4 gives related works. Finally Section 5 concludes the paper.

2. Semi-deflection Routing This section proposes the mechanism of Semi-deflection routing. This routing avoids deadlocks using the approach of Turn-model, and it requires to update output selection functions and arbitration mechanism of routers as described below.

3.9 Turn model Turn model is a family of adaptive routing which supports deadlock-free by removing all cyclic channel dependencies in a network topology [6]. North-last model and West-first model are representative examples. Figure 1 shows North-last turn model in a two-dimensional Mesh topology. A router has four input/output ports in each direction (north, south, east and west), and there are eight possible turns. A turn model prohibits a set of certain turns (which are called “prohibited turns” in this paper) in order to remove cyclic channel dependency, and they are southwest turn and southeast turn in case of North-last model. In other words, packets must arrive at a destination node by finally taking 0-hops or more of

northward movement. Turn model allows non-minimal path to destination nodes, but certain path cannot be taken due to prohibited turns.

3.10 Deadlock removal This section describes the mechanism of Semi-deflection routing. This allows prohibited turns of turn model under certain conditions, while it guarantees deadlock-free. We focus on the fact that deadlocks do not occur as long as packets are moved by non-blocking on every prohibited turn in virtual cut-through switching, because each packet can independently detour blocked packets. First, we define “non-blocking packet transfer” as follows. Definition 1: (Non-blocking packet transfer) A packet transfer is called “non-blocking” when a packet at a certain input port is allocated to an output port via router’s crossbar switch, given priority by the arbiter, and transferred to the selected output port without being blocked by other packets. Even when a prohibited turn is taken, deadlock does not occur as long as it is guaranteed to transfer packets to adjacent routers in a non-blocking manner. Thus, fully deadlock-free routing is satisfied by always transferring packets that intend to take prohibited turns in a non-blocking manner. Figure 2 shows pairs of input and output ports where non-blocking transfer needs to be guaranteed in case of North-last turn model. The non-blocking between these ports can be attained because each packets can independently detour themselves when virtual cut-through routing is applied. In precise, we give a limitation when packets cannot take prohibited turns as follows. Assume that an incoming packet is expected to take a path along a prohibited turn, but its output channel is locked by another packet. In this case, the packet cannot wait only for that output port to become vacant. In other words, the packet must be transferred to another output channel if it is vacant even when that direction is not towards to the destination. This is a similar idea to deflection routing [20], which requires all packets to be moving around the network constantly at a router at every cycle. For example, let’s say that the gray-colored node in Figure 2 receives a packet from the south input port, and applied turn model is North-last model. Since selection of west, east or south output port makes prohibited turn, the packet must choose north output port if the rest of the ports are locked (occupied) by others. Only in case when North output port is locked, a packet that came from a South input port packet waits for certain output port to become vacant. This is because not prohibited output ports behave as escape paths that prevents cyclic channel dependency. Idea of escape path was originally suggested in adaptive routing algorithms such as Duato’s protocol [9] where they are implemented by using virtual channels. Black-colored input ports in Figure 2 indicate that any available output port selection becomes a packet transfer along the prohibited turns. Thus, packets from these input ports must be transferred somewhere in a non-blocking manner. Next, we define a term “Non-waiting port” as follows.


54

Definition 2: (Non-waiting port) An input port and its pair connected with a link in the neighboring node is defined as “non-waiting port” when either of following is satisfied: (a) when any available output port selection becomes a packet transfer along the prohibited turns, or (b) a port is paired with port defined in (a). Theorem 1: There is one non-waiting port per router at most when network topology is two-dimensional Mesh (N × N) and applied turn model is either North-last or West-first model. Proof: Packets that enter a router from a South input port can only take a prohibited turn for North-last model. Consequently, South input ports have only possibilities to be non-waiting ports according to Definition 2 (a). On the other hand, for Definition 2 (b), since there is only one input port that can be paired with (a), Theorem 1 is obviously satisfied. For West-first routing, packets that go through routers at coordinate x = N−1 can only take prohibited turn. Here, there is an output port which does not make prohibited turn either in x or y direction in case of y ≠ 0, y ≠ N − 1. Condition of Definition 2 (a) is satisfied either when x = N − 1, y = 0 or x = N − 1, y = N − 1 are satisfied. In the former case, condition is satisfied for North input port, and South input port for the latter case. On the other hand, for Definition 2 (b), since there is only one input port that can be paired with (a), satisfaction of Theorem 1 is obvious. ■ Definition 3: (Loopback transfer) When a packet transfers directly from a router’s port to adjacent router’s port which the packet was transferred from, it is called “loopback transfer”. Theorem 2: When there is only one non-waiting port, a packet does not stay at non-waiting port after a certain period of time when following conditions are satisfied: (a) loopback transfer is permitted, and (b) highest arbitration priority is given to the non-waiting port. Proof: When buffer of forwarding input port is vacant, a packet at a non-waiting port is transferred to one of vacant ports. According to Definition 2, a non-waiting port always exists on a router that a packet is transferred back to in a loopback manner. If any input port of forwarding router is unavailable, a packet is transferred in a loopback manner. Even when a packet exists in forwarding non-waiting input port, packets are simply exchanged between pairing non-waiting ports when highest arbitration priority is given to these ports. Since these packet transfer and exchange are done within a certain period, Theorem 2 is satisfied. ■ Non-waiting ports in Figure 2 is marked with black color or diagonal lines. Livelock removal of non-waiting ports is described in the following section.

3.11 Router structure This section describes the modification of router’s output selection function and arbiters, which are necessary to support Semi-deflection routing mechanism.

3.3.2 Output selection function Adaptive routing is characterized by determining adaptive routing algorithm and output selection function (OSF). Adaptive algorithm decides a set of available output ports, and OSF prioritizes output ports for assigning to packets. Definition 4: (Output Selection Function (OSF) of Semi-deflection routing) OSF of Semi-deflection routing prioritizes each port based on the following order: 1) Output towards the destination 2) Output that goes away from the destination, but does not loop back 3) Output that loops back It is possible to prevent livelock between non-waiting ports by giving loop-back selection the lowest priority.

2.3.2 Arbitration An arbiter is in charge of allotting input packets to output ports. Priority of arbitration is described as follows. If none of the rules match, the order is based on the timestamp when packets entered the router. Definition 5: (Arbitration order of Semi-deflection routing) 1) Packets that were injected from non-waiting input port 2) Packets whose highest prioritized output port by the OSF makes prohibited turn 3) Other packets It is possible to prevent livelock by arbitrating (1) with the highest priority. Also, non-blocking loop-back transfer is realized when other output ports are busy. By arbitrating (2) with high priority, non-blocking transfer to the prohibited direction is satisfied when the output port is available. If it is blocked, other vacant ports are allotted in prior to other input packets. Finally, the arbiter selects other packets that are not injected from non-waiting port, or packets that does not wish to take prohibited turn. If only available output port would keep packets away from destination nodes, packets can wait at the router input buffer until other ports to become vacant. This suppresses unprofitable increase of hop counts.

2.1 Router structure Based on above studies, routing which adopts the following rules is defined as Semi-deflection routing mechanism. Definition 6: (Semi-deflection routing mechanism) l Prohibited turns and non-waiting ports are determined

based on a turn model in the given network. l Each router transfers packets based on priorities given by

OSFs and arbiters in the previous subsections. Because turn model can be applied to arbitrary topology [21], application of Semi-deflection routing is not limited to a two-dimensional Mesh topology.


55

Theorem 3: Semi-deflection routing is deadlock-free. Proof: (i) If a packet does not take a prohibited turn, it does not deadlock because no cyclic-dependency occurs. (ii) In case of taking prohibited turn, a packet can be transferred if buffers among forwarding input ports are vacant. (iii) Even when packets exist in all buffer of forwarding input

ports, no deadlock occurs unless a packet transfer takes prohibited turn. Thus, buffers of forwarding input port will be vacant according to (i), and packet transfer will be possible. According to the above, only possible condition for causing a deadlock is when all packet transfer to forwarding input ports to take prohibited turns, and packets

Figure 3. Uniform (1-flit) Figure 4. Matrix transpose (1-flit) Figure 5. Bit reversal (1-flit)

Figure 6. Uniform (4-flit) Figure 7. Matrix transpose (4-flit) Figure 8. Bit reversal (4-flit)

Figure 9. Uniform (8-flit) Figure 10. Matrix transpose (8-flit) Figure 11. Bit reversal (8-flit) also exist in their buffers. However, such input ports fulfilling the above conditions are non-waiting ports according to Definition 2, and a packet does not stay after a certain period of time according to Theorem 1 and Definition 5. Consequently, Semi-deflection routing is deadlock-free. ■ Theorem 4: Semi-deflection routing is livelock-free. Proof: According to Definition 4 and Definition 5, an output port toward the destination is given higher priority. Also, Semi-deflection routing is a fully-adaptive non-minimal routing which assumes virtual cut-through routing. The above two conditions satisfy conditions for a livelock-freedom of a chaos-router [19]. Consequently, Semi-deflection routing is livelock-free. ■

3. Evaluations This section shows throughput evaluation results of Semi-deflection routing for 8x8 Mesh topology with various packet lengths.

3.1 Simulation conditions The throughput was evaluated with 64 nodes in a two-dimensional Mesh topology using irr sim[22], a C++ flit-level network simulator. Here, latency is number of cycles between source core injects a packet in the network and destination core receives it. Accepted traffic is the average number of flits each core receives at a cycle in average. Maximum throughput is defined as maximum value of accepted traffic when latency is 1000 cycles or lower. For evaluation, we used following batch and permutation routing problems [23].


56

l Batch routing: all processors generate a single packet

simultaneously. Ø Uniform traffic

All processors generate a single packet to identically distributed random destinations. Traffic loads of each node are equally balanced.

l Permutation routing: each node is the destination of a single packet Ø Matrix transpose

When array size is k, node (x, y) transmits data to node (k - y - 1, k - x -1). Nodes on the diagonal axis (x + y = k - 1) transmit data to node (k - x -1, k - y - 1).

Ø Bit reversal traffic When index values of source nodes are given as (a0, a1, ...,

an−1), each sends data to nodes (an-1, ..., a1, a0 ). Table 1. Average hop count (8x8 Mesh)

Routing Traffic pattern Workload level low moderate very high

North -last

Uniform 5.34 5.36 5.29 Mtx. transpose

6.25 5.60 5.06

Bit reversal 6.25 5.67 5.78

Semi -Deflec -tion

Uniform 5.37 5.51 5.81 Mtx. transpose

6.39 6.61 7.22

Bit reversal 6.34 6.37 7.31

Deflec -tion

Uniform 5.63 6.46 5.72 Mtx. tranpose 6.25 7.84 5.14 Bit reversal 6.84 8.07 6.70

3.2 Throughput Throughput of Semi-deflection routing is shown in Figure 3 to Figure 11. The throughput was compared with North-last adaptive routing, and exclusively for 1-flit packets, we also made comparison with the most classical Hot-potato routing algorithm, where input packets are deflected from the router at the next cycle, and cannot wait for output ports to take shortest path. Other parameters are specified as follows. l Topology: 8×8 two-dimensional Mesh l Buffer size: 16 flits l Packet size: 1, 4, and 8 flits l Throttle threshold value: 3 ports

To maintain performance of Semi-deflection routing and Hot-potato routing, we applied a very simple throttle mechanism to each router; when more than three input ports of a router are busy, its local host core does not inject a packet. Throughput evaluation with different packet length and traffic patterns revealed following features of Semi-deflection routing. First of all, comparing the routing with North-last methodology, it performed quite well on permutation routing patterns even with long packet length. With any packet length, Semi-deflection routing provided nearly twice or even higher throughput, as Figures 4, 5, 7, 8, 10 and 11 indicate. According to Figure 7, approximately 2.38 times higher throughput was obtained when packet length was one flit and traffic pattern was Matrix transpose. On the other hand, according to Figures 3, 6, and 9,

advantage of Semi-deflection routing using uniform traffic pattern was moderate. Maximum throughput was lower than North-last routing when packet length was four, was equal when packet length was eight, and slightly better for single-flit packet. Comparing with classical Hot-potato routing, maximum throughput of Semi-deflection routing was equal or outperformed according to Figures 3, 4, and 5. This is a striking result because Hot-potato routing intuitively provides higher flexibility compared to Semi-deflection routing. In addition, throughput of Hot-potato routing decreased when network load surpassed certain amount, indicating that it requires stricter injection limitation technique which presumably would reduce maximum throughput of Hot-potato routing.

3.3 Average hop count Table 1 shows average hop count of each traffic pattern by applying North-last, Semi-deflection, and Hot-potato routing to 8×8 two-dimensional Mesh topology. Again, here we assume a single-flit packet. Traffic load was set to the following. l Low traffic: when packets are injected per 50 cycles to

all routers. l Moderate traffic: when R is round-off value of average

hop count in case of low traffic, packets are injected per R cycles to all routers. In concrete, R = 5 was applied for uniform traffic, and R = 6 for other traffic patterns. l Very high traffic: when packets were injected per one

cycle to all routers. As Table 1 indicates, Semi-deflection routing effectively suppresses the increase of hop counts. Semi-deflection and the other routing algorithms show contrasting characteristics as traffic load increases. The case of Semi-deflection routing is intuitive; when packets have lower chances to take minimal path, they take detour route and average hop count increases. Contrary, hop count decreases for North-last routing and Hot-potato routing. This is due to difference in packet delivery capacity of both routing mechanisms. When traffic becomes closer to the maximum capacity, packet movement slows down and eventually terminates. Thus, hop count indicates the average value when traffic was below saturation. Since Semi- deflection routing provides escape paths, packet movement does not slow down, and this leads to the higher throughput as shown in previous section. 4. Related Works Deterministic routing is a routing that always selects a single path between each pair of source and destination nodes. A most typical example is dimension-order routing, which routes packets first in the x-dimension, and then in the y-dimension to reach the destination node. Although this routing is advantageous in terms of average hop count because it only allows minimal path, it may cause imbalance for some traffic patterns. However, because of its simplicity and feasibility to avoid deadlocks, dimension-order routing was applied to many supercomputers such as Cray T3D[24].


57

On the other hand, an adaptive routing can select multiple possible paths, and make much of path diversity of the topology. The main challenge of adaptive routing is to provide a large number of alternative paths so as to maintain average network load constant, balance the load among multiple routers, and keep average hop count low as much as possible. Another challenge is to guarantee deadlock- and livelock-freedom. A turn model described in Section 2 defines prohibited turns to eliminate all cyclic channel dependency of the topology. Duato’s protocol is an adaptive routing which makes the best use of path diversity, and is a fully-adaptive routing. It allows all possible paths by adding one virtual channel for each path. In contrast, Semi-deflection differs from the above in terms of the following features. l Router design becomes lightweight because it does not

require virtual channels. l It is a non-minimal fully-adaptive routing that has

largest path diversity. Another approach to cope with deadlocks is deadlock-recovery-based routing, which usually employs minimal fully adaptive routing. It is useful only when deadlocks are infrequent, and recently, the techniques that are applicable for efficient deadlock recovery-based routing have been proposed. When deadlock is found, one of the packets is removed from the deadlocked paths. In overall, there are two types in deadlock removal strategies: a progressive method that prepares escape path, and regressive method which discards and retransmits packets. The former requires virtual channels, and the latter requires control mechanics for packet disposal and retransmission. Semi-deflection routing is a deadlock-free mechanism, so it does not require such control mechanisms. Also, some methods handle deadlocks more proactively. For example, Deflection routing guarantees deadlock-free by making a router sending out larger number of flits than it has received per cycle. Of course, each router cannot necessarily selects proper output port of each packet to the destinations, however deadlock does not occur because packets do not make collisions. Also, studies of Chaos router have proved its livelock-freedom. Drawback of Deflection routing is that it cannot be applied to wormhole routing, increasing number of average hop count, and increasing hardware amount due to addition of dedicated buffer for holding one packet per router[25]. Semi-deflection routing differs from Deflection routing in terms of the following. l Semi-deflection routing only requires non-blocking

transfer for selected ports of partial routers. Other packets can wait for output ports to take appropriate paths. Thus, average hop count is smaller than Deflection routing, as shown in Section 3. l No dedicated buffer is required for implementation.

5. Conclusions In this paper, we proposed Semi-deflection routing, which is a non-minimal fully-adaptive routing and makes the best use of virtual cut-through switching. Semi-deflection routing does not require the use of virtual channels by allowing non-

blocking transfer among certain routers in interconnection networks of massively parallel computers and SANs. Evaluation results show that throughput improvement was 2.38 times in maximum in case of 8×8 two-dimensional Mesh topology. As the future work, we are planning to test our proposed routing with other topologies such as torus topology, and study other tuning techniques. References

[1] N.J.Boden and et al., “Myrinet: A Gigabit-per-Second Local Area Network,” IEEE Micro, vol. 15, no. 1, pp. 29–35, 1995.

[2] I.T.Association, “Infiniband architecture. specification volume1,release 1.0.a,” available at the InfiniBand Trade Association, http://www.infinibandta.com, Jun. 2001.

[3] F.Petrini, W. Feng, A.Hoisie, S.Coll, and E.Frachtenberg, “The Quadrics network: high-performance clustering technology,” IEEE Micro, vol. 22, no. 1, pp. 46–57, 2002.

[4] P. Kermani and L. Kleinrock, “Virtual cut-through: A new computer communication switching techniques,” Computer Networks, vol. 3, no. 4, pp. 267–286, 1979.

[5] W. Dally and C. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transaction on Computers, vol. 36, no. 5, pp. 547–553, May 1987.

[6] C. J. Glass and L. M. Ni, “The Turn Model for Adaptive Routing,” Proceedings of International Symposium on Computer Architecture, pp. 278–287, 1992.

[7] W. J. Dally and H. Aoki, “Deadlock-Free Adaptive Routing in Multicomputer Networks Using Virtual Channels,” IEEE Transaction on Parallel and Distributed Systems, vol. 4, no. 4, pp. 466–475, 1993.

[8] A.A.Chien and J.H.Kim, “Planar-adaptive routing: low-cost adaptive networks for multiprocessors,” Journal of the ACM, vol. 42, no. 1, pp. 91–123, Jan. 1995.

[9] J. Duato, “A Necessary And Sufficient Condition For Deadlock-Free Adaptive Routing In Wormhole Networks,” IEEE Transaction on Parallel and Distributed Systems, vol. 6, no. 10, pp. 1055–1067, 1995.

[10] J.C.Martinez, J.Flich, A.Robles, P.Lopez, J.Duato, and M.Koibuchi, “In-Order Packet Delivery in Interconnection Networks using Adaptive Routing,” in Proceedings of IEEE International Parallel and Distributed Processing Symposium, Apr. 2005, p. 101a.

[11] M.Koibuchi, J.C.Martinez, J.Flich, A.Robles, P.Lopez, and J.Duato, “Enforcing In-Order Packet Delivery in System Area Networks with Adaptive Routing,” Journal of Parallel and Distributed Computing, vol. 65, pp. 1223–1236, Oct. 2005.

[12] S. L. Scott and G. T.Horson, “The Cray T3E network: adaptive routing in a high performance 3D torus,” in Proceedings of Hot Interconnects IV, Aug. 1996, pp. 147–156.


58

[13] W.J.Dally and et al., “Architecture and implementation of the reliable router,” in Proceedings of Hot Interconnects Symposium II, Aug. 1994.

[14] P. Coteus, H. R. Bickford, T. M. Cipolla, P. G. Crumley, A. Gara, S. A. Hall, G. V. Kopcsay, A. P. Lanzetta, L. S. Mok, R. Rand, R. Swetz, T. Takken, P. L. Rocca, C. Marroquin, P. R. Germann, and M. J. Jeanson, “Packaging the Blue Gene/L supercomputer,” IBM Journal of Research and Development, vol. 49, no. 2/3, pp. 213–248, Mar/May 2005.

[15] J. Martinez, J. Flich, A. Robles, P. Lopez, and J. Duato, “Supporting Adaptive Routing in IBA Switches,” Journal of Systems Architecture, vol. 49, pp. 441–449, 2004.

[16] P. Baran, “On Distributed Communication Network,” Communications Systems, IEEE Transactions on, 1962. [Online]. Available:

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1088883

[17] C. Kaklamanis and S. Rao, “Hot-Potato Routing on Processor Arrays,” in Fifth annual ACM symposium on Parallel Algorithms and Architectures, Velen, Germany, 1993, pp. 273 – 282.

[18] T. Moscibroda and O. Mutlu, “A Case for Bufferless Routing in On-Chip Networks,” in Proceedings of the International Symposium on Computer Architecture (ISCA’09), June 2009.

[19] S. Konstantinidou and L. Snyder, “The chaos router,” IEEE Transactions on Computers, vol. 43, no. 12, pp. 1386–1397, 1994.

[20] E. Nilsson, “Design and Implementation of a Hot-Potato Switch in Network On Chip,” Master’s thesis, Laboratory of Electronics and Computer Systems, Royal Institute of Technology (KTH), June 2002.

[21] A. Jouraku, M. Koibuchi, and H. Amano, “An Effective Design of Deadlock-Free Routing Algorithms Based on 2-D Turn Model for Irregular Networks,” IEEE Transactions on

Parallel Distributed Systems, vol. 18, no. 3, pp. 320–333, 2007.

[22] H. Matsutani, M. Koibuchi, D. Wang, and H. Amano, “Adding Slow-Silent Virtual Channels for Low-Power On-Chip Networks,” in Proceedings of the International Symposium on Networks-on-Chip (NOCS’08), Apr. 2008, pp. 23–32.

[23] W.J.Dally and B.Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann, 2003.

[24] S. L. Scott and G. Thorson, “Optimized routing in the Cray T3D,” in PCRCW ’94: Proceedings of the First International Workshop on Parallel Computer Routing and Communication. London, UK: Springer-Verlag, 1994, pp. 281–294.

[25] J. Duato, S. Yalamanchili, and L. Ni, Interconnection Networks: an engineering approach. Morgan Kaufmann, 2002.

Author’s Profile Yuri Nishikawa recieved the B.E. and M.E. degrees from Keio University, Japan, in 2006 and 2008. She is currently a Ph.D. candidate at Keio University. Her research interests include the areas of interconnection networks and high-performance computing.

Michihiro Koibuchi received the B.E., M.E., and Ph.D degrees from Keio University, Japan, in 2000, 2002, and 2003. He was Visiting Researcher in Technical University of Valencia, Spain, and Visiting Scholar in University of Southern California, in 2004, and 2006. He is currently Assistant Professor in National Institute of Informatics (NII) and the

Graduate University for Advanced Studies, Japan. His research interests include the areas of high-performance computing and interconnection networks.

Hiroki Matsutani received the B.A., M.E., and Ph.D degrees from Keio University, Japan, in 2004, 2006, and 2008. He is currently Cooperative Researcher in the Research Center for Advanced Science and Technology (RCAST), The University of Tokyo, Japan, and Visiting Researcher in Keio

University. His research interests include the areas of Networks-on-Chips and interconnection networks.

Hideharu Amano received the Ph.D degree from Keio University, Japan, in 1986. He is currently Professor in the Department of Information and Computer Science, Keio University. His research interests include the areas of parallel processing and reconfigurable systems.


59

Protecting Consumers from the Menace of Phishing

Vivian Ogochukwu Nwaocha

National Open University of Nigeria, School of Science and Technology,

Victoria Island, Lagos [email protected]

Abstract: The number and sophistication of phishing scams sent out to consumers is continuing to swell drastically. Banks, Vendors, and a number of organizations who provide their services online have had several incidents where their clients have been swindled by phishers. The internet industry is starting to take the threat very seriously seeing the exploding trend of attacks and the tendency for the phish hits to afflict the big industries. Today, both the spam and phishing enterprises are blooming. These fraudsters send spam or pop-up messages to lure personal and financial information from unsuspecting victims. The hostile party then uses this information for criminal purposes, such as identity theft and fraud. In spite of the measures being taken by researchers, internet service providers and software vendors to curb this scam, phishing scams have been on the rise as phishers continue to devise new schemes to deceive consumers. In this paper, we present the different forms of phishing; highlighting specific phishing features that would help consumers identify an imminent phishing scam in order to avoid being phished. It is hoped that promoting valuable consumer education would help protect Internet users worldwide from becoming victims of phishing scams. By providing their consumers with the tools, resources, and guidance they need to protect themselves from these threats, industries and organizations would equally help reduce the threat of phishing attacks. Keywords: consumers, emails, phishing, vishing, websites. 1. Introduction Phishing attacks are rapidly increasing in frequency. According to the Anti-Phishing Working Group (APWG), [1] reports of phishing attacks increased by 180% in April 2004 alone, and by 4,000% in the six months prior to April. A recent study done by the antispam firm MailFrontier Inc. found that phishing emails fooled users 28% of the time.[2] Estimates of losses resulting from phishing approached $37 million in 2002.[3] The term phishing refers to the act of sending an e-mail to a user falsely claiming to be an established legitimate enterprise in an attempt to scam the user into surrendering private information that will be used for identity theft. The e-mail directs the user to visit a Web site where they are asked to update personal information, such as passwords and credit card, social security, and bank account numbers, that the legitimate organization already has. [4] A phishing attack is said to be successful when a user is tricked into forming an inaccurate mental model of an online interaction and thus takes actions that have effects contrary to the user's intentions. The attacker can then use this information for criminal purposes, such as identity theft

or fraud. Users are tricked into disclosing their information either by providing it through a web form or by downloading and installing hostile software. Once this is done, the attackers have the information they want, which puts the ball squarely in their court. This has been a very successful avenue for attackers in the past. They have been able to harvest various users’ personal information with ease. As a whole, the Internet is unsecure because many of the constituent networks are unsecure. [5]

The first major phishing attempt was made in 1995 against AOL users (ASTALAVISTA, 2010). Back then, AOL just recently finished adapting measures that prevented using fake credit card numbers to open new AOL accounts. Because of this Crackers resorted to the phishing to get real credit card numbers from authentic users to create their accounts. Phishers usually posed as AOL employees. These fake AOL employees contacted their victims using instant messaging in an attempt to get them to reveal their credit card details. [6]

Due to the fact that many phishers were successful in obtaining credit card details from AOL customers, they realized that it might be profitable to attack online payment institutions. Phishing has become a critical problem for every major financial institution in the world. Nowadays, phishers usually target people who deal with online payment services and banks. Phishers now have the ability to target specific customers of different financial institution. By narrowing down the which bank service you are using, phishers can then send targeted emails by posing as employees from a specific financial institution. This makes their data gathering attempts much more efficient and difficult to stop. This process is referred to as ‘spear phishing’. Some phishers have targeted VIPs and high-ranking executives in a practice that has been labeled as ‘whaling’.

With the advent of social networking sites such as Facebook and MySpace, phishers have now moved to new hunting grounds. The details obtained from phishing on social networking sites are known to be used in identity theft. Phishers prefer targeting social networking sites because the success rate is often high. In fact, experts have estimated that 70% of all phishing attacks in social networking sites are successful. This is because phishers use a fake login page to track social networkers to punch in their login details. File sharing sites like Rapidshare and Megaupload have also been targeted by phishing schemes. Phishers


60

attempt to obtain login details to various premium accounts to gain access to unlimited upload and download service that are provided by the site.

There is yet another form of phishing where the scammers exploit the phone channel to ask for sensitive information, rather than sending e-mails and cloning trustworthy websites. In some sense, the traditional phone scams are streamlined by attackers using techniques that are typical of modern, e-mail-based phishing.

2. Related Work A number preventive and detective solutions for phishing threats have been provided by MarkMonitor, Panda Security, VeriSign, Internet Identity, Cyveillance, RSA, WebSense, etc. [7], most of them are based on detecting fraudulent emails and embedded URL, identifying and closing down the scam site, bombing phishing sites with dummy information (but apparently real) in order to confuse the attacker making it difficult to distinguish real data from dummy data. The use of digital certificates is also a solution proposed as a countermeasure for phishing attacks. However, investigations reveal that the use of digital certificates for server authentication is not enough to mitigate phishing threats. This is for many reasons, for example many users do not pay enough attention to the digital certificate details or many others do not have the knowledge to perform a correct validation of the digital certificate [8, 9]. In addition the attacker could decide not to use encrypted traffic (HTTP instead of HTTPS). These solutions are not sufficient to provide a secure environment because most of them are reactive solutions and others do not comply with security policies (e.g. deny as default, allow only permitted, etc.). In particular for blocking an attacker site, detecting fraudulent emails is like making a black list, and this is the opposite of allowing only permitted. Other solutions such as the use of two factor authentication are not enough. If we authenticate the user, we also have to authenticate the server because both entities must be considered mutually untrusted. For this reason, in order to work in a secure way in presence of innumerable phishing attempts a multi-factor solution is required. 3. Common Phishing Procedure The most common phishing scams involves sending a fraudulent email that claims to be from a well-known company. Below is an illustration of a typical phishing procedure:

The Modus Operandi of Phishing

• A fraudster initiates phishing by sending thousands, even millions, of emails to different mail accounts disguised as messages from a well-known company. The typical phishing email will contain a concocted story designed to lure you into taking an action such as clicking a link or button in the email or calling a phone number. [10]

• In the email, there will be links or buttons that take ignorant consumers to a fraudulent website.

• The fraudulent website will also mimic the appearance of a popular website or company. The scam site will ask for personal information, such as credit card number, Social Security number, or account password.

• As soon as the user is tricked to take actions contrary to his intention, phishing is said to be successful. Thus, the user thinks he’s giving information to a trusted company when, in fact, he’s supplying it to a criminal.

4. Types of Phishing 3.1 Email and Bogus Website Phishing

The most common form of phishing is by email. In this mode of phishing, phishers pretending to be from a genuine financial institution, or a legitimate retailer or government agency, ask their targeted victim to “confirm” their personal information for some made-up reason. Typically, the email contains a link to a phony Web site that looks just like the real thing – with sophisticated graphics and images. In fact, the fake Web sites are near-replicas of the real one, making it difficult even for experts to distinguish between the real and fake Web sites. As a result, the victim enters his personal information onto the Web site – and into the hands of identity thieves.

3.2 Vishing

The main text for your paragraphs should be 10pt font. All body paragraphs (except the beginning of a section/sub- As computer users have become more educated about the dangers of phishing emails, perpetrators have begun incorporating the telephone into their schemes. This variation on the phishing ploy has been termed vishing, indicating that it is a combination of voice (phone) and phishing. In a typical vishing attempt, you would receive a legitimate-looking email directing you to call a number. This would connect you to an automatic voice system, which would ask for your credit card information. In some cases email wouldn't be involved at all. Instead, you would receive an automated phone call requesting your account information. Often the perpetrators would already have your credit card number and would be requesting only the security code from the back of the card.

Internet Voice, also known as Voice over Internet Protocol (VoIP) or Internet telephony, is a relatively new technology that allows you to make phone calls over the Internet. Depending on the provider, VoIP can have several advantages over conventional phone service, such as a flat rate for long distance calls and no extra charge for popular

1. Mass Email

2. Phishing Email

3. Fraudulent Website


61

features such as caller ID and voice mail. Internet voice (VoIP) vulnerabilities are facilitating this form of fraud. Users can telephone anonymously. In addition, caller ID devices can be fooled into displaying a false source for a call.

. 3.3.1 Samples of Phishing

Sample 1: "Is this Mr. Shola? I'm calling from PSP Bank. Do you have a Visa® card? I need to verify your account number because it seems that someone may be fraudulently charging purchases to your account. Can you read me the account number and expiration date on the front of your Visa® card? OK, now the last four digits on the back..."

Sample 2: "Hello, Mr. Peter Johnson? I represent the ICC Company and our records show that you have an overdue bill of $500 plus interest and penalties. You don't know anything about this bill? Well, there could be a mix-up. Is your address 34 Hall Street? What is your Social Security number...?"

Sample 3: "This is Inspector Danladi calling from the Economic and Financial Crimes Commission. Are you Mr. Samuel? We have received several reports of telemarketing fraud involving attempted withdrawals from bank accounts in your area. In order to safeguard your account, we need to confirm your account number, could you please call out your account number...” 3.3.2 Common Phishing Features

While phishing scams can be sophisticated, one needs to be vigilant in order to recognize a potential scam. The following features are often pointers that something is wide of the mark:

… Someone contacts you unexpectedly and asks for your personal information such as your financial institution account number, an account password or PIN, credit card number or Social Security number. Legitimate companies and agencies don’t operate that way.

… The sender, who is a supposed representative of a company you do business with, asks you to confirm that you have a relationship with the company. This information is on record with the real company.

… You are warned that your account will be shut down unless you “reconfirm” your financial information.

… Links in an email you receive ask you to provide personal information. To check whether an email or call is really from the company or agency, call it directly or go to the company’s Web site (use a search engine to find it).

… You’re a job seeker who is contacted by someone claiming to be a prospective employer who wants your personal information.

5. Tips for Spotting Phishing Scams Essentially, fraudulent email and websites are designed to deceive you and can be difficult to distinguish from the real thing. Whenever you get an email about your account, the safest and easiest course of action is to open a new browser, type the website address of your online transaction and log in to your account directly. Do not click on any link in an email that requests personal information. 5.1 Identifying Fraudulent Emails There are many telltale signs of a fraudulent email.[11]

a. Sender's Email Address. To give you a false sense of security, the “From” line may include an official-looking email address that may actually be copied from a genuine one. The email address can easily be altered – it’s not an indication of the validity of any email communication.

b. Generic Email Greeting. A typical phishing email will have a generic greeting, such as “Dear User.” Note: All PayPal emails will greet you by your first and last name.

c. False Sense of Urgency. Most phishing emails try to deceive you with the threat that your account will be in jeopardy if it’s not updated right away. An email that urgently requests you to supply sensitive personal information is typically fraudulent.

d. Fake Links. Many phishing emails have a link that looks valid, but sends you to a fraudulent site that may or may not have an URL different from the link. Always check where a link is going before you click. Move your mouse over the URL in the email and look at the URL in the browser. As always, if it looks suspicious, don't click it. Open a new browser window, and type https://www.paypal.com.

e. Attachments. Similar to fake links, attachments can be used in phishing emails and are dangerous. Never click on an attachment. It could cause you to download spyware or a virus. PayPal will never email you an attachment or a software update to install on your computer.

Model of a fraudulent email

5.2 Identifying a Fraudulent Website A phishing email will usually try to direct you to a fraudulent website that mimics the appearance of a popular website or company. The fraudulent website commonly referred to as a ‘spoof ‘website will request your personal information, such as credit card number, Social Security number, or account password.


62

You think you are giving information to a trusted company when, in fact, you are supplying it to an online criminal. a. Deceptive URLs. Be cautious. Some fraudsters will insert a fake browser address bar over the real one, making it appear that you’re on a legitimate website. Follow these precautions: Even if an URL contains the word "PayPal," it may not be a PayPal site.

Examples of fake PayPal addresses: http://[email protected]/ http://83.16.123.18/pp/update.htm?=https:// www.paypal.com/=cmd_login_access www.secure-paypal.com

Always log in to PayPal by opening a new browser and typing in the following: https://www.paypal.com. The term "https" should precede any web address (or URL) where you enter personal information. The "s" stands for secure. If you don't see "https," you're not in a secure web session, and you should not enter data.

b. Out-of-place lock icon. Make sure there is a secure lock icon in the status bar at the bottom of the browser window. Many fake sites will put this icon inside the window to deceive you.

Model of a Bogus Website

6. Phishing Protection: Multi-factor Approach The primary responsibility for protecting yourself from phishers lies with YOU. Here are some steps you can take:

• Be on guard Be wary of any email with an urgent request for personal, account or financial information. Unless the email is digitally signed (a method of authenticating digital information), you can't be sure it is authentic.

• Don't fill out a form on a Web site unless you know it is secure. You should communicate information such as credit card numbers or account information only through a secure Web site or over the telephone. To ensure that you're on a secure Web server, check its address in your browser's address bar. It should begin with "https" rather than just "http." In addition, there should be a symbol such as a padlock or key, usually at the bottom of your browser window (not in the

Web page window). Double click on the symbol to see the security certificate for the site and make sure that it matches the site you think you're visiting. But beware - a scammer may also use a secure Web site.

• Regularly check your bank, credit and debit card statements (paper and online). Verify each account at least once a month. Ensure that all transactions are legitimate. If anything is suspicious, contact your bank and all card issuers.

• Ensure that your browser is up to date. Make sure that you have applied the latest security patches and updates. If you use the Microsoft Internet Explorer browser, go to http://www.microsoft.com/security/ and download a special patch relating to certain phishing schemes.

• Install and maintain antivirus and anti-spyware

Software. Some phishing email may contain software that can track your activities, disrupt your computer or simply slow it down. Detect, manage and delete these threats by installing effective antivirus software and antispyware and keeping it updated, either automatically or by downloading updates manually from the manufacturer's Web site.

• Consider installing a phish-blocking toolbar on your Web browser. EarthLink ScamBlocker is part of a free browser toolbar that alerts you before you visit a page that's on EarthLink’s list of known fraudulent phisher Web sites. It's free to all Internet users and can be downloaded at EarthLink Toolbar.

Handle a vishing attempt as you would a phishing situation:

• Don't respond to it.

• Don't call a number given in an email.

• Don't give out your account information in response to a phone call you didn't initiate.

Contact your credit card company directly and only by your usual means. 7. Conclusion It is thus important that consumers are watchful in handing out critical user-specific information. Creating passwords that use a combination of upper and lower case and special characters will also contribute to a hard data encryption. For businesses, educating employees on how to recognise a phishing attempt makes it competitive in computer security. It is also wise to install advanced browsers that alert users when fraudulent or suspicious websites are visited. Moreover, exchanging details should be done in secured manner and channel where strong cryptography is used for server authentication. In the struggle against phishers and Internet scam perpetuators, being a smart Internet user makes a difference. Internet


63

fraud can be eliminated or reduced to a great extent when common sense and safety precautions are applied.

References [1] Anti-Phishing Working Group, Phishing Attack Trends

Report, April 2004. [Online]. Available: http://antiphishing.org/APWG_Phishing_Attack_Report-Apr2004.pdf.

[2] Bob Sullivan, "Consumers Still Falling for Phish,"

MSNBC, July 28, 2004. [Online]. Available:

http://www.msnbc.msn.com/id/5519990/

[3] Neil Chou, Robert Ledesma, Yuka Teraguchi, and John

C. Mitchell, "Client-Side Defense Against Web-Based

Identity Theft," 11th Annual Network and Distributed

System Security Symposium, 2004. [Online]. Available:

http://theory.stanford.edu/people/jcm/papers/spoof

guard-ndss.pdf.

[4] WEBOPEDIA, Everything you need to know is right here, 2010. [Online]. Available:

http://www.webopedia.com/TERM/P/phishing.html [5] C. Douglas, “The INTERNET Book, Everything You Need to Know About Computer Networking and How the

Internet Works,” Fourth Edition, pp. 311-312, 2006. [6] ASTALAVISTA,"The Hacking and Security Community, Introduction to Phishing”. July, 2010.

[Online]. Available: http://www.astalavista.com/blog/5/entry-90-introduction-to-phishing/

[7] Anti-Phishing Working Group, Vendor solutions, 2010. [Online]. Available: (http://www.antiphishing.org/solutions.html)

[8] R. S. Katti and R. G. Kavasseri, “Nonce Generation For The Digital Signature Standard,” International Journal of Network Security, vol. 11, no. 1, pp. 23-32, July 2010.

[9] C. Yang, “Secure Internet Applications Based on Mobile Agents,” International Journal of Network Security, vol. 2, no. 3, pp. 228-237, May 2006.

[10] PayPal, Phishing Guide Part I, [Online]. Available: https://www.paypal.com/cgi-

bin/webscr?cmd=xpt/Marketing/securitycenter/general/UnderstandPhishing-outside

[11] PayPal, Phishing Guide Part II, “Recognizing Phishing," [Online]. Available: https://www.paypal.com /cgibin/webscr?cmd=xpt/Marketing/securitycenter/general/RecognizePhishing-outside

Author Profile Vivian Ogochukwu Nwaocha is currently involved in coordinating Computer Science and Information Technology programs at the National Open University of Nigeria. Her main research interests are computer security, artificial intelligence, mobile learning and

assistive technologies. A good number of papers authored by Vivian have been published in various local and international journals. Vivian has equally written a number of books which are accessible online. She has participated in several community and service development projects in Nigeria and beyond. Vivian is a member of Computer Professionals Registration Council of Nigeria, Nigeria Computer Society, Prolearn Academy, elearning Europe, IAENG society of Computer Science, IAENG society of Artificial Intelligence, IAENG society of Bioinformatics and several online social networking communities.


64

An Intelligent Bidirectional Authentication Method

Nabil EL KADHI 1 and Hazem EL GENDY 2

1 Computer Engineering Department Chairman Ahlia University Bahrain

[email protected]

2 Faculty of Computer Sc &IT Ahram Canadian University Egypt [email protected]

Abstract: A new Bluetooth authentication model using some game theory concepts is presented in this paper. Bluetooth is a wireless communication protocol designed for WPAN (Wireless Personal Area Network) use. Game theory is a branch of mathematics and logic which deals with the analysis of games. An authentication between two Bluetooth devices is an unidirectional challenge-response procedure and consequently, has many vulnerabilities. We propose a bidirectional authentication scheme in which the authentication is considered as a non-cooperative non-zero-sum bi-matrix game. Three strategies are developed for each player, and the best-response strategies (also called Nash equilibrium) for this game are computed. Using Simplex algorithm, we find only one Nash equilibrium corresponding to the case where both Bluetooth devices are authentic and trying to securely communicate together. In a Nash equilibrium, no player has an incentive to deviate from such situation. Then, we generalize our authentication method to other protocols.

Keywords: Computer/Communications Protocols, ISO (International Standards Organization), Bluetooth security, Bluetooth authentication, game theory, Nash equilibrium, Transport Layer Protocol. 1. Introduction The growth of Information Technology role in various aspects of our lives in various areas has been increasing rapidly. This in turn increased the importance of having digital information bases and have electronic connectivity between various sites of the same organization and between various organizations. These may be spread over multiple networks in different countries in different contents [16, 17].

This in turn, significantly and substantially increased the importance of having security guarantees for these information data bases and electronic connectivity. Unfortunately, the security risks have also increased. This triggered the Research on & Development of security methods and systems to provide security guarantees to the communicating users and users of the information data bases. This includes the work on developing authentication methods to authenticate the identity of the communicating parties [1].

The explosive growth of electronic connectivity and wireless technologies revolutionized our society. Bluetooth is one of these technologies. It is a recently proposed standard [8] that allows for local wireless communication and facilitates the physical connection of different devices [2]. Unfortunately, this wireless environment attracted many

malicious individuals. Wireless networks are exposed to many risks and hacker attacks, ranging from data manip-ulation and eavesdropping to viruses and warms attacks. On one hand, security needs are increasingly vital. On the other hand, many security problems have been addressed by game theory. In fact, game theory is the formal study of interactive decision processes [11] offering enhanced understanding of conflict and cooperation through mathematical models and abstractions.

Bluetooth networks are proliferating in our society. Unfortunately, the Bluetooth security has many weaknesses. Del Vecchio and El Kadhi [8] explain many attacks based on the Bluetooth protocol and Bluetooth software implementations.

The application of game theory to networks security has been gaining increasing interest within the past few years. For example, Syverson [14] talks about “good” nodes fighting “evil” nodes in networks and suggests using game theory for reasoning. In [3], Browne describes how game theory can be used to analyze attacks involving complicated and heterogeneous military networks. Buike [4] studies the use of games to model attackers and defenders in information warfare.

In this paper, we focus on the vulnerability of the Bluetooth authentication. Since such process is unilateral, a malicious Verifier can considerably damage its correspondent menacing the operability of that device on the one hand and, the confidentiality and the integrity of the data exchanged on the other hand. To counter this weakness, a game-theoretic framework is used to model a bidirectional authentication between two Bluetooth devices. Using the Nash equilibrium concept, a secure authentication process is defined in which the authentication is successful if and only if both devices are trusted. This paper is structured as following: First, Bluetooth protocol is reviewed with a focus on its security procedures and vulnerabilities in section 2. Then, section 3 is dedicated to a background on game theory. Next, in section 4 we introduce our game-theoretic model, then some results are presented in section 5. The new bidirectional Bluetooth authentication protocol is described in section 6. In Section 7, we generalize our intelligent authentication method to other protocols. Section 8, presents concluding remarks.


65

2. An overview of the Bluetooth security

2.1 Bluetooth technology Bluetooth is a short-range wireless cable replacement technology. It was researched and developed by an international group called the Bluetooth Special Interest Group (SIG). It has been chosen to serve as the baseline of the IEEE (Institute of Electronic and Electrical Engineers) 802.15.1 standard for Wireless Personal Area Networks (WPANs) [6]. Bluetooth communication adopts a master-slave architecture to form restricted types of an ad-hoc net-work (a collection of nodes that do not need to rely on a predefined infrastructure to keep the network connected) called piconets. A Bluetooth piconet can consist of eight devices, of which one is the master and the others are slaves. Each device may take part in three piconets at most, but a device may be master in one piconet only. Several connected piconets form a so called scatternet. One of the main practical applications of Bluetooth technology includes the ability to transfer files, audio data and other objects, such as electronic business cards, between physically separate devices such as cell phones and PDAs (Personal Digital As-sistant) or laptops. In addition, the piconets formed by Bluetooth can be useful for example in a meeting, where all participants have their own Bluetooth-compatible laptops, and want to share files with each other.

2.2 Bluetooth link-level security The Bluetooth specifications include security features at the link level. These features are based on a secret link key that is shared by a pair of devices. Bluetooth link-level security supports key management, authentication and encryption [10].

2.2.1 Security entities In every Bluetooth device there are four entities used for managing and maintaining security at the link level, namely [7]:

• The Bluetooth device address (BD.ADDR).

• The private link key.

• The private encryption key.

• A random number (RAND). There is also a Bluetooth Personal Identification Number (PIN) used for authentication and to generate the initialization key before exchanging link keys [13].

2.2.2 Key management A key management scheme is used to generate, store, and distribute keys for the purpose of encryption, authentication and authorization [13][5]. Bluetooth specifies five different types of keys: four link keys (initialization key, a unit key, a combination key and a master key) [7][13] and one encryption key [5]. 2.2.3 Authentication Bluetooth authentication uses a challenge-response scheme, which checks whether the other party knows the link key [9]. Thus one device adopts the role of the Verifier and the other the role of the Claimant [7]. Authentication is unilateral, i.e. one device (the Claimant) authorizes itself to another device (the Verifier). If mutual authentication is re-quired, the authentication process is repeated with the roles

exchanged [15]. The authentication process is shown in figure 1:

2.2.4 Encryption The encryption procedure follows on from the au-

thentication procedure. After the link key has been determined, and authentication is successful, the encryption key is generated by the Bluetooth E3 algorithm [9][12]. The stream cipher algorithm, E0, is used for Bluetooth packet encryption and consists of three elements: the keystream generator, the pay-load key generator and the encryption/decryption component [7].

3. Game theory Game theory is a systematic and formal representation of the interaction among a group of rational agents (people, corporations, animals...). It attempts to determine mathematically and logically the actions that players should take in order to optimize their outcomes. We distinguish two main types of game-theoretic models: the strategic (or static) games and the extensive games. The strategic form (also called normal form) is a basic model studied in non cooperative game theory. A game in strategic form is given by a set of strategies for each player, and specifies the payoff for each player resulting from each strategy profile (a combination of strategies, one for each player). Each player chooses his plan of action once and for all and all players make their decisions simultaneously at the beginning of the game. When there are only two players, the strategic form game can be represented by a matrix commonly called bi-matrix. The strategic game solution is, in fact, a Nash equilibrium. Every strategic game with finite number of players, each with a finite set of actions has an equilibrium point. This Nash equilibrium is a point from which no single player wants to deviate unilaterally. By contrast, the model of an extensive game specifies the possible orders of the events. The players can make decisions during the game and they can react to other players’ decisions. Extensive games can be finite or infinite. An extensive game is a detailed description of the sequential structure corresponding to decision problems encountered by the players within strategic situations.


66

4. Proposed model: a game-theoretic protocol 4.1 Assumptions and notations

The bidirectional Bluetooth authentication between two devices is described by a non cooperative and non-zero-sum game for two players in a normal form representation also known as a bimatrix game. Our game is a non cooperative one because the authentication procedure is considered under the worst-case assumption. In other words, the Verifier device and the Claimant are assumed to be in con-flict because each of them has to consider that the other one may be malicious. Both devices are trying to reach the same optimal situation: communicate together without any risk. Thus, what one device gains is not necessarily what the other loses. This yields to a non-zero-sum game. We define three strategies for each player i:

i = {v, c}

Where v refers to the Verifier and c refers to the Claimant:

• Tf Tell the truth and communicate with the player j. • If Tell the truth and don’t communicate with the player

j. • If Lie and try to damage the player j.

where j = {v,c} andi=j. To allow only secure devices to communicate together, we affect some reward and cost values defining an utility function vd for each player i. In practice, each strategy choice is assigned by some value of players’ utility functions. The set of values assigned to different strategies is determined according to statistical computations, empirical studies, or by user specified values. In this work, such values are defined according to a set of secure bidirectional Bluetooth authentication rules. Note that we suggest specifying these rules according to the authentication game context and logic. Thus:

Rule 1 A bidirectionnal authentication between two Bluetooth devices is secure if and only if both devices are trusted. Rule 2 A Bluetooth device is a winner when it is trusted and is a loser otherwise. Rule 3 A bidirectionnal Bluetooth authentication between two Bluetooth devices is successful if and only if it is secure and both devices cooperate together. In addition, the following assumptions illustrate our authentication game:

Assumption 1 Each player knows that his correspondent may be a trusted device or a malicious one (note that this assumption will justify the use of cryptographic parameters in our model).

Assumption 2 Each player knows that if it cooperates, in others words if it tells the truth and communicates with its correspondent, it will win some value LU in the best case (when its correspondent is trusted) and it will lose some value £ in the worst-case (when its correspondent is malicious).

Assumption 3 Each player knows that if it tries to damage its correspondent, in others words if it lies, it will lose some value n

when its correspondent is trusted and it will win some value i when its correspondent is malicious.

Assumption 4 Each player knows that it had better be trusted in any case: LU > i, £ < K and (w + £) > (> + «).

Assumption 5 Each player knows that if it does not cooperate, in other words if it tells the truth and does not communicate with its correspondent, it will neither win nor lose.

4.2 Costs and rewards Next, the meanings of win and lose are defined for the Bluetooth devices. Consider each player payoff as a function of an energy class constant G and a trust level constant Q. In fact, the Bluetooth devices need to save operating power. The device’s level of trust defines the interoperability authorization. Then, the utility function is described as: Ui = CHG - (3iQ. For each player, the term cnG defines the reward value whereas the term &Q defines the cost value. en value depends only on the trustworthiness of the player i. Whereas $ depends on the trustworthiness of both players i and j. For example, if a player i is a trusted one and faces an untrusted correspondent j, i will be rewarded for its authenticity but it should pay for the non authenticity of j. Thus, we define the following values for the coefficients αi and βi

4.3 The Nash equilibrium of our game To achieve a secure bidirectional Bluetooth authentication preserving the confidentiality and the integrity of the data in transit, we use the Nash equilibrium theorem:

Theorem 1 A Nash equilibrium of a strategic-form game is a mixed-strategy profile a* G Σ such that “every player is playing their best response to the strategy choices of his opponents” . More formally, a* is a Nash equilibrium if:

where P = {1,... ,n}= the player set, Si= Player i ’s pure-strategy space,

∑i= Player i’s mixed-strategy space (the set of probability distributions over Si),

-i= The set P\i, σi= Player i’s mixed-strategy profile, and Ui(a)= Player i expected utility from a mixed-strategy

profile.


67

To compute our game’s Nash equilibrium, we first formulate the Verifier’s and the Claimant’s mixed-strategy best-responses’ correspondences (respectively, MBRv(r, s) and MBRc(p, q)):

where p, q, r and s ∈ [0, 1]. The probabilities p, q, r and s corresponding to the players’ mixed-strategies, are computed using the lin-ear programs described in equations (3) and (4):

Then, the Simplex algorithm is used to solve equations (3) and (4). This resolution leads to the following values:

and t = 0.

4. Results After optimal results are computed by the Simplex resolution, the algorithm matchs Verifier and Claimant probabilities with the mutual best-response correspondence(MBi?y (r, s) and MBRc(p,q)). The Claimant probability r = 1

73 corresponds to the case where Tv is the

best-strategy for the Verifier. In fact, r is greater than 38s

and also greater than 1s. Analogously, the Verifier probability p = 1

73 yields the case where Tc is the Claimant’s

best-strategy. In fact, p is greater than 389 and also greater

than 15q. Thus, the mixed-strategy Nash equilibrium of our

game corresponds to the situation where telling the truth and cooperating is the best-strategy for both players. Consequently, the best strategy for the Verifier is Tv and the best strategy for the Claimant is Tc and both players have no incentive to deviate from this situation. This means that according to our bidirectional authentication, the two Bluetooth devices in communication are better off trusting each other.

5. Our bidirectional Bluetooth authentication protocol

Our method includes two main phases: the authentication security parameters phase and the authentication game establishment phase. The first phase is used to define the devices’ trustworthiness and consequently the players’ strategies. The second phase corresponds to our game-theoretic model where the bidirectional authentication is considered a bimatrix game.

6.1 The security parameters check phase According to the classic Bluetooth authentication (see figure 1), the Verifier and the Claimant devices use their input parameters to produce the SRES and AGO outputs. For both devices, there is only one secure parameter, the BDDR.C relative to the Claimant, and only the Verifier checks if the two SRES correspond. The Verifier can establish the trustworthiness or the untrustworthiness of its correspondent. Consequently, it can accept or refuse the communication without any risk. But, if the Verifier is a malicious device, the Claimant is incapable of to discovering this, and the Verifier can easily damage its correspondent. Consequently, in our bidirectionnal model, we consider additional input parameters for both existing players :RAND(C) and BDDR_V. Thus, the security parameters check phase include two main steps. First, the Verifier checks the Claimant identity. Next, the Claimant takes the role of the Verifier and checks its correspondent identity. Note that this identity check is done during two different sessions and is not bidirectional. In each step, each device computes an output and then, the two devices check for correspondence. The Verifier and the Claimant compute, respectively, SR1 and SR2 in the first step, and SR3 and SR4 in the second step.


68

6.2 The authentication game phase

The authentication game phase consists of modeling the bidirectional Bluetooth authentication as a game between the Verifier and the Claimant. Results achieved in the previous step of our algorithm are used to define the players strategy. In fact, device-retained strategies are derived from output matching. On one hand, SR1 = SR2 means that the Claimant is trusted and ready to communicate. Otherwise, the Claimant is considered a malicious device. On the other hand, if the Claimant does not return a result, it is indifferent to the communication. The same reasoning is used for the Verifier where, this time, the SR3 and SR4 results are used. After deriving the players’ strategies, the utility function parameters are defined. These parameters represent the cost and reward function coefficients affected to each player, depending on its strategy and the one that of its correspondent. Next, the Nash equilibrium is computed as detailed in section 5.3 (or best-responses correspondence). Consequently, our Nash equilibrium represents a pair of strategies (one by device) where each player tells the truth and wants to securely communicate which its correspondent. Recall that in a Nash equilibrium, no player has an incentive to deviate from its strategy. In terms of Bluetooth security, our bidirectional authentication is successful if and only if both devices are trusted and there isn’t any risk of damage or impersonation.

6.3 BiAuth algorithm

We summarize our bidirectionnal authentication procedure on an algorithm called BiAuth which is described as follows:

Algorithm BiAuth 1. Security parameters check:

(a) Define the authentication security parameters. (b) Compute the security parameters correspondences.

2. Authentication game: (a) Define the game basic elements:

• Define the set of players (a Verifier device and a Claimant device).

• Define the players’ pure strategies (depending on the verification of security parameters).

• Define the players’ mixed strategies. • Define the players’ utility functions.

(b) Find mixed Nash equilibrium: • Compute Verifier and Claimant pure-strategy

best-response correspondences. • Compute Verifier and Claimant mixed-strategy

best-response correspondences.

(c) Formulate Verifier and Claimant problems as linear programs.

(d) Compute mixed strategies’ probabilities: Simplex resolution.

(e) Compute mixed Nash equilibrium. Figure 2 illustrates our bidirectional Bluetooth authentication protocol, where:

• RV and RC are Verifier and Claimant random-generated numbers.

• BV and BC are the Verifier and the Claimant Bluetooth addresses (BDDR).

• LK is the link key. • ACO is the Authenticated Ciphering Offset generated by the

authentication process. • FV and FC are the Verifier and the Claimant functions used

to check their identities. • E1 is the cryptographic function used during the

unidirectional Bluetooth authentication. • SSV and SSC are the set of all possible strategies for the

Verifier and the Claimant. • PRV and PRC are Verifier and Claimant strategy

probabilities. • UV and UC are the Verifier and the Claimant utility

functions. • CNEV and CNEC are the functions used to compute The

Verifier and the Claimant best-response correspondences. • NEV and NEC are the Verifier and the Claimant Nash

strategies.

6.4 Attacks scenarios As previously cited, an important risk incurred in the classical Bluetooth authentication is linked to a malicious Verifier. Such a device can attack a trusted Claimant by a set of messages and damage it. According to our authentication model, such a scenario will not occur. In fact, when considering our game, the strategies pairs- lying to trying to damage the Claimant and telling the truth to com-municate with the Verifier- do not represent a Nash equilibrium. Another possible attack is the Man-in-the-Middle attack where an attacker device inserts itself “in between” two Bluetooth devices. The attacker connects to both devices and plays a masquerade role. Our bidirectional authentication can prevent such an attack. Indeed, the attacker could not impersonate any device in communication. The attacker must authenticate itself as a trusted device for each Bluetooth device. Otherwise, the authentication fails. 7. Generalization of the Security Method to

Other Protocols In this Section, we generalize our authentication scheme to protocols other than the Bluetooth protocol. We extend the authentication scheme to end-to-end protocols of the wired ISO networks (Given in Figure 3) that utilizes the ISO OSI Transport Layer Protocol. To do this, we extend the ISO Transport Layer protocol

Figure 2: Our bidirectional Bluetooth authentication protocol. To include an authentication phase. We also require that the successful authentication of the other party planning to communicate with (the party responding to a request to communicate with from the initiating party or the party requesting communication with the current party) be a necessary condition for the transfer of user data (the normal Data Transfer state of the ISO Transport Layer protocol). Figure 4 represents the extended ISO Transport Layer protocol while Figure 3 represents the ISO Transport Layer protocol before extension.


69

Figure 3. Block Diagram representing normal ISO Transport Layer protocol

Referring to the ISO Transport Layer protocol given in Lotos in [16, 17], we have the following: Consider the Lotos specification for Class 0 transport protocol to the case where the protocol entity is the initiator. Process TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,tdatr,ndreq] :no exit := ( ?tcreq; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,ndind,tdreq,dt,tdatr,ndreq] [] ?tcreq; !cr; ( ( ?dr; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,ndreq] ) [] (?cc; !tccon; exit))) >> (Data_phase[tdatr,dt] [> Disconnection_phase[tdreq,ndreq,ndind,tdind] ) endproc where Process Data_phase[tdatr,dt] ::exit = ?tdatr; i; Data_phase[tdatr,dt] [] ?dt; i; Data_phase[tdatr,dt] endproc Process Disconnection[tdreq,ndreq,ndind,tdind[ ::no exit := ?tdreq; !ndreq; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,tdatr,ndreq] [] ?ndind; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dr,tdatr,ndreq] endproc Applying our authentication scheme and our extension, we get the following specifications: Figure 4: Block Diagram of the Extended ISO Transport Layered protocol with Authenticaion

Process AuthenticatedTPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,tdatr,ndreq] :no exit := ( ?tcreq; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,ndind,tdreq,dt,tdatr,ndreq] [] ?tcreq; !cr; ( ( ?dr; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,ndreq] ) [] (?cc; !tccon; exit))) >> ((Authentication_Data_phase[tdatr,dt]) [> Disconnection_phase[tdreq,ndreq,ndind,tdind] ) endproc where Process Authentication_Data_phase[RV,RC,SR2,tdatr,dt] ::exit = ( !RV; ?RC; SR2; (i; ?tdatr; i; Data_phase[tdatr,dt]) [] (i; Disconnection_phase[tdreq,ndreq,ndind,tdind]) [] (!RV; ?RC; SR2; (i; ?dt; i; Data_phase[tdatr,dt]) [] (i; Disconnection_phase[tdreq,ndreq,ndind,tdind]))

endproc Process Disconnection[tdreq,ndreq,ndind,tdind[

::no exit := ?tdreq; !ndreq; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dt,tdatr,ndreq] [] ?ndind; !tdind; TPC0[tcreq,tdind,cr,cc,tccon,dr,ndind,tdreq,dr,tdatr,ndreq] Endproc 8. Conclusions In this work, we present a solution to strengthen the Bluetooth security as well as other protocols including those for the wired networks. A classical Bluetooth authentication is unidirectional and consequently is vulnerable to malicious device attacks. The idea is to propose a bidirectional authentication scheme. Game theory is useful for such modelisation since it is a global framework with formal opportunities for real-life problem representations. Thus, the authentication between two Bluetooth devices is viewed as a game. The new bidirectional authentication is modeled as a simultaneous two-players game (bi-matrix). The possible strategies for each player are defined (based on some security parameters check) and formulated with the utility function. Such function affects some costs and rewards values for each player depending on its strategy and its correspondent’s. Then, each players’ best-strategy are com-puted (defining the Nash equilibrium). The algorithm uses the Simplex technique to calculate players’ total gains. Recall that in such conditions only one Nash equilibrium can be derived. This equilibrium corresponds to the case where both players are telling the truth. In Bluetooth security terms, two devices have to be trusted during bidirectional authentication. In other words, the bidirectional authentication is successful if and only if both devices are authentic. To implement this protocol, two issues are possible: outside the Bluetooth core protocol (in the application layer) or within the Bluetooth core protocol (in the LMP layer). In the first case, the classical Bluetooth authentication will be replaced by our bidirectional authentication. When considering the second view, some changes in the cryptographic function used during a classical Bluetooth authentication are necessary in order to incorporate the described model. We are finalizing some benchmarks to compare the efficiency between our

Idle Connection establishment phase

Ua-connected

Authentication phase Au-connected

Data transfer phase Disconnection phase

Idle

connected idle

Connection establishment phase

Disconnection phase

Data transfer phase


70

algorithm and the standard Bluetooth authentication model. Our work can be extended in different ways. For example, we can model our bidirectional authentication as an N-player game. According to such model, an authentication process can be performed between many devices at the same time. This will be useful when piconets or scatternets are formed. In addition, we can exploit extensive form in order to describe dynamic behavior. A player will take into ac-count the effect of its current behavior on the other players’ future behavior. This principle can forewarn trusted Bluetooth devices of possible threats and malicious devices. Also our model can be applied to any authentication process just by adapting the utility function parameters. References [1] Alexoudi, M., Finlayson, E., & Griffiths, M. (2002).

Security in Bluetooth. [2] Bray, J., & Sturman, C. F. (2002). Bluetooth 1.1:

connect without cables. Second Edition, Prentice Hall PTR (Eds.).

[3] Browne, R. (2000). C4i defensive infrastructure for survivability against multi-mode attacks. In Proc. 21st Century Military Communications - Architectures and Technologies for Information Superiority.

[4] Buike, D. (1999). Towards a game theory model of information warfare. Master’s Thesis, Technical report, Airforce Institute of Technology

[5] Candolin, C. (2000). Security Issues for Wearable Computing and Bluetooth Technology. Telecommu-nications Software and Multimedia Laboratory, Helsinky University of Technology, Finland.

[6] Cordeiro, C. M., Abhyankar, S., & Agrawal, D. P. (2004). An enhanced and energy efficient commu-nication architecture for Bluetooth wireless PANs. Elsevier.

[7] De Kock, A. Bluetooth security. University Of Cape Town, Department Of Computer Science, Network Security.

[8] Del Vecchio, D., & El Kadhi, N. (2004). Bluetooth Security Challenges, A tutorial. In proceedings of the 8th World Multi-Conference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA.

[9] Kitsos, P., Sklavos, N., Papadomanolakis, K., & Koufopavlou, O. (2003) Hardware Implementation of Bluetooth Security. IEEE CS and IEEE Commu-nications Society, IEEE Pervasive Computing.

[10] Muller, T. (1999). Bluetooth security architecture -Version 1.0. Bluetooth white paper.

[11] Osborne, M.-J., & Rubinstein, A. (1994). A course in game theory. Massachusetts Institute of Technology.

[12] Persson, J., & Smeets, B. (2000). Bluetooth security - An overview. Ericsson Mobile Communications AB, Ericsson Research, Information Security Technical Report, Vol 5, No. 3, pp. 32-43.

[13] Pnematicatos, G. (2004). Network and InterNetwork Security: Bluetooth Security.

[14] Syverson, P. F. (1997). A different look at secure distributed computation. In Proc. 10th IEEE Computer Security Foundations Workshop.

[15](2003) Bluetooth: threats and security measures. Bundesant fr Sicherheit in der Informationstechnik, Local Wireless Communication Project Team, Ger-many.

[16] Hazem El-Gendy, “Formal Method for Automated Transformation of Lotos Specifications to Estelle Specifications”, International Journal of Software Engineering & Knowledge Engineering, USA, Vol. 15, No. 5, October 2005, pp. 1-19. 2005.

[17] Hazem El-Gendy and Nabil El Kadhi, “Testing Data Flow Aspects of Communications Protocols, Software, and Systems Specified in Lotos”, International Journal on Computing Methods in Science and Engineering, Published in Greece, 2005.


71

Optimization of DTN routing protocols by using forwarding strategy (TSMF) and queuing drop

policy (DLA)

Sulma Rashid1, Qaisar Ayub1, M. Soperi Mohd Zahid 1 , A.Hanan. Abdullah1

1Universiti Teknologi Malaysia (UTM), Faculty of Computer Science & Information System,

Department of Computer System & Communication Skudai - Johor, 81310, Malaysia [email protected] , [email protected] , [email protected] , [email protected]

Abstract: Delay tolerant Networks (DTN) are wireless networks where disentanglement may occur repeatedly. In order to attain the delivery probability in DTN, researchers have proposed the use of store-carry-forward paradigm where a node may accumulate messages in its buffer and carry it’s for long period of time, until a forwarding opportunity arises. In this context, multiple copies scheme gets popularity which floods multiple messages to increase the delivery probability. This combination leads protocol to store message for long time at node buffer and less time duration of contact limited the message forwarding and increase overhead of bandwidth. Thus a effective scheduling for forwarding and drop polices are extremely important to make decision in which order message is forwarded from queue when limited time of transmission is available and decide which drop policy will be use to overcome the full node buffer when a new message received. In this paper, we propose a combination of drop policy DLA and forwarding scheduling policy TSMF called as DLTs that optimizes the DTN routing protocols in term of overhead and boost the delivery probability and buffer time average. We evaluate their efficiency and tradeoffs, through simulation and prove that DLTs is performing better than FIFO. Keywords: Store and carry networks. Routing protocols, drop policies, forwarding policies.

1. Introduction In conventional routing schemes it is necessary to launch end-to-end path from source to destination previous to the transmission of data. Hence most of wireless applications such as sensor networks for ecological monitoring [16], ocean sensor networks [18],[17] biological sensor networks [11] and vehicular networks [19], [20] due to highly unstable path which may change or break while being discovered.

Disruption tolerant networks (DTNs) enable the transmission of data by using intermittently connected mobile nodes. DTN as [9],[6] refers as work by using store-carry-forward paradigm, where each node in the network store the message in buffer, carries the message while moving and forward when it encounter with another node.

Due to long delays, frequent disruptions between intemitteltly nodes and limited resources, routing in DTN [10] has become the prime issue.

Based on the idea of message forwarding, routing schemes for DTN can be divided in to two major categories,

single copy and multi copy [12]. In single copy schemes only one copy of message exists in the network, which is forwarded along single path [5] for example first contact [5], direct delivery [5]. While in multi copy schemes more then one copy of same message are forwarded to multiple paths for example Epidemic router[8], spray&wait[14] ,prophet[1], MaxProp [20] ,probabilistic forwarding[3].

As proved by [13] that multi copy policy has high impact on message delivery and robustness at the cost of more bandwidth, energy and memory usage. However an important issue which was not investigated in the previous work is the use of an efficient buffer management strategies and message forwarding polices. A recent work [1], [21],[7] has proposed few forwarding and buffer management strategies.

In this paper we combine the buffer management strategy DLA with forwarding queue mode TSMF to optimize the performance of DTN routing protocols in term of delivery probability, overhead ratio and buffer time averages. This technique is called as DLTs.

The remaining paper is prepared as follows .Section 2 elaborates the existing buffer and forwarding polices. Section 3 is about routing protocols for optimization and Section 4 is performance metrics, Section 5 is approach and simulation results simulates in section 6 by a conclusion at section 7.

2. Existing drop and forwarding policies When nodes under DTN resource constrained (buffer) network communicates the congestion arise frequently. Hence the issue is which message from congested buffer will be dropped to continue transmission.

2.1 Queuing drop policies Following are queuing drop polices for messages discarded order when a new message received at the full buffer node.

2.1.1 Drop Random (DR) The selection of message to be dropped is in random order

2.1.2 Drop–Least-Recently-Received (DLR) The message with the long stay time in buffer will be dropped. The idea is that the packet with in buffer for long time has less probability to be passed to other nodes.


72

2.1.3 Drop -Oldest (DOA)

The message with the shorted remaining life time (TTL) in network is dropped. The idea of dropping such packet is that if packet TTL is small, it is in the network for long time and thus has high probability to be already delivered.

2.1.4 DL-Drop last (DL) It drops the newly received message.

2.1.5 Drop front (DF) The message that enters first in the queue is dropped first.

2.1.6 N-Drop In N-Dropt [2], the message that does N number of forwarding will be selected to drop.

2.1.7 MOFO The message that has been forwarded to maximum number of times will be dropped first. [1]

2.1.8 MOPR Each message in node is associated with a forwarding predictability FP, initially assigned to 0. When the message is forwarded the FP value is updated and the message with highest FP value will be dropped first. [1]

2.1.9 SHLI The message having smallest TTL will be selected to drop. [1]

2.1.10 LEPR “Since the node is least likely to deliver a message for which it has a low P-value, Drop the message for which the node has the lowest P value.” [1]

2.1.11 Drop Largest (DLA) In Drop Largest (DLA) [21] large size message will be selected in order to drop.

2.2 Forwarding policies

2.2.1 First In First Out (FIFO) In FIFO queue mode all messages are arranged according to arrival time and the message which has oldest arrival time will be transmitted first.

2.2.2 Random queue mode (RND) The message is randomly selected for the transmission.

2.2.3 GRTR “Assume A, B are nodes that meet while the destination is D, P(X, Y) denote the delivery predictability that a node X has for Destination Y. GRTR forward the message to node only if P (B-D) >P (A-D)” [1].

2.2.4 GRTRSort “GRTRSort looks at difference P (B-D) – P(A-D) values for each message between the nodes and forward the message only if P(B-D)>P(A-D).” [1]

2.2.5 GRTRMax “Select messages in descending order of P (B-D) forward the message only if P (B-D)> P (A-D).” [1]

2.2.6 TSMF In TSMF [7] the forwarding queue the message with small size is placed on top of queue.

3. Routing protocol for optimization

2.1. Epidemic In epidemic routing[8] application messages are swamped to the relay nodes called carriers , where carrier nodes though moving comes in contact with another related proton of network , it forward the message to further island of nodes. This redundancy of forwarding formulates sure about the delivery of message to its destination.

2.2. Spray&wait (binary) Spray&wait (binary) start with N number of copies. While moving when it encounter with a node such that N=0, it spread the (N/2) message copies to new node while keep (N/2) for itself. When it left with one copy (N=1) it perform direct transmission. Spray&wait combines the speed of epidemic router with the simplicity of direct transmission.

2.3. Direct delivery The source node transmits the message to other node only when other node is its destination. Direct delivery[5] can be considered a hybrid technique as one can include it in flooding where Direct Delivery always select the direct path between source and destination.

2.4. First Contact In first contact [5] a message is forwarded along a single path by selecting the node randomly from available connections. If connections do not exist the nodes waits and transmit the message to first available contact.

2.5. Prophet router The routing protocols perform variants of flooding. Epidemic [8] replicates messages to all encountered peers, while Prophet [3] tries to estimate which node has the highest “likelihood” of being able to deliver a message to the final destination based on node encounter history.


73

4. Performance oriented metrics

4.1 Delivery probability It is ratio of the message delivered to the destination to those generated by the sources. High probability means that more messages are delivered to the destination.

Table1: Simulation setup

4.2 Overhead ratio It is the negation of number of messages transmitted to relay to the number of message delivered. Low value of overhead means less processing required delivering the relayed messages.

4.3 Buffer time average It is Sum of time spend by all message(s) in buffer divided

by message delivered.

5. Approach DLTs Consider a scenario of two nodes A and B with buffer size 1000kb,Mid is message identification and must be unique for each message, SM is size of message, AT represent the arrival time while TT is transmission time for message. ATT is available transmission time.

Figure 2. FIFO-FIFO queue mode

Assume B wants to transmit the message to Node A, according to forwarding queuing policy (FIFO) M810 will be selected. We can see that the buffer at node A is congested, according to Drop policy (FIFO) messages (M110, M563, M120, M111, M150.) at node A will be dropped until the space becomes available for M810.With available ATT only M810 with TT of 5s will be transmitted.

Figure 3. DLTs (Drop Largest-Transmit smallest)

Figure 2 depicts the DLTS mechanism. As buffer at Node

A is congested, DLTS will drop the large size message which is M190. When B starts transmission it will first sort out the forwarding queue to pop up small sizes messages. We can see that with available transmission time M101, M920, M126, M137, M115 will be transmitted which increases the delivery probability.

6. Simulation and results In the following section we have examined routing protocols (Section 5), with exiting (FIFO-FIFO) and proposed (DLTs).

All the experiments were carrying out using ONE Simulator. The ONE Simulator is a discrete event simulator written in Java. The main aim of simulator is to apply DTN (store-carry-forward) of message for lengthy time, where the likelihood of disconnections and failures increased

Figure 4. Delivery probability FIFO-FIFO and DLTs The above figure 3 depicts the comparison of DLTs and FIFO queue and forwarded mode with respect to delivery probability. We can observe that epidemic, spray&wait and prophet router have high delivery probability because frequency of node encounter is high resulting more congestion where transmitting smallest message raise the delivery of messages. FC and DD are single copy cases and chance of congestion is less as in multi copy schemes result less delivery as compared to other routers but still DLTs increases the delivery than existing queue policy. Moreover, DD router passes the messages to that node that are destination. However in all the configuration of routers, message delivery probability of DLTs is improved then FIFO.

Number of Nodes 126 Movement model 1 Random Waypoint Movement model 2 Map rout movement Number of groups 06 Buffer size 5MB Transmission range 10M Transmission speed 250 K Message creation interval

25-35 (seconds)


74

Figure 5. Buffer time average FIFO-FIFO and DLTS

Figure 4 observes buffer time averages with DLTs and FIFO. It can be clearly seen that DLTs has high value of buffer time average with all router. Buffer time occupancy is very scared resource in the DTN where store and carry paradigms are used. As expected, in this architecture buffer should retain the message as along as it can, so that delivery of message should increase and eliminate the drop ratio of messages. DLTs improved the buffer time occupancy for all router in multi and single copy routers and increase the delivery ratio as in figure 3.

Figure 6. Overhead ratio FIFO-FIFO and DLTs

Figure 5 represents the impact of DLTs and FIFO with respect to overhead ratio. We are able to see clearly that overhead ratio with DLTS is decreases in all routers irrespective of multi copy or single copy approaches of routers., DD due to direct transmission overhead is zero so we excluded that case for both algorithms, while Epidemic, spray&wait, Prophet, FC overhead is reduced to considerable level.

7. Conclusion and future work In this paper, we propose a combination of drop policy DLA and forwarding scheduling policy TSMF called as DLTs that optimizes the DTN routing protocols in term of overhead and boost the delivery probability and buffer time average. We evaluate their efficiency and tradeoffs, through simulation and prove that DLTs is performing better than FIFO.

The results presented in this paper can be used as a starting point for further studies in this research field, and give helpful guidelines for future DTN protocol design.

References

[1] A.indgren and K. S. Phanse, “Evaluation of queuing

policies and forwarding strategies for routing in intermittently connected networks,”in Proc. of IEEE COMSWARE, pp. 1-10, Jan. 2006.

[2] Yun Li, Ling Zhao ,Zhanjun Liu,Qilie Liu.” N-Drop Congestion Control strategy under Epidemic Routing in DTN.” Research center for wireless information networks,chongqing university of posts & Telecommunications ,chongqing 400065,china, pp. 457-460, 2009

[3] A. Lindgren, A. Doria, and O. Schelen, “Probabilistic routing in intermittently connected networks,” SIGMOBILE Mobile Computing and Communication Review, vol. 7, no. 3, 2003.pp 19-20.

[4] KERÄNEN, A., AND OTT, J. Increasing Reality for DTN Protocol Simulations. Tech. rep., Helsinki University of Technology, Networking Laboratory, July 2007.

[5] T. Spyropoulos, K. Psounis, and C. Raghavendra A, C. S. “Single-copy routing in intermittently connected mobile networks,” IEEE/ACM Transactions on Networking (TON), vol. 16, pp. 63-76, Feb. 2008.

[6] Tara Small and Zygmunt Haas.” The shared wireless infestation model - a new ad hoc networking paradigm (or where there is a whale, there is a way)”, In Proceedings of The Fourth ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc 2003), pages 233.244, June 2003.

[7] Qaisar Ayub, Sulma Rashid and Dr.Mohd Soperi Mohd Zahid. Article: Optimization of Epidemic router by new forwarding queue mode TSMF. International Journal of Computer Applications 7(11):5–8, October 2010. Published By Foundation of Computer Science.

[8] Vahdat and D. Becker. Epidemic Routing for Partially-connected Ad hoc Networks. Technical Report CS-2000-06, Duke University, July 2000.

[9] SCOTT, J., HUI, P., CROWCROFT, J., AND DIOT, C. Haggle: Networking Architecture Designed Around Mobile Users. In Proceedings of IFIP WONS (2006).

[10] FALL, K. A Delay-Tolerant Network Architecture for Challenged Internets. In Proc. of ACM SIGCOMM (2003).

[11] Z. J. Haas and T. Small. A New Networking Model for Biological Applications of Ad Hoc Sensor Networks. IEEE/ACM Transactions on Networking, 14, No. 1:27–40, 2006.

[12] T. Spyropoulos, K. Psounis, and C. Raghavendra. Efficient Routing in Intermittently Connected Mobile Networks: The Multi-copy Case. In ACM/IEEE Transactions on Networking, 2007.

[13] T. Small and Z. J. Haas. Resource and performance tradeoffs in delay-tolerant wireless networks. In SIGCOMM Workshop on Delay Tolerant Networking (WDTN), 2005.

[14] T. Spyropoulos, K. Psounis, and C. S. Raghavendra.” Spray and wait: an efficient routing scheme for intermittently connected mobile networks”, In SIGCOMM Workshop on Delay Tolerant Networking (WDTN), 2005

[15] J. Scott, P. Hui, J. Crowcroft, and C. Diot, “Haggle: A networking architecture designed around mobile users,” in Proc. IFIP Conf. Wireless On-Demand Network Systems and Services (WONS), 2006.


75

[16] P. Zhang, C. M. Sadler, S. A. Lyon, and M. Martonosi. Hardware Design Experiences in ZebraNet. In Proc. ACM SenSys, pages 227–238, Nov. 2004.

[17] Maffei, K. Fall, and D. Chayes. Ocean Instrument Internet. In Proc. AGU Ocean Sciences Conf., Feb 2006.

[18] J. Partan, J. Kurose, and B. N. Levine. A Survey of Practical Issues in Underwater Networks. In Proc. ACMWUWNet, pages 17–24, Sept. 2006.

[19] J. Ott and D. Kutscher. A Disconnection-Tolerant Transport for Drive-thru Internet Environments. In Proc. IEEE INFOCOM, pages 1849–1862, Mar. 2005.

[20] J. Burgess, B. Gallagher, D. Jensen, and B. N. Levine. MaxProp: Routing for Vehicle-Based Disruption- Tolerant Networks. In Proc. IEEE Infocom, April 2006.

[21] Sulma Rashid,Qaisar Ayub,”Effective buffer management policy DLA for DTN routing Protocals under congetion”, International Journal of Computer and Network Security,Vol 2,NO 9,Sep 2010. pp .118-121


76

An Empirical Study to Investigate the Effectiveness of Masquerade Detection

Jung Y. Kim1, Charlie Y. Shim2 and Daniel McDonald3

1Utica College, Computer Science Department,

1600 Burrstone Road, Utica, NY, 13492 [email protected]

2Kutztown University of Pennsylvania, Computer Science Department,

PO Box 730, Kutztown, PA, 19530 [email protected]

3Kutztown University of Pennsylvania, Computer Science Department,

PO Box 730, Kutztown, PA, 19530 [email protected]

Abstract: Masquerade detection is an important research area in computer security. A masquerade attack can be identified when audited user’s patterns significantly deviate from his or her normal profile. One of the popular approaches in masquerade detection is to use Support Vector Machines (SVMs). The main goal is to maximize detection rates while minimizing the number of false alarms. In this paper, we explore various aspects in masquerade detection using SVMs to determine how the overall effectiveness of the system can be enhanced. Setting a proper threshold level has a greater influence on the false alarm rate than the detection rate. In addition, we have found that the classifier that takes the order of the instances into account outperformed the other type when the instance length is not overly long.

Keywords: SVM (Supportive Vector Machine), masquerade detection, detection rate, false alarm rate.

1. Introduction The main purpose of the masquerade detection framework

is to identify masquerade attempts before serious loss or damage occurs. Masquerade attacks are difficult to detect since masqueraders enter the system as valid users and thus won’t be affected by existing access control schemes [1]. Masquerade detection can be designed as a class of anomaly detection in that the test instance is declared as anomalous if it does not fall within the boundary of normal behavior [2]. Note that the behavior of masqueraders is unusual and thus deviates from that of legitimate users. The goal of anomaly detection is to maximize detection rates while minimizing false alarm rates. Various approaches have been tried to implement masquerade detection and one of the most recent attempts is to use the Support Vector Machine (SVM). It is because the SVM has achieved excellent classification performance in a wide range of applications such as texts, images, and computer security field [3], [4], [5]. However, focuses have usually been placed on demonstrating the superiority of the proposed method over other approaches.

The important topic that has been overlooked is how we can maximize the effectiveness of the masquerade detection using the SVM. Factors such as the relationship between the

type of employed classifier and the ideal length of instances can further be examined. Unfortunately, a few studies on SVM-based masquerade detection have discussed these issues. Changes in these parameters will directly affect the overall effectiveness of the masquerade detection and this is what we studied in this research. The rest of the paper is organized as follows. Section two surveys the previous work related to the topic that has been investigated. In section three, we present our empirical study and illustrate the effect of adjusting different parameters in detail. Section four summarizes our work and concludes with our findings.

2. Related Work Masquerade detection is an important field of study and

various approaches such as Naïve Bayes classifiers [6], [7] and Support Vector Machines [8], [9], [10] have been attempted. Applying a Naïve Bayes classifier is simple and effective. A drawback of using this classifier, however, is that new “unseen” characteristics are more likely to be considered as a legitimate user’s patterns, which allows a masquerader to elude detection [11].

Wang and Stolfo found employing the SVM in masquerade detection performed better than a Naïve Bayes classifier in that it showed higher detection rates [6]. As an attempt to increase the efficiency of the system, they used “truncated” UNIX commands and a large command set. They used the one-class training algorithm to detect masquerade attacks and asserted that increasing the detection threshold might allow for a higher detection rate [6]. However, even though higher detection accuracy could be achieved, their system left the problem of false alarm rates being escalated simultaneously. Therefore, the idea of combining the output of the system with other sensors was suggested to reduce the number of false alarms.

Maxion applied "enriched" UNIX commands – commands with their corresponding arguments – to a Naive Bayes classifier [11]. Higher detection rates were achieved with minimally increased false alarm rates. Moreover, irregularly used arguments of enriched commands could be


77

identified [11]. However, the problem of proper threshold setting was still left behind. Another study showed that the composition of two kernel methods was shown to improve the detection accuracy while minimizing the false alarm rate slightly [8].

3. Empirical Study and Experimental Results As we surveyed in the previous section, the SVM has

been popularly employed in masquerade detection. Nevertheless, these studies mainly focused on demonstrating the superiority of the proposed model when compared to other approaches. The main purpose of our research is to provide a guideline for modeling an ideal set of features in utilizing the SVM so that the effectiveness of masquerade detection can be maximized. Our study analyzes the performance of masquerade detection with respect to three parameters: threshold levels, the type of classifiers, and the length of instances. Section 3.1 describes our experimental design and overall test results.

3.3 Dataset and Experimental Design We used the most popular dataset provided by Schonlau

et al. for our experiments. This dataset is called the SEA data and it includes 15,000 UNIX commands for each of 50 users [7]. We believed that the sequence of UNIX commands were a good identifier to determine the identity of each user. This approach was widely used by many researchers [8], [9], [11]. The sequence of commands was parsed and partitioned to generate meaningful subgroups which were fed to the SVM. That is, each user's command history in the dataset was divided into multiple files which were broken down into two distinct categories: training data and test data. Commands were first taken from the dataset to compile a 500 line file for training on the appropriate SVM which generated a profile for each user. Next, multiple files were generated for each sequence length, 4 to 13, for the purpose of identifying the effectiveness in terms of sequence length. Each user's profile was then trained on the appropriate SVM, and the profile was used to classify each test file for each user. For each user, 500 tests were conducted.

We analyzed detection rates by classifying a user's profile against other user's test files. Comparing a user’s test data against his (or her) own normal profile generated false alarms. Data was then collected to determine the average detection rate and false alarm rate for each user in terms of different instance lengths. This data was further extended into three threshold values: 35%, 50%, and 70%. This in turn was averaged to determine the average detection rate and false alarm rate for each sequence length. That way, we could determine the relationship between the threshold level and the performance of masquerade detection.

Different types of classifiers were used for SVMs and we classified the types of classifiers into two distinctive groups: ordered and unordered. The order of the command sequence is considered in ordered classifiers whereas it is not taken into account in unordered classifiers. The LIBSVM (a Library for Support Vector Machines) is an integrated

implementation for support vector classification [12] and this was selected for unordered classification. The SVMHMM (Support Vector Machines Hidden Markov Model) is an implementation of SVMs for sequence tagging [13] and it was selected for ordered classification.

The overall results of the tests that we have conducted are presented in Figure 1 and Figure 2. Figure 1 shows detection rates when different threshold (TH) values, 35%, 50%, and 70%, were applied to each SVM and Figure 2 shows their corresponding false alarm rates. Note that both detection rates and false alarm rates are increased when the instance length gets longer. Detailed analysis of our findings are described in sections 3.2 ~ 3.4.

Figure 1. Comparison of detection rates

Figure 2. Comparison of false alarm rates

3.4 Threshold values The threshold value represents the selected minimum

matching percentage so that the audited behavior can be classified as a masquerade attack or not. Determining an appropriate threshold level directly affects the performance of the system. That is, in general, the raise in the threshold value causes the increase in both detection rates and false alarms. Figure 3 and Figure 4 show the average detection rates and false alarm rates when three threshold values, 35%, 50%, and 70%, were applied to each SVM.

Our testing showed that threshold values had a profound effect on detection and false alarm rates. Increasing the


78

threshold value increased both detections and false alarms. Although the lowest threshold, 35%, had the lowest detection rates, this threshold produced minimal false alarms in testing (see Figure 3 and Figure 4). Much higher detection rates were seen at a threshold of 50% than at a threshold of 35%. While detection rates as high as 93.3% (SVMHMM) and 96.3% (LIBSVM) were achieved at a threshold level of 70%, this was at a cost of a high false alarm rate, 83.5% (SVMHMM) and 89.1% (LIBSVM) respectively. Thus, a threshold of 70% or higher is seen as impractical for use due to the excessively high false alarm rates.

Figure 3. Analysis of thresholds (SVMHMM)

Figure 4. Analysis of thresholds (LIBSVM)

One important fact that we have found is that the number of false alarms was seen to increase at a lot faster rate than detections as the threshold value was increased; note that an average of 67% increase in the false alarm rate was found, whereas there was only a 23.2% increase in the detection rate when the threshold value was escalated to a 70%. Thus, an appropriate threshold level needs to be selected in such a way that reasonable detection rates and tolerable false alarm rates can be achieved.

3.5 The type of classifiers The SVMHMM (ordered classifier) outperformed the

LIBSVM (unordered classifier) in minimizing false alarms when the instance lengths are not overly long. There was a significant difference in the false alarm rates between both SVMs when the smallest instance length was used. The greatest difference between the two classifiers was seen at a 50% threshold with the smallest two sequence lengths (see

Figure 2). False alarm rates differed by 26 points at sequence lengths four and 18 points at length five.

Note that the SVMHMM outperformed the LIBSVM in most cases with respect to detection rates when the instance lengths were less than 10 (see Figure 1). However, as the instance length was increased, the performance of both SVMs converged at length 10. The performance degradation in the SVMHMM seems to be caused by the increasing particularity as the instance lengths become too long. This is because it is less likely that users always enter a long series of commands in the exactly same pattern.

The performance of the LIBSVM, however, turned out to be less relevant to the instance lengths and this behavior is shown in Figure 1 and Figure 2. The reason behind this phenomenon is that the specific order of commands entered by users is not considered in the LIBSVM. Therefore, there is no significant change in the performance as the instance length varies.

3.6 Instance lengths In order to determine the effect of applying different

instance lengths, we classified the employed instance lengths into three groups: Short (lengths of 4 ~ 6), Medium (lengths of 7 ~ 9), and Long (lengths of 10 ~ 13). Testing results are averaged and redrawn using these groups and they are represented in Figure 5, 6, 7, and 8.

Figure 5. Analysis of detection rates (SVMHMM)

Figure 6. Analysis of false alarm rates (SVMHMM)

When the SVMHMM was used, increasing the instance length was shown to increase both detections and false alarms (see Figure 5 and Figure 6). Thus, detection rates can be maximized by using larger instance lengths whereas


79

a smaller instance length is desirable in order to maintain lower false alarm rates.

As we previously mentioned in section 3.3, the performance of the LIBSVM is less affected by the instance length (see Figure 7 and Figure 8). Note that there was a slight benefit in the detection rate as the instance lengths increased under the 35% threshold setting.

Figure 7. Analysis of detection rates (LIBSVM)

Figure 8. Analysis of false alarm rates (LIBSVM)

However, as the instance length increased, results using different classifiers (SVMHMM and LIBSVM) began to converge (see Figure 5~8). Results showed that both classifiers converge to the same detection and false alarm rates when a long instance group was used.

Figure 9. Analysis of Instance Length Change

Therefore, we can assert that there is no significant benefit in employing a longer instance; however, it is still

attractive to use a medium length instance; note that there was a 21.86% increase in the detection rate (see Figure 9) when the instance was lengthened from short to medium.

4. Conclusion There have been many approaches in tackling

masquerade attacks. However, these studies primarily focused on demonstrating the advantage of the proposed model when compared to other approaches. The main goal of our research is to investigate the effectiveness of masquerade detection using SVMs. We analyzed the performance of masquerade detection with respect to three parameters: threshold levels, the type of classifiers, and the length of instances.

In conclusion, no parameters that were selected and tested were able to improve detection rates while decreasing false alarms. In all tests, increased detection rates correlate to increased false alarm rates. However, masquerade detection using sequence classification was more successful when limiting false alarms with the use of smaller instance lengths. Increasing threshold values to a 70% showed little benefit since false alarm rates increased significantly with only slight increase in detection rates. This study shows that there is an advantage of using smaller instance lengths applied to a classifier which considers the order as an effort to minimize false alarm rates. If maximizing detection capability is the main goal, the type of a classifier is less relevant. Instead, it is desirable to use a longer instance at the sufficient level of threshold where reasonable limits of false alarms can be retained.

Finally, a new dataset, if any, could be used in order to support and reinforce the validity of our findings. This research will help to provide a principle for modeling an ideal set of rules so that the effectiveness of masquerade detection can be maximized.

References [1] B. Szymanski, and Y. Zhang “Recursive Data Mining

for Masquerade Detection and Author Identification,” Proceedings of the Fifth Annual IEEE SMC, pp 424~431, 2004

[2] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection: A survey,” ACM Computing Surveys (CSUR), Volume 41, Issue 3, pp. 15:1-15:58, 2009.

[3] Z. Liu, J. Liu, and Z. Chen, “A generalized Gilbert's algorithm for approximating general SVM classifiers,” Neurocomputing, Volume 73 , Issue 1-3, pp. 219-224, 2009.

[4] J. Wu, Z. Lin, and M. Lu, “Asymmetric semi-supervised boosting for SVM active learning in CBIR,” Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 182-188, 2010

[5] T. Joachims, “Text categorization with support vector machines: Learning with many relevant features,” Proceedings of the European Conference on Machine Learning (ECML), pp. 137-142, 1998.

[6] K. Wang, and S. Stolfo, “One-Class Training for Masquerade Detection,” Proceedings of the ICDM Workshop on Data Mining for Computer Security (DMSEC), Melbourne, pp. 2-7, 2003.

[7] M. Schonlau, W. DuMouchel, W. Ju, A. Karr, M. Theus, and Y. Vardi, “Computer Intrusion: Detecting Masquerades,” Statistical Science Vol.16, No.1, pp. 58–74, 2001.


80

[8] J. Seo, and S. Cha, “Masquerade detection based on SVM and sequence-based user commands profile,” Proceedings of the 2nd ACM symposium on Information, computer and communications security (ASIACCS '07), 2007.

[9] H. Kim, and S. Cha, “Empirical evaluation of SVM-based masquerade detection using UNIX commands,” Computers & Security Vol. 24, No. 2, pp 160-168, 2005.

[10] S. Mukkamala, and A. Sung, “Feature Ranking and Selection for Intrusion Detection Systems Using Support Vector Machines,” Proceedings of the Second Digital Forensic Research Workshop (DFRWS), Syracuse, 2002.

[11] R. Maxion, “Masquerade Detection Using Enriched Command Lines,” Proceedings of the International Conference on Dependable Systems & Networks” , pp. 22-25, 2003.

[12] C. Chang, and C. Lin, “LIBSVM -- A Library for Support Vector Machines,” [Online]. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm. [Accessed: Jun. 14, 2010].

[13] T. Joachims, “SVMHMM - Sequence Tagging with Structural Support Vector Machines,” August 14, 2008, [Online]. Available: http://www.cs.cornell.edu/People/tj/svm_light/svm_hmm.html. [Accessed: Jul. 26, 2010].


81

An Efficient Intrusion Detection System for Mobile Ad Hoc Networks

B.V. Ram Naresh Yadav1, B.Satyanarayana2, O.B.V.Ramanaiah3

1Dept. of CSE, JNTUHCEJ, Karimnagar, Andhra Pradesh, India.

[email protected]

2Dept. of CST, S.K.University, Anantapur, Andhra Pradesh, India. [email protected]

3Dept. of CSE, JNTUH College of Engineering, Hyderabad, Andhra Pradesh, India.

[email protected] Abstract: A Mobile ad hoc network is a collection of nodes that is connected through a wireless medium forming rapidly changing topologies. Mobile ad hoc network are vulnerable due to its fundamental characteristics such as open medium, dynamic topology, distributed co-operation and constrained capability. Real time Intrusion detection architecture for ad hoc networks has been proposed for detecting black hole and packet dropping attacks. The main problem with this approach is that the detection process relies on a state based misuse detection system. In this case every node needs to run in the IDS agent. This approach does not make use of a distributed architecture to detect attacks that require more than one hop information. In this paper we propose an Efficient IDS, a novel architecture that uses a specification based intrusion detection techniques to detect active attacks such as packet dropping, black hole attacks against AODV protocol. Our architecture involves the use of FSM for specifying AODV routing behavior and distributed network monitors for detecting the attacks. Our methods can detect most of the bad nodes with low false positive rate and packet delivery ratio can also be increased with high detection rate. Efficient Intrusion detection system architecture for ad hoc networks does not introduce any changes to the underlying routing protocol since it operates as an intermediate component between the network traffic and the utilized protocol with minimum processing overhead. We have developed a prototype that was evaluated in AODV enabled networks using the network simulator (ns-2).

Keywords: MANET’S, Types of attacks, AODV, IDS.

1. Introduction Mobile Ad hoc networks are one of the recent active fields and have received spectacular consideration because of their self configuration and self maintenance. Early research assumed a friendly and co-operative environment of wireless network. In order to maintain the connectivity in a mobile ad hoc network all participating nodes have to perform routing of network traffic. Therefore, a network layer protocol designed for such self organized networks must enforce connectivity and the security requirements in order to guarantee the undisrupted operations of higher layer protocols, unfortunately all the widely used ad hoc routing protocols have no security considerations and trust

all the participants to correctly forward routing and data traffic. The routing protocol sets the upper limit to security in any packet network. If routing can be misdirected or modified the entire network can be paralyzed [2]. Several efforts have been made to the design of a secure routing protocol for ad hoc networks. The main problems with this approach are that it requires changes to the underlying protocol and that manual configuration of the initial security associations cannot be completely avoided. The Efficient Intrusion Detection Systems for mobile ad hoc network system is based on previous research proposed to detect active attacks against AODV, a routing protocol that is widely used in wireless networks [1]. We have adopted the successful approach of employing distributed network monitors for detecting attacks in real time and have applied to the domain of ad hoc routing. Efficient Intrusion detection Systems for mobile ad hoc networks can be characterized as an architecture model for Intrusion detection in ad hoc networks, while its implementation targets specifically AODV [9]. We clarify our system as an architecture model since it does not perform any changes to the underlying routing protocol but it merely intercepts traffic and acts upon recognized patterns. In the remainder of this paper we start by briefly presenting the related work on this area in section 2. In section 3 we describe the AODV routing protocol and the threat model associated with it. In section 4 we describe in detail our proposed architecture and the design of Effective ID for mobile ad hoc network for AODV based networks. In the section 5 we evaluate our prototype that has been implemented using ns-2 simulator, section 6 concludes by describing the strengths and short coming of our proposal identifying the directions for future work.


82

2. Related Work Specification based intrusion detection system is used to detect attacks on AODV. This approach involves the Finite state machine for specifying correct AODV routing behavior and distributed network monitors for detecting runtime violation of the specifications [3]. Specification based system are particularly attractive as they successfully detect both local and distributed attacks against the AODV routing protocol with a low number of false positives. A real time intrusion detection system for ad hoc networks model for detecting real time attacks has been developed specifically for AODV [2]. The model is composed of four main layers, a traffic interception module, an event generation module, an attack analysis module, and counter measure module. The traffic interception module captures the incoming traffic from the network and selects which of these packets should be further processed. The event generation module is responsible for abstracting the essential information required for the attack analysis module to determine if there is malicious activity in the network. The event generation and attack analysis modules are implemented using TFSM’S. The final component of the architecture is the counter measure module that is responsible for taking appropriate actions to keep the network performance within acceptable limits. The result of this research clearly demonstrates that this approach is used to detect active attacks in real time. In effective intrusion detection system for mobile ad hoc networks, we use this work as a basis and apply the developed concepts in the field of ad hoc networking environment and more specifically to the AODV routing protocol. The watchdog and path rater scheme has suggested two extensions to the DSR ad hoc routing protocol that attempt to detect and mitigate the effects of nodes that do not forward packets although they have agreed to do so [7].The watchdog extension is responsible for monitoring that the next node in the path forwards data packets by listening in promiscuous mode. The path rater assumes the results of the watchdog and select most reliable path for packet delivery. As the authors of the scheme have identified, the main problem with this approach is its vulnerability to black mail attacks. The intrusion detection and response model proposes a solution to attacks that are caused from a node internal to the ad hoc networks where the underlying protocol is AODV [8]. The intrusion detection model claims to capture the attacks such as distributed false route requests, Denial of Service, destination is compromised, impersonation, and routing information disclosure. The intrusion response model is a counter that is incremented wherever a malicious activity is encountered. When the value reaches a predefined threshold, the malicious node is isolated. The authors have provided statistics for the accuracy of the model. A cooperative distributed intrusion detection system (IDS) has been proposed in [10] by Zhang and Lee. This method employs co-operative statistical anomaly detection techniques. Each intrusion detection agent runs

independently and detects intrusions from local traces. Only one hop information is maintained at each node for each route. If local evidence is in conclusive, the neighboring IDS agents co-operate to perform global intrusion detection. The author utilizes misuse detection techniques to reduce the number of false positives. A context aware detection of selfish nodes utilizes hash chains in the route discovery phase of DSR and destination keyed hash chains and promiscuous made of link layer to observe malicious acts of neighboring nodes [11].This approach introduces a fear based awareness in the malicious node that their actions being watched and rated, which helps in reducing mischief in the system. A potential problem of this system could be mobility of the nodes. Since the malicious node can go out of range and again come in the network with a different IP address. It can still take advantage of the network. Since this method uses cryptographic mechanisms to detect malicious attacks, it cannot be classified as pure intrusion detection system. A specification based intrusion detection system for AODV [3]. It involves the use of Finite State Machines for specifying correct AODV routing behavior and distributed network monitors for detecting runtime violation of the specifications. An additional field in the protocol message is proposed to enable the monitoring. 3. AODV Security Problems In this section we present an overview of AODV ad hoc routing protocol and the threat model associated with it. 3.1 AODV overview AODV can be thought as a combination of both DSR and DSDV [9].It borrows the basic on demand mechanism of Route discovery and Route maintenance from DSR and the use of hop by hop routing ,Sequence numbers from DSDV.AODV is an on demand routing protocol ,which initiates a route discovery process only when desired by Source node. When a Source node S wants to send data packets to a destination node D but can not find a route in its routing table, it broadcasts a Route Request (RREQ) message to its neighbors, including the last known sequence number for that destination .The neighbors of the node then rebroad cast the RREQ message to their neighbors if they do not have a fresh route to the destination node .This process continues until the RREQ Message reaches the destination node or an intermediate node that has a fresh enough route. AODV uses Sequence numbers to guarantee that all routes are loop free and contain most recent routing information [9]. An Intermediate node that receives a RREQ replies to it using a route reply (RREP) message only if it has a route to the destination, whose corresponding destination Sequence numbers is greater or equal to the one contained in RREQ. Otherwise, the intermediate node broadcasts the RREQ packet to its neighbors until it reaches to the destination. The destination unicasts a RREP Back to the node that initiated route discovery by transmitting it to


83

the neighbor from which it received the RREQ. As RREP, back to the node that initiated the route discovery by transmitting it to the neighbor from which it received the RREQ. As RREP is propagated back to the source, all intermediate nodes set up forward route entries in their tables. The Route maintenance process utilizes link layer notifications, which are intercepted by neighbors are the one that caused the error. These nodes generate and forward route error (RERR) messages to their neighbors that have been using routes that include the broken link. In general a node may update the sequence numbers in its routing tables when ever it receives RREQ, RREP, RERR and RREP-Ack messages from its neighbors. 3.2 AODV Threat model In this Section the most important attacks are presented that are easily performed by an internal node against AODV [2, 12]. 3.2.1 Sequence number (black hole) Attack It is a type of Routing attack where a malicious node advertise it self as having the shortest path to all the nodes in the environment by sending a fake route reply. By doing this, the malicious node can deprive the traffic from the source node. It can be used as DOS attack, where it can drop the packets later. The set up for black hole attack is similar to routing loop attack in which attacker sends out forged routing packets. It can set up a route to some destination via it self and when the actual data packets get there they are simply dropped forming a black hole where data enters but not leaves. 3.2.2 Packet dropping Attack It is essential in ad hoc network that all nodes participate in the routing process. How ever a node may act selfishly and process only routing information that are related to it self in order to conserve energy. This behavior or attack can create network instability or even segment the network. 3.2.3 Resource Consumption Attack In this attack, the malicious node attempt to consume both the network and node resources by generating and sending frequent un necessary routing traffic. The goal of this attack is to flood the network with false routing packets to consume all available network bandwidth with irrelevant traffic and to consume energy and processing power from the participating nodes. There are several other similar attacks presented in the literature [4, 5, 6]. They exploit more or less the same routing protocol vulnerabilities to achieve their goals. Sequence number attack is specific to AODV, while the other two can be applied to any routing protocol. 4. Efficient Intrusion detection system Architecture

The EIDS Architecture utilizes the use of Finite state machines for Specifying AODV routing behavior and distributed Network monitors enable the system to detect attacks in real time rather than using statistical analysis of captured traffic. EIDS detects attacks against the AODV routing protocol in wireless Mobile ad hoc networks. The Architecture of EIDS is as shown in below figure.

Figure 1. Architecture of an Efficient Intrusion detection Systems

The EIDS is used to successfully detect both local and distributed attacks against the AODV routing protocol, with a low number of false positives. It uses Network monitors to trace RREQ and RREP messages in a request reply flow for distributed network. A Network monitor employs a FSM for detecting incorrect RREQ and RREP messages. The below fig shows the architecture of a Network monitor.

Figure 2. Architecture of a Network Monitor

Networks monitors are used to detect incorrect RREQ and RREP messages by listening passively to the AODV routing messages. A request reply flow can be uniquely identified by the RREQ ID, the source and destination IP addresses. Messages are grouped based on the request-reply flow to which they belong A network monitor employs a finite state machine (FSM) for detecting incorrect RREQ and RREP messages. It maintains a Finite state machine for each branch of a request-reply flow. A request flow starts at the Source state. It transmits to the RREQ Forwarding state when a source node broadcasts the first RREQ message (with a new REQ ID). When a forwarded broadcasting RREQ is detected, it

Intruder

Network Traffic

S

I

D

Active Monitor IDS

Attacks


84

stays in RREQ Forwarding state unless a corresponding RREP is detected. Then if a unicasting RREP is detected, it goes to RREP Forwarding state and stays there until it reaches the source node and the route is set up. If any suspicious fact or anomaly is detected, it goes to the suspicious or alarm states.

Figure 3. Finite State Machine Diagram

Figure 4. Suspicious and Alarm State Machine Diagram When a Network Monitor compares a new packet with the old corresponding packet, the primary goal of the constraints is to make sure that the AODV header of the forwarded control packet is not modified in an undesired manner. If an intermediate node responds to the request, the Network monitor will verify this response from its forwarding table as well as with the constraints in order to make sure that the intermediate node is not lying. In addition, the constraints are used to detect Packet drop and spoofing. 5. Evaluation The experiments for the evaluation of effective Intrusion detection systems for mobile ad hoc networks were carried out using the network simulator (ns-2).we have evaluated AODV with out any modifications, AODV with one malicious node present and AODV with the Effective IDS component enabled having in the network a malicious node. The Scenarios developed to carry these tests use as parameters the mobility of the nodes and the number of active connections in the network. The choices of the simulator parameter that are presented in Table 1 consider both the accuracy and the efficiency of the simulation.

Table 1: Simulation Parameters

Parameter

Value

Simulation(Grid) area 1000*1000m

Simulation duration 900 seconds Number of mobile hosts 30 Type of Packet Traffic CBR Maximum Speed 20 m/sec Node Mobility Random way point Transmission range 250m Routing Protocol AODV MAC Layer 802.11,Peer to Peer Dropped Packet time out 10 seconds Dropped Packet Threshold 10 packets

Clear delay 100 seconds Host Pause time 15 seconds Modification Threshold 5 events Neighbor hello period 30 seconds

The following are the metrics that we chosen to evaluate the impact of implemented attacks. (1) False Positives (2) detection Rate (3) Packet delivery ratio (4) Routing packet dropped ratio. These metrics were used to measure the severity of each attack and the improvement that Effective IDS manages to achieve during active attacks. Every point in the produced graphs is an average value of data collected from repeating the same experiment ten times in order to achieve more realistic measurements.

5.1 Sequence number (black hole) Attack Detection

The four metrics that were used in the evaluation of Sequence number attack detection and counter mechanisms are the delivery ratio, the number of false routing attacks sent by the attacker, false positive and detection rate.

Figure 5. Packet Delivery ratio against Number Of connections


85

Figure 6. Packet Delivery ratio against Speed of Nodes

Figure 7. Percentage of False Positives Against Percentage of bad nodes

Figure 8. Percentage of Detected bad Nodes against Percentage of bad nodes

In figures 5 and 6 delivery ratio is plotted as the node mobility or density increases. The normalized overhead of AODV is 2-4 times more when the network is loaded. In the graphs, the overhead of AODV is considered with a fully loaded network. as it can be seen from the graph, with EIDAODV running, delivery ratio is increased by as much as 72%.

The second metric that was used in the evaluation of this attack was the number of false packets sent by the attacking node versus the number of active connections and the node mobility. This metric was used to examine the overhead of the sequence number attack and we considered only the extra cost on communication imposed by the attack. We observed that the average number of RREP sent by the malicious node in all the experiments was 1856 and the number of nodes that inserted the false route into their routing table was 20 out of 30. In figure 7, false positives are nodes incorrectly labeled as malicious. As expected, the performance of Active response protocol improved with respect to false positives as the density of the malicious nodes increased. Figure 8 shows the detection rate. In the best case, 93% of the attacks can be detected, Where as, the worst case detection rate is 80%. There are several reasons why a bad node may go undetected. First, the bad node may not be in any path in the routing cache each time when the monitors begin to check. Since the paths are based solely on the paths maintained by the routing cache, if a node is not contained in any path, its forwarding function will not be monitored. Second, there may be two consecutive bad nodes in a path bad behavior of one node is hidden by the other bad node. 5.2 Packet Drop Attack Detection To evaluate this attack, the metrics chosen were delivery ratio and routing overhead ratio. The following graphs show the Performance.

Figure 9. Packet Delivery ratio against Number of Connections

Figure 10. Packet Delivery ratio against Speed of Nodes


86

Figure 11. Percentage of False Positives against Percentage of bad nodes Figure 9 shows that EIDAODV system improves the delivery ratio by 51% compared to plain AODV. Figure 10 shows that the routing overhead introduced by the attack reduces by 52%. EIDAODV reduces the routing overhead ratio to approximately the levels that normal AODV demonstrates. In Figure 11 we see that the performance of active response protocol improves with respect to false positives as the density of malicious nodes increases. Figure 12 shows that in the best case, 93% of the bad nodes can be detected. The worst-case detection rate is 77%.

Figure 12. Percentage of Detected bad nodes against Percentage of bad nodes

6. Conclusions An Effective Intrusion Detection System aiming at securing the AODV protocol has been developed using specification-based technique. It is based on a previous work done by Stamouli. The EIDS performance in detecting misuse of the AODV protocol has been discussed. In all the cases, the attack was detected as a violation to one of the AODV protocol specifications. From the results obtained, it can be concluded that our EIDS can effectively detect

Sequence Number Attack, Packet Dropping Attack with Incremental Deployment. The method has been shown to have low overheads and high detection rate. Our Intrusion Detection and Response Protocol for MANET’s have been demonstrated to perform better than the ones proposed by Stamouli in terms of false positives and percentage of packets delivered. Simulation results validate the ability of our protocol to successfully detect both local and distributed attacks against the AODV routing protocol, with a low number of false positives. References [1] Stamouli, P.G.Argyroudis, H.Tiwari, “Real time

Intrusion detection for ad hoc networks”, Proceedings of sixth IEEE Symposium on a world of wireless mobile and multimedia networks, (WOWMOM), 2003.

[2] W.Wang, Y.Lu and B.K.Bhargava “on vulnerability and protection of ad hoc on demand distance vector protocol”, Proceedings of international conference on telecommunications, 2003.

[3] TSENG, CHIN-YANG, et al “A specification based Intrusion detection system” for AODV in proceedings of 1st ACM workshop on security of ad hoc and sensor networks. (SAS.N ’03), Fair fax, V.A 2003.

[4] HUY.C, Pevirgn.A, and Johnson, D.B.Ariadane., “A secure on demand routing protocol for ad hoc networks”. In eight ACM international conference on mobile computing and networking (Mobicom2002), September 2002.

[5] Stajanos.F and Anderson.R. “The Resurrecting Duckling: security issues for ad hoc wireless networks, 7th international workshop Proceedings, 1999.

[6] C.Sivarama murthy, B.S Manoj, “Ad hoc wireless networks: Architectures and protocols”, Princeton hall, May 2004, New Jersey, USA.

[7] Seri go Marti, Thomas J.Givli, Kerenlai, Mary Baker, “Mitigating routing Mis behavior in mobile ad hoc networks”., Proceedings of MOBICOM, 2000.

[8] S.Bhargava, D.P.Agarwal, “security enhancement in AODV protocol for wireless ad hoc networks” in IEEE semi annual proceedings of vehicular technology conference (VCT’01), 2000.

[9] Charles E.perkins, “ad hoc on demand distance vector on (AODV) routing”, Internet draft, draft-ilet-manet-aodv-01.txt August 1998.

[10] Y .Zhang, W.Lee, Y.Huang. “Intrusion detection for wireless Ad hoc networks”. In Mobile Networks and applications, ACM 2002.

[11] P.Papadimitratos, Z.J.Han, “Security routing for Mobile

Ad hoc networks”, in the proceedings of the SCS communication Networks and Distributed systems, Modeling and Simulating conference (CNDS’02), January 2002.

[12] V.Madhu, A.A .Chari, “An approach for detecting attacks in mobile ad hoc networks”, Proceedings of journal of computer science 2008.

[13] Perkins, C.E. Rayer, E.M. Das. S, “Ad hoc on demand distance vector routing”, RFC 3651 (2003).


87

Author’s Profile

B.V.RamNaresh Yadav is working as a Research Scholar in CSE Department of JNT University, Hyderabad, Andhra Pradesh, India. His Area of interests includes Network security, Compliers, Computer Networks.

Dr.O.B.V.Ramanaiah is working as a Professor in CSE Department of JNT University, Hyderabad, Andhra Pradesh, India. His Area of interests includes Mobile Computing, Computer Networks and Operating Systems.

Dr B. Satya Narayana is working as a Professor in Department of CST of S.K.University, Anantapur, Andhra Pradesh, India. His Area of interests include Network security, Data ware Housing and Data Mining, Computer Networks and Artificial Intelligence.


88

Veracity Finding From Information Provided on the Web

D.Vijayakumar1, B.srinivasarao2, M.Ananda Ranjit Kumar 3

JNTU UNIVERSITY, P.V.P.S.I.T, C.S.E Dep., Vijayawada, A.P., India,

[email protected] , [email protected]

3Asst.Professor in C.S.E Dep., L.B.R.College of engineering, Vijayawada, A.P., India,

[email protected]

Abstract: The quality of information on the web has always been a major concern for internet users. On machine learning approach, the quality of web page is defined by human preference. The two Approaches Page Rank and Authority-Hub analysis are used to find pages with high authorities.Unfortunately; the popularity of web page does not necessarily lead to accuracy of information. In his paper, we propose “Truth Finder”, which utilizes the relationships between web sites and their information, i.e., a web site is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy web sites. Our experiments show that Truth Finder successfully finds quality of information better than Page Rank and Authority-Hub analysis, and identifies trustworthy web sites better than the popular search engines.

Keywords: page rank, hub analysis, Trustworthy. 1. Introduction

THE World Wide Web has become a necessary part of our lives and might have become the most important information source for most people. Everyday, people retrieve all kinds of information from the Web. For example, when shopping online, people find product specifications from websites like Amazon.com or ShopZilla.com. When looking for interesting DVDs, they get information on websites such as NetFlix.com

Unfortunately, the popularity of web pages does not necessarily leads Accuracy of information. two. Observations are made in our experiments: 1) even the most popular websites may Contain many errors, whereas some comparatively not so popular Websites may provide more accurate information. 2) More accurate information can be inferred by using many different websites instead of relying on single website.

2. Problem Definition In this paper, we propose a new problem called the Veracity problem, which is formulated as follows: Given a large amount of conflicting information about many objects, which is provided by multiple websites, how can we discover the true fact about each object? We use the word “fact” to represent something that is claimed as a fact by

some website and such a fact can be either true or false. In this paper, we only study the facts that are either properties of objects or relationships between two objects we also require that the facts can be parsed from the web pages. There are often conflicting facts on the Web, such as different sets of authors for a book. There are also many websites, some of which are more trustworthy than others. A fact is likely to be true if it is provided by trustworthy websites. A website is trustworthy if most facts it provides are true. At each iteration, the probabilities of facts being true and the trustworthiness of websites is inferred from each other. This iterative procedure is rather different from Authority-Hub analysis. Thus, we cannot compute the trustworthiness of a website by adding up the weights of its facts as in, nor can we compute the probability of a fact being true by adding up the trustworthiness of websites providing it instead, we have to resort to probabilistic computation. Second and more importantly, different facts influence each other. For example, if a website says that a book is written by “Jessamyn Wendell” and another says “Jessamyn Burns Wendell,” then these two websites actually support each other although they provide slightly different facts. We incorporate such influences between facts into our computational model. In summary, we make three major distributions in this paper. First, we formulate the Veracity problem about how to discover true facts from conflicting information. Second, we propose a framework to solve this problem, by defining the trustworthiness of Websites, confidence of facts, and influences between facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true facts using iterative methods. Our experiments show that TRUTHFINDER achieves very high accuracy in discovering true facts, and it can select better trustworthy websites than authority-based search engines such as Google.

3. Basic Definitions Confidence of facts: -The confidence of fact f is the probability of f being correct, according to the best of our Knowledge and is denoted by s (f).


89

Trustworthiness of websites: - The trustworthiness of a website w is the expected confidence of the facts provided by w and is denoted by t (w)

Our Problem Setting

• Each object has a set of conflictive facts

• E.g., different author names for a book

• And each web site provides some facts

• How to find the true fact for each object?

Figure 1. Input of Truth finder

3.1 Trustworthiness of the Web

The trustworthiness problem of the web. According to a survey on credibility of web sites:

• 54% of Internet users trust news web sites most of time

• 26% for web sites that sell products • 12% for blogs

Given a large amount of conflicting information about many objects, provided by multiple web sites How to discover the true fact about each object?

Different websites often provide conflicting info. On a subject, e.g., Authors of “Rapid Contextual Design” Table1: Conflicting Information about Book Authors

Online Store Authors Powell’s books

Holtzblatt, Karen

Barnes & Noble

Karen Holtzblatt, Jessamyn Wendell, Shelley Wood

A1 Books Karen Holtzblatt, Jessamyn Burns Wendell, Shelley Wood

Cornwall books

Holtzblatt-Karen, Wendell-Jessamyn Burns, Wood

Mellon’s books

Wendell, Jessamyn

Lakeside books

Wendell, Jessamynholtzblatt, Karenwood, Shelley

Blackwell online

Wendell, Jessamyn, Holtzblatt, Karen, Wood, Shelley

4.3 Basic Heuristics for Problem Solving There is usually only one true fact for a property of an object. This true fact appears to be the same or similar

on different web sites E.g., “Jennifer Widom” vs. “J. Widom”.The false facts on different web sites are less likely to be the same or similar. False facts are often introduced by random factors. A web site that provides mostly true facts for many objects will likely provide true facts for other objects 4.4 Overview of Our Method

3.3.1 Confidence of facts ↔ Trustworthiness of web

sites

A fact has high confidence if it is provided by (many) trustworthy web sites. A web site is trustworthy if it provides many facts with high confidence. Our method, Truth Finder Initially, each web site is equally trustworthy and Based on the above four heuristics, infer fact confidence from web site trustworthiness, and then backwards. Repeat until achieving stable state.

Web sites facts

High High

Worthiness Confidence Hubs Authorities

Figure 2. Facts ↔ Authorities, Web sites ↔ Hubs

3.3.2 Difference from authority-hub analysis

Linear summation cannot be used. A web site is trustable if it provides accurate facts, instead of many facts. Confidence is the probability of being true. Different facts of the same object influence each other.

User User0.0

TruthFinderSystem

Figure 3. Context Diagram

4. Modules

Collection data: First we have to collect the specific data about an object from different websites. The collected data is stored in related database. Create table for specific object and store the facts about a particular object.

Data search: Searching the related data link according to user input. In this module user retrieve the specific data about an object. Here user searches data in three ways.1.Normal search2.Page rank search

Truth finder search: We design a general framework for the Veracity problem, and invent an algorithm Called

W1 f1


90

Truth Finder. It utilizes the relationships between web sites and their information. That is a web site is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy web sites.

Result calculation: For each response of the query we are calculating the Performance. Using the count calculated find the best link and show as the out put.

Figure 4. System Architecture

5. Computational Model A Website trustworthiness if it provides facts with high confidence. We can see that the website trustworthiness and fact confidence are determined by each other, and we can use an iterative method to compute both. Because true facts are more consistent than false facts. We introduce the model of iterative computation 5.1 Computation Model (1): t (w) and s(f) We compute the trustworthiness of a web site w: t(w) by calculating the Average confidence of facts it provides Sum of fact confidence

Set of facts provided by w

We compute the confidence of a fact f: s (f)

One minus the probability that all web sites providing f are wrong

Probability that w Is wrong Set of websites providing f

5.2 Computation Model (2) Influence between related facts

Example: For a certain book B w1: B is written by “Jennifer Widom” (fact f1) w2: B is written by “J. Widom” (fact f2)

f1 and f2 support each other If several other trustworthy web sites say this book is written by “Jeffrey Ullman”, then f1 and f2 are likely to be wrong

5.3 Computation Model (3)

A user may provide “influence function” between related facts (e.g., f1 and f2 )

E.g., Similarity between people’s names The confidence of related facts are adjusted according to the influence function

Figure 5. .Computing confidence of a fact. Experiments: Finding Truth of Facts

Determining authors of books • Dataset contains 1265 books listed on

abebooks.com • We analyze 100 random books (using book mages)

Home

Login

Search Engine

Query Process

Truth Finder

Output

Login Validation

Conflicting Web Pages

Truth Finder Webpage’s

Truth Discovery with multiple conflicting information

( )( )( )

( )wF

fswt wFf∑ ∈=

( ) ( )( )( )

∏∈

−−=fWw

wtfs 11

t(w1)

t(w2)

w1

w2

t(w3)

w3

s(f1)

f1

s(f2)

f2

o1


91

Table 2: Comparison of the Results of Voting, Truth finder,

And Barnes & Noble

Case Voting Truth Finder

Barnes & Noble

Correct 71 85 64 Miss author(s) 12 2 4

Incomplete names

18 5 6

Wrong first/middle

names

1 1 3

Has redundant names

0 2 23

Add incorrect names

1 5 5

No information 0 0 2 Experiments: Trustable Info Providers

Finding trustworthy information sources most trustworthy bookstores found by Truth Finder vs. Top ranked bookstores by Google (query “bookstore”) Table 3: Comparison of the Accuracies of Top Bookstores by TRUTHFINDER and by Google

Truth finder Bookstore trustworthiness #boo

k Accurac

y TheSaintBookstor

e 0.971 28 0.959

MildredsBooks 0.969 10 1.0 Alphacraze.com 0.968 13 0.947

Google

Bookstore Google rank #book Accuracy Barnes & Noble 1 97 0.865 Powell’s books 3 42 0.654

6. Conclusion In this paper, we introduce and formulate the Veracity Problem, which aims at resolving conflicting facts from multiple websites and finding the true facts among them. We propose TRUTHFINDER, an approach that utilizes the interdependency between website trustworthiness and fact confidence to find trustable websites and true facts. Experiments show that TRUTHFINDER achieves high accuracy at finding true facts and at the same time identifies Websites that provide more accurate information

References: [1] LogisticalEquationfrom Wolfram MathWorld,

http://mathworld.wolfram.com/LogisticEquation.html, 2008.

[2] T. Mandl, “Implementation and Evaluation of a Quality-Based Search Engine,” Proc. 17th ACM Conf. Hypertext and Hypermedia, Aug. 2006.

[3] R. Guha, R. Kumar, P. Raghavan, and A. Tomkins, “Propagation of Trust and Distrust,” Proc. 13th Int’l Conf. World Wide Web (WWW), 2004.

[4] G. Jeh and J. Widom, “SimRank: A Measure of Structural-Context Similarity,” Proc. ACM SIGKDD ’02, July 2002.

[5] J.M. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” J. ACM, vol. 46, no. 5, pp. 604-632, 1999.

[6] J.S. Breese, D. Heckerman, and C. Kadie, “Empirical Analysis of Predictive Algorithms for Collaborative Filtering,” technical report, Microsoft Research, 1998.

Author Profile

D.Vijayakumar received the B.sc Degree from ANU University in 2002 and the M.sc mathematics degree from ANU University in 2004. He is doing M.Tech, Computer science & Engineering in P.V.P.S.I.T, Vijayawada, Andhra Pradesh, India. Now he was working as a lecturer in Sri Viveka Institute Of Technology,


92

The Online Scaffolding in the Project-Based Learning

Sha Li

Alabama A & M University, School of Education

4900 Meridian St., AL 35762, USA [email protected]

Abstract: This is a case study on the students’ attitude and perspectives on the effectiveness of the online scaffolding in a graduate computer literacy course. This study used the qualitative and quantitative mixed methods. Sixty-four students and one faculty member participated in this study. The findings show that the online scaffolding is an effective approach to integrate Internet technology in the computer project-based learning and help “Leave Nobody Behind.” It is beneficial for both the students and the faculty. Learning through the online scaffolding environment also models the students the effective use of online learning resources that impacted students in their life time. Keywords: scaffolding, online learning, distance education, resource-based learning 1. Introduction Vygotsky asserted that learning is a social process [19]. Social Learning Theory (SLT) is a category of learning theories which is grounded in the belief that human behavior is determined by a three-way relationship between cognitive factors, environmental influences, and behavior [2]. “If people observe positive, desired outcomes in the observed behavior, then they are more likely to model, imitate, and adopt the behavior themselves” [20]. To observe the peer’s behaviors could result in even more peer modeling effect in the learner. The metaphor of scaffolding is grounded in the social learning theories of Vygotsky’s “Zone of Proximal Development” (ZPD). Vygotsky said:

…the difference between the child’s developmental level as determined by the independent problem solving and the higher level of potential development as determined through problem solving under adult guidance or in collaboration with more capable peers. [19]

Learning support falls into three categories: cognitive, affective, and systematic [18]. They are usually in a combined mechanism in the teaching process. The skeleton of the concept is also guided by a series of the guidance, counseling, coaching, tutoring, assessment, etc. [13]. Scaffolding is an effective teaching approach to support learners [1]. The theoretical concept of the social learning underlines the scaffolding perspective. With the scaffolding, learners are guided, supported and facilitated during the

learning process [18]. “Scaffolding refers to providing contextual supports for meaning through the use of simplified language, teacher modeling, visuals and graphics, cooperative learning and hands-on learning” [7]. Combined with the scaffolding, the learning of the new multimedia project creation provides more interest and motivation, keeping students on track and increasing retention [14]. The scaffolding structured through the Internet platform has intrigued interest in the educators in the blended classrooms as well as in the distance education classes [1, 13]. The hands-on learning in the authentic context is known as experiential learning [21]. Learning would be most effective when learning takes place in the authentic context [11]. Authentic instruction use teaching strategies such as: structuring learning around genuine tasks, employing scaffolding, engaging students in inquiry and social discourse, and providing ample resources for the learners [8]. When rich learning resources are incorporated and available to learners, effective learning would be achieved [9]. 2. Method This is a case study on the students’ perspectives on the effectiveness of the online resource-based scaffolding in a computer project-based learning course. It adopts the qualitative and quantitative mixed methods. The quantitative method is one in which the investigator uses postpositivist claims for developing knowledge, such as cause and effect thinking, reduction to specific variables and hypotheses and questions. It takes measurement to test hypothesis or theories. While the qualitative method is one in which the inquirer makes knowledge claims mainly based on constructivist perspectives or participatory perspectives, or both [3]. The quantitative research statistically measures situation, attitudes, behavior, and performance, utilizing a series of tests and techniques. Quantitative research often yields data that is projectable to a larger population. Because it so heavily relies on numbers and statistics, quantitative research has the ability to effectively transform data into quantifiable charts and graphs. Qualitative research discovers in the depth to which explorations are conducted and descriptions are written, usually resulting in sufficient details for the reader to grasp the idiosyncrasies of the situation or phenomenon [16]. The quantitative approach uses predetermined closed ended questions to collect numeric data, employing statistical procedure scientifically;


93

the qualitative approach uses open ended questions in the interview, observation and document review to collect data through human interaction. Both methodologies have their own strengths and weaknesses. The qualitative and quantitative mixed research method is regarded to better explain the process of an event and to give a more meaningful result [6]. The mixed research method draws from the strength of both qualitative and quantitative approaches, and minimizes the weakness of either in a single research study [10]. It increases the validity and reliability of the findings by allowing examination of the same phenomenon in different ways and promotes better understanding of findings [4]. This study adopts the quantitative descriptive statistics and qualitative mixed method. The course of FED 529 Computer-Based Instructional Technology was used as a case for this study. The purpose of this study is to explore the learners’ perspectives on the effectiveness of the online scaffolding in the FED 529 project-based learning class. This study uses three graduate classes of FED 529 Computer-Based Instructional Technology as a case. The data are collected in three FED 529 classes in the spring, summer and fall semesters in 2010. Sixty-four graduate education students and one instructor participated in this study. The FED 529 class was a project-based computer literacy class in the traditional classroom. Rich online learning resources were created on the class website as an enhancement or support to learning in this class. The resources like project models, tutorials, multimedia resources, and writing help are online as scaffolding resources to learners. The learning content is related to the concepts and relationships that are relevant to the real-world instructional projects for various content areas, such as PowerPoint presentation for science, math, etc., instructional web page design, graphics design, video editing, sound editing, etc. The writing project is a research paper on the use of the educational technology. Though the FED 529 class is taught in the traditional classroom, the classroom teaching and the online learning resource-based support exist at the same time to make it a blended format (in a traditional instructional format but also integrating plenty of online learning resources) so as to enhance learning in a more effective way [1]. The course website is at http://myspace.aamu.edu/users/sha.li. 3. Data Analysis FED 529 Computer-Based Instructional Technology course is a traditional classroom computer literacy class. The students age from 20s to 50s. Their computer knowledge and skills are in a large discrepancy. Even though learning is in a face-to-face format, the students still expect extra support or help during the learning process because some students have limited experiences of using computers. They are easy to get lost in class. They need extra time and help to catch up. Others might have missed class because of their personal events. To follow the principle of No Student Left Behind, the instructor design the online learning support for the students by integrating rich course information, resources to scaffold the learners so as to ensure that everybody learns successfully. Through the interview and survey, students gave responses about their experiences and

perspectives on their use of the online scaffolding for learning. The first part of the responses is around the use of the project samples or examples provided online. Students value the effect of using the online project samples to learn to create new projects.

I think it is a great help to access our course website. In conducting activities to do projects and assignments, this website gave us a number of sample projects in various ways (to tell us how) to do PowerPoint presentations, and using Excel to create charts and graphs. It helps me with learning how to effectively use track change, making flyers for class activities. And (by following the examples) it saves my time in doing mail-outs to parents by creating the mail merge project.

This website provides a lot of good resources for our projects, saving us a lot of research time. There were also some good examples of projects developed by other students which helped (us) better understand our class assignments to meet higher requirement. I am a visual learner. The computer is pretty new to me. But the class website gave me great examples to go by…. I have used the website tutorials to learn some projects, and used the resources to insert sound, music, and clip arts into my PowerPoint projects, flyer, and web pages. I depended on the previous students’ model projects to create in learning to do the quality work.

The old proverb says that “seeing is believing.” The novice students are in demand of the visual experience or “first sight experience” to gain preliminary understanding [5]. Providing only verbal talking without visual presentation would result in ineffective teaching. So giving the students samples/examples, or models, enhanced knowledge acquisition as well as comprehension. The computer literacy class provides a variety of trainings. The cognitive load is high for the students especially whose background is weak. To involve students in the learning activities by starting with the good examples could smoothly engage the learners to self goal-setting according to the interesting and high quality projects. Tutoring is the next major issue of the teaching. While goal is set up, hands-on learning starts. About two thirds of the students could follow in-class instruction to create the projects, but one third of the students need extra time and help to finish them, according to Dr. Lee, the instructor of the class. “Some students are old students, and some are from the families which could not provide enough computer access for the students. So both the teacher and the students have to spend more effort to accomplish the class objectives with assignments. When I can provide one on one support to those students, I provide. When I am short of time, I might refer students to the online tutorials, called FAQs, to learn during after-class period. This is also effective and also


94

saves me a lot of time. About 90% of the students who used the online tutorials told me that they feel satisfied to learn with the help of the online tutorials, because they are clear to follow and easy to access. The rest of the students are either too old to learn effective by themselves, or have limited access to the computer/Internet at home,” Dr. Lee added. The online FAQs are in two formats: text tutorials and video tutorials. Some students preferred using the text FAQs, but majority of the low skill students preferred using video FAQs. Some of the students’ survey feedback about the online scaffolding was provided as follows:

The good side of video tutorials is it teaches you visually and with the instructor’s sound. The downside of the video tutorials is sometimes it is hard to download at home, because I use the dial up phone line to connect to Internet. When I use videos (to learn) at home, I have to be patient to download them. But they do help me (learn). I am familiar with most of the commonly used computer programs like Microsoft Word, Excel, PowerPoint, Internet, etc. I can finish most of the projects in class except some of the advanced projects which have more advanced skills I don’t know. When I need to use the FAQs, the text FAQs are enough for me, because I read text faster. I only occasionally use video FAQs, not very much to learn the things I missed.

I prefer video FAQs. The video FAQ allows for you to view as exactly as it displays on the computer screen. It is easy to understand for people like me. Since I am not good at computers, I like the tutorial to teach me slowly. The windows media player can pause and slide back to view video repeatedly. That is exactly what I like.

The video tutorial was some of the students’ favorite help. The FAQ files are mall and easy to download. It has video with instructor’s sound that improves understanding. Dr. Lee recorded those video tutorials by using the online free screen capture ware to make them. The screen capture-based software captures the computer screen interface of the projecting process from the beginning to the end, displaying the real procedure of creating a project on the computer step by step visually and auditorily. With the variety of the screen capture tools available, integrating the screen capture-based programs into the scaffolding become an easier and convenient practice. It empowers the facilitation and solicits enhanced students engagement in learning. Dr. Lee told that the current popular screen capture is the Catasia (www. camtasia.com), but the Catasia is a software package that costs money. So he chose to use the online free ware of Windows Media Encoder (http://technet.microsoft.com/en-us/library/bb676137.aspx) or Instant Demo (www.instant-demo.com) to capture the process of the projecting screen processes. Doing this can save money in one way, and model the educational technology use for the teacher students in another. Dr. Lee

chose to use more of the Instant Demo to capture screen of the project tutorials because the Windows Media Encoder could malfunction for some time when a new version of the Windows Operating System comes out. The Instant Demo is more stable with the change of the Windows versions because it uses the Flash Player to display the video tutorial that upgrades to the new version of Windows faster than other programs. Flash works well with multimedia and easy to play on the Internet. Students like to view the video FAQs which involved the students at the low level of the technology skills in effective learning to catch up to a more advanced level. The FED 529 class was enhanced with the integration of the Internet technology. The learners are benefited with the online learning resources. They offered their feedback in the interviews on their opinion of the use of the online resources.

I think that the website is helpful for those who did not understand or missed a class. They can have a general idea of what is expected of them. They can also follow those step by step tutoring to learn. And the website is easy to access.

The website has good examples of the finished assignments with tutorials. I felt like it was very helpful. Whenever I needed an answer, I could just look at this website for examples. It included many other useful website links also, such as BrainyBetty’s multimedia resources, Youtube video insertion skills with PowerPoint, clip art download, sample research articles, and APA writing guide.

Our class website is very informational. It is neat. It helped me tremendously throughout this semester. I really wish other teachers could consider doing websites too. I think all students should have opportunities to see the examples of class tasks. I will create my own instructional websites for my classes when I start teaching in a school, just like Dr. Lee’s. The teacher’s instructional website contains useful information and examples that helped me understand the class content. With the online tutor and demonstration, I was able to preview and practice the projects prior to actually doing them in class. It allowed me to access very valuable information with very little effort on my part. I really appreciate the time and effort that Dr. Lee put into this website.

Dr. Lee said that his time he contributed to the class website design was a lot, but the benefit his class received is also a lot. In this class, not only did Dr. Lee teach and facilitate students, his website also taught and facilitated the class. Dr. Lee provided instruction and scaffolding physically, and his website provided teaching and scaffolding through online informational resources. “It is a two-way teaching strategy in this learning environment,” as


95

Dr. Lee said. There is another gain. That is that the integration of the Internet technology into the instruction has imposed a conspicuous impact on the learners. The teacher has modeled the use of the online resources to scaffold learning for students. The students not only learned the course content, but also learned how to integrate Internet technology into education that results a long term effect in students. In the survey on the students’ attitude and perspectives on the online scaffolding, students’ responses are displayed in the Table I.

From the data in Table I, we can see that the Mean of the students’ overall evaluation of the class is 4.15. It is pretty high. That means the students highly value their experience in this online scaffolding resource aided class. The highest items are number 1 (about the online examples), number 10 (the overall evaluation of the class), number 7 (about the comfortableness of this class), and number 4 (about the user-friendliness of the online resources). The number 1 item is one of the highest probably because most of the students prefer a “first sight” experience on a satisfying project outcome. Viewing project examples could increase the students’ interest and motivation students invest plenty of time and effort in creating excellent work (Li & Liu, 2005). The lowest items are the number 2 (about the use of the online tutorials) and number 8 (about expecting other teachers to create instructional websites too). The number 2 item is the lowest probably because the students who could learn without online tutorials are excluded from it. “The actual students who need more of the online tutorials are about one third of the total students. The others occasionally access online tutorials to learn when it is necessary for them,” Dr. Lee said, “It verifies our scaffolding has targeted at that group to leave nobody behind. That is exactly what our scaffolding means.”

The Table II displays the students’ attitude toward each specific item of the scaffolding resources. Table II shows the specific items in the online scaffolding resources. We can see that the students’ preference to use the online scaffolding resource

items is middle high as a whole. The lowest is the item of Tutorials/FAQs (Mean = 2.28). Probably it means that there were a larger number of students who did not need to depend on the online scaffolding to learn. But that seems normal in this class because this class is a blended course. The online scaffolding is a supplement to those students who need more help than others. 4. Findings and Conclusion The findings show that the online scaffolding is effective in supporting the students in the FED 529 Computer-Based Instructional Technology course. Integrating the online scaffolding enhanced the class instruction, motivated the learners’ involvement and promoted learning outcome [9]. Students value all the scaffolding components in respect to their learning. The course website is a desirable place to help the less advanced students to catch up to “Leave Nobody Behind.” Even though more of the low technology level students resorted to the online scaffolding than the high level students, majority of the students in this class agreed that the online scaffolding was an effective tool and they would like to create a website for their own classes when they are ready to teach in a school as a teacher. On the other hand, the use of the online resource-based scaffolding models the integration of Internet technology for the students so that they could observe the instructional approach/strategies and learn from their classroom experience. Integrating online scaffolding impacted students in their perspectives of how to be an effective teacher by creating an instructional website to teach. The project-based learning through doing not only prepared the students in learning the intended content areas and skills, but also left them a new concept of being a competitive teacher in this Information Age [1]. This impact is important and long lasting in their life. This online scaffolding approach also tells us that though integrating Internet technology requests for painstaking effort, it is effective once the resources are created. It is long lasting, upgradeable, and convenient for learning. By teaching students how to integrating technology, we also need to teach them the disposition of being a hardworking teacher who is dedicated to education and willing to invest personal time and intellectual endeavor into instructional design to be creative to use technology for education effectively.


96

Reference [1] C.C. Ang, A.H. Tey, K.P. Khoo, Blended Project-Based

Learning with use of eMath Tools. In Z. Abas et al. (Eds.), Proceedings of Global Learn Asia Pacific 2010 (pp. 278-283). AACE, 2010.

[2] A. Bandura, Social Learning Theory. General Learning Press, 1977.

[3] J. Brannen, Mixing methods: qualitative and quantitative research. Avebury, Aldershot, 1992.

[4] J. Brewer, A. Hunter, Multimethod research. A Synthesis of styles. Sage, Newbury Park, 1989.

[5] R.F. Chong, X. Wang, M. Dang, D.R. Snow, Understanding DB2®: Learning Visually with Examples, 2nd Edition. IBM Press, New York, 2007.

[6] J.W. Creswell, Research Design: Qualitative, Quantitative, and Mixed Methods Approaches (2nd Ed). Sage Publications, Thousand Oaks, 2003.

[7] L.T. Diaz-Rico, K.Z. Weed, The cross cultural, language, and academic development handbook: A complete K-12 reference guide (2nd Ed.), p. 85. Ally & Bacon, Boston, 2002.

[8] C. Donovan, L. Smolkin, Children's Genre Knowledge: An Examination of K-5 Students Performance on Multiple Tasks Providing Differing Levels of Scaffolding. Reading Research Quarterly Newark, (3794), p. 428-465, 2002.

[9] J.R. Hill, M.J. Hannafin, The resurgence of resource-based learning. Educational Technology, Research and Development, 49(3), p. 37-52, 2001.

[10] R.B. Johnson, A.J. Onwuegbuzie, Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33(7), p. 14-26, 2004.

[11] J. LAVE, E.WENGER, SITUATED LEARNING: LEGITIMAE PERIPHERAL PARTICIPATION. CAMBRIDGE UNIVERSITY PRESS, NEW YORK, 1991.

[12] S. Li, Daonian. Liu, The Online Top-Down Modeling Model. Quarterly Review of Distance Education, 6(4), p. 343-359, 2005.

[13] S. LUDWIG-HARDMAN, J.C. DUNLAP, LEARNER SUPPORT SERVICES FOR ONLINE STUDENTS: SCAFFOLDING FOR SUCCESS. THE INTERNATIONAL REVIEW OF RESEARCH IN OPEN AND DISTANCE LEARNING, 4(1), P. 1-15, 2003.

[14] C. McLoughlin, L. Marshall, Scaffolding: A model for learner support in an online teaching environment. 2000. Available: http://lsn.curtin.edu.au/tlf/tlf2000/mcloughlin2.html

[15] M. Myers, Qualitative research and the generalizability question: Standing firm with Proteus. The Qualitative Report, 4(3/4). 2000. Available: http://www.nova.edu/ssss/QR/QR4-3/myers.html

[16] E. Stacey, Collaborative learning in an online environment. Journal of Distance Education, 14(2), p. 14 – 33. 1999. Available: http://cade.icaap.org/vol14.2/stacey.html

[17] A. Tait, Planning student support for open and distance learning, Open Learning, 15(3), p. 287 – 299, 2000.

[18] L.S. VYGOTSKY, MIND IN SOCIETY: THE DEVELOPMENT OF HIGHER PSYCHOLOGICAL PROCESSES, P.76. HARVARD UNIVERSITY PRESS, CAMBRIDGE, 1978.

[19] Wikipeida, Social learning theory. 2009. Available: http://en.wikipedia.org/wiki/Social_learning_theory

[20] D.A. Kolb., R. Fry. Toward an applied theory of experiential learning, in C. Cooper (ed.) Theories of Group Process. John Wiley, London. 1975.

[21] C.R. Rogers, H.J. Freiberg. Freedom to Learn (3rd Ed). Merrill/Macmillan, Columbus. 1994.

Author Profile Sha Li received his doctoral degree in educational technology from the Oklahoma State University in 2001. He is an Associate Professor in the Alabama A&M University. His research interests are in e-learning in the networked environment, distance education, multimedia production, and instructional design with technology. He is also an instructional design facilitator for the local public school systems.


97

Hybrid Portable Document Format Protection Scheme based on Public key and Digital

Watermarking Karim Shamekh1, Ashraf Darwish2 and Mohamed Kouta3

1Computer Science Department, High Technology Institute, 10th of Ramadan, Egypt, [email protected]

2Computer Science Department, Faculty of Science, Helwan University, Cairo, Egypt [email protected]

3Arab Academy for Science, Technology and Maritime, Cairo, Egypt [email protected]

Abstract: Portable Document Format (PDF) developed by Adobe Systems Inc. is a flexible and popular document distribution and delivery file format, and it is supported within various operating systems and devices. The ease of reproduction, distribution, and manipulation of digital documents creates problems for authorized parties that wish to prevent illegal use of such documents. To this end, digital watermarking has been proposed as a last line of defence to protect PDF files copyright through visible watermarks in such files. As well as, to preserve the integrity of the digital watermark, an asymmetric cryptographic algorithm (DES) is employed. The proposed watermarking method does not change the values of the stored data objects. Experimental results show the feasibility of the proposed method and provide a detailed security analysis and performance evaluation to show that the digital watermarking is robust and can withstand various types of attacks.

Keywords: Portable Document Format (PDF), Watermarking (WM), Data Encryption Standard (DES), Copyright protection, Cryptosystems.

1. Introduction The number of files that are published and exchanged through the Internet is constantly growing and electronic document exchange is becoming more and more popular among Internet users. The diversity of platforms, formats and applications has called for a common technology to overcome those differences and produce universally readable documents to be exchanged without limitations. Even though it is supported by nearly every application on any machine, plain text ASCII has failed to become popular because it does not allow text formatting, image embedding and other features that are required to an efficient communication. Portable Document Format (PDF) files [1] are popular nowadays, and so using them as carriers of secret messages for covert communication is convenient. Though there are some techniques of embedding data in text files [2-4], studies of using PDF files as cover media are very few, except Zhong et al. [5] in which integer numerals specifying the positions of the text characters in a PDF file are used to embed secret data. In this paper a new algorithm for PDF documents protection has been presented. The PDF, created by Adobe Systems for document exchange [1], is a fixed-layout format for representing documents in a manner independent of the application software, hardware,

and operating system. Each PDF file contains a complete description of a 2-D document, which includes text, fonts, images, and vector graphics described by a context-free grammar modified from PostScript. Many PDF readers are available for use to read PDF files; each PDF file appears in the window of a PDF reader as an image-like document. The main advantage of the PDF format is that it allows documents created within any desktop publishing package to be viewed in the original typeset design, regardless of the systems where it is being displayed. Documents with texts, images, hyper-links and other desirable features in document authoring, can be easily created with the packages distributed by Adobe or with any other authoring application (e.g., Microsoft Office, OpenOffice, LaTeX, etc.) and then converted to the PDF format. The result is an easy to distribute, small size document, that will be displayed exactly in the way it was created, on any platform and using any viewer application. Besides being very flexible and portable, PDF documents are also considered to be secure. Popular document formats like Microsoft Compound Document File Format (MCDFF) have been proven to have security flaws that can leak private user information(see Castiglione et al.(2007)[6]), while PDF documents are widely regarded as immune to such problems. This is one of the reasons why many governmental and educational institutions have chosen PDF as their official document standard. In this paper, we will start giving a concise overview of the PDF format, focusing on how data is stored and managed. Digital watermarking is a relatively new research area that attracted the interest of numerous researchers both in the academia and the industry and became one of the hottest research topics in the multimedia signal processing community. Although the term watermarking has slightly different meanings in the literature, one definition that seems to prevail is the following [7]: Watermarking is the practice of imperceptibly altering a piece of data in order to embed information about the data. The above definition reveals two important characteristics of watermarking. First, information embedding should not cause perceptible changes to the host medium (sometimes called cover medium or cover data). Second, the message should be related to the host medium. In this sense, the watermarking techniques form a subset of information hiding techniques. However, certain authors use the term watermarking with a


98

meaning equivalent to that of information hiding in the general sense. Digital media are replacing traditional analog media and will continue to do so. By digital media, we mean digital representations of audio, text documents, images, video, three-dimensional scenes, etc. These media offer many benefits over their analog predecessors (e.g., audio and video cassettes). Unlike analog media, digital media can be stored, duplicated, and distributed with no loss of fidelity. Digital media can also be manipulated and modified easily. Clearly, digital media offer many benefits, but they also create problems for parties who wish to prevent illegal reproduction and distribution of valuable digital media (e.g., copyrighted, commercial, privileged, sensitive, and/or secret documents).Two classic methods for protecting documents are encryption and copy protection. However, once decrypted, a document can be copied and distributed easily, and copy-protection mechanisms can often be bypassed. As a safeguard against failures of encryption and copy protection, digital watermarking has been proposed as a last line of defence against unauthorized distribution of valuable digital media. This paper provides insight for some of the security issues within the use of digital watermarking and its integrity with cryptographic algorithm (DES). The remainder of this paper is as follows. Section 2 surveys related works while Section 3 analyzes the portable document files (PDF) format. Section 4 reviews the digital watermarking and cryptographic systems. Section 5 introduces the proposed scheme, experimental results and analysis. Conclusion and future work are given in section 6.

1.1 Copyright Protection The main requirements for copyright-protection watermarking algorithms are robustness (denoting how the watermark can survive any kind of malicious or unintentional transformations), visibility (does the watermark introduce perceptible artifacts), and capacity (the amount of information which can be reliably hidden and extracted from the document). For copyright applications, robustness should be as high as possible, and visibility as low as possible in order to preserve the value of the marked document. Note, however, that capacity can be low since copyright information generally requires a rather small amount of information, which can be an index inside a database holding copyright information. Other requirements can be outlined, which are: security (from the cryptographic point of view), and that the scheme should be oblivious (the original or cover image is not needed for the extraction process). Many robust watermarking schemes have been proposed, consisting in either spatial domain or transform domain watermarks. The main issue addressed for these schemes these last years is the robustness of watermarks against various intentional or unintentional alterations, degradations, geometrical distortions or media-conversion which can be applied to the watermarked image. The four main issues of the state-of-the-art watermark robustness are described in more details in [8, 9].

2. Related Work Information about the Portable Document Format can be found in [10]. Those documents are the main source of information on how PDF documents are structured and managed by various PDF compliant applications. Even though information leakage in published documents is a well known issue, only a few publications investigate the problem. Byers (Byers, 2004 [11]) showed how hidden text can be extracted from MicrosoftWorddocuments.Hecollectedover100,000 documents and compared the text that appears when each document is opened with Microsoft Word, with the text extracted from the same documents using widely known text extraction tools Almost each processed document had some hidden contents like previously deleted texts, revisions, etc. Castiglione et al.(2007)[6] conducted a more extensive study on the popular document format, investigating the way Microsoft Compound documents are created and structured. The same authors developed some tools to extract document hidden meta data as well as sensitive publisher information and show how to use the slack space responsible of such threat as a steganographic means. Several companies and institutions have distributed guidelines to avoid information leakage in published documents after that the media reported news about documents published on the Web containing sensitive information which were not supposed to become public. For example, in May 2005 the Coalition Provisional Authority in Iraq published a PDF document on the “Sgrena-Calipari Incident”. Black boxes were used to conceal the names of some people involved in the incident, but all of them were easily revealed copying the text from the original document into a text editor [12]. Several papers discuss the PDF structure [13, 14] and some of them introduce tools for content extraction from PDF documents [15] or tools to use PDF documents as a steganographic means [16].

3. The Portable Document Format (PDF) This section will give a brief over view of the PDF format, highlighting the parts that are relevant to our work. The Portable Document Format is based on the PostScript (Adobe Systems Inc., 1999) page description language and has been introduced to improve performances and provide some form of user interactivity. A PDF document consists of a collection of objects which together describe the appearance of the document pages. Objects and structural information are all encoded in a single, self contained, sequence of bytes. The structure of a PDF file has four main components:

• A header identifying the version of the PDF specification to which the file complies. The file header contains a five character magic number, ‘‘%PDF-’’, and a version number in the form 1.x where x is a digit between 0 and 7, as shown in Table 1.

• One or more body sections containing the objects that constitute the document as it appears to the user;


99

• One or more cross-reference tables storing information and pointers to objects stored in the file;

• One or more trailers that provide the location of the cross-reference tables in the file.

Table 1: PDF file header For version 1.7 document Header Version

%PDF- 1.7

A newly created PDF document has only one body section, one cross-reference table and one trailer. When a document is modified, its previous version is not updated, but any changes and new contents are appended to the end of file, adding a new body section, a new section of the cross-reference table and a new trailer. The incremental update avoids rewriting the whole file, resulting in a faster saving process, especially when only small amendments are made to very large files. Objects stored in the body section have an object number used to unambiguously identify the object within the file, anon-zero generation number and a list of key-value pairs enclosed between the keywords (obj) and endobj. Generation numbers are used only when object numbers are reused, that is, when the object number previously assigned to an object that has been deleted is assigned to a new one. Due to incremental updates, whenever an object is modified, a copy of the object with the latest changes is stored in the file. The newly created copy will have the same object number as the previous one. Thus, several copies of an object can be stored in the file, each one reflecting the modifications made to that object from the time it was created, onwards. The cross-reference table is composed of several sections and allows random access to file objects. When a document is created, the cross-reference table has only one section and new sections are added every time the file is updated. Each section contains one entry per object, for a contiguous number of objects.

Table 2: An example of cross-reference table xref 0 36 0000000000 65535 f 0000076327 00000 n 0000076478 00000 n 0000076624 00000 n 0000078478 00000 n 0000078629 00000 n 0000078775 00000 n 0000078488 00000 n 0000078639 00000 n . . . 0000100661 00000 n

An example of cross-reference table section is given in Table 2.As shown, each section starts with the keyword

(xref) followed by the object number of the first object that has an entry in that section and the number of its entries. Table 2 shows a section with entries of 36 objects, from object 0 to object 35. Each entry provides the following information: • The object offset in the file; • The object generation number; • The free/in-use flag with value n if the object is in use or ƒ if the object is free, that is, if the object has been deleted. Object 0 is a special object and it is always marked as free, with generation number 65535. The latest document (trailer) is stored at the end of the file and points to the last section of the cross-reference table. A PDF document is always read from the end (apart when generated with the “Fast Web View ”flag enabled),looking for the offset relative to the last section of the cross-reference table, required to identify the objects that constitute the latest version of the document. Each time the document is updated–adding new objects or modifying existing ones–a new body, cross-reference table section and trailer are appended to the file. The body section will contain the newly created objects or the updated version of the existing ones, the cross-reference table section will store information to retrieve those objects, while the trailer will have a reference to the newly created cross-reference table section, as well as a pointer to the previous one.

4. Digital Watermarking and Cryptosystems: basics and overview

4.1 Digital watermarking Digital watermarking requires elements from many disciplines, including signal processing, telecommunications, cryptography, psychophysics, and law. In this paper, we focus on the process of protecting PDF documents. An effective watermark should have several properties, listed below, whose importance will vary depending upon the application.

• Robustness The watermark should be reliably detectable after alterations to the marked document. Robustness means that it must be difficult (ideally impossible) to defeat a watermark without degrading the marked document severely-so severely that the document is no longer useful or has no (commercial) value.

• Imperceptibility or a low degree of obtrusiveness To preserve the quality of the marked document, the watermark should not noticeably distort the original document. Ideally, the original and marked documents should be perceptually identical.

• Security Unauthorized parties should not be able to read or alter the watermark. Ideally, the watermark should not even be detectable by unauthorized parties.

• No reference to original document For some applications, it is necessary to recover the watermark without requiring the original, unmarked document (which would otherwise be stored in a secure archive).


100

• Unambiguity A watermark must convey unambiguous information about the rightful owner of a copyright, point of distribution, etc. Of these properties, robustness, imperceptibility, and security are usually the most important. When speaking of robustness, we often talk about attacks on a watermark. An attack is an operation on the marked document that, intentionally or not, may degrade the watermark and make the watermark harder to detect. For text documents, an attack might consist of photocopying.

4.2. Visible watermarking Visible watermarking techniques are used to protect the copyright of digital multimedia (audio, image or video) that have to be delivered for certain purposes, such as digital multimedia used in exhibition, digital library, advertisement or distant learning web, while illegal duplication is forbidden. Braudaway et al. [17] proposed one of the early approaches for visible watermarking by formulating the nonlinear equation to accomplish the luminance alteration in spatial domain. In this scheme, dimensions of the watermark image are equal to those of the host image. There is a one-to-one correspondence between pixel locations in the watermark image and those in the host image. According to their brightness, pixels in the watermark image can be divided into transparent and non-transparent categories. The brightness of each pixel in the host image in proportion to the non-transparent regions of the watermark will be increased or reduced to a perceptually equal amount by using a nonlinear equation while the brightness of each pixel in proportion to the transparent regions of the watermark will remain the same after watermark embedding. Meng and Chang [18] applied the stochastic approximation for Braudaway’s method in the discrete cosine transform (DCT) domain by adding visible watermarks in the video sequences. Mohanty et al. [19] proposed a watermarking technique called dual watermarking which combines a visible watermark and an invisible watermark in the spatial domain. The visible watermark is adopted to establish the owner’s right to the image and invisible watermark is utilized to check the intentional and unintentional tampering of the image. Chen [20] has proposed a visible watermarking mechanism to embed a gray level watermark into the host image based on a statistic approach. First, the host image is divided into equal blocks and the standard deviation in each block is calculated. The standard deviation value will determine the amount of gray value of the pixel in the watermark to be embedded into the corresponding host image. Kankanhalli et al. [21] proposed a visible watermarking algorithm in the discrete cosine transform (DCT) domain. First, the host image and the watermark image are divided into 8 x8 blocks. Next, they classify each block into one of eight classes depending on the sensitivity of the block to distortion and adopt the effect of luminance to make a final correction on the block scaling factors. The strength of the watermark is added in varying proportions depending on the class to which the image block belongs. Mohanty et al. [22] proposed a modification

scheme on their watermark insertion technique of [23] in order to make the watermark more robust. Hu and Kwong [24, 25] implemented an adaptive visible watermarking in the wavelet domain by using the truncated Gaussian function to approximate the effect of luminance masking for the image fusion. Based on image features, they first classify the host and watermark image pixels into different perceptual classes. Then, they use the classification information to guide the pixelwise watermark embedding. In high-pass subbands, they focus on image features, while in the low-pass subband, they use truncated Gaussian function to approximate the effect of luminance masking. Yong et al. [26] also proposed a translucent digital watermark in the DWT domain and used the error-correct code to improve the ability of anti-attacks. Each of the above schemes wasn’t devoted to better feature based classification and the use of sophisticated visual masking models, so Huang and Tang [27] presented a contrast sensitive visible watermarking scheme with the assistance of HVS. They first compute the CSF mask of the discrete wavelet transform domain. Secondly, they use a square function to determine the mask weights for each subband. Thirdly, they adjust the scaling and embedding factors based on the block classification with the texture sensitivity of the HVS. However, their scheme should further consider the following issues:

• The basis function of the wavelet transform plays an important role during the application of CSF for the HVS in the wavelet transform domain, but the study [27] didn’t consider this key factor.

• The embedding factors emphasize too much weight on the low frequency domain, rather than equal emphasis on the medium to high frequency domains.

• The interrelationship of block classification and the characteristics of the embedding location should be further analysed.

For the first issue, the direct application of CSF for the HVS in the wavelet transform domain needs to be further studied [28–32] while the basis function of the wavelet transformation is a critical factor affecting the visibility of the noise in the DWT domain. For the second issue, the watermark embedding in the low frequency components results in high degradation of the image fidelity. How to get the best trade-off between the visibility of the watermark and the capability of resistance for removal still needs to be further justified. For the third issue, the plane, edge and texture block classification in the study [27] is a genuine approach. However, the local and global characteristics of wavelet coefficients should be carefully considered, and the content adaptive approach is necessary for making the optimal decision.

4.3. DES (Data Encryption Standard) Data security research attracted a lot of interest in the research community in areas like databases [33], data mining [34], and Internet transaction [35, 36].


101

The DES algorithm is the most widely known block cipher in the world and even today, is resistant to most of practical attacks. It was created by IBM and defined in 1977 as U.S. standard FIPS 46. It is a 64-bit block cipher with 56 bit keys and 16 rounds. A round in DES is a substitution (confusion), followed by a permutation (diffusion). For each 64-bit block of plaintext that DES processes, an initial permutation is performed and the block is broken into two 32-bit halves, a left half )( iL and a right half )( iR . The 16 rounds of substitutions and permutations, called function f, are then performed. For each round, a DES round key ( iK ) of 48 bits and the current Ri are input into function f. The output of f is then XORed with the current

iL to give 1+iR . The current iR becomes 1+iL . After the 16 rounds, the two halves are rejoined and a final permutation is the last step. This process is shown in Fig.1

Figure 1. The DES algorithm

In January 1999, a DES cracking contest was held by RSA Data Security Inc. After only 22 hours and 15 minutes, the DES encrypted message was cracked by an international team consisting of a custom hardware DES cracking device built by the Electronic Frontier Foundation (EFF) and a group of distributed computing enthusiasts known as distributed.net. The EFF hardware cost $250,000 U.S. to design and build. Creating a second device would be much cheaper. Distributed.net used the idle CPU time of the members’ computers at no extra cost. With more money, a DES cracking hardware device could be built to decrypt messages within a small number of hours or even minutes. In this paper we will use the DES algorithm to encrypt the PDF documents.

5. The Proposed Scheme

5.1 Embed and encrypt PDF document The goal is to protect the copyright of the PDF document after releasing it. It is assumed that the owner published the KPDF document and keeps the source of the PDF document for himself. By showing the watermark and the owner details, the owner can prove that copies of the released

KPDF document belong to him. Software programs consist of two major entities: KPDF creator and KPDF viewer. The KPDF creator as shown in (Fig.2) and the flowchart is shown in (Fig.3):

1. Embedded digital watermark into PDF document 2. Generate string of the header which contain the

publisher info and the parameters value (Expire date, Allow password, Allow print and Allow print screen) then convert it into array of bytes then append last byte that’s contain (Number of Header bytes)

3. Read the work document (watermark included or not) into array of bytes then transposition the array of bytes from Z to A into a new array of bytes

4. Create a new array of bytes that contain header bytes + reverse bytes and one byte for the trailer which has header length

5. To preserve the integrity of the digital watermark, an asymmetric cryptographic algorithm (DES) is employed using public key hashed to create a unique 32 character (256-bit)

6. Write the all bytes into the KPDF document

(a)

(b)

Figure 2. (a) KPDF Creator without watermark Selection (b) KPDF Creator with watermark Selection


102

Figure 3. KPDF Creator flowchart

5.2. Decrypt document The KPDF viewer as shown in (Fig.4) and the flowchart is shown in (Fig.5):

1. Decrypt the KPDF document by the encryption key 2. Get the last byte from the document that’s has the

header byte length 3. Convert the header array of bytes to a string of

publisher information and parameters value like exp (Expire date, Allow password, Allow print and Allow print screen)

4. Transposition the reverse array to its original form then write the array into PDF document

5. Display the PDF document into PDF control into the main viewer

Figure 4. KPDF Viewer Figure 5 KPDF Viewer flowchart

5.3. Discussion KPDF converts existing PDF files to copy protected format where digital watermarking has been proposed as a last line of defense to protect PDF files copyright through visible watermarks in such files. As well as, to preserve the integrity of the digital watermark, an asymmetric cryptographic algorithm (DES) is employed. The proposed scheme shows that KPDF is generated by the software code for the following reasons:

• Embedded digital watermark into PDF documents • Protect PDF documents by Encryption Key • Distribute secure PDF documents that cannot be

copied • Prevent Print screen • Stop ALL screen capture software • Password protect PDF documents • Prevent printing of PDF documents • Set an expiry date for PDF documents validated by

time server • Prevent extraction of original images and resources

6. Conclusion AND FUTURE WORK

The Portable Document Format is an almost universally supported file format. During the past sixteen years it has become a de facto standard for electronic document distribution. Besides its portability, it provides its own security features such as encryption and digital signature and it is regarded as a secure document standard if compared to other popular document file formats. Portable Document Format is the most popular standard for document exchange over the Internet .PDF files are supported on almost all platforms, from common general purpose operating systems and web browsers to more exotic platforms, such as mobile phones and printers. Such universal support is both a blessing and a burden. For example, without doubt it is a blessing for authors, who can trust that a PDF document can be read practically anywhere by anybody. However, at the same time all of these machines share a common surface that is exposed and can be exploited. Being very complex to parse – the ISO standard document is over 700 pages long – it is also vulnerable as implementation errors are likely to happen. This is clearly visible from the vulnerability history of Adobe Reader, which is by no means Adobe’s problem alone but concerns all other implementations of PDF readers and writers as well. However, as Adobe’s own Reader is likely to be the most common tool for browsing PDF documents; it is also the one that is most likely to be attacked. We have presented a new method to digital watermarking of PDF documents. Watermarking embeds ownership information directly into the document, and it is proposed as a ``last line of defense'' against unauthorized distribution of digital media. Desirable properties of watermarks include imperceptibility, robustness, and security. An application showed how watermarking has been implemented for PDF files with new algorithm. The success of these methods encourages the development of more sophisticated watermarking algorithms as part of a larger system for protecting valuable digital documents. In this digital watermarking process the watermark is protected from any attacks by both transposition cipher and asymmetric cryptographic algorithm, other ciphers functions and asymmetric cryptographic algorithms should be evaluated for future study. A further study could be done to investigate how to make an online registration and request your PDF documents online.


103

References

[1]Adobe Systems Incorporated, Portable Document Format Reference (space) Manual, version1.7,November2006, /http://www.adobe.comS.

[2] W. Bender, D. Gruhl, N. Morimoto, A. Lu, Techniques for data hiding, IBM System Journal 35 (3, 4) (February 1996).

[3] H.M. Meral, E. Sevinc, E. Unkar, B. Sankur, A.S. Ozsoy, T. Gungor, Syntactic tools for text watermarking, in: Proceedings of SPIE International Conference on Security, Steganography, and Water- marking of Multimedia Contents, San Jose, CA, USA, January 29–February 1, 2007.

[4] M. Topkara, U. Topkara, M.J. Atallah, Information hiding through errors: a confusing approach, in: Proceedings of SPIE International Conference on Security, Steganography, and Watermarking of Multimedia Contents, San Jose, CA, USA, January 29–February 1, 2007.

[5] S. Zhong, X. Cheng, T. Chen, Data hiding in a kind of PDF texts for secret communication, International Journal of Network Security 4 (1) (January 2007) 17–26.

[6]Castiglione, A., De Santis, A., Soriente, C., 2007. Taking advantages of a disadvantage: digital forensics and steganography using document metadata. Journal of Systems and Software, Elsevier 80, 750–764.

[7] Brassil, J., Low, S., Maxemchuk, N. and O'Gorman, L., Electronic marking and identification techniques to discourage document copying. Proceedings of IEEE INFOCOM, `94, 1994, 3, 1278±1287.

[8] S. Voloshynovskiy, F. Deguillaume, O. Koval, T. Pun, Robust watermarking with channel state estimation, Part I: theoretical analysis, Signal Processing: Security of Data Hiding Technologies, (Special Issue) 2003–2004, to appear.

[9] S. Voloshynovskiy, F. Deguillaume, O. Koval, T. Pun, Robust watermarking with channel state estimation, Part II: applied robust watermarking, Signal Processing: Security of Data Hiding Technologies, (Special Issue) 2003–2004, to appear.

[10]Adobe SystemsInc.,2010.AdobePDFReferenceArchives. http://www.adobe. com/devnet/pdf/pdf reference archive.html (Last updatedJanuary2010).

[11]Byers, S., 2004. Information leakage caused by hidden data in published documents. IEEE Security and Privacy 2 (2), 23–27.

[12]Wikipedia theOnlineEncyclopedia,2009.TheCalipariIncident. http:// en.wikipedia.org/wiki/Nicola Calipari/, http://en.wikipedia.org/wiki/Rescue_of_Giuliana_Sgrena/ (Last updatedDecember2009).

[13]King, J.C., 2004. A format design case study: PDF. In: HYPERTEXT’04: Proceedings of the Fifteenth ACM Conference on Hypertext and Hypermedia. ACM Press, New York, NY, USA, pp. 95–97.

[14] Bagley, S.R., Brailsford, D.F., Ollis, J.A., 2007. Extracting reusable document components for variable data printing. In: DocEng’07: Proceedings of the ACM Symposium on Document Engineering. ACM Press, New York, NY, USA, pp. 44–52.

[15] Chao, H., Fan, J., 2004. Layout and Content Extraction for PDF Documents. Lecture Notes in Computer Science LNCS 3163, 213–224.

[16] Zhong, S., Cheng, X., Chen, T., 2007. Data hiding in a kind of PDF texts for secret communication. International Journal of Network Security 4 (1), 17–26.

[17] G.W. Braudaway, K.A. Magerlein, F.C. Mintzer, Protecting publicly available images with a visible image watermark, Proc. SPIE, Int. Conf. Electron. Imaging 2659 (1996) 126–132.

[18] J. Meng, S.F. Chang, Embedding visible video watermarks in the compressed domain Proc, ICIP 1 (1998) 474–477.

[19] S.P. Mohanty, K.R. Ramakrishnan, M.S. Kankanhalli, A dual watermarking technique for image, Proc. 7th ACM Int. Multimedia Conf. 2 (1999) 9–51.

[20] P.M. Chen, A visible watermarking mechanism using a statistic approach, Proc. 5th Int. Conf. Signal Process. 2 (2000) 910–913.

[21] M.S. Kankanhalli, R. Lil, R. Ramakrishnan, Adaptive visible watermarking of images, Proc. IEEE Int’l Conf. Multimedia Comput. Syst. (1999) 68–73.

[22] S.P. Mohanty, M.S. Kankanhalli, R. Ramakrishnan, A DCT domain visible watermarking technique for image, Proc. IEEE Int. Conf Multimedia Expo 20 (2000) 1029–1032.

[23] S.P. Mohanty, K.R. Ramakrishnan, M.S. Kankanhalli, A dual watermarking technique for image, Proc. 7th ACM Int. Multimedia Conf. 2 (1999) 9–51.

[24] Y. Hu, S. Kwong, Wavelet domain adaptive visible watermarking, Electron. Lett

37 (20) (2001) 1219–1220. [25] Y. Hu, S. Kwong, An image fusion-based visible

watermarking algorithm, in: Proc. Int’l Symp. Circuits Syst., IEEE Press, 2003, pp. 25–

28. [26] L. Yong, L.Z. Cheng, Y. Wu, Z.H. Xu, Translucent

digital watermark based on wavelets and error-correct code, Chinese J. Comput. 27 (11) (2004) 533–1539.

[27] B.B. Huang, S.X. Tang, A contrast-sensitive visible watermarking scheme, IEEE Multimedia 13 (2) (2006) 0–66.

[28] J.L. Mannos, D.J. Sakrison, The effects of a visual fidelity criterion on the encoding of images, IEEE Trans. Info. Theory 20 (4) (1974) 25–536.

[29] D. Levicky´ , P. For_is, Human Visual System Models in Digital Image Watermarking, Radio Eng. 13 (4) (2004) 38–43.

[30] A.P. Beegan, L.R. Iyer, A.E. Bell, Design and evaluation of perceptual masks forwavelet image compression, in: Proc. 10th IEEE Digital Signal Processing Workshop, 2002, pp. 88–93.

[31] S. Voloshynovskiy, et al., A stochastic approach to content adaptive digital image watermarking, in: Proc. 3rd Int. Workshop Information Hiding, Dresden, Germany, 1999, pp. 211–236.

[32] A.B. Watson, G.Y. Yang, J.A. Solomon, J. Villasenor, Visibility of wavelet quantization noise, IEEE Trans. Image Proc. 6 (8) (1997) 1164–1175.


104

[33] Iyer B, Mehrotra S, Mykletun E, Tsudik G, Wu Y. A framework for efficient storage security in RDBMS. Proceedings of EDBT, 2004, p. 147–64.

[34] Barbara D, Couto J, Jajodia S. ADAM: a testabed for exploring the use of data mining in intrusion detection. SIGMOD Record 2001;30(4):15–24.

[35] Bouganim L, Pucheral P. Chip-secured data access: confidential data on untrusted servers. In: Proceedings of very large databases (VLDB), 2002, Hong Kong China.

[36] Rubin A, Greer D. A survey of the world wide web security. IEEE Computer September 1998;31(9):34–41.

Author’s Profile

Karim Shamekh received the B.S. degree in Computer Science from High Technology Institute in 2006. During 2007-2010, he is preparing the M.S. degree in Computers and Information Systems from the Arab Academy for Science, Technology and Maritime Transport. His interests are concerned with Information Security, Web Mining

and Image processing.

Dr. Ashraf Darwish received his PhD degree in computer science in 2006 from computer science department at Saint Petersburg State University (specialization in Artificial Intelligence), and joined as lecturer (assistant professor) at the computer science department, Faculty of Science,

Helwan University in 25 June 2006.

Dr. Mohamed Kouta received the B.S. degree in Electrical Engineering from Military Technical College in 1972. He received the M.S. and Ph.D. degrees in computer science from Jons Hopkins University and Clarkson University in 1982 and 1985, respectively. He is the Chairman of Business information system(BIS) Department (Cairo Branch),college of management and technology, Arab academy

for science and technology (AAST) and the Vice Dean for Education.


105

Analysis of the Performance of Registration of Mono and Multimodal Brain Images using Fast

Walsh-Hadamard Transform D.Sasikala1 and R.Neelaveni2

1Assistant Professor, Department of Computer Science and Engineering,

Bannari Amman Institute of Technology, Sathyamangalam. Tamil Nadu-638401, India. [email protected]

2Assistant Professor, Department of Electrical and Electronics Engineering,

PSG College of Technology, Coimbatore, Tamil Nadu -641004, India.

Abstract: Image registration has great significance in medicine. Hence numerous techniques were developed. This paper introduced a method for medical image registration using Fast Walsh-Hadamard transform and measured their performance using correlation coefficient and time taken for registration. This algorithm can be used to register images of the same or different modalities. Each image bit is lengthened in terms of Fast Walsh-Hadamard basis functions. Each basis function is a concept of determining various aspects of local structure, e.g., horizontal edge, corner, etc. These coefficients are normalized and used as numerals in a chosen number system which allows one to form a distinct number for each type of local structure. The research outcomes show that Fast Walsh -Hadamard transform accomplished better results than the conventional Walsh transform in time domain. In addition Fast Walsh-Hadamard transform established medical image registration consuming less time and more correlation coefficient. Since in Medical Images information is very important than time, hence Correlation Coefficient is used as a measure. It proves that Fast Walsh Hadamard Transform is better than Walsh Transform in terms of both the measures. Keywords: Walsh Transform, Fast Walsh-Hadamard Transform, Local Structure, Medical Image Registration, Normalization.

1. Introduction Digital image processing is building up the basic

machine that could achieve the visual functions of all, that is, it is an enhancement by improving image quality by filtering the noise and restoration of images by performing compression to save storage area and channel capacity during transmission. It is a rapidly growing field with emergent applications in many areas of science and engineering. The main principle of registration is to combine the sets of data with the deviations if any or with their similarities into a single data. These sets of data are obtained by sampling the same scene or object at different times or from different perspectives, in different co-ordinate systems. The purpose of registration is to envision a single data merged with all the details about these sets of data acquired at different times or perspectives or co-ordinate systems. Such data is very vital in medicine for doctors to prepare for surgery. The most familiar and significant classes of image analysis algorithm with medical applications [1,3] are image registration and image

segmentation. In Image analysis technique, the same input gives out fairly detail description of the scene whose image is being considered. Hence the image analysis algorithms perform registration as a part of it towards producing the description. Also in single subject analysis, the statistical analysis is done either before or after registration. But in group analyses, the statistical analysis is done after registration. Though registration transforms the different sets of data into one co-ordinate system, it’s required for comparison or integration.

Generally registration is the most difficult tasks among all in image processing. It is because aligning images to overlap the common features and differences if any are to be emphasized for immediate visibility to the naked eye.

There is no general registration [1-17] algorithm that work reasonably well for all images. A suitable registration algorithm for the particular problem must be chosen or developed, as they are adhoc in nature. The algorithms can be incorporated explicitly or implicitly or even in the form of various parameters. This step determines the success or failure of image analysis. The method generally involves determining a number of corresponding control points in the images and, from the correspondences, determining a transformation function that will determine the correspondence between the remaining points in the images. This technique may be classified based on four different aspects given as follows: (i) the feature selection (extracting features from an image) using their similarity measures and a correspondence basis, (ii) the transformation function, (iii) the optimization procedure, and (iv) the model for processing by interpolation.

Amongst the numerous algorithms developed for image registration so far, methods based on image intensity values are particularly excellent as they are simple to automate as solutions to optimization problems. Pure translations, for example, can be calculated competently, and universally, as the maxima of the cross correlation function between two images [11] [15] [17]. Additional commands such as rotations, combined with scaling, shears, give rise to nonlinear functions which must be resolved using iterative nonlinear optimization methods [11].

In the medical imaging field, image registration is regularly used to combine the complementary and synergistic information of images attained from different


106

modalities. A widespread problem when registering image data is that one does not have direct access to the density functions of the image intensities. They must be estimated from the image data. A variety of image registration techniques have been used for successfully registering images that are unoccluded. This is generally practiced with the use of Parzen windows or normalized frequency histograms [12].

The work proposed in this paper uses Fast Walsh-Hadamard Transform (FWHT) [18, 19] for image registration. The coefficients obtained are normalized to determine a unique number which in turn represents the digits in a particular range. The experiments conducted on clinical images show that proposed algorithm performed well than the conventional Walsh Transform(WT) method in medical image registration. In addition, this paper provides a comparative analysis of Fast Walsh-Hadamard transform and Walsh transform in Medical image registration in terms of correlation coefficient and time taken for registration.

The remainder of the paper is ordered as follows. Section 2 provides an overview on the related work for image registration. Section 3 explains Walsh transform in image registration. Section 4 describes the proposed approach for image registration using Fast Walsh-Hadamard Transform. Section 5 illustrates the experimental results to prove the efficiency of the proposed approach in image registration in terms of correlation coefficient and time taken for registration and Section 6 concludes the paper with a discussion.

2. Related Works This section of paper provides a quick look on the

relevant research work in image registration. An automatic scheme using global optimization

technique for retinal image registration was put forth by Matsopoulos et al. in [1]. A robust approach that estimates the affine transformation parameters necessary to register any two digital images misaligned due to rotation, scale, shear, and translation was proposed by Wolberg and Zokai in [2]. Zhu described an approach by cross-entropy optimization in [3]. Jan Kybic and Michael Unser together put forth an approach for fast elastic multidimensional intensity-based image registration with a parametric model of the deformation in [4]. Bentoutou et al. in [5] offered an automatic image registration for applications in remote sensing. A novel approach that addresses the range image registration problem for views having low overlap and which may include substantial noise for image registration was described by Silva et al. in [6]. Matungka et al. proposed an approach that involved Adaptive Polar Transform (APT) for Image registration in [7, 10]. A feature-based, fully non supervised methodology dedicated to the fast registration of medical images was described by Khaissidi et al. in [8]. Wei Pan et al. in [9] proposed a technique for image registration using Fractional Fourier Transform (FFT).

3. Walsh Transform The Walsh, Haar [13], etc are examples of orthogonal

transforms. The coefficients of such an extension point

toward the effectiveness of the occurrence of the analogous structure at the particular position. If these coefficients are normalized by the dc coefficient of the expansion, i.e., the local average gray value of the image, then they measure purely the local structure independent of modality. Walsh basis functions correspond to local structure, in the form of positive or negative going horizontal or vertical edge, corner of a certain type, etc. In addition, registration schemes based on wavelet coefficient matching do not present a general mechanism of combining the matching results across different scales.

Suppose if there are two images I1 and I2, in which I1 is assumed to represent the reference image whereas I2 represents an image that has to be deformed to match I1. First, we consider around each pixel, excluding border pixels, a 3X3 neighborhood and compute from it, the nine Walsh coefficients (3X3 WT of a 3X3 image patch). If ‘f’ is the input image, the matrix of coefficients ‘g’ computed for it using equation (1),

11 .)( −−= fWWg T (1) Matrix contains the coefficients of the expansion of the

image, in terms of the basis images formed by taking the vector outer products of the rows of matrix W and its inverse W-1. These basis images are shown in Fig. 1(a). These coefficients are denoted by a00, a01, a02, a10, a11, a12, a20, a21, a22, and in a matrix form as shown in Fig. 1(b). These coefficients take the value in the range [0, 9]. Moreover normalization given by equation (2) makes the method robust to global levels of change of illumination.

αij = aij / a00 (2) However, the information having dense features and

rigid body transformation allows for plenty of redundancy in the system and makes it robust to noise and bad matches of individual pixels which effectively represent lack of local information. One may construct a unique number out of eight numbers if one uses these numbers as the digits of the unique number. The number of levels depends on the number system adopted. If one decide to stick with the decimal system, and then the normalized coefficients are quantized so that they take integer values in the range [0, 9].

(a). WTs basis images for a 3X3 images

a0

0 a0

1 a0

2 a1

0 a1

1 a1

2 a2

0 a2

1 a2

2 (b). Nine coefficients in matrix form Figure 1. Walsh Transformation

In Figure 1(a) the coefficients along the first row and the

first column are of equal importance, as they measure the presence of a vertical or a horizontal edge, respectively. The


107

remaining four coefficients measure the presence of a corner.

The following ordering of coefficients are used in images, Ordering IA α01, α 10, α 20, α 02, α 11, α 21, α 12, α 22 Ordering IB α 10, α 01, α 02, α 20, α 11, α 12, α 21, α 22 Ordering IIA α 22, α 21, α 12, α 11, α 02, α 20, α 10, α 01 Ordering IIB α 22, α 12, α 21, α 11, α 20, α 02, α 01, α 10 The performance of the FWHT is better than the WT in terms of Correlation Coefficient(CC) and Time taken for registration of mono and multimodal image registration.

The drawbacks of WT were found to be as follows: (i) Errors are present in the final registration stage. (ii) Consumes more CPU time for registration as lot of calculations are involved. This is the major drawback when compared to FWHT. The advantages of FWHT were found to be as follows: (i) More reliable as produces least error results in the final registration stage when compared to WT. (ii)Consumes very less CPU time for registration as calculations are performed using divide and conquer method. This is the major advantage when compared to WT. (iii)The performance of the FWHT is better than the WT in terms of CC for monomodal and multimodal image registration.

The Computed Tomography (CT) or Magnetic Resonance (MR) images represent monomodal images and CT-CT, CT-MR, and MR-MR images represent real multimodal pairs and the successful registration shows that FWHT produce accurate results while the measures correlation coefficient and time taken for registration are considered in a diversity of image sets, without any tuning in the preprocessing step.

4. Proposed Approach

4.1 Fast Walsh Hadamard Transform A fast transform algorithm can be considered as a sparse

factorization of the transform matrix, and refer to each factor as a stage. The proposed algorithms have a regular interconnection pattern between stages, and consequently, the inputs and outputs for each stage are addressed from or to the same positions, and the factors of the decomposition, the stages, have the property of being equal between them. The 2X2 Hadamard matrix is defined as H2 by equation (3) as

−

=11

112H (3) A set of

radix-R factorizations in terms of identical sparse matrices can be rapidly obtained from the FWHT property that relates the matrix H with its inverse and is given in equation (4),

1)( −= nn RnR HRH (4) Where HRn = radix-R Walsh Hadamard transform;

Rn = radix-R factorizations; n = input element; The FWHT is utilized to obtain the local structure of the

images. This basis function can be effectively used to obtain the digital numbers in the sense of coefficients [18] [19]. If these coefficients are normalized by the dc coefficient of the expansion, i.e., the local average gray value of the image, then they measure purely the local structure independent of modality. These numbers are then normalized to obtain the unique

number. This unique number can then be used as feature for image registration. The implementation of FWHT readily reduces the time consumption for medical image registration when comparing the same with conventional WT technique for image registration.

5. Experimental Result A series of experiments is performed using medical

images. The tests are performed using different images of different sizes. A set of CT and MR medical images which depict the head of the same patient is considered. The original size of these images is given as pixels. In order to remove the background parts and the head outline, the original images are cropped, creating sub-images of different dimension pixels. The algorithms are evaluated by determining the CC and Time taken for registration.

The CC is a statistical measure of how well trends in the predicted values follow trends in past actual values. It is a measure of how well the predicted values from a forecast model “fit” with the real-life-data. As the strength of the relationship between the predicted values and actual values increases, so does the CC. Thus higher the CC the better it is.

Correlation Coefficient

(i)Correlation Coefficient and Time taken for registration for Monomodal images:

For the evaluation of the algorithm, MR T2-Registered images are used. (a)MRI T2-Registered –Sagittal Image 400 x 419 - 88.8kB.

Figure 2. MRI T2-Registered –Sagittal Image 400 x 419 -

88.8kB Using FWHT

The results of WT and FWHT are obtained. Figure 3.shows the pictorial outputs from the WT that are different from Figure 2 and outputs from the FWHT is same as Figure 2.


108

Image for S.No 2

Image for S.No 4

Image for S.No 5

Figure 3. MRI T2-Registered –Sagittal Image 400 x 419 -

88.8kB using WT

Table 1.Represents results of WT & FWHT using CC.

00.10.20.30.40.50.60.70.80.9

1

1 2 3 4 5

Images

CC

WTFWHT

Figure 4. Comparison of WT & FWHT using CC.

Table 2.Represents Time consumption for Image Registration

using WT & FWHT S

No X in mm

Y in mm

Angle in degrees

Elapsed Time in seconds for WT

Elapsed Time in seconds

for FWHT

1 4 -10 9 138.297 6.829 2 -12 -7 13 135.922 7.328 3 5 -7 5 133.406 6.328 4 -14 -15 2 136.000 6.750 5 -8 -7 1 141.125 6.125

0

20

40

60

80

100

120

140

160

1 2 3 4 5

Images

Tim

e in

Sec

s

WTFWHT

Figure 5. Comparison of WT & FWHT in terms of time.

From the above analysis it proves that the performance of the FWHT is better than the WT in terms of time and Correlation Coefficient.

(ii)Correlation Coefficient and Time taken for registration for Multimodal images: (i) Sagittal 840 x 754 - 69k - jpg CT & Sagittal 500 x 500 - 72k - jpg MRI – WT

a) Registered Image obtained using WT

b) Difference in images obtained using WT

Figure 6. Images obtained using WT

(ii) Sagittal 840 x 754 - 69k - jpg CT & Sagittal 500 x 500 - 72k - jpg MRI – FWHT

a) Registered Image obtained using FWHT

b) Difference in images obtained using FWHT

Figure 7. Images obtained using FWHT

(iii) Sagittal 500 x 500 - 72k - jpg MRI & Sagittal 840 x 754 - 69k - jpg CT - WT



Figure 8. Images obtained using WT (iv) Sagittal 500 x 500 - 72k - jpg MRI & Sagittal 840 x 754 - 69k - jpg CT - FWHT




S. No

X in mm

Y in mm

Angle in degrees

CC after registration

for WT

CC after registration for FWHT

1 4 -10 9 0.8342 0.8343 2 -12 -7 13 0.8459 0.8464 3 5 -7 5 0.0416 0.8864 4 -14 -15 2 0.8933 0.8934 5 -8 -7 1 0.2542 0.9426


109

(v) Axial 320 x 420 - 40k - jpgCT Axial 553 x 642 - 38k – jpg-MRI - WT


b) Difference in images obtained using WT Figure 10. Images obtained using WT

(vi) Axial 320 x 420 - 40k - jpgCT Axial 553 x 642 - 38k – jpg-MRI - FWHT



Figure 11. Images obtained using FWHT (vii) Axial 553 x 642 - 38k – jpg-MRI Axial 320 x 420 - 40k - jpgCT –WT


b) Difference in images obtained using WT Figure 12. Images obtained using WT

(viii) Axial 553 x 642 - 38k – jpg-MRI Axial 320 x 420 - 40k - jpgCT – FWHT




(ix) Sagittal 432 x 427 - 41k – jpg- CT Frontal 400 x 400 - 18k – jpg- MRI –WT




(x) Sagittal 432 x 427 - 41k – jpg- CT Frontal 400 x 400 - 18k – jpg- MRI –FWHT



Figure 15. Images obtained using FWHT (xi) Frontal 400 x 400 - 18k – jpg- MRI Sagittal 432 x 427 - 41k – jpg- CT- WT




(xii) Frontal 400 x 400 - 18k – jpg- MRI Sagittal 432 x 427 - 41k – jpg- CT- FWHT





110

Table 3: Represents results for WT & FWHT using CC

S. No

CC after registration for Walsh Transform

CC after registration for Fast Walsh Hadamard Transform

1 -0.0459 0.5330 2 0.1838 0.4429 3 0.0130 0.6752 4 -0.0498 0.7483 5 -0.0557 0.5452 6 -0.0837 0.5638

Figure 18. Comparison of WT & FWHT using CC.

Table 4.Represents Time consumption for Image Registration using WT & FWHT

In this work, it is observed from the results that FWHT

gives better results when compared to WT in terms of Time and CC.

The results of WT and FWHT are obtained and Figures show the pictorial outputs from the WT that is different from outputs from the FWHT. The registration of these images shows that there is loss of information in pixels while registering images using WT due to less CC. But the rest of the images have the outputs for FWHT.

Figure 19.Comparison of WT & FWHT in terms of time

The loss of information in pixels of the FWHT is very less when compared to WT this is due to more CC of FWHT than WT. Hence FWHT is better when compared to WT for image registration. Also registration time of FWHT is best when compared to WT.

The FWHT is better than the WT in terms of Time and Correlation Coefficient as their values are maximized for both monomodal and multimodal images.

6. Conclusion This paper proposes the application of FWHT for

monomodal and multimodal medical image registration. This transform reduces the time consumption in image registration. Therefore it proves to be a better approach for medical image registration than any other conventional WT. The coefficients obtained using this transform are then normalized to obtain the unique number. The unique number represents the local structure of an image. Moreover this unique number indicates the feature of an image for image registration. The experimental results revealed the fact that the application of FWHT performs well in image registration. The future work concentrates on Mutual Information and its application for image registration with the analysis of its performance of monomodal and multimodal brain images using FWHT. Also to further improve the results by using some other transforms that use correlation coefficients.

References [1] George K. Matsopoulos, Nicolaos A. Mouravliansky,

Konstantinos K. Delibasis, and Konstantina S. Nikita, “Automatic Retinal Image Registration Scheme Using Global Optimization Techniques,” IEEE Transactions on Information Technology in Biomedicine, vol. 3, no. 1, pp. 47-60, 1999.

[2] G. Wolberg, and S. Zokai, “Robust image registration using log-polar transform,” Proceedings of International Conference on Image Processing, vol. 1, pp. 493-496, 2000.

[3] Yang-Ming Zhu, “Volume Image Registration by Cross-Entropy Optimization,” IEEE Transactions on Medical Imaging, vol. 21, no. 2, pp. 174-180, 2002.

[4] Jan Kybic, and Michael Unser, “Fast Parametric Elastic Image Registration,” IEEE Transactions on Image Processing, vol. 12, no. 11, pp. 1427-1442, 2003.

[5] Y. Bentoutou, N. Taleb, K. Kpalma, and J. Ronsin, “An Automatic Image Registration for Applications in Remote Sensing,” IEEE Transactions on Geosciences and Remote Sensing, vol. 43, no. 9, pp. 2127-2137, 2005.

[6] Luciano Silva, Olga R. P. Bellon, and Kim L. Boyer, “Precision Range Image Registration Using a Robust Surface Interpenetration Measure and Enhanced Genetic Algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 762-776, 2005.

[7] R. Matungka, Y. F. Zheng, and R. L. Ewing, “Image registration using Adaptive Polar Transform,” 15th IEEE International Conference on Image Processing, ICIP 2008, pp. 2416-2419, 2008.

[8] G. Khaissidi, H. Tairi and A. Aarab, “A fast medical image registration using feature points,” ICGST-GVIP Journal, vol. 9, no. 3, 2009.

[9] Wei Pan, Kaihuai Qin, and Yao Chen, “An Adaptable-Multilayer Fractional Fourier Transform Approach for Image Registration,” IEEE Transactions on Pattern

S. No

Elapsed Time in Secs after registration for Walsh Transform

Elapsed Time in Secs after

registration for Fast Walsh

Hadamard Transform

1 647.359000 39.843000 2 196.437000 9.516000 3 41.015000 4.531000 4 318.922000 18.297000 5 114.438000 8.047000 6 125.140000 7.453000

0

200

400

600

800

1 2 3 4 5 6

Images Time in Seconds

WT

FWHT

-0.2

0

0.2

0.4

0.6

0.8

1 2 3 4 5 6

Images

CC WTFWHT


111

Analysis and Machine Intelligence, vol. 31, no. 3, pp. 400-413, 2009.

[10] R. Matungka, Y. F. Zheng, and R. L. Ewing, “Image registration using Adaptive Polar Transform,” IEEE Transactions on Image Processing, vol. 18, no. 10, pp. 2340-2354, 2009.

[11] Jr. Dennis M. Healy, and Gustavo K. Rohde, “Fast Global Image Registration using Random Projections,” 4th IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2007, pp. 476-479, 2007.

[12] C. Fookes and A. Maeder, “Quadrature-Based Image Registration Method using Mutual Information,” IEEE International Symposium on Biomedical Imaging: Nano to Macro, vol. 1, pp. 728-731, 2004.

[13] M. Petrou and P. Bosdogianni, Image Processing—The Fundamentals. New York: Wiley, 1999.

[14] Pere Marti-Puig, “A Family of Fast Walsh Hadamard Algorithms With Identical Sparse Matrix Factorization,” IEEE Transactions on Signal Processing Letters, vol. 13, no. 11, pp. 672-675, 2006.

[15] J. L. Moigne, W. J. Campbell, and R. F. Cromp, “An automated parallel image registration technique based on correlation of wavelet features,” IEEE Trans. Geosci. Remote Sens., vol. 40, no. 8, pp. 1849–1864, Aug. 2002.

[16] J. P. W. Pluim, J. A. Maintz, and M. A. Viergever, “Image registration by maximization of combined mutual information and gradient information,” IEEE Trans. Med. Imag., vol. 19, no. 8, pp. 899–814, Aug. 2000.

[17] Z. Zhang, J. Zhang, M. Liao, and L. Zhang, “Automatic registration of multi-source imagery based on global image matching,” Photogramm. Eng. Remote Sens., vol. 66, no. 5, pp. 625–629, May 2000.

[18] M. Bossert, E. M. Gabidulin, and P. Lusina, “Space-time codes based on Hadamard matrices proceedings,” in Proc. IEEE Int. Symp. Information Theory, Jun. 25–30, 2000, p. 283.

[19] L. Ping, W. K. Leung, and K. Y. Wu, “Low-rate turbo-Hadamard codes,” IEEE Trans. Inf. Theory, vol. 49, no. 12, pp. 3213–3224, Dec. 2003.

Author’s Profile

D.Sasikala is presently working as Assistant Professor, Department of CSE, Bannari Amman Institute of Technology, Sathyamangalam. She received B.E.( CSE) from Coimbatore Institute of Technology, Coimbatore and M.E. (CSE) from Manonmaniam Sundaranar University, Tirunelveli. She is now pursuing Phd in Image

Processing. She has 11.5 years of teaching experience and has guided several UG and PG projects. She is a life member of ISTE. Her areas of interests are Image Processing, System Software, Artificial Intelligence, Compiler Design.

R. Neelaveni is presently working as a Assistant Professor, Department of EEE, PSG College of Technology, Coimbatore. She has a Bachelor’s degree in ECE, a Master’s degree in Applied Electronics and PhD in Biomedical Instrumentation. She has 19 years of teaching experience and has guided many UG and PG projects. Her research and teaching interests includes

Applied Electronics, Analog VLSI, Computer Networks, and Biomedical Engineering. She is a Life member of Indian Society for Technical Education (ISTE). She has published several research papers in national and international Journals and Conferences.


112

A Robustness Steganographic Method Against Noise For RGB Images Based On PCA

Hamid Dehghani1, Majid Gohari2 1 Malekashtar University of Technology, Department Computer Scienses,

Lavizan Road, Tehran, Iran [email protected]

2 Malekashtar University of Technology, Department of Secret Communication,

Lavizan Road, Tehran, Iran [email protected]

Abstract: In this paper an effective color image steganography method based on Principle Component Analysis (PCA) is proposed. After applying the PCA technique on the RGB trichromatic system, eigenimages obtained, the least significant bit of the Eigenimages pixels value are replaced with the data bits. Then inverse PCA applied and stego-image is resulted. The method used the correlation between the colors of RGB image. Experimental results show that the proposed algorithm robust against noises, attacks and detection comparable to LSB methods that replace the LSBs of pixel value directly. Hope with optimizing the method, makes it an effective steganographic technique in security communication.

Keywords: Steganography, Principle Component Analysis,

RGB image, correlation.

1. Introduction Steganography is combination of science and art concealing the secret message so that the very existence of it is not detectable [1]. The major steganography characteristics are providing a larger hidden capacity for data-hiding and maintaining a good perceived quality [2]. In the steganography process, first and more important term that must regarded is undetectability and then robustness (resistance to various image processing methods and compression) and capacity of the hidden data, that separate it from related techniques such as watermarking and cryptography. The steganographic algorithms can be divided into two groups: spatial/time domain and transform domain techniques. In the spatial domain in the case of images, secret message directly embedding in pixels value of image. The transform domain methods operate in the Discrete Cosine Transform, Fourier or wavelet transform domains of the host signal [2]. The proposed method belongs to transform domain but color transform domain. Applying PCA transform caused images that called eigenimages [3]. In natural RGB images, R-, G- and B-component have high correlation: , and [4], while eigenimages aren’t correlated and distribution of data is quite heterogeneous. Hence changing pixels value of RGB image caused most sensible variation of changing pixels value of eigenimages [5]. Tow characteristics uncorrelation and data distribution of eigenimages can be used for effective steganographic. Watermarking based on PCA transform for tamper-proof of web pages is proposed by Qijun and Hongtao [6]. The proposed scheme generates watermarks based on the PCA technique. The watermarks

are then embedded into the web page through the upper and lower cases of letters in HTML tags. When a watermarked web page is tampered, the extracted watermarks can detect the modifications to the web page. A. Abadpour and S. Kasaei [7] use the PCA to compression and watermarking of color images. However steganography based on PCA technique is typically a novel and first attempt scheme. The rest of this paper is organized as follows: first, Section 2 PCA technique briefly discussed. The proposed PCA-based steganography scheme is specifically described in Section 3. The paper continues with Section 4 which contains the experimental results and discussions. Finally, Section 5 concludes the paper. 2. Principle Component Analysis Let I be an N×N matrix, denoted by I ∈ , where F represents either the real number domain or the complex number domain. The first step of PCA is to calculate the covariance matrix V of I, which is defined as

V = ,

Where is the row vector in I, t denotes the transpose operation, and ∈ is the average vector of the row vectors in I, i.e.

.

Then eigen decomposition (ED) is applied to V:

V = UL ,

Where denotes the inverse matrix of U. L is a diagonal matrix with eigenvalues of V as its diagonal elements , and the columns of U, , , . . . ,

, are the eigenvectors of V. According to theory of linear algebra, the primary information of V is in the larger eigenvalues and corresponding eigenvectors. Without loss of generality, we assume that these diagonal elements in L have been sorted in descending order ( . A property of PCA is its optimal signal reconstruction in the sense of minimum mean square error (MSE) when only a subset of eigenvectors, called principal vectors, is used as basis vectors of a feature space S:

S = span ( , , . . . , ), m N.

With the feature space S, we can obtain another representation of the original data I by projecting them into the space S. Specifically this is as the following equation:


113

= . [ , , . . . , ], i= 1, 2, . . . , N,

Where ∈ can be viewed as the coordination of the original data in the feature space S. Actually they are what we call ‘principal components’, which, to some extent, can distinguish the original data well [7]. In the context of the proposed stenographic method, we generate principal components (eigenimages) as m=N=3, which we will show in the next part. Further details on PCA can be found in Refs. [8, 9]. However, we will not discuss it more in this paper due to limits of space. 3. Data embedment and extraction In order to apply PCA, we have first to construct a matrix, I, from the given RGB image of size M×N. the elements of R-, G- and B-component taken into three rows of the matrix:

I = .

PCA is then applied to the matrix I, and we choose all the m=N=3 principal components as eigenimages (PCA1, PCA2, PCA3). Secret message bits are embedded in the pixel value of these images with common LSB stenographic methods. In this way rounding of elements of PCA1, PCA2 and PCA3 images in the transmitter and receiver is needed. The act of rounding caused error. For conquest the error, convolutional coding (Error-Correcting Code) [10] of message bits applied. After applying invers PCA, stego image is obtained. Flowchart of data embedding process shown in Fig. 1(a).

Figure 1. The embedding and extraction schemas of the proposed method. (a) Embedding schema, and (b) Extraction schema. In side of receiver, the extraction process can be carried out by reversing the embedding procedure as the block diagram of Fig. 1(b). From this figure, it is obvious that this PCA is blind. The receiver first compute the eigenimages of stego image and then apply the LSB extraction. After decoding the retrieval data, secret message is obtained.

4. Implementation Results One of the error metrics used to compare the host image and stego-image is Pick Signal to Noise Ratio (PSNR);

PSNR = 10× dB

Where, M and N are the size of the images. c(i,j) and s(i,j) are sequentially pixel of cover and stego images [11].

4.1. Image quality To quantify the invisibility of the embedded secret message, three 512×512 images, namely, Lena, Baboon and pepper, shown in Fig. 2, were used as host images. Size of secret message is 35000×3 bits (35000 bits embedded in each of eigenimages) coded via convolutional coding of rate 4/8. Table 1 show average PSNR of R-,G- and B-component and percent of correct retrieval data with different embedding rate and different LSB locations (bpp: bits per pixel; LSB loc.: LSBs location under embedding) for the host images. Usually error occurs in lower lsbs. so with selecting upper LSB location(s) for imbedding message bits, error decrease. But increasing the LSB location caused the PSNR be decreased. We do experiments for 1, 2, 3, 4, 6 and 7 bit(s) per RGB pixel (that been used). Selecting of LSB location is based on the maximum data retrieval and then for the similar types, maximum PSNR is selected. Since transform domain of proposed method and frequency transform domain methods and also supported format (RGB format for the color transform domain; Greyscale and JPEG format for FT, DCT and Wavelet domain) aren’t similar, we while not compare the proposed method with these methods.

4.2. Embedding capacity Since, usually, the proposed method is with loss, using Error-Correcting Codes is needed. Therefore embedding capacity is depended on image properties and coding rate. Maximum capacity and PSNR can be optimized by using adaptive process depended on coding rate and LSB locations. It is clear that total bits number of secret message, , after embedding can be calculated by below equation:

Where is bpp of PCAi and show the used

coding rate of embedding of PCAi in M-by-N RGB image. The results of Table 1 shown that the proposed algorithm has acceptable capacity but (because of coding) lower than LSB method.

4.3. Robustness against noise In this evaluation, as an active attack, we generate White Gaussian Noise by using wgn(m,n,p) function of Matlab software and added to image components. wgn(m,n,p) generates an m-by-n matrix of white Gaussian noise. The p parameter specifies the power of the matrix in bits. Parameter-p selected to equal the numbers of bits embedded in a pixel (bpp). Secret message is coded by convolutional coding with rate 4/8. LSB and proposed methods are simulated and results gathered for the test images in Table


114

2. It can be seen from this Table that the PCA-based method is highly Robustness against white Gaussian noise than LSB method. When we hide secret message bits sequentially in PCAi pixels lsb caused the change in image domain that dispread in all of components. So the effect of noise reduced.

Figure 2. Test images

Table 1: The average PSNR and percent of correct retrieval data for different bpp and optimum locations.

bpp – LSB loc. percent of correct retrieval data

PSNR PCA1 PCA2 PCA3 PCA1 PCA2 PCA3

Lena

not used 1 – 3 not used - 100 - 49.62 1 – 2 1 – 2 not used 100 100 - 46.25

1 - 3 1 – 2 1 – 2 100 100 99.99 41.60 1 – 3 2 - 2:3 1 – 2 100 100 100 40.44 1 – 2 2 - 2:3 3 - 1:3 100 99.99 99.41 41.81 1 – 2 3 – 1:3 3 – 1:3 100 99.62 99.35 42.47

Baboon

not used not used 1 – 2 - - 100 50.48 1 – 2 1 – 2 not used 100 100 - 46.51 1 – 2 2 - 3:4 not used 100 100 - 41.84 1 – 3 2 - 2:3 1 – 2 100 100 100 40.59

2 - 3:4 2 - 3:4 2 - 2:3 100 100 100 36.08 2 - 3:4 2 - 3:4 3 - 2:4 99.99 100 99.69 36.31

Pepper

not used 1 – 3 not used - 100 - 47.95 not used 2 - 2:3 not used - 100 - 49.56 not used 2 - 2:3 1 – 1 - 100 97.71 46.88

1 – 1 2 - 1:2 1 – 1 96.76 99.64 98.01 48.26 2 - 2:3 2 - 1:2 2 - 1:2 96.25 97.05 95.78 43.64 2 - 2:3 3 - 1:3 2 - 1:2 96.01 98.63 95.46 42.43

Table 2: Percent of correct retrieval data in presence of white Gaussian noise with variance equal to bpp, for LSB and PCA-based methods (secret message size is 35000×3 bits and coding rate is 4/8)

1 bpp

2 bpp

3 bpp

4 bpp

Lena 50.27

50.24

49.96

50.03

LSB Baboon 50.02

49.60

50.33

49.49

Pepper 49.78

50.02

50.39

50.11

PCA-Based

Lena 98.23

97.50

94.50

80.37

Baboon 99.9 75.7 69.0 78.4

6 6 3 4 Pepper 99.3

2 98.9

1 87.1

2 89.3

3

5. Conclusion In this paper a novel transform domain steganography method based on Principle Component Analysis for secure and robustness steganography was proposed. The main idea is based on embedding a secret data in the pixels value of obtained images from colour transform domain. The proposed blind steganography method has the acceptable capacity and reduces the stego image visual distortion by hiding the secret data in eigenimages. Robustness against White Gaussian Noise is the major characteristic of the proposed method.


115

Reference [1] S .Katzenbeisser, F .Petitcolas, Information Hiding

Techniques for Steganography and Digital Watermarking, Artech HousePublishers, 2000.

[2] A. Cheddad, Digital image steganography: Survey and analysis of current methods, IEEE Trans. Signal Processing 90 (2010) 727–752.

[3] Arash Abadpour & Shohreh Kasaei, Color PCA eigenimages and their application to compression and watermarking, Image and Vision Computing 26 (2008) 878–890.

[4] Paluse H., Reprensentation of color images in different color spaces, in the color image processing handbook, S.J .Sangwine and R.E.N, Eds., 67-90 .Chapman and hall, London, 1998.

[5] A. Abadpour, Color image processing using principal component analysis, Master’s thesis, Sharif University of Technology, Mathematics Science Department, Tehran, Iran, 2005.

[6] Q. Zhao, H. Lu, PCA-based web pagewatermarking, Pattern Recognition 40 (2007) 1334 – 1341.

[7] A. Abadpour, S. Kasaei, Color PCA eigenimages and their application to compression and watermarking, Image and Vision Computing 26 (2008) 878–890.

[8] J.E. Jackson, A User’s Guide to Principal Components, Wiley, New York, 1991.

[9] I.T. Jolliffe, Principle Component Analysis, Springer, New York, 1986.

[10] M. Bossert and F. Hergert, “Hard- and soft-decision decoding beyond the half minimum distance An algorithm for linear codes,” IEEE Trans. Inform. Theory, vol. IT-32, pp. 709–714, Sept. 1986.

[11] Z. Wang, A. Bovik, A universal image quality index, IEEE Signal Process. Lett. 9 (March 2002) 81–84.

Majid Gohari received the B.S. degrees in Electrical Engineering from Ferdowsi University of Mashhad in 2008. He is now Graduate student of Secret Communication in Malekashtar University of Technology


116

Evolution Of FTTH As A Novel Subscriber’s Broadband Service Access Technology

P. Rajeswari1, N. Lavanya2 and Shankar Duraikannan3

1Student M. E. Communication Systems,

Department of Electronics And Communication Engineering, Ranippettai Engineering College, Walajah – 632 513 [email protected]

2Student M. E. Communication Systems,


3Assistant Professor,


Abstract: Internet access has become more and more convenient to the extent that it is now often seen by customers as a “utility”, a relatively undifferentiated key service. Emergence of new equipment which requires the optimal use of bandwidth like HDTV, mobile TV, wireless sound system, operators in recent years have developed more and more services requiring higher bandwidth. Without fiber, operators face a bottleneck on service development and therefore on the ability to develop new services and revenues. This paper focus on deployment of the FTTH using Ethernet and PON architectures. The comparative performance analysis of the architectures and the survey on different access networks emphasis that FTTH(using fiber optic cables) will be a promising technology for future bandwidth requirements and offers a way to eliminate the bottleneck in the last mile, with speeds 100 times higher than copper and also enables new applications and services within the digital home. Key Words: FTTH, Ethernet, Access, PON 1. Introduction The increasing need in the telecommunication services is the key driver behind the development in the access networks. Among the various broadband access technologies like digital subscriber loop (DSL) and cable/modem, Fiber-to-the-home (FTTH) is the end game for many service providers.

Figure 1. General architecture of FTTH

Since packets and frames of other categories of networks are transmitted over the broadband access networks, we need to find the technologies and the architectures that will enable cost effective transport of this traffic all the way to the home via an access networks.

FTTH deployed with Passive Optical Network (PON) technology seems to be the best solution to alleviate the bandwidth bottleneck in the access network. FTTH is an access technology in which the optical fiber runs from the central office to the subscribers living or workspace. The optical line terminal (OLT) resides at the central office and optical network unit is on the customer premises. OLT and ONU are interconnected by means of an optical fiber. The function of an ONU is to provide the services received from the OLT to the Home. ONU can also serve many homes; in that case the network is called FTTC (fiber to the curb). Many applications like CATV, VOIP reaches the central office and the ONU converts them in to single wavelength and they are transmitted via optical fiber. Figure.1 represents the architecture of FTTH. 2. FTTH Architectures FTTH can be deployed using either Ethernet or PON architectures. Ring, Star and Tree are the topologies that are considered and of which Tree is the mostly preferred and used topology [1].

FTTH architectures that have been deployed can be classified in to three broad categories.

• Ring Architecture of Ethernet switches. • Star Architecture of Ethernet switches. • Tree architecture using PON technologies.

2.1 Point to Point Ethernet Based Architectures Ethernet (IEEE 802.3) is a widely used LAN standard. The requirements for rapid time to market and low cost per subscriber have favored network architectures based on Ethernet switching. Ethernet transmission and switching have become commodities in the enterprise networking market and have led to attractive costs, mature products, and rapid innovation cycles.


117

Initially in Europe, FTTH using Ethernet is based on the architectures where the switches are located in the basements of the multiple dwelling units that have been interconnected in the form of a ring. Such architectures provide excellent resilience against fiber cuts and it is cost effective. But the sharing of bandwidth over each ring is comparatively small and because of this reason star architectures are preferred.

Figure 2. General representation of point to point Ethernet networks

Figure 2 represents simple point to point FTTH Ethernet

architectures using star topology. A Dedicated link runs from the OLT to the home. The fiber may be a single mode with 100BX or 1000BX or a pair of fibers with 100LX or 1000LX. There are a number of specifications has been released in recent years and different interfaces to the physical layer were defined [1]. 1. 100BASE-LX10: point-to-point 100 Mbps Ethernet links over a pair of single-mode fibers up to at least 10 km. 2. 100BASE-BX10: point-to-point 100 Mbps Ethernet links over an individual single-mode fiber up to at least 10 km. 3. 1000BASE-LX10: point-to-point 1000 Mbps Ethernet links over a pair of single-mode fibers up to at least 10 km. 4. 1000BASE-BX10: point-to-point 1000 Mbps Ethernet links over an individual single-mode fiber up to at least 10 km.

The supported maximum transmission speed is 100Mbit/s for slower links (100BASE–X) or 1000Mbit/s for faster links (1000BASE–X).

2.2 Passive Optical Networks-PON Architectures

Passive optical networks (PON) are identified as an economic and future safe solution to alleviate the bandwidth bottleneck in the access network [4], [5]. A Passive Optical Network (PON) is a single, shared optical fiber that uses inexpensive optical splitters to divide the single fiber into separate strands feeding individual subscribers. As the name implies, passive optical networks are typically passive, in the sense that they employ a simple passive optical splitter and combiner for data transport. As shown in the Figure 3,

typically, PON has a physical tree topology with the central office (CO) located at the root and the subscribers connected to the leaf nodes of the tree. OLT is the root of the tree and the ONUs are connected to the OLT the root, by optical fibers through passive optical splitter/combiners, which interconnects the ONUs. The function of an optical splitter (OS) is to divide the power among the users in the link. The maximum splitting ratio is 1:64 and 1:128 [6].i.e., OS can serve up to 128 users simultaneously.

Figure 3. Implementation of PON

2.2.1. Traffic Flow In PON In PON traffic flow is outlined by Downstream and Upstream data flows. Transport from the service provider to subscriber is represented as Downstream and its counter part as Upstream.

2.2.1. a. Downstream Data Flow

Figure 4. Traffic flow in downstream direction


118

The downstream represents the data transmission from the OLT to ONU. The wavelength preferred is 1490-1550nm, since the attenuation is very less 0.2 db/km. From the Figure 4, the services like voice, data and video etc., from different application networks transported over the optical network reaches the OLT and are distributed to the ONUs through the OS by means of power splitting.

The optical splitter splits the power of the signal i.e., if there are N users the splitting ratio is 1:N. Due to power splitting the signal gets attenuated but its structure and properties remain the same. Each Home receives the packets intended to it through its ONU.

2.2.1. b. Upstream Data Flow

The upstream represents the data transmission from the ONU to OLT. The wavelength preferred is 1310 nm. If the signals from the different ONUs arrives the splitter input at the same time at the same wavelength 1310nm, it results in superposition of different ONU signals when it reaches OLT. Hence TDMA [1] is adopted to avoid the interference of signals from ONUs. In TDMA time slots will be provided to each user on demand for transmission of their packets. At the optical splitter packets arrive in order and they are combined and transmitted to OLT.

Figure 5. Traffic flow in upstream direction

2.3PON Flavors

Figure 6. Evolution of PON

2.3.1 Broadband PON (BPON) BPON is the first introduced PON standard. It is accepted and given a standard by ITU T as ITU T G.983 in 1999[7]. The Asynchronous Transfer Mode (ATM) protocol is used to carry user data, hence sometimes access networks based on this standard are referred to as APONs [8], [9]. ATM uses 53-byte cells (5 bytes of header and 48 bytes of payload). Because of the fixed cell size, ATM implementations can enforce quality-of-service guarantees, for ample, bandwidth allocation, delay guarantees, and so forth. ATM was designed to support both voice and data payloads. Yet, the advantages of ATM proved to be the main obstacle in deployment of BPON and despite many field trails [10], [11] BPON did not gain much popularity.

The APON protocol operates differently in the downstream and upstream directions. All downstream receivers receive all cells and discard those not intended for them based on ATM addressing information. Due to the broadcast nature of the PON, downstream user data is churned, or scrambled, using a churn key generated by the ONT to provide a low level of protection for downstream user data.In the upstream direction, transmission is regulated with a time-division multiple access (TDMA) system. Transmitters are told when to transmit by receipt of grant messages. Upstream APON modifies ATM and uses 56-byte ATM cells, with the additional 3 bytes of header being used for guard time, preamble bits, and a delimiter before the start of the actual 53-byte ATM cell.

2.3.2 Ethernet PON (EPON) EPON standards are set by the IEEE and represented as IEEE 802.3ah Ethernet first mile task force in 2001. Ethernet Passive Optical Network (EPON) is a point to multipoint network topology implemented with passive optical splitters that follows the Ethernet standard. It follows the specification 100baseX, 1000baseX.

2.3.3 Gigabit PON (GPON) In 2001 a new effort for standardizing PON networks operating at the bit rates above 1Gbps. The GPON standards were accepted by ITU-T in Jan 2003 and are known as ITU-T recommendations G.984 [1]. Apart from the need to support higher bit rates, the overall protocol has been opened for re-consideration and the sought solution should be the most optimal and efficient in terms of support for multiple services.

The main GPON requirements are: • Full Service Support including voice (TDM, both

SONET and SDH), Ethernet (10/100 BaseT), ATM, leased lines and more.

• Physical reach of at least 20 km with a logical reach support within the protocol of 60 km.

• Support for various bit rate options using the same protocol, including symmetrical 622 Mb/s, symmetrical 1.25 Gb/s, 2.5 Gb/s Downstream and 1.25 Gb/s upstream and more.

• Security at the protocol level for downstream traffic due to the multicast nature of PON.


119

3. Performance of FTTH Networks Demand for the bandwidth is increasing with the emergence of new applications like tele-working, video conferencing, video telemetry. FTTH provides enormous bandwidth and long rich offering for triple play services(voice, data, video).From the Figure 7, the extent to which FTTH can provide greater bandwidth at lower cost is unmatched by any other technology[15].

Figure 7.Estimated bandwidth demand for future [15]

Based on measurements of performance, FTTH performs

better than other types of broadband and this performance gap is widening over time.

Figure 8. Tested download performance of broadband

media Mbps by year [14]

The Recent Surveys on the use of broadband Figure 8, shows that, FTTH download speeds are currently 1.5 times faster than cable modem download speeds, and 5.7 times faster than the median DSL download speeds. In terms of upload speeds, FTTH is 3.2 times faster than cable modem, and 5.7 times faster than DSL [14]. 4. Cost Considerations From Figure 9, the cost of FTTH network equipment and installation cost is less compared to all other technology as the dominant part of the cost is the civil works which can be considerably reduced if the construction is planned in advance.

Furthermore, while the FTTH network and its electronic elements have a lifecycle of many years, the fiber plant and

the optical distribution network have a longer lifecycle of at least 30 years.

Figure 9.Cost considerations

This longevity and the high cost of labor required in

physical construction places strong demands on proper design of the fiber plant. 5. FTTH Forecast According to the FTTH Worldwide Market-Technology Forecast, 2006-2011, as shown in Figure 10, the number of homes connected to fiber will grow from about 11 million at the end of 2006 to about 86 million at the end of 2011, representing about 5% of all households worldwide. Growth will be dominated by Asia (59 million households in the Asia Pacific Region – APAC – will have fiber by 2011). The rest of the subscriber base will be split equally between the Americas and the Europe Middle-East and Africa (EMEA) region [13].

Figure 10. Estimated growth of FTTH in EMEA (Europe Middle East and Africa), APAC (Asia Pacific region) and

America (2005-2011) [13] 6. Comparison of PON flavors

The selected characteristics of Existing PON flavors have been compared and summarized as in Table.1.


120

Table 1. Comparison of PON flavors

PARAMETERS

BPON EPON GPON

STANDARD

ITU T G.983 IEEE 802.3ah ITU T G.984

PACKET SIZE

53 BYTES 1518 BYTES 53-1518

BYTES

DOWNSTREAM DATA RATE 622Mbps 1.25Gbps 2..5 Gbps

UPSTREAM DATARATE 155Mbps 1.25 Gbps 1.25 Gbps

DOWNSTEAM WAVELENGTH

1490 - 1550 nm 1550 nm 1490 – 1550

nm UPSTREAM

WAVELENGTH 1310 nm 1310 nm 1310 nm

TRAFFIC MODES

ATM ETHERNET ATM

ETHERNET TDM

MAX PON

SPLITS

32 16 64 or 128

BANDWIDTH

40 Mbps 75 Mbps 1.25 Mbps

BER

- 10-12

10-10

EFFICIENCY

72% 49% 94%

7. Conclusion This paper has provided overview of FTTH and its architectures based on Ethernet and PON. The comparative analysis of the features in all the architectures Table.1 shows the access based on PON will be a promising technology of the future as it also proves to be a fast growing technology from the survey results. FTTH not only eliminate the bottle neck problem at the last mile but also offers much higher bandwidth and supports applications like HDTV, VOIP, and Telecom when compared to other access technologies like DSL and cable. References [1] Dawid Nowak and John Murphy, “ FTTH: The

Overview of Existing Technologies”. University College Dublin, Dublin 4, Ireland.

[2] Albert Domingo Vilar, “Modeling and deployment of NGAN in competitive market” 2009.

[3] Ziad A.Elsahn ,“Smooth upgrade of existing FTTH access networks: SAC-OCDMA and Dense SS-WDM solutions” 2010.

[4] C. Lin, “Broadband optical access networks and fiber-to-the-home: Systems technologies and deployment strategies”, John Wiley & Sons Ltd., September 2006.

[5] A. Girard, “FTTx PON technology and testing”, Quebec City, Canada: Electro- Optical Engineering Inc., 2005.

[6] T.Koonen, “Trends in optical access and in-building networks’ COBRA - Eindhoven the Netherlands.

[7] ITU-T, “G.983.1 - Broadband Passive Optical Networks (BPON): General characteristics,” June 1999.

[8] David Faulkner, Rajendrakumar Mistry, Tom Rowbotham, Kenji Okada, Wsewolod Warzanskyj, Albert Zylbersztejn, and Yves Picault, “The Full Services Access Networks Initiative,” IEEE Communications Magazine 35, pp. 58–68, Apr. 1997.

[9] Yoichi Maeda, Kenji Okada, and David Faulkner, “FSAN OAN-WG and future issues for broadband optical access networks,” IEEE Communications Magazine 39, pp. 126–132, Dec. 2001.

[10] Ingrid Van de Voorde and Gert Van der Plas, “Full Service Optical Access Networks: ATM Transport on Passive Optical Networks,” IEEE Communications Magazine 35(4), pp. 70–75, 1997.

[11] Hiromi Ueda, Kenji Okada, Brian Ford, Glenn Mahony, Stephen Hornung, David Faulkner, Jacques Abiven, Sophie Durel, Ralph Ballart, and John Erickson, “Deployment status and common technical specifications for a B-PON system,” IEEE Communications Magazine 39, pp. 134–141, Dec. 2001.

[12] www.ftthcouncilap.org, www.ftthcouncil.com [13] “FTTH white paper on Cable solutions for operator

diversity and lower CAPEX”, February 2008. [14] “Consumer study report about FTTH” RVA 2010. [15] John George“ Cost Innovations Speed Fiber Past

Copper to Enable Widespread FTTH Deployment” OFS Optics.

Author’s Profile

P. Rajeswari received her B.E. degree in Electronics and Communication from Anna University, Chennai, India in 2009 with a rank of 41 among 15535. Presently she is doing her M.E. in Communication Systems at Ranippettai Engineering College affiliated to Anna University.

N. Lavanya received her B.E. degree in Electronics and Communication from Anna University, Chennai, India in 2008. Presently she is doing her M.E. in Communication Systems at Ranippettai Engineering College affiliated to Anna University.

Shankar Duraikannan received his B.E. degree from University of Madras and M.Tech. from Anna University in 1996 and 2000 respectively. He has a decade of teaching and research experience in the field of Optical Communication and Networks. Presently he is working as

an Assistant Professor in Department of Electronics and Communication Engineering, Ranippettai Engineering College, Tamil Nadu, India.


121

A Modified Framework of Knowledge Management System Components for Collaborative Software

Maintenance Mohd Zali Mohd Nor1, Rusli Abdullah2, Masrah Azrifah Azmi Murad3 and Mohd Hasan Selamat4

1Universiti Putra Malaysia, Faculty of Computer Science and Information Technology,

43400 Serdang, Malaysia [email protected]





4Universiti Putra Malaysia, Faculty of Computer Science and Information Technology, 43400 Serdang, Malaysia [email protected]

Abstract: Knowledge Management System (KMS) is critical to software maintenance (SM) due to highly complex, knowledge-driven and collaborative environment. We propose a KMS framework for collaborative SM environment to model the requirements for sharing and sustaining knowledge in SM environment. The initial framework was based on literatures on Knowledge Management (KM), KMS and SM frameworks to identify the knowledge, contents and technology components related to SM environment. To verify the model, questionnaires are subjected to a pilot study before being sent out to users and maintainers. The final questionnaire survey responses were analyzed using Rasch methodology. As a result, several less important components were excluded from the initial model. The revised model shall be further used in our ongoing study to develop a tool to assist SM Community of Practice (CoP) members to perform their activities better.

Keywords: Knowledge Management, Knowledge Management System, Software Maintenance, Collaborative Environment

1. Introduction Within software engineering activity cycle, software

maintenance (SM) has yet to receive proper attention [29] It is a costly process, where previous works [14][23][29][34] estimated SM costs of between 60% to 90% of total software life cycle costs. The motivation in applying KM in SM in driven by the fact that the activities are knowledge-intensive and depend largely on expertise of the maintainers, However, SM organizations often have problems identifying resources and use of knowledge [30]. Managing knowledge in this area is therefore critical to ensure that maintainers can perform SM activities properly and timely, by sharing and obtaining vital knowledge.

SM organization consists of individuals (users and maintainers) working in interconnected groups/teams called Community of Practices (CoP). Therefore, knowledge flows not only within individual, but also within teams and

organization. The interrelations between these levels not only affects the way knowledge is shared and transferred, but may also becomes barrier to efficient knowledge flows [36].

However, there are various issues associated with the SM knowledge, which makes organizing, storing, sharing and disseminating knowledge difficult. Among the problems are:

• Some of these skills and expertise are documented as explicit knowledge, but more are hidden as tacit knowledge due scarcity of documentation [33].

• Maintainers have to collaborate with colleagues and other parties to obtain various information to enable them to carry out their software maintenance tasks.

• Domain knowledge are becoming more important to software maintainers but are seldom stored in KMS or other electronic means [25]. Often, maintainers have to rely on experts and also codes to understand the details [9][42]. While domain knowledge are important to maintainers, they are lacking, not stored properly or not readily available [33]. As such, maintainers, especially newcomers, spend a lot of time searching, collaborating and understanding these knowledge. Changes to domain knowledge due to enhancements or new business practices often affect usage of software application.

• Many SM tools are still not integrated to allow seamless information combination, which hampers information acquisition and sharing [25].

To address the above issues of knowledge within SM environment, a KMS framework shall be proposed. KMS framework is required to ascertain that KM requirements are fulfilled, includes the necessary conceptual levels and support integration of the individual, team and organizational perspectives [38]. Before the KMS framework in SM environment is proposed, concepts of KM, KMS and SM activities shall be reviewed,


122

as follows:

Knowledge, KM and KMS Frameworks As an overview, knowledge is defined as “a fluid mix of

framed experience, values, contextual information and expert insight that provides a framework for evaluating and incorporating new experiences and information. It originates and is applied in the mind of knowers. In organizations, it often becomes embedded not only in documents an repositories but also in organizational routines, processes, practices and norms.” [10].

Meanwhile, KM, in technical perspective, is defined as the strategies and processes of identifying, understanding, capturing, sharing, and leveraging knowledge [2][3][10][35]. One of the major challenges is to facilitate flow of knowledge not only within individual, but also within teams and organizations [3].

For individual knowledge cycle, many KM frameworks have been formulated [27][41], which were based on Polanyi’s tacit and explicit knowledge. Nonaka and Takeuchi’s models the knowledge creation components as socialization, internalization, combination and externalization. This SECI model has been used and synthesized by many others to model the KM for team and organization levels. In SM, usage of SECI model can be depicted in per Figure 1.

Tacit To Tacit Knowledge

Via Socialization SM knowledge are exchanged through experience sharing, brainstorming, observation and practice. Today technologies: Collaboration tools - teleconferencing, desktop video conferencing tools, live-meetings, village wells, synchronous collaboration

Tacit To Explicit Knowledge Via Externalization

Articulate tacit knowledge into explicit via concepts, metaphor, or models. In SM cases, these could be in form of screenshots of errors, shadow sessions, emails, conversations Today technologies: Email, terminal sessions, chat

Explicit To Tacit Knowledge Via Internalization

Knowledge is documented or verbalized, to help maintainers internalize and transfer knowledge, and also help other maintainers to ‘reexperience’ bug scenarios. Today technologies: Helpdesk and SCM applications are used to store bug reports and changes. Visualization tool to read or listen to success stories.

Explicit To Explicit Knowledge Via Combination

Knowledge are combined, sorted, added , exchanged and categorized, via specifications, SCM entries and error analysis Today’s technologies: Collaboration tools - E-mail, GroupWare, Homepages, consolidates in SCM. Data mining to sort, and filter information.

Figure 7. Nonaka's SECI Model (elaborated for SM) When individuals collaborate and share knowledge, they

commonly do so within a team or group. Wenger defines this group as the Community of Practice (CoP) - “a group of people who share a concern, set of problem, or a passion about a topic, and who deepen their knowledge and expertise in this area by interacting on regular basis”. [40]

Meanwhile, KM frameworks for organization structure includes Szulanski’s model of knowledge transfer [36], APQC’s organizational KM model [4], Choo’s model of knowing organization [8], Selamat et al.’s KM framework with feedback loop [35] and Holsapple and Joshi’s 3-fold collaborative KM framework [19]. This framework synthesizes the knowledge resources from Leonard-Barton, and Petrach and Sveiby models; KM activities from Nonaka, APQC, Wiig, Van der Spek and Alavi’s models, and KM influences from Wiig, APQC, Van der Speck, Szulanski and Leonard-Barton models..

To transform KM framework into a system, KMS framework is conceptualized. KMS is defined as “I.T-based system developed to support and augment the organizational process of knowledge creation, storage, retrieval, transfer and application” [3]. In general, a KMS framework consists of influential factors of KMS initiatives and their interdependent relationships and a model of KMS implementation [15]. However, systems and technology alone does not create knowledge [10], various other social “incentives” and organizational strategy and culture are often required to stimulate use of technology to share knowledge. As an example, the KMS frameworks by Meso and Smith and Abdullah et al. are illustrated in Figure 2 and Figure 3, respectively.

O RG ANIZATION ALKM

SYSTEM

Com puter-Media tedCollaboration

El ectronic taskmanagem ent

Messaging

Video Confere nc ing& Visualiz at ion

Group DecisionSupport

W eb Browsing

Data Mining

Se arch & Re tri ev al

Intel ligent Age nts

Doc umentManagement

TECH NOLO GIES FUNCT IONS

USI NGKNOWLEDG E

FIND INGKNOWLEDG E

CREATINGKNOWLEDG E

PACKAG INGKNOWLEDG E

KNOW LEDGE

Know-how

Know-wha t

Know-why

Se lf-M oti vat edCrea tivity

Person al T ac it

Cultur al T ac it

Orga ni zationalTa cit

Regu latoryAssets

Figure 8. Meso and Smith KMS Framework [24]

Figure 9. Abdullah et al. KMS Framework


123

Software Maintenance Knowledge Software maintenance (SM) is defined as “The totality of

activities required to provide cost-effective support to software system. Activities are performed during the pre-delivery stage as well as the post-delivery stage” [20].

After studying the SM process flow in a SM organization [25], we envisage that the knowledge flow in SM environment is somewhat analogous to a river. Starting from several streams or sources (call tickets, bugs reports, testing bugs, request for enhancements, etc.) , the inputs merge into the maintenance request mainstreams which go through many plains before ending at a delta of changed objects, released applications and changes to domain rules. To support these flow of maintenance information, several SM governance activities and tools are required. Tools such as Helpdesk, Software Configuration Management (SCM), Source Control, Project Management and others allow the team and organization to monitor and control the processes. In addition, collaborative tools and platform allows users and maintainers to communicate, cooperate and coordinate the required information to ensure good maintenance process.

The knowledge required in SM can be summarized as follows [16][31][32]:

• Organizational knowledge, such as roles and resources. The parties involved in software maintenance activities consist of various application users and software maintainers. The list may include enduser, superuser, maintenance manager, business analyst, systems analyst, project manager, QA personnel, build manager, implementation personnel and trainer. Attached to these roles are the areas of expertise.

• Managerial knowledge - such as resource management, task and project tracking and management.

• Technical knowledge – such as requirement analysis, system analysis, development tools, testing and implementation. Critical to this is also the knowledge on supporting groupware and CASE tools such as SCM, helpdesk and testing tools

• Domain knowledge – knowledge of the products and business processes.

• Knowledge on source of knowledge – where the knowledge resides, such as source codes, documentation, supporting CASE tools and more importantly, the where the experts are.

Based on past researches on KMS in SM, many researchers concentrate only on SM processes, tools, ontology, and knowledge flow. A proper study on KMS framework in collaborative SM environment is still lacking. The purpose of this paper is therefore to conceptualize such KMS framework.

This paper is structured as follows: section II discusses the related work on the related KMS frameworks and Section III reviews the methodology for the study. Then, section IV discusses the proposed KMS component model for collaborative SM environment. Section V elaborates the findings of a questionnaire survey on the above KMS components and proposes a revised framework of KMS

Components for Collaborative SM.

2. Related Works KMS for SM has been studied since late 1980s by Jarke

and Rose [22], who introduced a prototype KMS to control database software development and maintenance, mainly to facilitate program comprehension. The KMS is a decision-based approach that facilitates communication across time and among multiple maintainers and users, thus improving maintenance support. Even though this research was carried out quite some time ago, it provides the foundation on how decisions could be assisted via shared knowledge. However, facilitating program comprehension is not enough as SM is more than just understanding codes and extracting knowledge from codes.

Similarly, Deraman introduced a KMS model for SM which, albeit very simple, could provide us with the main essence of SM knowledge – the Software Knowledge, Change Request Knowledge and their functional interaction [11]. However, these alone, are not enough for users and maintainers. Newer technologies such as software agents are used to capture SM process knowledge in several researches [31][39]. However, no proper KMS framework was conceptualized by these studies.

Looking at the wider perspective of software engineering (SE), KMS in SE have been studied by many, including Santos et al. [33], Rus and Lindval [32] and Aurum et al. [5]. Rus and Lindval described the three main tasks of SE (individual, team and organization) and identified the three level of KM support for each task. The 1st level includes the core support for SE activities, document management and competence management. Meanwhile, the 2nd level incorporates methods to store organizational memory using method such as design rationale and tools such as source control and SCM. The 3rd KM support level includes packaged knowledge to support knowledge definition, acquisition and organization. Although the illustrative model is not given, the above should describe the KMS framework for SE. However, this model does not consider the social, physiological and cultural aspects of KM, as identified by the previous other generic KMS frameworks.

Therefore, the main motivation for this study is to formulate a more detailed KMS framework for Collaborative SM environment. The long-term goal of this study is to formulate a tool to support and automate KM tasks within collaborative SM environment. As such, the KMS framework shall place more emphasize on the technological perspective.

3. Methodology To formulate the KMS framework for Collaborative SM,

the following steps are used: 1. Review the existing KM and similar KM frameworks. 2. Review the existing KMS frameworks within the

related areas of SM. 3. List the KMS dimensions and components which

include the SM processes, tools and KM activities.


124

4. To verify the above dimensions and components, suitable questionnaires are developed.

5. To ensure reasonable face validity, questionnaires were further deliberated and refined by academic lecturers in Software Engineering, a statistician and several SM managers. A pilot study of 13 respondents were carried out to do the following:

a. Verify the constructs of the questions, based on the responses given by a pilot group of respondents.

b. Determine if the questions are well understood by respondents (i.e. hard to answer questions)

c. Determine if the questions are mundane and trivial (i.e. too easy) that perhaps the questions could just be left out.

d. Rasch Unidimensional Measurement Model (Rasch) was used to analyze the above pilot data. Some problematic questionnaire items are identified, revised and some discarded to make the questionnaire more acceptable.

6. Distribute questionnaires to three in-house software maintenance organizations in Klang Valley, Malaysia.

7. Analyze questionnaire responses using Rasch methodology to:

a. Determine reliability of respondents and items b. Determine outliers for both respondents and

items. c. Determine the component groups’ cut-off points

to exclude the components from the model. This should give us the final KMS component model for collaborative SM environment.

8. Revise the model for KMS Components for Collaborative SM Framework

The questionnaire are developed using a 4-Likert Scale order of importance, with 1 being less important and 4 denotes more important. Evaluation of questionnaire results shall be performed using Rasch. It is a probabilistic model that uses ‘logit’ as the measurement units, by transforming the ordinal data and raw scores into a linear scale [7]. Being linear, Rasch enables us to conduct more constructive analyses.

4. Initial KMS Dimensions and Components To build the list of dimensions and components for a

KMS framework, a review of current KMS framework for generic KMS, and related SE/SM KMS are conducted. The theoretical constructs for KM framework, KMS framework and SM knowledge dimensions are summarized in Appendix B, Appendix C and Appendix D, respectively.

Based on the construct summary, components suitable for SM KMS framework are identified as follows:

• Required knowledge, such as organizational knowledge, managerial knowledge, technical knowledge, business domain knowledge and knowledge on source of knowledge, are derived from Ghali [16, Rus and Lindval [31], Dias et al [12] and

Rodriguez et al. [30] • KM Activities are derived from Nonaka and Takeuchi

[27] and Holsapple & Joshi [19]. This includes Acquiring knowledge, Selecting knowledge, using knowledge, Providing/ Creating knowledge and Storing knowledge.

• SM governance tools are from Rus and Lindval [31], IEEE 14764 [20] and Mohd Nor et al. [25]. To support these flow of SM information, tools such as Helpdesk, SCM, Source Control and Project Management (PM) are crucial to monitor MRs.

• KM Components and Infrastructure are derived from Abdullah et al. [1], Meso & Smith [24], Dingsoyr & Conradi [13], and Rus and Lindval [32] frameworks. The major components includes computer-mediated collaboration, Experience Mgmt System, Document Management, KM portal, EDMS, OLAP, and Middleware tools.

• Automation and knowledge discovery tools are from Meso and Smith [24], Abdullah et al. [1], Rodriguez et al. [31] and new internet tools in the market. Tools such as GDSS, Intelligent Agents, Data mining/warehouse, Expert system and Case-Based Reasoning (CBR). Active tools such as RSS are also useful to get the right knowledge to the right users at the right time.

• KM Influences are derived from Holsapple and Joshi [19] and Abdullah [2]. Among these are the managerial influences and strategy, and psychological and cultural influences.

The questions are, what do users and maintainers want from such a KMS, which of these components affect the KMS and, most important of all, how does these components affect KM activities in Collaborative SM? To answer the above questions, questionnaires are drafted and distributed to selected users and maintainers to gauge their perspective on these KM components, mainly to explain the following issues:

4.1 The Important And Required General KM And SM Knowledge

Required knowledge, such as organizational knowledge, managerial knowledge, technical knowledge, business domain knowledge and knowledge on source of knowledge are important in SM processes. However, in most of the cases, we suspect that the business domain knowledge are important, but not stored for maintainers to access. 4.2 How Important Are SM Governance Tools As Enablers In Knowledge Activities

Based on IEEE 14764, SM processes include process implementation, problem and modification analysis, review and acceptance, development/modification, migration and retirement. Whilst KM components deals with converting tacit to explicit and then storing the knowledge, the SM governance tools are used to manage the flow of information and knowledge relating to the SM daily activities and processes. Tools such as Helpdesk application, SCM, source


125

control and project management are used throughout the SM processes and hence provides a good deal of input to the knowledge contents. The questions are – how well and important are these tools in the KM activities (acquiring, selecting, using, providing/creating and storing knowledge). 4.3 The Importance Of KMS Foundation Components And Infrastructure

KMS foundation components include, among a few, computer-mediated collaboration, Experience Mgmt System, Document Management, KM portal, EDMS, OLAP, and Middleware tools and Knowledge Map. While collaboration tools allow users to impart and share both tacit and explicit knowledge synchronously and asynchronously, the other tools are useful to search and extract available explicit information. Knowledge map refers to navigation aid to both explicit and tacit knowledge, by illustrating the knowledge flows within the organization [17]. In many aspects, it involves establishing knowledge ontology, mapping/linking the knowledge and validating the knowledge map. 4.4 How Important Are Different Automations And Automation Tools To The Overall Activities Of KMS?

Automation speeds up and assists maintainers in their daily activities. Technologies such as knowledge-map, CBR, expert system, agent technology and RSS are useful to assist users and maintainers to get the right knowledge at the right time. 4.5 How Important Are Managerial Influences And Strategies To The KMS Activities And Processes?

Managerial influences such as leadership, coordination, control and measurement [19] may affect the general SM KM activities and processes. Strategy deals with how KMS is planned for use, whether through codification (storing the explicit information), or personalization (storing the knowledge map). 4.6 How Important Are Psychological And Cultural Influences In The Overall Activities Of KMS?

Psychological issues include motivation, reward and awareness. Meanwhile cultural influences include the trusts, beliefs, values, norms and unwritten rules..

5. Discussion The Pilot study [26] revealed the item reliability in the

initial questionnaire was found to be poor and a few respondents and items were identified as misfits with distorted measurements. Some problematic questions are revised and some predictably easy questions are excluded from the final questionnaire.

In the final questionnaire survey, 41 respondents from three organizations participated in the survey. Among these, 27% are users and superusers, 22% are systems analysts, 15% are programmers, 15% SM managers, 10% business analysts and the rest are DBAs, Helpdesk and other technical staff. In years of service, 40% of respondents have more than 10 years experience, 34% have between 1 to 5 years, 24% have between 6 to 10 years and only 3% have less than 1 year of service.

The results of the survey, based on the components

discussed in Section 4 above, are analyzed in three parts; data reliability, fitness of respondent data and questionnaire items data and determination of component groups cut-off points.

5.2 Data Reliability Summary statistics for respondents (person) and items

(questions) are depicted in Table 1 and Table 2, respectively. 41 respondents returned the survey questionnaire. Out of which, Rasch identified an extreme score which will later be excluded from further analysis.

Table 1. Summary of 40 Measured (Non-Extreme) Persons

Raw

Score Cou

nt Mea sure

Infit Outfit mnsq zstd mnsq zstd

mean 133.8 42.8 0.49 1.02 -0.2 1.01 -0.2 s.d. 14.9 3.5 0.69 0.52 2.1 0.53 2 max. 167 45 2.64 3.14 6.4 3.37 6.7 min. 86 30 -0.65 0.28 -4.5 0.28 -4.4 Real RMSE =.30 Adj.SD=.62 Separation =2.10 Person Reliability= .82 Model RMSE =.27Adj.SD=.64 Separation=2.35 Person Reliability= .85 S.E. Of Person Mean = .11Maximum Extreme Score: = 1 Persons Valid Responses: 95.0% Person RAW SCORE-TO-MEASURE CORRELATION = .51 (approximate due to missing data) CRONBACH ALPHA (KR-20) Person RAW SCORE RELIABILITY = .94 (approximate due to missing data)

Table 2. Summary of 45 Measured Items

Raw

Score Count Measure Infit Outfit

mnsq zstd mnsq zstd mean 118.9 38 0 1 0 1 0.1 s.d. 16.6 3.3 0.62 0.12 0.6 0.15 0.7 max. 150 40 1.16 1.29 1.5 1.4 1.9 min. 88 29 -1.2 0.83 -1.3 0.74 -1.3

Real RMSE = .32 Adj.SD =.54 Separation =1.69 Item Reliability = .74 Model RMSE = .27 Adj.SD = .64 Separation =2.35 Item Reliability =.75 S.E. Of Person Mean = .09

The spread of person responses is = 3.29 logit is fair. This

is due to extreme responses by a person (code=PAUS2). However, Reliability = 0.82 and Cronbach Alpha=0.94 indicates high reliable data and hence the data could be used for further analyses.

On the questionnaire items aspects, the summary of 45 measured questionnaire items (see Table 3) reveals that the spread of data at 2.36 logit and reliability of 0.74 are good and fair, respectively. Details on measured items are listed in Appendix A. None of the items are beyond the critical measures (0.4 < Acceptable Point Measure Correlation < 0.8 and 0.5 < Outfit Mean Square < 1.5, and -2.0 < Outfit z-standardized value < 2.0). The previous pilot study is therefore proven helpful in making the questionnaire more reliable.


126

Table 3. Items Statistics – Measure Order

Item No.

Mea- sure (Logit)

Model mnsq Infit ZStd mnsq

Outfit ZStd PT Mea Corr.

1 -1.18 0.98 0 1 0.1 0.31 B1 Roles 2 0.12 0.88 -0.7 0.84 -0.8 0.44 B2 Resources 3 -0.32 0.88 -0.6 0.85 -0.7 0.43 B3 SM_Tasks 4 -0.26 0.84 -0.8 0.81 -1 0.46 B4 Req_analysis 5 -0.4 0.97 0 1.13 0.6 0.37 B5 Sys_analysis 6 -0.29 0.92 -0.3 0.9 -0.4 0.43 B6 Dev_skill 7 -0.1 0.87 -0.5 0.88 -0.4 0.49 B7 Testing_skill 8 -0.01 0.84 -0.7 0.84 -0.8 0.48 B8 Implementation 9 -0.28 1.03 0.2 1.02 0.2 0.31 B9 Info_Source

10 -0.82 1.04 0.3 1 0.1 0.28 BA Domain 11 -0.56 1.06 0.3 1.1 0.4 0.24 CA SCM 12 -0.53 1.19 0.6 1.27 0.9 0.13 CB Helpdesk 13 0.05 0.9 -0.2 0.93 -0.1 0.48 CC VCS 14 0.11 0.83 -0.5 0.87 -0.4 0.56 CD PM 15 -0.72 0.89 -0.6 0.8 -0.7 0.31 D1 Email 16 0.92 1.27 1.2 1.25 1.1 0.38 D2 Fax 17 0.99 1.06 0.4 1.07 0.4 0.49 D3 Memo 18 0.15 1.17 0.6 1.17 0.6 0.28 D4 E-Group 19 0.09 1.03 0.2 1 0.1 0.43 D5 BB 20 -0.36 1.06 0.4 1.04 0.3 0.3 D6 F2F 21 -0.36 0.97 -0.1 0.96 -0.1 0.36 D7 Phone 22 1.06 1.29 1.5 1.35 1.7 0.29 D8 Chat 23 0.85 1.11 0.6 1.1 0.6 0.37 D9 Conference 24 0.83 0.86 -0.6 0.86 -0.6 0.53 DA Audio/Video 25 -0.12 1.11 0.5 1.09 0.4 0.3 DB Portal 26 0.47 1.25 1.2 1.24 1.2 0.24 DC Intranet 27 -0.45 1.06 0.3 1.05 0.3 0.32 DD Search 28 -1.2 0.99 0 1.03 0.2 0.29 DE Groupware 29 0.17 0.89 -0.2 0.88 -0.2 0.47 EA GDSS 30 0.76 0.95 0 0.91 -0.1 0.31 EB CBR 31 0.2 0.83 -0.2 0.74 -0.3 0.56 EC Agent 32 0.02 1.1 0.4 1.21 0.8 0.28 ED SMS 33 0.53 0.91 -0.2 0.89 -0.3 0.44 EE RSS 34 -0.09 0.86 -1.3 0.79 -1.3 0.39 F1 Leadership 35 0.81 0.9 -0.9 0.87 -1 0.41 F2 Coordination 36 -0.25 0.9 -0.3 0.87 -0.4 0.49 F3 Control 37 0.75 0.95 -0.2 0.96 -0.2 0.45 F4 Audit 38 1.13 1.1 0.8 1.14 0.9 0.26 F5 Codification 39 -0.5 1 0.1 1.05 0.3 0.4 F6 Personalization 40 1.16 1 0 1.01 0.1 0.35 F7 Combination 41 -0.99 1.16 0.9 1.4 1.9 0.2 G1 Mgmt 42 -0.74 0.97 -0.1 1.01 0.1 0.41 G2 Org_value 43 -0.08 1.02 0.2 1.04 0.3 0.35 G3

Hoard_knowledge 44 0.24 1.03 0.2 1.02 0.2 0.37 G4 Rewards 45 -0.79 0.98 0 0.93 -0.2 0.39 G5 CoP

avg 0 1 0 1 0.1 63.5 S.D. 0.62 0.12 0.6 0.15 0.7 11

Fitness of Respondent and Questionnaire Items data Rasch provides the Person Item Distribution Map

(PIDM), which is similar to histogram (see Figure 5). PIDM allows both person and items to be mapped side-by side on the same logit scale to give us a better perspective on the relationship of person responses to the items. PIDM indicates a higher Person Mean (0.48) compared to the constrained Item Mean. This indicates tendency to rate higher importance to the prescribed questionnaire items.

Figure 10. Person-Item Distribution Map

Person PAUS2 (a User) being the highest in PIDM, have

the tendency to give high importance ratings to most of the questionnaire items, whilst P4SA2 (a Systems Analyst) tends to rate lower. On detailed post-questionnaire inspection, person PAUS2 is a new user which is not familiar with the technology components being asked in the questionnaire and has tendency to give equal high ratings to all question. Likewise, person P4SA2 is a long-serving user who is familiar with the components. Since PAUS2 has been identified by Rasch as extreme, the person shall be excluded from further analysis. This is reasonable, since our priority is to evaluate the items and less on persons.

On the Item side, the distribution is quite closely bunched together, with no obvious outliers. Among these items, B1 Roles, BA Domain, DE Groupware, G1 Management and G5 CoP are below the minimum measure of Persons. This indicates overall agreeableness on the high importance of these components.

5.3 Component Group Cut-off Points There are no hard and fast rules on how to determine

which of the less important KMS components should be excluded from the framework. We listed the components and sorted them in descending logit values. The list was distributed to four experts from knowledge management and software engineering fields, and three software maintenance managers, to identify the cutoff point for important components.

Based on the overall experts’ judgments, the following components are selected to be excluded from the model:

• B8 - Knowledge on implementation is important in my line of work.

• B2 - Knowledge on how to manage resources in SM organization.

• CD - Project Management tool to store and share explicit knowledge on SM resources and activities

• D2 - Electronic fax • D3 - Paper memo • D8 - On-line chat (MSN Messenger, Yahoo Chat, etc) • EE - RSS technology disseminate information


127

• EB - Expert Systems or CBR tools • F5 - Codification strategy (storing explicit information) • F7 - Combination of both codification and

personalization strategies • G4 – Rewards should be implemented to promote more

knowledge sharing

6. Revised KMS Framework Based on the above reduced components, the revised

framework is depicted in Figure 5, which consists of the components evaluated using the questionnaire survey, and the following fixed components:

• SM Process/Activities are standard activities described by ISO/IEC 14764 standard

• The KM components and infrastructure are derived from the other standard KMS infrastructure and frameworks.

SM ActivitiesProcess ImplementationAnalysisPlanning & AcceptanceModification & ImplementationMigration & Refinement

Knowledge Required

Org. KnowledgeTechnical KnowledgeManagerial KnowledgeBusiness Domain Knowledge

KM Components & Infrastructure

KM PortalDocument ManagementDirectory servicesKnowledge MapMiddlewaresOLAP

SM ToolsHelpdeskSCMSource Control

Automation & K-Discovery Tools

Multi Agent System (MAS)Search EngineData Mining & WarehousingExpert Systems

KnowledgeProcess/Activity

ToolsCoP KM Infra

PsychologicalMotivation, & AwarenessManagerial influences Leadership, Coordination,

Cultural influencesTrusts, Beliefs, Values, Norms & Unwritten Rules, K-Hoarding, CoPStrategy - Personalization

KM Soft influences

Users

Superuser

Helpdesk

SM Manager

Business Analyst

Systems Analyst

Programmer

DBA

Others

KM ActivitiesAcquire

knowledgeSelect

knowledge

Use knowledge

Provide/Create knowledge

Store knowledge

Collaborative Components

Same Time & SpaceFace-to-faceGroupware

Same Time Diff SpacePhone . EmailAudio Video Conference

Diff Time, Diff SpacePortal . BBE-GroupIntranet

Diff Time,Same SpaceEmail . E-GroupBB . PortalIntranet

Figure 11. The Revised Model for KMS Components for

Collaborative SM

7. Conclusion and Future Works In SM environment, KMS are critical to ensure that KM

activities such as knowledge acquisition, storage and retrieval and processes includes not only the hard-components (tools and infrastructure), but also the soft-components (managerial, psychological and cultural).

To formulate the KMS framework for collaborative SM, the components on KMS, SM governance, and automation and knowledge discovery are compiled from various literatures. An initial model of modified KMS components for collaborative SM is proposed. The relationships between these components are used to construct the questionnaire, which were first tested in a pilot study. A pilot study was undertaken to evaluate the construct validity of the questionnaire, as well as identifying the expected measures.

A survey using revised questionnaire items was carried out in three SM organizations in Klang Valley Malaysia to gain a better perspective on the importance of the SM and KM components. As a result, several less important components were excluded from the initial model. The revised model was further deliberated, by experts’ opinion,

to finalize the important components for KMS framework for collaborative SM environment. This new framework shall be used in our ongoing study to develop a tool to assist SM CoP members to perform their SM activities better.

Acknowledgment This research is funded by the Malaysia Ministry of

Science and Technology (MOSTI) e-ScienceFund Project No.01-01-04-SF0966.

References [1] R. Abdullah, Knowledge Management System in a

Collaborative Environment, University Putra Malaysia Press, 2008.

[2] R. Abdullah, S. Sahibuddin, R. Alias, and M.H. Selamat, Knowledge Management System Architecture For Organizational Learning With Collaborative Environment, International Journal of Computer Science and Network Security, vol. 6, 2006.

[3] M. Alavi and D. Leidner, Knowledge Management Systems: Issues, Challenges, and Benefits, Communication of AIS, vol. 1, 2000.

[4] Arthur Anderson and APQC, The Knowledge Management Assessment Tool: External Benchmarking Version, Arthur Anderson/APQC, 1996.

[5] Aurum, R. Jeffery, C. Wohlin, and M. Handzic, Managing Software Engineering Knowledge, Springer, 2003.

[6] G. Avram, Knowledge Work Practices in Global Software Development, The Electronic Journal of Knowledge Management, vol. 5, 2007.

[7] T. Bond and C. Fox, Applying the Rasch Model: Fundamental Measurement in Human Sciences., New Jersey: Lawrence Album Associates, 2007.

[8] C. Choo, An Integrated Information Model of the Organization: The Knowing Organization, 1996.

[9] S. Das, W. Lutters, and C. Seaman, Understanding Documentation Value in Software Maintenance, Proceedings of the 2007 Symposium on Computer Symposium On Computer Human Interaction For The Management Of Information Technology, 2007.

[10] T. Davenport and L. Prusak, Working Knowledge: How Organization Manage What They Know, Harvard Business School Press, 2000.

[11] Deraman, A Framework For Software Maintenance Model Development, Malaysian Journal of Computer Science, vol. 11, 1998.

[12] M. Dias, N. Anquetil, and K. Oliveira, Organizing the Knowledge Used in Software Maintenance, Journal of Universal Computer Science, vol. 9, 2003.

[13] T. Dingsoyr and R. Conradi, A Survey Of Case Studies Of The Use Of Knowledge Management In Software Engineering, International Journal of Software Engineering and Knowledge Engineering, vol. 12, 2002.

[14] R. Fjeldstad and W. Hamlen, Application Program Maintenance Study: Report to Our Respondents, Software Engineering- Theory and Practices, Prentice Hall, 1998.

[15] S. Foo, A. Chua, and R. Sharma, Knowledge management Tools and Techniques, Singapore: Pearson Prentice Hall, 2006.


128

[16] N. Ghali, Managing Software Development Knowledge: A Conceptually-Oriented Software Engineering Environment, PhD. Thesis, University of Ottawa, Canada, 1993.

[17] D. Grey, Knowledge Mapping: A Practical Overview. Available at: http://kmguru.tblog.com/post/98920, Accessed September 30, 2009

[18] M. Handzic and H. Hasan, The Search for an Integrated KM Framework, Australian Studies in Knowledge Management, UOW Press, 2003.

[19] C. Hosapple and K. Joshi, Understanding KM Solutions: The Evolution of Frameworks in Theory and Practice, Knowledge Management Systems: Theory and Practice, Thomson Learning, 2002.

[20] IEEE, IEEE 14764-2006, Standard for Software Engineering - Software Life Cycle Process - Maintenance, The Institute of Electrical and Electronics Engineers, Inc., 2006.

[21] V. Janev and S. Vranes, The Role of Knowledge Management Solutions in Enterprise Business Processes, Journal of Universal Computer Science, vol. 11, 2005.

[22] M. Jarke and T. Rose, Managing Knowledge about Information System Evolution, Proceedings of the 1988 ACM SIGMOD International Conference, 1988.

[23] B. Lientz and E. Swanson, Characteristics of Application Software Maintenance, Communications of the ACM, vol. 24, 1981.

[24] P. Meso and R. Smith, A Resource-Based View Of Organizational Knowledge Management Systems, Journal of Knowledge Management, vol. 4, 2000.

[25] M.Z. Mohd Nor and R. Abdullah, A Technical Perspective of Knowledge Management in Collaborative Software Maintenance Environment, Knowledge Management International Conference (KMICE), 2008.

[26] M.Z. Mohd Nor, R. Abdullah, M.A. Azmi Murad, M. H. Selamat, A. A. Aziz, "KMS Components For Collaborative Software Maintenance – A Pilot Study", International Conference on Information Retrieval and Knowledge Management (CAMP10), 2010.

[27] G. Natarajan and S. Shekar, Knowledge Management: Enable Business Growth, McGraw-Hill, 2001.

[28] Nonaka and H. Takeuchi, The Knowledge-Creating Company, New York: Oxford University Press, Inc., 1995.

[29] T. Pigoski, Software Maintenance, Guide to the Software Engineering Body of Knowledge (SWEBOK), The Institute of Electrical and Electronics Engineers, Inc., 2004.

[30] O. Rodriquez, A. Martinez, J. Favela, A. Viscaino, and M. Piattini, Understanding and Supporting Knowledge Flows in a Community of Software Developers, Lecture Notes in Computer Science, vol. 2198, 2004.

[31] O. Rodriquez, A. Martinez, J. Favela, A. Viscaino, and M. Piattini, How to Manage Knowledge in the Software Maintenance Process, Lecture Notes in Computer Science, vol. 3096, 2004.

[32] Rus and M. Lindvall, Knowledge Management in Software Engineering, IEEE Software, vol. 19, 2001.

[33] G. Santos, K. Vilela, M. Montoni, and A. Rocha, Knowledge Management in a Software Development Environment to Suport Software Process Deployment, Lecture Notes in Computer Science, vol. 3782, 2005.

[34] S. Schach, B. Jin, L. Yu, G. Heller, and J. Offutt, Determining the Distribution of Maintenance

Categories: Survey versus Measurement, Journal of Empirical Software Engineering, vol. 8, 2003.

[35] M. Selamat, R. Abdullah, and C. Paul, Knowledge Management System Architecture For Organizational Learning With Collaborative Environment, International Journal of Computer Science and Network Security, vol. 6, 2006.

[36] K. Sherif, Barriers to Adoption of Organizational Memories: Lessons from Industry, Knowledge Management Systems: Theory and Practice, Thomson Learning, 2002

[37] G. Szulanski, Exploring Internal Stickiness: Impediments to the Transfer of Best Practice Within The Firm, Strategic Management Journal, vol. 17, 1996.

[38] F. Ulrich, A Multi-Layer Architecture for Knowledge Management Systems, Knowledge Management Systems: Theory and Practice, Thomson Learning, 2002, pp. 97-111.

[39] Vizcaíno, J. Soto, and M. Piattini, Supporting Knowledge Reuse During the Software Maintenance Process through Agents, Proceedings of the 6th International Conference on Enterprise Information Systems (ICEIS), 2004.

[40] E. Wenger, Communities of Practice: Learning, Meaning, and Identity, Cambridge University Press, 1998.

[41] K. Wiig, Knowledge Management Foundation, Schema Press, 1993.

[42] T. Wilson, The nonsense of knowledge management, Journal of Information Research, Vol. 8, 2002.

Mohd Zali Mohd Nor received the B.Sc. in Mathematics from the University of Michigan, Ann Arbor in 1988 and Master of Management in I.T. from Universiti Putra Malaysia in 2005. He is now an I.T Development and Maintenance manager in a shipping company, whilst pursuing his Ph.D in University Putra Malaysia. His main area

of interest is Knowledge Management in Collaborative Software Maintenance.


129

Effect of Handoff on End-to-End Delay over Mobile IP Networks

Reza Malekian1, Abdul Hanan Abdullah2 and Elmarie Biermann3

1Faculty of Computer Science and Information Systems, Universiti Teknologi Malaysia,

Johor, Malaysia [email protected]

2SFaculty of Computer Science and Information Systems, Universiti Teknologi Malaysia,

Johor, Malaysia

3F’SATI, Cape Peninsula University of Technology, Cape Town, South Africa,

Abstract: Mobile IP allows for a Mobile Node (MN) to remain reachable while handoff occurs to a new foreign network. When a MN moves to a new network, it will be unreachable for a period of time. This period is referred to as handoff latency. In general, it is caused by the time used to discover a new network. This include obtaining and validating a new Care of address (CoA) that identifies the current location of the mobile node; obtaining authorization to access the new network, making the decision that a handoff should be initiated, and, finally, executing the handoff. Executing the handoff involves notifying the home agent of the new CoA and awaiting the acknowledgment from the home agent (HA).IP mobility must be able to support performance in terms of initializing the handoffs as well as smoothing the process. In this paper, we study the effect of handoff latency on end-to-end delay mobile IPv4 through extensive simulation.

Keywords: Mobile IP, Hand off, tunneling, End-to-end delay.

1. Introduction In the event that a Mobile Node moves to a foreign network, it is required to register its care of address in relation to the home agent. The CoA is the secondary address of the mobile node, reflecting its current “away from home” location. This address is temporary and whenever a MN changes its foreign network a new CoA must be registered. Figure 1 depicts the registration process and the message flows between these entities [5]. The MN sends a registration request message to the HA via a Foreign Agent (FA). The HA updates its binding table, changes the CoA entry related to the MN and sends a registration reply which indicates the mobile’s registration request has been received. Once this is completed the MN is able to continue communicating within the core network (CN) [1].

Once the MN handoff (handover) and moves from its home agent, it delivers packets via a tunnel. In our scenario the MN follows specific trajectories, when visiting FA1, FA2, and FA3, respectively, as depicted in Figure 2. The MN registers the new CoA in HA and sends or receives packets via tunnels 1, 2, 3.

Figure 2. Tunneling in Mobile IPv4

Figure 1. CoA Registration

HAAccessNetwork FA MN

Registration Request

Registration Reply


130

2. Handoff latency in Mobile IPv4 The main problem which arises with hand off is the time span in which a MN is not able to receive packets. During this time the mobile node obtains a new CoA and updates its last communications [8]. This period of time can be above the threshold required for the support of real-time services [3]. Authors in [4] proposed pre-registration and post-registration handoff methods to minimize handoff latency in mobile IP networks. In pre-registration handoff method, the MN communicates with the new foreign agent (nFA) while still being connected to the old foreign agent (oFA). This means the MN is able to do a registration prior to handoff to the new network. In post-registration handoff a tunnel is setup between the nFA and the oFA. The MN is thus still connected to the oFA while on new foreign agent’s subnet. With this condition, the MN can perform a registration after communication with the nFA is established. Both of these methods minimize handoff latency. The focus of our research is to conduct an in-depth study into the effect of handoff latency on end-to-end delay within Mobile IPv4. We set a simulation scenario in which the connectivity of the mobile node to the home and foreign agents are analyzed.

3. Simulation Results In order to study the effect of handoff latency on end-to-end delay Mobile IPv4 we set a simulation scene as revealed in Figure 3. It consists of one MN, one correspondent node, one HA, three FAs, and one IP cloud that interconnects them. At the start of the simulation, the MN is located on the home network and from there move to foreign networks along a defined trajectory with an average speed of 10km/h. Simulation results are conducted using OPNET 14.

Figure 3. Simulation Scene

Figure 4 show the results of layer-2 connectivity between the MN and any of the HA/FAs. The Base Station Subsystem (BSS) ID numbers reflected within the graph, identify the MN to which the agent is connected. The value of -1 indicates the MN losing connectivity with an agent [2]. As the MN follows the trajectory, it establishes layer-2 connectivity with all the agents. When the MN moves out of the home network, it loses connectivity with the home agent (BSS ID=0) at 10 min and connect to FA1 (BSS ID=1). Disconnection occurs again at approximately 25 min when the MN leaves FA1 and enters FA2. Connection with FA2 is lost at approximately 41 min when it roams to FA3. Ultimately, the MN looses agent connectivity upon leaving FA3 at 55 min.

Figure 4. Agent connectivity

Figures 5 and 6 illustrate both tunneled and un-tunneled traffic received during periods of 1 hour (packets per second) and 3 hours (bits per second) respectively. When the MN is within its home network it does not need to receive packets from the CN via a tunnel due to the MN using the IP protocol when residing in its home network. When the MN traverses through foreign networks it sends and receives traffic via a tunnel. Gaps appear between the tunneled traffic received as the MN roams between various foreign agents. For example, the MN is disabled to receive tunneled traffic upon leaving FA1 and entering FA2 at 25 min.


131

Figure 5. Traffic (packets/sec)

Figure 6. Traffic (bits/sec)

End-to-end packet delay is depicted within Figure 7. Within the simulation results, packet loss begins at approximately 7 min, when the MN moves out of the home network, loses connection with the HA and resumes again at approximately 10 min. Packet flow resumes when the MN successfully registers its current location in the FA1. Figure 7 indicates a gap between 23 and 25 min when the MN roams between FA1, FA2. Packet loss occurs again at approximately 40 min when the MN leaves FA2 and enters FA3. Packet flow resumes again at approximately 43 min.

Figure 7. End-to-end delay

When the MN is in the home network, minimum end-to-end delay is much smaller than end-to-end delay in foreign networks. The main reason for this is that the MN ignores mobile IP protocol when it resides in the home network and uses the IP protocol to communicate with the CN. When the MN moves to foreign networks it utilizes the MIP protocol. It thus needs to register its CoA in the home agent and also needs to send/receive packets via a tunnel. This is shown in Figure 8.

Figure 8. Average end-to-end delay

4. Conclusion In his paper we presented end-to-end delay and average and-to-end delay based on handoff in mobile IPv4. From this study we distinguish important metrics that should be considered in order to increase performance within mobile networks. Performance evaluation for handoff includes handoff latency as well as the number of performed handoffs. When several networks are candidates to serve as target for a handoff, the network that provides most bandwidth and the most stable connection should be the first choice.


132

Acknowledgments

This work is supported by the Fundamental Research Grant Scheme (FRGS). The authors would like to acknowledge the FRGS for their support of our research project.

References [1] Jhy-Cheng Chen, Tao Zhang, “IP-Based next

generation wireless networks”, WILEY-Interscience, 2004.

[2] OPNETWORK 2005, 1348 Planning and Analyzing Mobile IP networks

[3] R. Malekian, “ The study of Handoff in Mobile IP Networks:, In proceeding of Broadcom’08 (IEEE), pp. 181-185, Pretoria, South Africa, 2008

[4] Svetoslav Yankov, Sven Wiethoelter, “Handover blackout duration of layer3 mobility management scheems”, Telecommunication networks group, Technical University Berlin, May 2006.

[5] E. Fogelstroem, A. Jonsson, C.E. Perkins, “Mobile Ipv4 regional registration”, RFC 4857, June 2007.

[6] Y. Chen, W. Zhuang, “DiffServ resource allocation for fast handoff in wireless mobile Internet”, IEEE Communication Magazine, 2002.

[7] Michal Skorepa, “Mobile Ipv4 simulation and implementation”, In proceeding of student EECIT 2008, Czech Republic, 2008.

[8] Jeng-Yueng Chen, Chen-Chuan Ynag, Li-Sheng Yu, “HH-MIP: An enhancement of mobile IP by home agent handover”, EURASIP Journal on Wireless Communication and Networking, Vol 10, 2010.

Author’s Profile Reza Malekian is conducting research in the area of mobile IP in the Department of Computer Science and Information Systems at the Universiti Technologi, Malaysia. His research interest is in wireless communication. He is a member of IEEE Vancouver section and also editor-in-chief of The International Journal of Wireless Communication and Simulation. During summer 2010, he was a visiting researcher at the Communications Network Laboratory (CNL), Simon Fraser University of Canada to do a new proposal on Mobile IP version 6. Professor Dr Abdul Hanan Abdullah is the Dean, Dept. of Computer Systems & Communication, University Teknologi, Malaysia. He received the B.Sc. and M.Sc from University of San Francisco, California, and the Ph.D degree from Aston University, Birmingham, UK, in 1995. His research interest is in Information Security. He is also a head of Pervasive Computing Research Group (PCRG) UTM and member of ACM.

Adjunct Professor Dr. Elmarie Biermann is a research professor at the French South African Institute of Technology (F’SATI) at the Cape Peninsula University of Technology (CPUT), Cape Town. She completed the BSc, BSc

(Honns) and MSc at the Potchefstroom University and her PhD at the University of South Africa. She is specializing within computer security and software agents and is also the manager/founder of a research and training company within South Africa. .


133

Design of Cryptographic Hash Algorithm using Genetic Algorithms

Siva Prashanth J1, Vishnu Murthy G 2 and Praneeth Kumar Gunda3

1,2,3Dept. of Computer Science and Engineering,

CVSR College of Engineering, Ghatkesar, Andhra Pradesh, India

[email protected], [email protected], [email protected]

Abstract: Hash functions play a prominent role in signing the digital documents. The classical one being SHA-1 is used even now which is very costly in the terms of space and time complexities. Here in this paper, we propose a new One way, Collision Resistant and an Economic Hash function , which is equivalently secure and efficient as SHA-1, designed using Genetic algorithms and a pseudo random number generator which makes it secure against Timing attacks. A Comparison study with the classical hash functions such as MD5 and SHA-1 are also given.

Keywords: Hash algorithm, Genetic functions.

1. Introduction A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value. The data to be encoded is often called the "message", and the hash value is sometimes called the message digest or simply digest. Cryptographic hash functions have many information security applications, notably in digital signatures, message authentication codes (MACs), and other forms of authentication. They can also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to detect duplicate data or uniquely identify files, and as checksums to detect accidental data corruption. Indeed, in information security contexts, cryptographic hash values are sometimes called (digital) fingerprints, checksums, or just hash values, even though all these terms stand for functions with rather different properties and purposes. An important application of secure hashes is verification of message integrity. Determining whether any changes have been made to a message (or a file), for example, can be accomplished by comparing message digests calculated before, and after, transmission (or any other event). A message digest can also serve as a means of reliably identifying a file; several source code management systems, including Git, Mercurial and Monotone, use the sha1sum of various types of content (file content, directory trees, ancestry information, etc) to uniquely identify them. Hashes are used to identify files on peer-to-peer filesharing networks. For example, in an ed2k link, an MD4-variant hash is combined with the file size, providing sufficient information for locating file sources, downloading the file

and verifying its contents. Magnet links are another example. Such file hashes are often the top hash of a hash list or a hash tree which allows for additional benefits. Hash functions can also be used in the generation of pseudorandom bits, or to derive new keys or passwords from a single, secure key or password. Rest of this paper is organized as follows: Section 2 deals with the description of Blum Blum Shub Cryptographically Secure Pseudo Random Bit Generator (CSPRBG), Section 3 presents the algorithm and finally Section 4 deals with the security aspect of proposed algorithm and finally, section 5 shows the juxtaposition of the algorithm with SHA-1, MD 5 and RIPEMD-160 respectively. Section 6 and 7 concludes and proposes future enhancements.

2. Literature Survey 2.1 Blum Blum Shub Random Bit Generator A popular approach for generating secure pseudorandom

number is known as the Blum, Blum, Shub (BBS) generator, named for its developers [3]. The procedure is as follows. First, choose two large prime numbers, p and q, such that both have remainder of 3 when divided by 4.

i.e., p≡q≡3 (mod 4) means that (p mod 4)= (q mod 4)= 3. Let n = p X q. Next,

choose a random number s, such that s is relatively prime to n; this is equivalent to saying that neither p nor q is factor of s. Then the BBS generator produces a sequence of numbers Xi according to the following algorithm:

X0 = s2 mod n. for i =1 to infinite Xi= (Xi-1)2 mod n

The BBS is referred to as a cryptographically secure pseudorandom bit generator (CSPRBG). A CSPRBG is defined as one that passes the next- bit test , Which is defined as follows: “A Pseudo random bit generator is said to pass the next-bit test if there is not a polynomial-time algorithm that, an on input of the first k bits of an output sequence, can predict the (k+1)st bit with probability significantly greater than 1/2”.

An interesting characteristic of the Blum Blum Shub generator is the possibility to calculate any xi value directly (via Euler's Theorem):

The security of BBS is based on the difficulty of factoring


134

n. That is, it is hard to determine its two prime factors p and q. If integer factorization is difficult (as is suspected) then B.B.S. with large n should have an output free from any nonrandom patterns that can be discovered with any reasonable amount of calculation. Thus it appears to be as secure as other encryption technologies tied to the factorization problem, such as RSA encryption.

In the proposed algorithm we use two genetic functions “CROSSOVER” and “MUTATION”. Crossover is a genetic function which can be described by the following figure: As Illustrated in the Figure 1 the Binary representation of key and plain text are Crossected. We have two forms of crossover: Single and Double Crossover. Where we take, one breaking point for single crossover and two breaking points for double crossover.

Figure 1: Crossover

Mutation is a genetic function where the bit at a given position is inversed (i.e., 0 to 1 and vice versa). Mutation can be of a single point or at multiple points.

3. Proposed Algorithm The algorithm consists of two phases where the first

phase is of generating keys and the other is of hashing. 3.1. Key Generation The algorithm uses a 3-tuple key {p, q, s} where p and

q are large prime numbers, s is a chosen random number which is relatively prime to n, the product of p and q. Then, the algorithm uses the Blum, Blum, Shub Generator for generating the random numbers (Section 2.1) which are used as keys in each iteration for generating hash value. An example is given ahead which tells the working of the algorithm

1. Choose p=7 and q=19 2. Implies, n= 7 X 19 = 133 3. Choose s=100, relatively prime with 133 4. Then, X0=s2mod n= (100)2 mod 133= 25 X1=(X0)2 mod n= (25)2 mod 133= 93 . . . . Here, the key is represented as {7, 19, 100}.

3.2 Hash Algorithm 1 . Read the message and store it in array X. 2 . Obtain the ASCII value for each element in X[ ] and convert into binary format and

3. while y[i] != null or y[i+2]!=null do If the number is even perform right shift by p%8 bits. Else, perform left shift Xi % 8.

( i )Perform CROSSOVER between y[i] and y[i+2]. ( ii ) Perform mutation on the resultant and store it in array Z . end. 4.copy the elements of Z[ ] into res[ ] 5. for i = 0 to N where N is number of elements in the array res hash = res[N+1] = res[0] res[1]....res[N] 6. Append the hash to Z[ ] .

4. Analysis Table 1 shows the comparison of GHA-1 with SHA-1,

RIPEMD-160 and MD5 in the aspects of there features and Table 2 distinguishes how prone the algorithm GHA-1 is when compared to the above said algorithms.

Table 1: Comaprision of Various algorithms in terms of

processing

Function

Word

Computation Values

Endian

operation

RIPEMD-160

32 320 Little A B S

MD 5 32 512 Little A B S

SHA-1 32 160 Big A B S

Proposed

64 2 Big B

In the above table, the representations used are: A(Addition),B(Bitwise operations), S(Shift, Rotation). From the above, we infer that the processing of algorithm is economic in all the aspects when compared with the other algorithms.


135

Function

Hash Size

No. of Rounds

Best-Known attacks

Collision

Preimage

RIPEMD-160

160 80 251.48

MD 5 128 64 210 2127

SHA-1 160 80 251

Proposed

512 2 235.24

Table 1: Comparison of Various algorithms in terms of vulnerability to attacks

The above table clearly shows that GHA-1 is much more resistant to all the attacks than the above said algorithms.

5. Conclusion and Future Enhancements The above analysis proves that proposed one is an

economic and an efficient algorithm which is competent enough with SHA-1 in all the aspects.

The author will dedicate the future work in making this algorithm work accordingly with PGP and S/MIME.

References [1] Praneeth Kumar G, and Vishnu Murthy G, “Design of a

Novel Cryptographic algorithm using Genetic Functions”, IJCNS, Austria, Vienna Vol.2 No.4, April 2010 pp. 55-57.

[2] Vishnu Murthy G, Praneeth Kumar G , and A Krishna Kumari “Design of a Novel Image Encryption Technique using Genetic Algorithms”, Proceedings of SPVL,June 2010 pp. 459-461.

[3] Lenore Blum, Manuel Blum, and Michael Shub.,

“Comparision of two pseudo random number generators” Proc. CRYPTO’82, pages. 61-78, Newyork, 1983

[4] William Stallings, “Cryptography and Network Security”, Prentice Hall, 3rd Edition.

[5] Subramil Som, Jyotsna Kumar Mandal and Soumya Basu, “A Genetic Functions Based Cryptosystem (GFC)”, IJCSNS, September 2009.

Author’s Profile

J Siva Prashanth received the B.Tech Degree in Computer Science and Engineering and pursuing M.Tech, working as Asst. Professor at CVSR College of Engineering. His areas of interest include software engineering and Information Security.

Vishnu Murthy G received the B.Tech.and M.Tech. degrees in Computer Science and Engineering. He is resource person for IEG and Birla Off campus programmes. He is presently pursuing his Ph.D in J.N.T.U. and heading the Department of Computer Science and Engineering in CVSR College of Engineering. His areas of interest include software Engineering,

Information Security and Image Processing.

Praneeth Kumar Gunda received the B.Tech Degree in Computer Science and Engineering from Progressive Engineering College in 2008. He is pursuing M.Tech at CVSR College of Engineering. During May’ 2008 – Aug’ 2009, he worked in Concepts in Computing(CIC) as a Software Engineer. He is presently working at

CVSR College of Engineering as an Assistant Professor. His areas of interest include Image Processing and Information Security.


136

Design of CMOS Active-RC Low Pass Filter Using 0.18 µm Technology

Dilip Singar1, D.S. Ajnar2 and Pramod Kumar Jain3

1 Department of Electronics & instrumentation engineering,

Shri G.S. Institute of technology and science 23, Park Road, Indore (M.P.)

[email protected] 2 Department of Electronics & instrumentation engineering,


[email protected] 3 Department of Electronics & instrumentation engineering,


[email protected]

Abstract: In this paper advances on analog filter design for telecom transceivers are addressed. Portable devices require a strong power consumption reduction to increase the battery life. Since a considerable part of the power consumption is due to the analog baseband filters, improved and/or novel analog filter design approaches have to be developed. We design an active RC filter for this application in this paper. To demonstrate the proposed techniques, a ±0.8 V, 2-MHz second-order filter fabricated in a conventional 0.18 µm CMOS process is presented. The filter achieves a THD of 40 dB. The measured power consumption for the filter alone consumes about 0.19 mW for a supply voltage of ± 0.8 V. Design and simulation of the circuit is done in Cadence specter environment with UMC 0.18 µm CMOS process.

Keywords: Analog IC design, active RC filter, THD, op amp, telecom transceiver

1. Introduction The active filters are widely used in instrumentation and communication systems. Technical evolution and market requirements demand for high-performance fully integrated telecom transceivers. The most popular receiver architecture is the Direct Conversion (DC) one: for this case, the following discussion is applied. Figure.1 shows the typical DC receiver architecture. The signal from the antenna is processed by an external prefilter to reject an amount of the out-of-band interferers. The front-end consists of LNA and a quadrature mixer that down converts the signal. The baseband part is composed by the low-pass filter (LPF), variable gain amplifier (VGA) and analog to digital converter (ADC). LPF and VGA perform the following functions: - The LPF selects the channel and reduces the noise and the out-of-band interferes, relaxing ADC requirements. - The VGA increases the in-band input signal in order to optimize analog to digital conversion performed by the following ADC.

Figure1. Block Diagram of Direct Conversion (DC) receiver

For example: in the UWB systems, the input signal power are typically very low (about -40dBm) therefore it needs to be amplified by more than a 40dB Factor [2]. The LPF can be implemented with different solutions, depending on several reasons as: - The power consumption minimization is strongly required by portable devices to increase the battery life; - Different communication standards require strongly different analog active-RC filter’s performances in terms of bandwidth, distortion, Noise. 2. Circuit Implementation 2.1 Architecture of Filter Figure 2 shows the Active-RC low pass filter using the AM biquad topology. It is the suitable filter structure needed to achieve high filter pole frequency for a given unity bandwidth. The biquad is the slightly modified form of the original AM biquad. In which C2 will be omitted and a resister will be added in parallel with C1 of the first integrator to control Q factor. The advantage of the modification is that it allows the adjustment of Q factor by adjusting the value of C2 only for given value of C1. The pole frequency is depends on the value of C1 and C2. In the above figure four op-amp filters (so called quads) are used in one integrated circuit. The circuit can be adjusted in a non-interactive manner for precise filter parameter. A3 is non-inverting op-amp amplifier and A1, A2, A4 are the inverting


137

amplifier. The A1 and A3 have no dc-feedback path in between and converted the signal directly.Quality factor is set by the resistor QR.

Figure 2. Schematic on CMOS active-RC low pass filter

The output of a second-order low-pass filter with a very high quality factor responds to a step input by quickly rising above, oscillating around and eventually converging to a steady-state value. The low pass filter gain is controlled by R/K, varying the value of this resistor the gain can be adjusted to our specifications of the filter. We will consider this on the Ackerberg-mossberg circuit shown in figure 2. The low pass output filter realizes for C1, C2 and indicated resistors [2]. The transfer function is given by

(S) = = - = We have chosen, as shown in figure 2 two identical capacitors to keep with common, and have identified the quantities K and Q at the corresponding resistors that determine these filter parameters. The two resistors R1 and R2 of the inverter are arbitrary. Let us employ a fourth op-amp as a summer, shown in the lower part of figure 2. This equation has four solutions bandwidth (BW), the two positive ones are ω1and ω2 and their difference can be shown to be

BW= ω2 – ω1 = ω0 / (Q (1) Let us determine the value of Q required meeting the specified 1/q over a band ∆f

(2) The specifications for the desired filter are given in the table I.

Table 1: Specification Results

Experimental Results

Value

Open loop Gain ≥ 23 dB

3dB frequency ≥ 1.5 MHz

Input referred noise(1KHz)

≤ 160 nV/√Hz

Power dissipation ≤ 0.52 mW

PSRR(Vdd) ≥ 29dB

2.2 OP AMP Design: Design of op-Amp: operational amplifier is very important to get accurate result. The Op-Amp is characterized by various parameters like open loop gain, Bandwidth, Slew Rate, Noise and etc. The performance measures are fixed due to design parameters such as transistors size, bias current and etc. Transistors M8 and M9 functions as a constant current source, and transistors M1, M2 and M3 functions as two current mirror 'pairs'. The transistors M4, M5, M6 and M7 are the differential amplifier. Transistor M10 is an output amplifier stage [4]. Figure 3 represents the op amp design which is used in the filter.

Figure 3. Schematic of CMOS Op-Amp

3. Measured Performance The active RC filter is designed and simulated in Cadence spectre environment with UMC 0.18-µm silicon technology. The whole circuit is biased with a supply voltage of ± 0.8 V. Figure 4 plots the open loop gain of filter. It shows that the open loop gain is found to be 40 dB with 3 dB bandwidth of 2MHz. It has unity gain frequency of 80 MHz. With the sweep of ± 0.8 V of supply the Dc sweep response represents that the offset voltage is reduced to 5 mV. Figure 5 plot the change in power supply rejection ratio (PSRR) with frequency.It is found that the PSRR of filter is 28 dB.


138

Figure 4. Simulation result of Gain and phase response

Figure 5. Simulation result of PSRR Response

If a voltage source inserted in series with the input source

generates the input referred noise. Input referred noise at

1KHz has been found 62.08 nV/sqrt (Hz), the input referred

plot is shown in figure 6. Since the first-order components

grow linearly and third-order components grow cubically,

They eventually intercept as the input power level increases. The IP3 is defined as the cross point of the power for the 1st order tones, w1 and w2, and the power

for the 3rd order tones, 2 w1 and w2 & two w2 and w1 on the load side. The input intercept point (IIP3) was found to be 0.448 dBm.

Figure 6. Simulation result of Input referred noise response

The third order harmonic distortion plot is shown in figure 7.THD performance versus different peak-to-peak magnitudes at the low-pass output of the main filter for two different frequencies (5 and 100 kHz). Less than -40 dBc of THD can be achieved for a peak-to-peak output voltage. The final output results of active RC filter with comparisions are represented in table 2.

Figure 7 Simulation result of Total harmonic distortion

Table 2: SUMMARY OF EXPERIMENTAL RESULTS

Experimental

Results

This

Work

Ref [2]

Open loop Gain 40 dB 23dB

3dB frequency 2 MHz 1.5MHz

Unity Gain

Frequency

80 MHz 50 MHz

Input referred

noise(1KHz)

62

nV/√Hz

160 nV/√Hz

Output referred

noise (10KHz)

30

nV/√Hz

120 nV/√Hz

Power dissipation 0.19

mW

0.52 mW

Input Offset

Voltage

5.34

mV

8 mV

PSRR(Vdd) 29 dB 29 dB

Total harmonic

distortion

40 dBm 60 dBm

4. Conclusion In this design, a low-voltage CMOS active RC low pass filter is designed using a Akerberg Mossberg topology. The proposed techniques can be used to design low-voltage and


139

low-power active RC low pass filter in a standard CMOS process. To demonstrate the proposed techniques, a ±0.8V and 2-MHz second-order filter implemented in a standard 0.18µm CMOS process. Reference [1] A. M. Durham, W. Redman-White, and J. B. Hughes,

“High linearity Continuous -time filters in 5-V VLSI CMOS,” IEEE Journal of Solid-State Circuits, volume 27, page no. 1270–1276 Sept.1992.

[2] M. De Matteis1, S. D Amico A.Baschirotto “Advanced Analog Filters for Telecomm-unications’’, IEEE Journal of Solid-State Circuits, volume 65, page no. 06–12, Sept. 2008.

[3] H. Huang and E. K. F. Lee, “Design of low – voltage CMOS continuous time filter with on chip automatic tuning ,” IEEE Journal of Solid-State Circuits, volume 36 page no. 1168–1177 Aug. 2005.

[4] Eri Prasetyo, Dominique Ginhac Michel Paindavoine,’’ Design and Implementation a 8 bits Pipeline Analog to Digital Converter in the Technology 0.6 um CMOS Process’’. Makalah ada di prosiding ISSM05, Paris, 30th September – 1st October 2005.

[5] AKERBERG, D:' Comparison of method for active RC telecommunication Theory, Royal Institute Technology, Stockholm, Sweden, technical report 19, June 1999

Authors Profile

Dilip Singar received the B.E. degree in Electronics and Communication Engineering from Rajiv Gandhi Technical University Bhopal, india in 2008 and M.Tech in Microelectronics and VLSI Design from S.G.S.I.T.S. Indore, India in 2010. Recently he is working with analog filter design and analysis.

D.S.Ajnar received the B.E. degree in Electronics and Communication Engineering from D.A.V.V.University, India in 1993 and M.E. Degree in Digital Techniques & Instrumentation Engineering from Rajiv Gandhi Technical University Bhopal, India in 2000. He has been teaching and in research profession since 1995. He is now working as Reader in Department of Electronics &

Instrumentation Engineering, S.G.S.I.T.S, and Indore, India. His interest of research is in Designing of analog filter and Current-Conveyor.

P.K.Jain received the B.E. degree in Electronics and communication Engineering from D.A.V.V. University, India in 1987 and M.E. Degree in Digital Techniques & Instrumentation Engineering from Rajiv Ghandhi Technical University Bhopal, India in 1993. He has been teaching and in research

profession since 1988. He is now working as Reader in Department of Electronics & Instrumentation Engineering, S.G.S.I.T.S. Indore India. His interested field of research is analog circuit design.


140

Identification of Critical Factors for Fast Multiple Faults Recovery Based on Reassignment of Task in

Cluster Computing Sanjay Bansal1, Sanjeev Sharma2

1Medi-Caps Institute of Technology and Management

Indore, India [email protected]

2 School of Information Technology Rajiv Gandhi Prodhyogiki Vishwavidya

Bhopal, India [email protected]

Abstract: Performance of a recovery algorithm based on reassignment of task in distributed system can be improved by improved scheduling algorithm. By equal load distribution of task in a recovery, execution time and resource utilization can be improved. This is done by distributed scheduling algorithms. Fast and efficiency are most desirable features of a recovery algorithm. In this paper important critical issues involved in fast and efficient recovery are discussed. Impact of each issue on performance of reassignment based recovery is also discussed. Relationships among issues are also explored. Finally comparisons of important issues are done between simple reassignment based recovery and fast reassignment based recovery.

Keywords: Splitting Ratio, Reassignment, Distributed System.

1. Introduction Fault tolerance is a crucial requirement in distributed computing. As the size and complexity of distributed system is increasing to provide services to millions of users with large data transfer to and from, probability of faults have also increased. Faults are now inevitable and can not be prevented 100%. Air traffic control, defense application, online railway reservation system and online banking are few applications where user must be unaware of faults and must be continued with his or her normal operation. Even a single fault can lead to great loss of human lives and money. In such a situation inclusion of fault tolerance becomes essential. This inclusion of fault tolerance introduces an overhead which affects the performance of whole system. In case of multiple faults situation it becomes more severe. In real time application producing the output after a predetermined time for which system is designed seems to be impossible due to lack of fast and efficient recovery. A cluster is a type of parallel or distributed processing system which consists of a collection of interconnected computers cooperatively working together as a single, integrated computing resource. Cluster is a multi-computer architecture. Message passing interface (MPI) or pure

virtual machine (PVM) are used to facilitate inter process communication in a cluster. Clusters and distributed systems offer fault tolerance and high performance through load sharing. For this reason cluster is attractive in real-time application. When one or more computers fail, available load must be redistributed evenly in order to have fault tolerance with performance [1]. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down [13]. Fault tolerance with performance in cluster computing can be achieved by distributing the available load evenly and equally among computing devices. In case of multiple faults or multiple node failures up to a reasonable value, load can be redistributed among all available nodes. This reassignment based recovery can be easily implemented in cluster computing. Reassignment based recovery algorithm can be improved by reducing the various overhead of reassignments and by improving scheduling algorithm for reassignments. Redistribution of task is done by scheduling algorithm. Recovery based on reassignment of task using some scheduling algorithm must be fast in the sense that it must take less number of iterations and time to redistribute the task evenly and equally among all working or non faulty nodes. A reassignment based recovery can be made fast by after investigating and optimizing the critical issues related to reassignment, system environment and nature of computation and communication. In a fast recovery based on reassignment algorithm, available load is redistributed by reassignment to different computing nodes with less number of iterations and time. So a fast recovery based on reassignment is a special case of reassignment based recovery with fast convergence. A fast recovery based on reassignment takes less or optimal iterations to converge an algorithm to reassign or redistribute the load evenly across all computing nodes in a distributed system with low communication overhead when one or more computers


141

break down. In the next section, various critical issues related to fast reassignment based recovery are discussed.

2. Critical Factors in Fast Recovery Based on Reassignment

2.1 Instability and Thrashing

Recovery based on reassignment of task is performed by distributing the available tasks among all available computing nodes. This is known as recovery based on reassignment. Recovery time can be reduced by making the reassignment evenly and equally [2]. This can be achieved by task migration from heavily loaded computing node to lightly loaded computing node. In doing this, sometimes cluster becomes unstable. It could enter in a state in which all the computing nodes are spending their time in load transfer without accomplishing the useful work. This is known as thrashing. A token based algorithm have been proposed to overcome thrashing or instability [3]. A node computation capability is divided among several parts. This part is called a token. A process that has to execute on that node requests for token and then it can execute. The algorithm works well if computation time of all nodes is known in advance. Load balancing is done by load balancing decision. Contradictory decision is main cause of thrashing. An algorithm is said to be thrashing free or stable if it proves that load calculation remain constant during load balancing. By forming groups thrashing can be prevented to some extent. Still in highly dynamic system thrashing is a problem. Sometime limit can be set for such a dynamic system. In this way thrashing or instability depends on the strategy chosen for taking the decision. Selection of particular strategy depends on the characteristics of system such as static, dynamic, highly dynamic etc.

2.2 Effectiveness Another issue with fast recovery algorithms is its effectiveness. Effective reassignment based recovery policy ensures optimal use of the distributed resources whereby no computing node remains in an idle state while any other computing node is busy. Effectiveness in reassignment based recovery depends on accuracy of knowledge of the state of individual computing node. This level of accuracy will decide assignment of task of failure node to appropriate computing node reassignment policy. A regeneration-theory approach is used to measure the effectiveness. It analytically characterizes the average overall completion time in a distributed system. The approach considers the heterogeneity in the processing rates of the nodes. It also considers the communication delays. An optimal one-shot algorithm is proposed by Sagar Dhakal [4]. This algorithm is effective since it minimizes the average completion time per task while improving the system processing rate. However this is proposed by considering the two nodes only whereas in practical situation multimode distributed computing takes place. A fast reassignment based algorithm must redistribute the load evenly but in reality no algorithm can redistribute almost evenly or uniformly. Up to some

extent of uniform or even reassignment can be achieved. Heterogeneity is main issue that must be addressed carefully to achieve the effectiveness of reassignment based algorithm. Weighted load reassignment is one of the approaches to address heterogeneity. This heterogeneity must be addressed before any decision for redistribution of load in a recovery technique.

2.3 Splitting Ratio

In reassignment based recovery time can be reduced by assigning the more tasks to lightly loaded computed node as well as by transferring the load of heavily loaded computing nodes to lightly loaded computing node. This is done by process migration or load transfer. Load transfer module transfers a part of load from maximum loaded node to minimum loaded node. This part of load is called load transfer ratio (LTR). A larger LTR takes less number of iterations to balance a load. However a smaller LTR takes more number of iterations to balance the load [9].

2.4 Convergence Convergence of fast reassignment based algorithm is related to number of iterations and time taken by an algorithm to reach to an optimal reassign state [5]. Rapid convergence algorithm is a need for today’s distributed system. Different recovery algorithms are based on various reassignments strategies. While some strategies probably converge in polynomial time, for others the convergence time might require an exponential number steps [6]. A fast reassignment based recovery must have faster convergence speed and a lower communication overhead. However, fast convergence causes instability, threshing etc. A reassignment based algorithm must be convergence after a higher bound number iteration and time. In case of fast reassignment based algorithm this upper bound limit is less. Fast convergence is a requirement but it causes instability and fast transfer mechanism as well. Convergence can also be improved by improving the communication delay traffic minimization and by increasing the load transfer ratio.

2.5 Cost Function Cost function predicts the recovery time for any given reassignment based recovery in a multi-user heterogeneous network environment. Finding an accurate optimization cost function for a reassignment based recovery algorithm is very difficult [7]. Williams suggested cost function as the sum of a part that minimizes load imbalance and a part that minimizes communication [8]. Cost function can be reduced by fast convergence, by reducing the communication delay and various scheduling overheads. Minimizing the load imbalance depends on effectiveness of the algorithm. Redistribution will be effective if it will result in almost uniform and even load distribution on every working node. In case of nods of different speed and performance effectiveness can be achieved if load is distributed in proportion to their computational power. So cost function is more crucial in case of heterogeneity as compare to homogeneity. By selecting the appropriate scheduling technique communication cost can be reduced .In one algorithm, task of resource management is handled by


142

dividing the nodes into mutually overlapping subsets. Thereby a node gets the system state information by querying only a few nodes. This minimized scheduling and communication overhead and therefore cost function [10].

2.6 Reassignment estimation policy

Reassignment estimation policy determines how to estimate the workload of a particular node of the system [1]. Estimation of the workload of a particular node is a difficult problem for which no completely satisfactory solution exists. A node's workload can be estimated based on some measurable guidelines. These guidelines could include time dependent and node dependent factors such as number of processes on the node, resource demands on these processes, instruction mixes of these processes , architecture and speed of the node's processor [3].

2.7 Process transfer policy Process Transfer Policy determines whether to execute a process locally or remotely. This issue is mainly for dynamic reassignment based recovery algorithm [10]. Process transfer policy decision is major issue that affects the performance of a reassignment based recovery algorithm. Generally this policy is based on threshold value. Threshold value is a value or set of values used as a criterion to determine weather a processor is overloaded or not [11].

2.8 Reassignment Strategies Reassignment strategy could be sender-initiated vs. receiver-Initiated strategies, global vs. local strategies and centralized vs. distributed strategies. In sender initiated policies, heavy loaded nodes attempt to move work to lightly-loaded nodes. In receiver-initiated policies, lightly-loaded nodes look for heavily-loaded nodes from which work may be received [12]. In case of global strategies, load balancer uses the performance of all workstations whereas in case of local strategies workstations are partitioned into groups. In a centralized scheme, the load balancer is located on one master workstation node. In a distributed scheme, the load balancer is replicated on all workstations [13].

2.9 Nature and type of application running on cluster: A reassignment based algorithm works well with one type of task may not work well with another type of load. Thus nature and type of load is a major issue which decides the performance of a load balancing algorithm. The types of load can be of following types; communication intensive v/s computation intensive, I/O intensive v/s CPU intensive. Researchers suggested a fast reassignment based algorithm for intensive I/O and memory requirement by allocating the job to highest unused I/O and memory. Videos on demand are an example of intensive I/O and memory [14].

2.10 Communication Complexity An algorithm must have low communication complexity. Researcher suggested low communication complexity where each node receives information of other node from all the nodes without redundancy. A Distributed system with n nodes, the communication complexity is O (n2). Researcher developed algorithm with lesser complexity [15].

2.11 Failure detection time Fast multiple fault recovery depends on immediate detection of occurred faults. In some case, it is difficult to determine the cause of a fault in order to provide fast fault restoration/isolation [16]. A fault must be detected as soon as fault has occurred. The time elapsed between occurrence of a fault and detection must be as small as possible. This fast fault detection is very difficult because it can declare a healthy processor to fault due to heavy load and traffic. The time taken to recovery is sum of time taken to failure detection and time taken to run recovery algorithm after detection. Avoiding extensive application or system data undo or redo upon recovery is key for providing fast recovery [17]. Conventional recovery algorithms redo the computation of the crashed process since the last checkpoint on a single processor. As a result, the recovery time of all protocols is no less than the time between the last checkpoint and the crash. In fast recovery there is need for improved algorithm to address this drawback [18]. If faults are detected immediately than further loss will be less and recovery process can prevent further loss as immediately as it can. In that case job of recovery will be not too hard and hence recovery will be fast recovery.

2.12 Selection of node/nodes to run recovery The fast recovery also depends on selection of node for running the recovery algorithm. The list of available computers and their current work loads are dynamic within clusters; having a direct impact on which computers should be used each time a recovery operation begins [19]. Recovery manager has to select one best node to run recovery algorithm among list of several computers. A reliable, fast and ideal node had to select in order to run the recovery algorithm. In cluster, nodes are added and removed dynamically. Loads are dynamically fluctuating as well. In such a situation selection of idle node to run recovery is tedious. Recovery process can be run on single node or can be run on more than one. Running of recovery on multiple nodes are free from single point failure as well as it is distributed recovery scheme.

2.13 Performance Performance of reassignment policy is affected of various delays. These delays include delay due to transfer of loads from one computer node to another. This transfer delay varies as load varies. This delay also dependant on traffic among source and destination node. Another delay which is significant as per performance is communication delay. Scheduling overhead is another delay which affects the performance of recovery based on reassignment [20]. In order to make a reassignment based recovery all delays and overhead must be minimized.

2.14 One-Time Assignment vs. Dynamic Reassignment The one-time assignment of a task may be dynamically done but once it is scheduled to a given processor, it can never be rescheduled to another one [21]. On the other hand, in the dynamic reassignment process, jobs can migrate from one node to another even after the initial placement is made.


143

3. Comparisons

Table 1: Issues Comparison

Issue Simple reassignment based recovery

Fast reassignment based recovery

Thrashing No Yes Effectiveness Less More Convergence Slow Fast

Instable Less More Adaptive No Yes

Response Time Less More Cost Less More

Splitting ratio Fix Vary Communication

complexity Less More

Process transfer policy required

No Yes

Failure Detection Medium Fast

4. Conclusion We have discussed many issues of reassignment based recovery for multiple faults. These issues are discussed as an impact of overall performance of distributed system. For real time distributed system convergence and response time are important. Some issues are not relevant with simple reassignment based recovery. Fast reassignments required handling more issues than a simple one. In order to make adaptive and fast, spitting ratio must be high but at the same time this high spitting ratio demands a need for fast load transfer mechanism to other destination node. Fast and efficient recovery has two major factors; minimize the various overheads and distributed the load as equally as possible by addressing the issues discussed properly. This will also reduce cost Instability and thrashing are main issues that may be addressed efficiently while designing a fast recovery based on reassignment for multiple fault tolerance. Recovery time can be reduced by detection of multiple faults as early as possible.

References [1] A. Chhabra, G. Singh, Sandeep S. Waraich, B. Sidhu,

and G. Kumar , “Qualitative Parametric Comparison of Load Balancing Algorithms in Parallel and Distributed Computing Environment” , World Academy of Science, Engineering and Technology pp 39-42 , 2006.

[2] R. D. Babu1 and P. Sakthive, “Optimal Recovery Schemes in Distributed Computing,” IJCSNS International Journal of Computer Science and Network Security, Vol.9 No.7, July 2009.

[3] P. Neelakantan, M. M .Naidu, “Token Based Load Balancing in Case of Process Thrashing”, pp 152-155, International Journal of Computer Science and Network Security, Vol .7 No.12, Dec 2007.

[4] S. Dhakal, M. M. Hayat, J. E. Pezoa, C. Yang, “Dynamic Load Balancing in Distributed Systems in the Presence of Delays: A Regeneration-Theory Approach”, Parallel And Distributed Systems IEEE, pp 485- 497, vol. 18, No. 4, Apr. 2007.

[5] J. E. Gehrke , C. G. Plaxton , R. Rajaraman , “Rapid Convergence of a Local Load Balancing Algorithm for Asynchronous Rings”, Theoretical Computer Science,Vol. 220 , Issue 1, pp 247 – 265, 1999.

[6] E. E-Dar, A. Kesselman, Y. Mansour, “Convergence Time to Nash Equilibrium in Load Balancing”, ACM Transactions on Computational Logic, Vol. 2, No. 3, 09 2001, Pages 111- 131.

[7] Y. P. Chien and et.al. “Cost Function for Dynamic Load Balancing of Explicit Parallel CFD Solvers with Variable Time-stepping Strategies, ” International Journal of Computational Fluid Dynamics, Vol. 15, Issue 3 Nov. 2001 , pp. 183 – 195.

[8] Roy D. Williams, Performance of Dynamic Load Balancing Algorithms for Unstructured Mesh Calculations,” Concurrency: Practice and Experience, Vol 3 , Issue 5 Oct 1991,pp 457 – 481

[9] G. Murugesan, A.M. Natarajan and C. Venkatesh , “Enhanced Variable Splitting Ratio Algorithm for Effective Load Balancing in MPLS Networks,” Journal of Computer Science 4 (3): 232-238, 2008.

[10] Md. Abdur Razzaque and Choong Seon Hong, “Dynamic Load Balancing in Distributed System: An Efficient Approach,” www.networking.khu.ac.kr/.../.

[11] A. C. Filte “A new distributed diffusion algorithm for dynamic load balancing in parallel systems,” Doctor Thesis Automa De Barcelona 2000.

[12] D. Eager, E Lazowska, J. Zahorjan, “A comparison of receiver initiated and sender-initiated adaptive load sharing Performance Evaluation,” v.6 n.1, p.53-68, March 1986.

[13] S. Malik, “Dynamic Load Balancing in a Network of Workstations”, pp 1-16, 95.515F Research Report, November 29, 2000.

[14] P. Kumar Chandra, B. Sahoo , “Performance Analysis of Load Balancing Algorithms for cluster of Video on Demand Servers”, pp 408-412, WEJEJ International Advance Computing Conference (IACC 2009), India, .

[15] I. Chung and Y. Bae, “The design of an efficient load balancing algorithm employing block design”, J. Appl. Math. & Computing Vol. 14(2004), pp. 343 – 351.

[16] H. G. Hotel, “UPnP based Service Discovery and Service Violation Handling for Distributed Fault Management in WBEM-based Network Management,” 5th ACIS International Conference on Software Engineering Research, (SERA 2007).

[17] R. Christodoulopoulou, K. Manassiev, A. Bilas and C. Amza, “Fast and Transparent Recovery for Continuous Availability of Cluster-based Servers,” PPoPP’06, March 29–31, 2006, New York, USA, 2006 ACM.

[18] X. Yang, Y. Du, Panfeng ,W., H. Fu, and J. Jia, “FTPA: Supporting Fault-Tolerant Parallel Computing through Parallel Recomputing,” IEEE Transactions on Parallel And Distributed Systems, vol. 20, oct. 2009.

[19] A. Maloney and A. Goscinski, “Transparent and Autonomic Rollback-Recovery in Cluster Systems,” 2008 14th IEEE International Conference on Parallel and Distributed Systems.

[20] J. Ghanem, “Implementation of Load Balancing Policies in Distributed Systems,” THESIS, American University of Beirut, 2002.


144

[21] J. Ghanem and et. al.”On load balancing in distributed systems with large time delays: Theory and experiments,” IEEE Mediterranean conference on control and automation, Turkey, 2004.

Authors Profile Sanjay Bansal has passed B.E. (Elex & Telecom Engg.) and M.E. (CSE) in 1994 and 2001 respectively .Presently he is working as Reader in Medi-Caps Institute of Technology, Indore. He is pursuing PhD from Rajeev Gandhi Proudyogiki Vishvavidyalaya, Bhopal, India. Sanjeev Sharma has passed B.E (Electrical Engg.) in 1991 and also passed M.E. in 2000.He is PhD. His research areas are mobile computing, data mining, security and privacy as well as adhoc networks. He has published many research papers in national and international journals. Presently he is working as an Associate professor in Rajiv Gandhi Proudyogiki Vishwavidyalaya Bhopal (India)


145

Evaluating Clustering Performance for the Protected Data using Perturbative Masking Techniques in

Privacy Preserving Data Mining

S.Vijayarani1, Dr.A.Tamilarasi2

1, School of Computer Science and Engg., Bharathiar University, Coimbatore, Tamilnadu,India

[email protected]

2Dept. of MCA, Kongu Engg. College, Erode, Tamilnadu, India [email protected]

Abstract- Privacy Preserving Data Mining has become very popular for protecting the confidential knowledge which was extracted from the data mining techniques. Privacy preserving data mining is nothing but the study of how to produce valid mining models and patterns without disclosing private information. Several techniques are used for protecting the sensitive data. Some of them are statistical, cryptographic, randomization, k-anonymity model, l-diversity and etc. In this work, we have analyzed the two statistical disclosure control techniques i.e additive noise and micro aggregation. We have examined the clustering performance of additive noise and micro aggregation techniques. The experimental results show that the clustering performance of additive noise technique is comparatively better than micro aggregation.

Keywords- Data Perturbation, Micro Aggregation, Additive Noise, K-means clustering

1. Introduction The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to leverage this information. Many data mining applications such as financial transactions, health-care records, and network communication traffic are deal with private sensitive data. Data is an important asset to business organization and governments for decision making by analyzing it. Privacy regulations and other privacy concerns may prevent data owners from sharing information for data analysis. In order to share data while preserving privacy data owner must come up with a solution which achieves the dual goal of privacy preservation as well as accurate data mining result. The main consideration in privacy preserving data mining is twofold. First, sensitive raw data like identifiers, names, addresses and the like should be modified or trimmed out from the original database. Second, sensitive knowledge which can be mined from a database by using data mining algorithms should also be excluded, because such knowledge can equally well compromise data privacy. The main objective in privacy preserving data mining is to develop algorithms for modifying the original data in some

way, so that the private data and private knowledge remain private even after the mining process [8]. Data modification is one of the privacy preserving techniques used to modify the sensitive or original information available in the database that needs to be released to the public. It ensures high privacy protection. The rest of this paper is organized as follows. In Section 2, we present an overview of micro data and masking techniques. Section 3 discusses different types of micro data protection techniques. Additive noise and micro aggregation techniques are discussed in section 4. Section 5 gives the performance results of additive noise and micro aggregation. Conclusions are given in Section 6. 2. Micro Data

Protecting static individual data is called micro data. It can be represented as tables. It consists of tuples (records) with values from a set of attributes. A micro data set V is a file with n records, where each record contains m attributes on an individual respondent [3]. The attributes can be classified in four categories which are not necessarily disjoint: Ø Identifiers. These are attributes that unambiguously

identify the respondent. Examples are the passport number, social security number, name surname, etc.

Ø Quasi-identifiers or key attributes. These are attributes

which identify the respondent with some degree of ambiguity. Examples are address, gender, age, telephone number, etc.

Ø Confidential outcome attributes. These are attributes

which contain sensitive information on the respondent. Examples are salary, religion, political affiliation, health condition, etc.

Ø Non-confidential outcome attributes. Those attribute

which do not fall in any of the categories above.


146

3. Classification of micro data protection techniques (MPTs) [3]

Figure 1. Micro-data Protection Techniques

3.1 Masking Techniques

Protecting sensitive data is a very significant issue in the government, public and private bodies. Masking techniques are used to prevent confidential information in the table. Masking techniques can operate on different data types. Data types can be categorized as follows. Ø Continuous. An attribute is said to be continuous if it is

numerical and arithmetic operations are defined on it. For instance, attributes age and income are continuous attributes.

Ø Categorical. An attribute is said to be categorical if it can assume a limited and specified set of values and arithmetic operations do not have sense on it. For instance, attributes marital status and sex are categorical attributes.

Masking techniques are classified into two categories Ø Perturbative Ø Non- Perturbative

3.1.1 Perturbative Masking

Perturbation is nothing but altering an attribute value by a new value. The data set are distorted before publication. Data is distorted in some way that affects the protected data set, i.e. it may contain some errors. In this way the original dataset may disappear and new unique combinations of data items may appear in the perturbed dataset; in perturbation method statistics computed on the perturbed dataset do not differ from the statistics obtained on the original dataset [3]. Some of the perturbative masking methods are, Ø Micro aggregation Ø Rank swapping Ø Additive noise Ø Rounding Ø Resampling Ø PRAM

Ø MASSC etc.

3.1.2 Non-Perturbative Masking Non-perturbative techniques produce protected microdata by eliminating details from the original microdata. Some of the Non-perturbative masking methods are Ø Sampling Ø Local Suppression Ø Global Recoding Ø Top-Coding Ø Bottom-Coding Ø Generalization

3.2 Synthetic Techniques The original set of tuples in a microdata table is replaced with a new set of tuples generated in such a way to preserve the key statistical properties of the original data. The generation process is usually based on a statistical model and the key statistical properties that are not included in the model will not be necessarily respected by the synthetic data. Since the released micro data table contains synthetic data, the re-identification risk is reduced. The techniques are divided into two categories: fully synthetic techniques and partially synthetic techniques. The first category contains techniques that generate a completely new set of data, while the techniques in the second category merge the original data with synthetic data. 3.2.1 Fully Synthetic Techniques Ø Bootstrap Ø Cholesky Decomposition Ø Multiple Imputation Ø Maximum Entropy Ø Latin Hypercube Sampling

3.2.2 Partially Synthetic Techniques Ø IPSO (Information Preserving Statistical

Obfuscation) Ø Hybrid Masking Random Response Ø Blank and Impute Ø SMIKe (Selective Multiple Imputation of Keys) Ø Multiply Imputed Partially Synthetic Dataset [3]

4. Analysis of the SDC techniques The main steps involved in this work are,

• Sensitive numerical data item is selected from the database

• Modifying the sensitive data item using micro aggregation and additive noise

• Analyzing the statistical performance • Analyzing the accuracy of privacy protection • Evaluating the clustering accuracy


147

4.1 Micro aggregation Micro aggregation is an SDC technique consisting in the aggregation of individual data. It can be considered as an SDC sub-discipline devoted to the protection of the micro data. Micro aggregation can be seen as a clustering problem with constraints on the size of the clusters. It is somehow related to other clustering problems (e.g., dimension reduction or minimum squares design of clusters). However, the main difference of the micro aggregation problem is that it does not consider the number of clusters to generate or the number of dimensions to reduce, but only the minimum number of elements that are grouped in the same cluster [9]. Any type of data, micro aggregation can be operationally defined in terms of the following two steps: Ø Partition: The set of original records is partitioned into

several clusters in such a way that records in the same cluster are similar to each other and so that the number of records in each cluster is at least k.

Ø Aggregation: An aggregation operator (for example, the

mean for continuous data or the median for categorical data) is computed for each cluster and is used to replace the original records. In other words, each record in a cluster is replaced by the cluster’s prototype.

From an operational point of view, micro aggregation applies Ø A clustering algorithm to a set of data obtaining a set of

clusters. Formally, the algorithm determines a partition of the original data. Then, micro aggregation proceeds by calculating a cluster representative for each cluster finally,

Ø Each original datum is replaced by the corresponding cluster representative [7]

After modifying the values the k means algorithm is applied to find, whether the original value and the modified value in the micro aggregation table are in the same cluster. 4.2. Additive Noise It perturbs a sensitive attribute by adding or by multiplying it with a random variable with a given distribution. [2]

Ø Masking by uncorrelated noise addition The vector of observations xj for the j-th attribute of the original dataset Xj is replaced by a vector

zj = xj +εj where εj is a vector of normally distributed errors drawn from a random variable εj ∼ N(0, σ2εj ), such that Cov(εt, εl) = 0 for all t ≠ l. This does not preserve variances nor correlations. Ø Masking by correlated noise addition.

Correlated noise addition also preserves means and additionally allows preservation of correlation coefficients. The difference with the previous method is that the covariance matrix of the errors is now proportional to the covariance matrix of the original data, i.e. ε ∼ N(0,Σε), where Σε = αΣ. In this work we have used the given additive noise algorithm. • Consider a database D consists of T tuples.

D={t1,t2,…tn}. Each tuple in T consists of set of attributes T={A1,A2,…Ap} where Ai Є T and Ti Є D

• Identify the sensitive or confidential numeric attribute AR n

• Calculate the mean ΣARi i=1

• Initialize countgre=0 and countmin=0 • If ARi>=mean then

{ Store these numbers separately group1= ARi (i=1,..n) countgre=coungre+1 }

• else if ARi <mean then { Store these numbers separately group2= ARi (i=1,..n) countmin=countmin+1 }

• Calculate the noise1 value as 2*mean/countgre • Calculate the noise2 value as 2*mean/countmin • Subtract the noise1 value from each data item in group1 • Add the noise2 value to each data item in group2 • Now release the new modified sensitive data • In the modified data find out the mean value which is

same as the original • Adding the noise1 and noise2 produce the result as 0.

5. Experimental Results In order to conduct the experiments, synthetic employee dataset can be created with 500 records. From this dataset, we select the sensitive numeric attribute, income. Additive noise and micro aggregation techniques are used for modifying the attribute income. The following performance factors are considered for evaluating the two techniques

5.1 Statistical Calculations The statistical properties mean, standard deviation and variance of modified data can be compared with the original data. Both the techniques were produced the same results.


148

Figure 2. Statistical Performance

5.2. Privacy Protection In order to verify the privacy protection, we have analyzed whether all the sensitive data items are modified or not. From the results, we know that, there is a 100% privacy protection.

Figure 3. Data Modification

5.3 Accuracy of Clustering

Ø K-Means Clustering Algorithm

The k-means algorithm for partitioning, where each cluster’s center is represented by the mean value of the objects in the cluster Input

• K : the number of clusters • D : a data set containing n objects

Output: Set of k clusters Method

• arbitrarily choose k objects from D as the initial cluster centers;

• Repeat • (re)assign each object to the cluster to which the

object is the most similar, based on the mean value of the objects in the cluster

• Update the cluster means, i.e. calculate the mean value of the objects for each cluster

• Until no change In order to verify the clustering accuracy we have used k-means clustering algorithm. Original clusters are compared

with the modified clusters. The additive noise modified data set clustering is same as the original clustering.

Figure 4.Accuracy of additive noise and micro aggregation.

6. Conclusions Preserving privacy in data mining activities is a very important issue in many applications. In this paper, we have analyzed micro aggregation and additive noise perturbative masking techniques in privacy preserving data mining. Micro aggregation and additive noise performance are good in statistical calculations and privacy protection. Then we have used the modified dataset for clustering, the results show that the additive noise technique is comparatively better than micro aggregation. ACKNOWLEDGEMENT I would like to thank “The UGC, New Delhi” for providing me the necessary funds.

References [1] Brand R (2002). “Micro data protection through noise

addition”. In Domingo-Ferrer J, editor, Inference Control in Statistical Databases, vol. 2316 of LNCS,pp. 97{116. Springer, Berlin Heidelberg.

[2] Charu C.Aggarwal IBM T.J. Watson Research Center, USA and Philip S. “Privacy preserving data mining: Models and algorithms” Yu University of Illinois at Chicago, USA.

[3] Ciriani, S.De Capitani di Vimercati, S.Foresti, and P.Samarati “Micro data protection” © Springer US, Advances in Information Security (2007)

[4] Feng LI†, Jin MA, Jian-hua LI (School of Electronic Information and Electrical Engineering,“ Distributed anonymous data perturbation method for privacy-preserving data mining”. Shanghai Jiao Tong University, Shanghai 200030, China).

[5] G. R. Sullivan. The Use of Added Error to Avoid Disclosure in Microdata Releases. PhD thesis, Iowa State University, 1989.

[6] J. J. Kim. A method for limiting disclosure in microdata based on random noise and transformation. In Proceedings of the Section on Survey Research Methods, pages 303–308, Alexandria VA, 1986. American Statistical Association.

[7] Vicen c Torra “Constrained micro aggregation: Adding constraints for Data Editing” IIIA - Artificial Intelligence Research Institute, CSIC - Spanish Council


149

for Scientific Research, Campus UAB s/n, 08193 Bellaterra (Catalonia, Spain).

[8] Vassilios S. Veryhios, Elisa Bertino, Igor Nai Fovino Loredana Parasiliti Provenza, Yucel Saygin, Yanniseodoridis, “State-of-the-art in Privacy Preserving Data Mining”, SIGMOD Record, Vol. 33, No. 1, March 2004.

[9] Xiaoxun Sun1 Hua Wang1 Jiuyong Li2 “Microdata Protection Through Approximate Microaggregation” 2009, Australian Computer Society, Inc. Thirty-Second Australasian Computer Science Conference (ACSC2009), Wellington, New Zealand. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 91. Bernard Mans, Ed.

Authors Profile

Mrs. S.Vijayarani has completed MCA and M.Phil in Computer Science. She is working as Assistant Professor in the School of Computer Science and Engineering, Bharathiar University, Coimbatore. She is currently pursuing her Ph.D in the area of privacy preserving data mining. She has published two papers in international journal and presented six research papers in

international and national conferences.

Dr. A.Tamilarasi is a Professor and Head in the Departmentof MCA, Kongu Engineering College, Perundurai. She has supervised a number of Ph.D students. She has published a number of research papers in national and international journals and conference proceedings.


150

Security Architecture Objectives for Beyond 3G Mobile Networks: A Survey and Critical Evaluation

Mana Yazdani1, and Majid Naderi2

1Faculty of Electrical Engineering, Iran University of

Science & Technology, Tehran, Iran [email protected]

2 Faculty of Electrical Engineering, Iran University of

Science & Technology, Tehran, Iran [email protected]

Abstract: This paper provides a review on the different authentication and key agreement candidate protocols, EAP-SIM, EAP-AKA, EAP-TLS, EAP-TTLS, and PEAP, for the interworking of the WLAN with 3GPP networks. Each protocol’s message flow is also presented. A critical evaluation and comparison between the protocols is provided in order to reveal the deficiency and vulnerability of each procedure. The identity protection, the Man-in-the-Middle attack, possibility of replay attack, latency, energy consumption and the total size of these protocols are evaluated.

Keywords: Security, WLAN, 3G, EAP-SIM, EAP-AKA, EAP-TLS, EAP-TTLS, PEAP.

1. Introduction The wireless communications are being integrated to comply with the recent increasing demand and rapid development. The integration of different wireless networks dramatically originates new security issues. It should be taken into consideration that the act of integrating two secure networks must not negatively impact the overall security, bit rate, and mobility of each network.

The Universal Mobile Telecommunication System (UMTS) as a 3rd Generation (3G) mobile network and Wireless Local Area Networks (WLAN) are among the most practical technologies providing wireless services. 3G networks benefit from the wide area coverage, roaming, and mobility while WLAN systems offer very high bit rates. On the other hand, 3G and WLAN systems suffer from limited capacity and less area coverage respectively. Practically, the 3G-WLAN interworking keeps the advantages of both 3G and WLAN networks intact providing the users with the ubiquitous services. Numerous studies have been conducted to improve various security aspects of such a heterogeneous network [13]-[16].

Although the 3rd Generation Partnership Project (3GPP) has accepted two access scenarios for the 3G-WLAN interworking: Extensible Authentication Protocol- Subscribed Identity Module (EAP-SIM), and Extensible Authentication Protocol and Key Agreement (EAP-AKA) [1]-[4], they have shown security flaws and deficiencies [11]. Besides, some other authentication protocols have been also evaluated to fulfill the interworking requirements. These protocols were essentially proposed by the Internet Engineering Task Force (IETF) and widely used for internet

applications. EAP-TLS (Transport Layer Security) [5], EAP-TTLS (Tunnel TLS) [6], and PEAP (Protected EAP) [7] are among the most applicable Internet protocols. A brief introduction to these protocols with the critical security assessment will be provided in this paper.

2. Interworking Authentication Protocols EAP is a wrapper for authentication protocols that encapsulates the protocols and provides an end to end security between the Authentication, Authorization, and Accounting (AAA) server and the User Equipment (UE) [8].

In this section the authentication protocols which have been evaluated as candidates for the integration of 3G-WLAN are briefly presented. As mentioned earlier, two protocols, EAP-SIM and EAP-AKA, have been accepted and are currently employed by 3GPP as the authentication protocols used for the interworking of 3G-WLAN [1], [2]. The other authentication protocols have been assessed in articles as alternative candidates.

2.1. EAP-SIM EAP-SIM is a protocol used in the interworking of 3G and WLAN when the SIM-card of the GSM/GPRS is being applied in the UE side. Although the security credentials used in the GSM are also engaged in the authentication process of EAP-SIM, some enhancements have been implemented to eliminate the known security weaknesses of GSM.

The main vulnerability of the GSM networks was mainly emerged from the opinion that owning a base station is not affordable for a potential attacker. Consequently, the mutual authentication is not supported in the GSM/GPRS and only the user is authenticated to the network. Another security flaw unveiled is the weak ciphering systems used in the authentication process. Many attacks have been published on A5/1 and especially A5/2, the main cryptographic primitives of the GSM/GPRS [9], [10]. EAP-SIM is enhanced in comparison to the GSM/GPRS authentication to have the mutual authentication and use a longer security key. Practically, the security key for the GSM/GPRS is 64 bit which is enhanced up to 128 bit for the EAP-SIM [3].

The EAP-SIM authentication and key agreement procedure is presented in Fig. 1. The steps of the authentication procedure are shown by numbers 1 through 10 in the Fig. 1. The first and second steps are the


151

initialization of the authentication procedure in which the user communicates with the wireless AP via the EAP Over LAN (EAPOL). The user sends his identity in the format of the Network Access Identifier (NAI). This identity can be the International Mobile Subscriber Identity (IMSI) or his temporary identity (TMSI). The IMSI must be sent in a plain text in the first connection setup and the TMSI is used in the other setups.

Figure 1. EAP-SIM authentication protocol

AAA server recognizes the user’s identity through steps 3 and 4 and the authentication procedure is started at this point. Then, the NONCE sent in the fifth step is the user’s challenge to the network. In the steps 6 and 7, the AAA server obtains n (n=2 or n=3) authentication triplets (RAND, SRES, and Kc) for a specific user. The generation of these triplets is based on a permanent secret key shared between the user and the network. The step 8 is to send the n RANDs and the MACserver. The MACserver is calculated using the NONCE and the n RANDS in a MAC algorithm. The user can authenticate the network by verifying the received MACserver. The ninth step includes the MACuser and n XRES values which are calculated by the UE. The AAA server verifies the MACuser and checks if the received XRES is equal to the SRES upon receiving this message. If the check is accomplished successfully, the user is authenticated to the AAA server and EAP success message is sent to the user indicating the completion of the authentication procedure. At the end of this procedure, the session key is sent to the access point via the AAA server by an AAA protocol (Radius or DIAMETER). This session key is used for encryption purposes between the UE and the AP. Further details on the EAP-SIM authentication procedure can be accessed in references [3] and [11].

2.2. EAP-AKA EAP-AKA is another authentication protocol which is used in the interworking of 3G-WLAN when the user owns a USIM card [4], [11]. A USIM card is the application utilized on the smart card (UICC) of the 3G user equipment. EAP-AKA implements the UMTS authentication and key agreement procedure and applies the same protocols for

communication between the network components as EAPOL, Radius, DIAMETER, etc.

The vulnerabilities mentioned for the GSM/GPRS authentication were concerned in the structure of the UMTS-AKA protocol; this authentication protocol benefits from the mutual authentication and new cryptography with a higher degree of security. Because EAP-AKA is an encapsulation of the AKA procedure in EAP, it certainly does not suffer from the GSM/GPRS vulnerabilities.

Figure 2. EAP-AKA authentication protocol

Fig. 2 depicts the authentication procedure and the key agreement in the EAP-AKA protocol through steps 1 to 10. The first two steps are the initialization process. Similar to the EAP-SIM protocol, the identities transmitted in an NAI format (in the third step) may be either permanent (IMSI) or temporary (TMSI). If the AAA server does not possess a 3G Authentication Vector (AV) gained from a previous authentication, it will request AVs from the HSS/HLR (Home Subscriber Server / Home Location Register). The HSS generates n AVs for the specific user by the permanent secret key shared between the UE and the HSS/HLR. The AVs transmitted in the fourth and fifth step include a random challenge (RAND), the authentication token (AUTN), the calculated response (SRES), the encryption key (CK), and the integrity key (IK). The AAA server chooses one of the AVs for the current session and stores the rest for other sessions. The CK and IK with the identity of the user are calculated in a MAC algorithm to generate the master key in the EAP-AKA process. The produced master key is eventually used to generate the master session key. The MACserver calculated in the AAA server from the master key is sent with the RAND and AUTN in the sixth step. The verification of the MACserver and AUTN will be used to authenticate the network to the user. The AAA server verifies the calculated MACuser and XRES sent in the seventh step to authenticate the user. If XRES is equal to the SRES received in the fifth step and the MACserver value is acceptable, then the user is authorized. An EAP success message terminates the EAP-AKA procedure while the session key is transmitted via the AAA server to the AP to be applied for the security purposes between the user and the AP.


152

2.3. EAP-TLS Secure Socket Layer (SSL) is the most widely used security protocol on the wired internet which employs a public key infrastructure [12]. As a result, many works have focused on applying the SSL based authentication protocols to the wireless networks to make a compatible integration between wireless and wired networks [13]-[16]. Performance considerations have discouraged the use of SSL based protocols in the resource constraint environments such as the wireless environment. On the other hand, the relatively small sizes of wireless data transactions imply that the public key encryption dominates the security processing requirements in wireless networks.

UE AP AAA Server HSS/HLR

WwEAPOL

WaAAA (Radius or

Diameter)

WxSS7

1. Connection Establishment

2. EAP Request / Identity

3. EAP Response / Identity (NAI)

5. EAP-TLS Start

7. EAP Request

Session Key

[EAP-Type=EAP-TLS (TLS Server-Hello, TLS Certificate, TLS-Key-

Exchange, TLS Certificate-Request, TLS Server-Hello-Done)]

The session key is sent using the

AAA protocol

4. Access Request with UserID(EAP-Type=EAP-TLS, start bit set, no data)

6. EAP Response

[EAP-Type=EAP-TLS (TLS Client-Hello)]

Public key operation to verify AAA server’s

certificate

8. EAP Response

[EAP-Type=EAP-TLS (TLS Certificate, TLS Client-Key-Exchange, TLS Certificate-Verify, TLS Change-

Cipher-Spec, TLS Finished)]

9. EAP Request

[EAP-Type=EAP-TLS (TLS Change-Cipher-Spec, TLS Finished, New Encripted Pseudonym)]-RADIUS

Access Success

10. EAP Response

Decrypt New Pseudonym (P-TMSI)[EAP-Type=EAP-TLS]

11. EAP Success Figure 3. EAP-TLS authentication protocol

EAP-TLS is an authentication and key agreement protocol which is mainly based on SSL v.3. Similar to the SSL protocol, EAP-TLS engages public key cryptography to securely communicate with the AAA server. EAP-TLS is known as one of the most secure EAP standards on wireless LANs. The requirement for a client to possess a certificate is part of the authentication procedure that casted doubt on the feasibility of implementing EAP-TLS on the wireless networks. The papers in the references [13], [14] present some practical aspects of the implementation of the EAP-TLS on the wireless networks.

Fig. 3 illustrates the structure of EAP-TLS authentication protocol proposed in the references [13], [14]. The message flow in the figure includes the essential adaptations to the EAP-TLS to make it “mobile-enabled” [14]. The initialization procedure is NAI based and similar to the protocols mentioned in the previous sections. The user sends his identity (IMSI or TMSI) along with the certificate in an EAP response message and the EAP server verifies the user identity by this certificate. On the other side, the client checks the server certificate validity which is signed by a trusted Certification Authority (CA).

In the EAP-TLS architecture proposed in the reference [14], the use of PKI is mandatory; so, a CA must be connected to the 3G core network to issue the certificates. Different structures are proposed in the reference [14] to

support the PKI in the UMTS architecture. A procedure for the session resumption is also presented that improves the efficiency of the repeated connection attempts. In this structure, the need for the generation of the AVs by the HSS/HLR is eliminated. Finally, EAP-TLS provides an end to end authentication procedure.

2.4. EAP-TTLS EAP-TTLS is the revision of the EAP-TLS in which the need for the PKI in the structure was a deficiency in the wireless networks [6]. EAP-TTLS utilizes the secure connection established by the TLS protocol. The TLS handshake used in the TTLS may be either mutual or one way (only the server is authenticated to the client). The client may be authenticated using an AAA protocol such as RADIUS. The authentication of the client may be EAP or another protocol such as CHAP (Challenge Handshake Authentication Protocol) [15], [16].

UE AP AAA Server HSS/HLR

WwEAPOL

WaAAA (Radius or

Diameter)

WxSS7

1. Connection Establishment

2. EAP Request / Identity

3. EAP Response / Identity (NAI)

5. Client Hello

7. Client Key Exchange, Change Spec(In SSL),

Finished (Encrypted)

Session Key

6. Server Hello, Server Certificate, Server Hello Done

8. Change Spec. Finish (Encrypted)

9. Username. CHAP Challenge, CHAP Response

13. EAP Success

4. EAP-TLS Start

10. RADIUS Authentication Request(Includes CHAP message)

11. RADIUS Access Accept

12. Success, Data Cipher Suit

Figure 4. EAP-TTLS authentication protocol

EAP-TTLS has the advantage of easy deployment on an existing structure in a wireless network. This protocol is in fact a combination of two protocols: an outer and an inner protocol. The inner is the legacy authentication protocol and the outer protects the inner protocol messages. Moreover, the outer protocol provides a tunnel that enables the network to perform the functions such as the client authentication and the key distribution. On the other hand, the inner protocol includes a TLS handshake which is used to authenticate the server to the client based on a public or a private key certificate.

Fig. 4 shows the EAP-TTLS authentication procedure. In this figure, the TLS protocol is used to authenticate the server and the CHAP protocol performs the client authentication. The server must verify the value of the CHAP challenge to authenticate the user. The steps 1 through 3 in the Fig. 4 are the initialization procedure similar to the other protocols and the steps 4 through 8 demonstrate the creation of a TLS tunnel in which the server is authenticated. The rest of the steps are to authenticate the client in the established tunnel. In EAP-


153

TTLS, if the client uses a certificate for the authentication, the protocol will have the same procedure as the EAP-TLS [15], [16].

2.5. PEAP PEAP provides a wrapping of the EAP protocol within TLS [7]. The PEAP, similar to the EAP-TTLS, implements a tunnel to transfer the protocol authentication messages. One of the protocols encapsulated in the PEAP tunnel is the EAP-AKA authentication. As mentioned earlier, the tunnel derives the session keys.

The message flow in PEAP with the EAP-AKA authentication is illustrated in the Fig. 5. The UE and the HSS own a common secret key which is used during the authentication. In the initialization phase, the UE sends an Identity (IMSI/TMSI) as part of the EAP-AKA. An AAA protocol like MAP or DIAMETER or RADIUS sends the IMSI from AAA server to the HSS/HLR. Then, HSS/HLR calculates and sends the AVs (RAND, AUTN, XRES, IK, and CK) to the AAA server. The chosen AV is sent to the UE for the verification so that the network is authenticated to the user. The RES is sent back to the AAA server and if RES=XRES, the UE is authenticated. After the AKA procedure is completed, the session keys are derived and shared between the UE and the AP. These session keys are not the same as those derived in the 3G-AKA but derived from the TLS master secret.

Figure 5. PEAP authentication protocol

3. Deficiencies and Vulnerabilities The five authentication protocol candidates for the integration of wireless networks were explained earlier. In this section, a critical evaluation is made introducing the deficiencies and vulnerabilities of each protocol separately. Some of these vulnerabilities are revealed by the proposed attacks in the literature which are also addressed in this section. Additionally, the deficiencies are unveiled by making critical comparisons between different protocols.

3.1 Vulnerabilities of the protocols Generally, the authentication protocols presented in the

previous section set up connections between the cellular networks and the WLAN. The security protocols included in the WLANs are mainly based on the different versions of 802.11. The basic version of the 802.11 is considered as one of the most vulnerable security protocols. Currently, new

versions of this security protocol are implemented in the wireless networks. One of the new security architectures for the 802.11 security protocol is called WiFi Protected Access (WPA). WPA2 version, which is widely used in the wireless networks, suffers from a number of vulnerabilities such as denial of service attacks, session hijacking in the absence of encryption, and the lack of trust relationship within the WPA architecture. On the other hand, the user equipment may initiate a bottleneck. This happens when, for instance, a Trojan in the terminal can originate a challenge response with the UICC and forwards the results to an active attacker. The attacker then analyzes the messages and sets up an attack. Another example is the malicious software residing in a different host which can launch Distributed Denial of Service (DDOS). When a user intends to access a WLAN service via a cellular authentication procedure, the SIM/USIM must be used remotely from the WLAN client through a serial, Infrared, or Bluetooth connection. Sending credentials on these connections can endanger the user confidentiality.

3.3.1 EAP-SIM EAP-SIM protocol establishes a secure connection between the GSM and WLAN [1]-[3]. The GSM network suffers from many security weaknesses such as the unidirectional authentication and the key agreement protocol, the possibility of replay attacks, and the weak cryptographic primitives that resulted in many successful attacks to this architecture [9], [17]. EAP-SIM claims that it has solved many of the security flaws in the GSM though.

Some of the vulnerabilities of the EAP-SIM could be summarized as follows. • The mobile user is obliged to send his permanent

identity (IMSI) in a plain text during the first authentication attempt. Correspondingly, a passive eavesdropper may steal this identity and use it in a later active attack.

• The messages transmitted between the UE and the Radio Network Controller (RNC) are the only messages provided with an integrity protection; hence, the protocol may be vulnerable to replay attacks.

• Many EAP-SIM messages (EAP-Request/Notification, EAP Success, or EAP Failure) are exchanged unprotected enabling an attacker to send false notification and mount denial of service attacks.

• Although EAP-SIM mandates the use of fresh authentication triplets, there is no mechanism that enables the user to check whether the authentication triplets received from the AAA server are fresh. Therefore, if an attacker has access to authentication triplets, he may use the compromised triplets as long as the master secret key remains unchanged for the target user.

• A possible way of implementing a Man-in-the-Middle (MitM) attack on the EAP-SIM is when the same authentication triplets are used in both GSM and WLAN access. If the HSS is not used specifically for the interworking of the GSM and WLAN, then HLR will be used as the data base that stores the authentication credentials. Accordingly, the authentication triplets stolen from a GSM connection


154

can be misused to mount a false WLAN connection by an adversary in EAP-SIM. 3.3.2 EAP-AKA

EAP-AKA is the authentication protocol used in the interworking of the WLAN and the UMTS cellular networks [1], [2], [4]. In this protocol, EAP encapsulates the AKA procedure which is known for providing enough security. Moreover, the authentication token (AUTN) and the sequence number in the message flow of the authentication procedure are engaged in order to defeat the possibility of the replay and impersonation attacks. In spite of all the attempts to make a secure protocol, it is blamed to have some vulnerabilities as below. • EAP-AKA does not support cipher suit or protocol

version negotiation and the key sizes and the algorithm are fixed making it a less secure and inflexible protocol.

• The integrity protection is only guaranteed when communicating between the radio network controller and the user equipment; hence, the protocol may be vulnerable to replay attacks.

• IMSI is sent in plain text on the first authentication attempt; so, an adversary pretending a valid server may force the user to send his IMSI and gain his permanent identity.

• Many EAP-AKA messages (EAP-Request/Notification, EAP-Success, and EAP-Failure) are exchanged unprotected enabling an attacker to mount denial of service attack.

• Although the AKA procedure is strong enough to defeat the MitM attack, the integration of the UMTS with the GSM has resulted in the interception of all the UE initiated calls [18], eavesdropping attack, and an impersonation attack [19]. If the HSS is not used specifically for the interworking of the UMTS and the WLAN, a MitM attack is likely to happen. The authentication credentials gained from mounting the previously mentioned attacks on the HLR assist the attacker to initiate a MitM attack in the EAP-AKA. 3.3.3 EAP-TLS

EAP-TLS appeared to provide the acceptable level of security in the wired networks. It has not yet even shown vulnerability to the MitM attacks. Nevertheless, similar to the other interworking authentication protocols, the Network Access Identifier (NAI) can divulge the permanent user identity under certain circumstances thus compromising the user privacy.

3.3.4 EAP-TTLS EAP-TTLS was proposed to eliminate the need for a PKI in the EAP-TLS and provide more security by tunneling which itself augmented the possibility of a MitM attack [20]. The attack suggested in the reference [20] is due to the fact that the legacy client authentication protocol is not aware if it is run in a protected or unprotected mode. The main cause of the MitM attack in EAP-TTLS is the ability of an authentication to proceed without tunneling. The message flow of the MitM attack in the EAP-TTLS is depicted in the Fig. 6.

Figure 6. MitM in the EAP-TTLS

As it is shown in the Fig. 6, the MitM captures the initialization procedure of a legitimate user and sets up a tunneled authentication protocol with the AAA server using the UE identity. Afterwards, the MitM forwards the legitimate client authentication protocol messages through the tunnel. The MitM unwraps the messages received from the AAA server and forwards them to the legitimate user. After the successful completion of the procedure, the MitM derives the session keys and starts an active or passive attack.

3.3.5 PEAP PEAP is a tunneling protocol similar to the EAP-TTLS which provides a wrapping for the legacy protocols such as the EAP-AKA. The most significant vulnerability of this protocol arises from the nature of including a tunneling procedure. The MitM attack in PEAP with EAP-AKA is displayed in the Fig. 7.

Figure 7. MitM in the PEAP with EAP-AKA

According to the Fig. 7, the MitM initiates a tunneled authentication protocol with the network while masquerading as the legitimate AP to the user. MitM unwraps the tunneled messages received from the AAA server and forwards them to the victim. At the end of the procedure, the MitM owns the session keys.


155

3.2 Deficiencies of the protocols Each candidate protocol has its advantages and disadvantages to be employed in the interworking of the WLAN with the cellular networks. The most notable drawback in the EAP-SIM and EAP-AKA is their dependency on the network structure and thus cannot be dynamic. However, the advantage of the EAP-TLS/TTLS or PEAP is that the user can be authenticated locally and does not need to first connect to the cellular access gateway. Another deficiency of the two protocols is the latency of the authentication procedure which is exacerbated due to the frequent roaming of the users among different WLANs; this frequency is caused by the comparatively small range of each WLAN AP. Another advantage of the EAP-TLS/TTLS or PEAP in comparison with the EAP-SIM/AKA is their applicability in the beyond 3G heterogeneous networks since they have been successfully implemented as protocols in the Internet which is the backend of the beyond 3G networks. Many researches have focused on comparing the energy consumption, latency, and the total size of these authentication protocols in an interworking scenario [13]-[16]. All the researches demonstrate that EAP-SIM and EAP-AKA suffer from the considerable latency but benefit from the small total size on the UE.

The most significant problem in the implementation of the legacy wired internet protocols in the interworking of the WLAN with the cellular networks is the infrastructures required for using the public key and the PKI. EAP-TLS/TTLS or PEAP use a public key infrastructure and certificate authority which are not introduced in the existing 2G and 3G cellular networks. Another problem of using the certificate authority is that the USIM, which is a constraint resource, must be preloaded with all the CA public keys. Furthermore, most of the UEs are not equipped with the digital certificate.

The Table 1 summarizes the main vulnerabilities and deficiencies mentioned earlier.

Table 1: Vulnerabilities and deficiencies comparison

EAP-SIM

EAP-AKA

EAP-TLS

EAP-TTLS PEAP

User identity protection û û û ü ü

Secure against the MitM û û ü û û

Secure against replay attack û û û û û

Interworking with Internet û û ü ü ü

Short latency û û ü ü ü

Low energy consumption ü ü û û û

Small total size ü ü û û û

4. Conclusion The authentication and the key agreement procedure of the interworking architecture between the WLAN and the cellular networks for different candidate protocols were discussed. The two accepted protocols by the 3GPP were the

EAP-SIM and EAP-AKA because of their easy compatibility with the existing cellular network infrastructures. On the other hand, the EAP-TLS/TTLS and PEAP, which were used in the Internet, showed promising advantages to be employed in the interworking structure. The security vulnerability and the deficiency of each authentication protocol were addressed and compared. Although 3GPP has accepted the interworking protocols for the WLAN-Cellular network, more studies on the efficiency of the security protocols for the beyond 3G networks are required.

References [1] 3GPP, “3GPP system to Wireless Local Area Network

(WLAN) interworking; System description,” 3GPP TS 23.234 V9.0.0, Jan. 2010.

[2] 3GPP, “Wireless Local Area Network (WLAN) interworking security,” 3GPP TS 23.234 V9.2.0, June 2010.

[3] H. Haverinen, J. Saloway, “EAP-SIM authentication,” RFC 4186, Jan. 2006.

[4] J. Arkko, H. Haverinen, “EAP-AKA authentication,” RFC 4187, Jan. 2006.

[5] B. Aboba, D. Simon, “PPP EAP TLS Authentication Protocol,” IETF RFC 2716, Oct. 1999.

[6] P. Funk, S. Blake-Wilson, "EAP Tunneled TLS Authentication Protocol version0," IETF RFC 5281, Feb. 2005.

[7] H. Anderson, S. Josefsson, “Protected Extensible Authentication Protocol (PEAP)” IETF RFC 2026, Aug. 2001.

[8] L. Blunk, J. Vollbrecht, “Extensible Authentication Protocol (EAP)” IETF RFC 3748, March 1998.

[9] E. Barkan, E. Biham, N. Keller, “Instant ciphertext-only cryptanalysis of GSM encrypted communication,” Journal of Cryptology, Vol. 21 Issue3, 2008.

[10] A. Bogdanov, T. Eisenbath, A. Rupp, “A hardware-assisted real time attack on A5/2 without precomputations,” in Cryptographic Hardware and Embeded Systems, vol. 4727 , pp. 394-412, 2007.

[11] Ch. Xenakis, Ch. Ntantogin, “Security architectures for B3G mobile networks,” Journal of Telecommunication Systems, vol.35, pp. 123-139, Sept. 2007.

[12] V. Gupta, S. Gupta, “Experiments in wireless internet security,” Proc. IEEE Wireless Communication and Networking Conf., Vol. 1, pp. 859-863, March 2002.

[13] G. Kambourakis, A. Rouskas, S. Gritzalis, “Using SSL in authentication and key agreement procedures of future mobile networks,” Proc. 4th IEEE Int. Conf. on Mobile and Wireless Communication Networks 2002, pp. 152-156, Sept. 2002.

[14] G. Kambourakis, A. Rouskas, G. Kormentzas, S. Gritzalis, “Advanced SSL/TLS-based authentication for secure WLAN-3G interworking,” IEEE Communications Proceedings, Vol. 151, pp.501-506, Oct.2004.

[15] P. Prasithsangaree, P. Krishnamurthy, “ A new authentication mechanism for loosely coupled 3G-WLAN integrated networks,” In Proceeding of Vehicular Technology Conference 2004, IEEE, Vol. 5, pp.2284-3003, May 2004.


156

[16] Y. Zhaho, Ch. Lin, H. Yin, “Security authentication of 3G-WLAN interworking,” 20th International Conference on Advanced Information Networking and Applications, Vol. 2, pp. 429-436, 2006.

[17] C.J. Mitchell, “The security of the GSM air interface protocol,” Technical Report, Royal Holloway University of London, RHUL-MA-2001-3, 2001.

[18] U. Meyer, S. Wetzel, “A Man-in-the-Middle attack on UMTS,” in ACM workshop on wireless security 2004, pp. 90-97, 2004.

[19] Z. Ahmadian, S. Salimi, A. Salahi, “New attacks on UMTS network access,” Wireless Telecommunications Symposium 2009, Prague, pp. 1-6, April 2009.

[20] N. Asokan, N. Valteri, K. Nyberg, “Man-in-the-Middle in tunneled authentication,” Internet Drafts, Nokia Research Center, Oct. 2003.


157

A Modified Hill Cipher Involving Permutation, Iteration and the Key in a Specified Position

V.U.K.Sastry1 and Aruna Varanasi2

1Department of computer Science and Engineering,SNIST

Hyderabad, India, [email protected]

2Department of computer Science and Engineering,SNIST Hyderabad, India,

[email protected] Abstract: In this paper we have developed a block cipher by modifying the classical Hill cipher. In this we have introduced additional features such as Permutation and Iteration. For simplicity and elegancy of this cipher, the plaintext is multiplied by the key from the left side in the first iteration and from the right side in the second iteration and this process is continued in the subsequent iterations. The above mentioned operations carried out in this analysis led to a thorough confusion and diffusion of the information. The avalanche effect and the cryptanalysis, carried out in this paper, clearly indicate that the cipher is a strong one. In this analysis the permutation is playing a very prominent role in strengthening the cipher. Key words: Encryption, Decryption, Cryptanalysis, avalanche effect, permutation. 1. Introduction The classical Hill Cipher [1], which came in to existence much before the advent of Computers has brought in a revolution in the area of Cryptography. It is governed by the relations

C = KP mod26, and (1) P = K¯1C mod26. (2)

In these relations P is a plaintext vector, C the Ciphertext vector, K the key matrix and K¯1 is the modular arithmetic inverse of K. In the above relations mod 26 is used as 26 characters of English alphabet are used in the development of the cipher. In the literature of Cryptography, it is well established that this cipher can be broken by the known plaintext attack. This is due to the fact that the equation (1) can be brought to the form

Y = KX mod 26, in which X and Y are matrices containing appropriate number of column vectors of the plaintext and the corresponding column vectors of the ciphertext, and the modular arithmetic inverse of X can be obtained in several instances. Before proceeding to the development of Feistel cipher[2-3], Feistel has noticed that the Hill cipher involving only a linear transformations is quite vulnerable and it can be broken. Thus he has introduced a XOR Operation, linking the portions of the plaintext and the key, and an iterative procedure so that confusion and diffusion, which offer strength to the cipher, can be achieved.

In a recent investigation, sastry et al. [4], have applied a key matrix on both the sides of the plaintext matrix and have avoided the direct possibility of breaking the cipher. They have strengthened the cipher by using a function called mix() for mixing the plaintext bits at every stage of the iteration process involved in the cipher. They have found that the relation between the Ciphertext and the plaintext is highly nonlinear and hence the cipher cannot be broken by any cryptanalytic attack. Of late sastry et al.[5] have developed a cipher, wherein they have taken a pair of keys. In this they have introduced an iterative procedure, which includes a Permutation and have shown that the cipher is significantly strong. In the present paper our objective is to develop a block cipher which involves a nonlinear relation between the Ciphertext and the plaintext. In this analysis we have introduced the key matrix, K as a left multiplicant in one step of the iteration, and as a right multiplicant in the next step of the iteration, and we have continued this process till all the iterations are completed. The procedure is strengthened by using a function named Permute() for achieving confusion and diffusion. In what follows we present the plan of the paper. In Section 2, we introduce the development of the cipher, and put forth the algorithms for encryption and decryption. Section 3 is devoted to illustration of the cipher. In section 4 we discuss the cryptanalysis in detail. Finally in section 5 we mention the computations related to the cipher and draw conclusions. 2. Development of the cipher Consider a plaintext, P. Let this be written in the form of a matrix given by P = [Pij], i= 1 to n , j=1 to n. (3) Here each element of P is a decimal number lying between 0 and 255. Let us take a key matrix K, which can be represented in the form K = [Kij], i=1 to n, j=1 to n, (4) where each Kij is a decimal number in the interval 0 to 255. Let the Ciphertext, C be given by C = [Cij], i=1 to n, j=1 to n, (5) In which all the elements of C also lie between [0-255].


158

In this we have used 0-255 as we have made use of EBCDIC code for the conversion of plaintext into the corresponding decimal numbers. The process of encryption and the process of decryption are described by the flowchart given in Figure-1. The basic relations governing the process of encryption are given by

if (r mod 2 = 1) then P=KP mod 256 else P=PK mod 256

and P= Permute(P).

The process of decryption is given by

C=IPermute(C ) and if(r mod 2 =1) then C=K-1 C mod 256 else C=C K-1mod 256.


159

The process of permutation can be described as follows. Let P be a matrix containing n rows and n columns. This can be written in the form of a matrix containing m(=n2) rows and eight columns as shown below:

81

128121118111

pnnpnn

pppp

K

MOM

MOM

O

L

(6) .

This is obtained by converting each element of the P into its binary form. On assuming that the n is an even number, the above matrix can be divided into two equal halves, where in each half contains m/2 rows. Then the upper half is mapped into a matrix containing m rows and four columns. In the process of mapping we start with the last element of the upper half and place it as the first row, first column element of a new matrix. Then we place the last but one element of the upper half as the element in the second row and first column. We continue this process of placing the remaining elements of the upper half, one after another, till we get m rows and four columns of the new matrix. Then we place the elements of the lower half from the beginning to the end in their order , such that they occupy four more columns and m rows. Thus we again get a matrix of size mx8.

For example when n=4 (i.e., m=16), let us suppose that the plaintext matrix, P is given by

P =

193101651141612242210219364335568310386

(7)

This can be brought to the binary form given by

10000011010100001010010101001110000010000011000001001111010010111101101100100100110101001100010000011100110010101110011001101010

(8)

On permuting the above matrix in accordance with the process explained earlier, we get

11000000000111000100001100001100001100000101100110011110110101100000000111011001001011101010010101010111010010000101101100011011

(9)

In order to have clear insight into the process of permutation let us explain what we have done. Here we have started with the last element of the last row of the upper half in (8), and have gone in the backward direction ( placing the elements one after another in a column wise manner in the first column, second column, etc., ) till we have reached the first element of the matrix. This results in the first four columns of (9). Then we have commenced with the first element of the lower half and filled up, element after element, in the remaining four columns of (9) till we have reached the last element of (8). This has enabled us to obtain (9). This process of permutation is expected to thoroughly permute the binary bits which are obtained as a consequence of the product of the plaintext matrix and the key matrix. The matrix (9) can be brought into its corresponding decimal form as shown below.

35619448

12154121107

128155116165

23418218216

(10)

Obviously as it is expected the elements of original matrix, P (7) and the elements of the matrix (10) obtained on permutation are totally different. It may be noted here that the IPermute() in the decryption is a reverse process of the Permute() used in the encryption.

The algorithms for encryption and decryption are written below.

Algorithm for Encryption 1. Read n,P,K,r 2. for i = 1 to r

{ if ( (i mod 2)=1) P=(KP) mod 256 else P =( PK) mod 256 P= Permute(P) }

C = P 3. Write( C )


160

Algorithm for Decryption 1. Read n,C,K,r 2. K¯1 = Inverse(K) 3. for i= r to 1

{ C = IPermute(C) If ((i mod 2)=1) C = (K¯1 C ) mod 256 else C= (C K¯1 ) mod 256 }

4. P = C 5. Write (P) Algorithm for Inverse(K) 1. Read A, n, N

// A is an n x n matrix. N is a positive integer with which modular arithmetic is carried out. Here N= 256.

2. Find the determinant of A. Let it be denoted by Δ, where Δ ≠ 0.

3. Find the inverse of A. The inverse is given by [Aji]/ Δ, i= 1 to n , j = 1 to n // [Aij] are the cofactors of aij, where aij are the elements of A

for i = 1 to N {

// Δ is relatively prime to N if((iΔ) mod N == 1) break; }

d= i; B = [dAji] mod N. // B is the modular arithmetic inverse of A.

3. Illustration of the cipher

Let us consider the plaintext given below. Father! it is really a pleasure to see India as a visitor. Each village has a temple, each town has an engineering college and of course, each place has a liquor shop. (11) We focus our attention on the first sixteen characters of the plaintext (11). This is given by Father! it is re (12)

On using EBCDIC code the plaintext (12) can be written in the form of a matrix, P given by

P =

13315364162137641631376479153133

136163129198

(13)

Let us take the key, K in the form

K =

92855539752091994811201713467925123

(14)

On applying the encryption algorithm, described in section 2, with r=16, we get the Ciphertext given by

C =

19514119301012181336357872146

127140168115

(15)

On adopting the decryption algorithm, we get back the original plaintext given by (13). In order to estimate the strength of the algorithm, let us study the avalanche effect.

Here we replace the fifth character “e” of the plaintext (12) by “d”. The EBCDIC codes of “e” and “d” are 133 and 132, which differ by one bit in their binary form. Thus, on using the modified plaintext, the key (14) and the encryption algorithm, let us compute the corresponding ciphertext. This is given by

C=

25514822224912323675

10653181713060680

(16)

On converting (15) and (16) into their binary form, we notice that the two ciphertexts differ by 69 bits (out of 128 bits). This shows that the cipher is a strong one. Let us now change a number in key, K. We replace 199,the third row second column element of (14), by 198. These two numbers also differ by only one binary bit. On performing the encryption with the modified key and the original plaintext intact, we get the ciphertext given by

C=

9524825579232221253825813722823965756322

(17)

Now on comparing the binary forms of (15) and (17), we find that they differ by 68 bits (out of 128 bits). This also shows that the cipher is a potential one. 4. Cryptanalysis In the literature of Cryptography the general methods of cryptanalytic attack are

• Ciphertext only attack (Brute force attack) • Known plaintext attack • Chosen plaintext attack and


161

• Chosen ciphertext attack In this analysis we have taken the key, K consisting of 16 numbers, where each number can be represented in terms of 8 binary bits. Thus the length of the key is 128 bits, in view of this fact the size of the key space is

2128 = 21012. 8

≈ 10312. 8

=1038.4. If the determination of the plaintext for each value of the key takes 10-7 seconds, then the time required for computation with all the possible keys in the key space is given by

( )606024365

71010 4.38

xxxx − ( )15104.381017.3 −= xx

( )4.231017.3 x= years.

Thus the cipher cannot be broken by the Ciphertext only attack. In the case of the known plaintext attack, we know as many pairs of plaintext and ciphertext as we require. For example, we confine our attention only to two rounds of the iteration process in the encryption. For the sake of convenience in presentation, let us denote the function Permute() as F(). Then we have P = (KP) mod 256, (18) P = F(P), (19) P = (PK) mod 256 and (20) P = F(P) mod 256. (21) C = P (22) From (18) - (22), we get C = F(( F((KP) mod 256) K) mod 256). (23) In (23) the innermost K and P are multiplied and the result is operated with mod 256. On converting the resulting numbers into their binary form, permutation is performed, as mentioned earlier, and the resulting matrix containing decimal numbers is obtained. Then this matrix is multiplied by K on the right side and mod 256 is taken. Thus we have got a new matrix whose permutation yielded C. In this process the K and P are getting thoroughly interacted, and their binary bits are undergoing diffusion and confusion in a very strong chaotic manner. This is all on account of the permutation and mod operation. The K and P are losing their shapes and getting thoroughly mixed, so that no trace of them can be found separately. In the above analysis we have taken only two rounds. In general in our analysis of this problem, as we have sixteen rounds, we get C =F((….. F((F((KP) mod 256)K)mod 256)…..) mod 256). In this relation, as we have multiplication, mod 256 and Permutation playing a vital role, the binary bits of K and P are undergoing several changes several times. Thus we are not able to get the key or a function of the key so, that the cipher can be broken. Hence the cipher is a very strong one. The third and fourth attacks namely chosen plaintext and chosen Ciphertext attacks merely depend upon the vision and the intuition of the attacker. Though the cipher is

an elegant one, its suggesting a plaintext or a Ciphertext to be chosen is found to be impossible. Hence the cipher cannot be broken in the last two cases too. The above analysis clearly shows that the cipher cannot be broken by any attack. 5. Computations and Conclusions

In this paper we have modified the Hill Cipher by introducing a Permutation, Iteration and the key in a specified position. In the first round the key is on the left side of the plaintext and it is on the right side in the second round. The same pattern is continued in the subsequent rounds. The algorithms for encryption and decryption, mentioned in section 2, are implemented in java. On using the program for encryption the Ciphertext corresponding to the entire plaintext given in 3.2 is obtained as

In carrying out the computation, we have taken a block of 16 characters of the plaintext in each round. In the last round when the plaintext has fallen short to sixteen characters by a few characters, we have appended additional characters at random to make up the block complete. In this analysis we have seen that the multiplication of K and P, the mod operation which is modifying the product and the permutation causing a thorough diffusion and confusion are playing a prominent role in strengthening the cipher. The potentiality of the cipher is seen markedly in the discussion of the avalanche effect. The cryptanalysis clearly indicates that the cipher cannot be broken by any cryptanalytic attack. The key K as a left multiplicant in all the odd rounds and as a right multiplicant in all the even rounds is expected to play a significant role in diffusing the information in all directions, so that the strength of the cipher is enhanced. From the development and the analysis of the above cipher, we notice that the modified Hill Cipher is a strong one and it can be applied very confidently for the security of information. References: [1] William Stallings, Cryptography and Network Security,

Principles and Practice, Third edition, Pearson, 2003. [2] Feistel, H. “Cryptography and Computer Privacy”

Scientific American, May 1973. [3] Feistel, H., Notz., W., and Smith,J. “Some

Cryptographic Techniques for Machine-to-Machine Data Communications.” Proceedings of the IEEE, November 1975.


162

[4] V.U.K.Sastry, D.S.R.Murthy, S. Durga Bhavani, “A Large Block Cipher Involving a Key Applied on both the Sides of the Plaintext”, Vol.2, No.2, pp.10-13,February.2010.

[5] V.U.K.Sastry, V.Aruna, Dr.S.Udaya Kumar, “ A Modified Hill Cipher Involving a Pair of Keys and a Permutation”, Vol.2,No.9, pp.105-108, September 2010.


163

Neural Network based Regression Model for Accurate Estimation of Human Body Fat - Obesity

Assessment using Circumference Measures

N.Revathy1 and Dr.R.Amalraj2

1Department of Computer Applications, Karpagam College of Engineering, Coimbatore, India [email protected]

2Department of Computer Science, Sri Vasavi College, Erode, India

Abstract: Obesity is a matter of great concern for individuals and organizations including the armed forces. Excess fat is detrimental to physical fitness and work performance. Although the height-weight tables adopted by the armed forces for obesity assessment are simple to use, yet they are not accurate enough, as a muscular individual may be labeled obese while a lean person with normal weight range may escape detection although he may have an abnormal fat content. The author compared the skin-fold method of body fat assessment, circumferences method of body fat assessment and the Body Mass Index (BMI). The circumference method is recommended to be used in the armed forces rather than the inaccurate height-weight measures. keywords: Obesity, Overweight, Body Mass Index, Skin-fold, Circumference, Neural Networks

1. Introduction

Throughout most of human history, a wide girth

has been viewed as a sign of health, and prosperity. Yet it is ironical that several million people are sent early to their graves by damaging effects of eating too much and moving too little. Clearly, overweight cannot be taken lightly. The problem of obesity is a matter of serious concern to the individual as well as the state. Obesity increases mortality and morbidity at all ages. Excess fat is considered a detriment to physical fitness and work performance, which is also affected by a number of other variables such as age, sex, training, attitude, and motivation, genetic and environmental factors. Obesity is a condition in which there is an excessive accumulation of body fat in body. Estimation of human body fat has lot of applications in different areas of medicine and other medical fitness tests. The most accurate way of calculating body fat percentage is the one provided by Siri’s equation [6], which requires a measurement of Density via weighting under water. But this is expensive and unpractical. Anthropometry (height, weight and multiple skinfold thicknesses) by trained observers using standardized technique to calculate body fat by Durnin and Womersley's equation [15]. But taking

skinfold based measurement is very tedious and may lead to inaccurate measurements. The other laboratory based techniques such as total body water and body fat by bioelectrical impedance analysis (BIA) and deuterium oxide dilution (D2O) are also comparatively expensive.

But, there are simple and cost effective methods which use the age and the body circumference measurements which are easy to obtain without much measurement errors. There are lot of statistical methods and formulas to estimate the percentage of fat in human body. In this paper, we have proposed a neural network based model for body fat estimation using simple circumference measures. Measurement Based Body Fat Estimation Methods There are two commonly used methods for body fat measurements. They are Ø Skin fold thickness method Ø Circumference method

Skin fold method was standardized by Dumin and Womersley [15]. In this method, the body fat is estimated using different skin fold measured which are taken form different parts of the body such as midtriceps, biceps, subscapular, suprailiac, etc., regions. These measurements will be made using an apparatus called constant force caliber.

Circumference methods were first developed by Wright, Davis and Doston. This methods estimates the percentage of body fat using different circumference measures taken in different parts of the body such as, upper and lower arm, waist, neck, abdomen, hip thigh etc., Skin fold thickness method of body fat measurement: Since most adipose tissue is in the subcutaneous layer, the present body fat has been traditionally estimated by measuring skin folds at midtriceps, biceps, subscapular and suprailiae regions, using a constant force calipers. The method was standardized by Dumin and Womersely [15]. Body fat is indirectly assessed from body density. The percent body fat is calculated by the equation: % BF = [(4.95 / body density) – 4.5] * 100. Though more accurate than the, H-W measures, the main drawback of the skin fold measurements is their poor repeatability in the absence of a skilled observer. Moreover, measurements may be difficult


164

or impossible in very obese people whose skin folds would not fit between the jaws of the measuring caliper.

2. Techniques Used in Measurement Based Body Fat Estimation Methods

Regression and multi-dimensional regression are the most commonly used statistical techniques in the design of the above two types of measurement based fat estimation methods. There is lot of statistical methods available to solve multi-dimensional regression problem. In statistics, regression analysis includes any techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables.

Most of the existing techniques for measurement based body fat estimation formulas were based on statistical relationships in the data. But, in the recent years, new techniques such as; artificial neural networks and fuzzy inference systems were employed for developing of the predictive models to estimate the needed parameters. Soft computing techniques are also now being used as alternate statistical tool.

Some previous work shows that the radial basis function(RBF) neural network based prediction systems exhibiting a high performance than multilayer perceptron (MLP) network, adaptive neuro-fuzzy inference system (ANFIS) and multiple regression (MR) technique. So in this work, we explored the possibility of using a neural network to solve the body fat estimation problem.

A radial basis function (RBF) neural network is trained to perform a mapping from an m-dimensional input space to an n-dimensional output space. RBFs can be used for discrete pattern classification, function approximation, signal processing, control, or any other application which requires a mapping from an input to an output. Radial Basis Functions (RBF) represent alternative approach to MLP's in universal function approximation. RBFs were first used in solving multivariate interpolation problems and numerical analysis. Their prospect is similar in neural network applications, where the training and query targets are rather continuous. While MLP performs a global mapping (i.e., all inputs cause an output), RBF network performs a local mapping (i.e., only inputs near specific receptive fields will produce an activation). The units (in the hidden layer) receiving the direct input from a signal may see only a portion of the input pattern, which is further used in reconstructing a surface in a multidimensional space that furnishes the best fit to the training data. This ability of the RBF network to recognize whether an input is near the training set or outside the trained region provides a significant benefit over MLP's. RBF's are useful in safety critical applications, as well as in those having a high financial impact. 3. Definitions and Measurement

The height-weight tables suggested by body composition working group have been extensively adopted as a first line screening technique for obesity and the same has been adopted by the India Armed Forces. Obesity is defined as, weight 20% greater than the desirable weight for that particular person. Several methods have been used to assess body composition. Most methods of measuring the fat content in the living subject are, to a lesser or greater degree, indirect. The three classical methods used in the lab are the measurements of whole body density whole body water, and whole body potassium. Though more accurate these method are time consuming and not feasible for routine body fat assessment. Several workers have developed height-weight indices, also called indices of relative weight or ‘indices of adiposity’. These are easily calculated requiring nothing more than the height(H), weight(W) and Age(A). A commonly used index is the Body Mass Index (BMI) / Quetlet’s index which is = weight (kg) / height (m)2. The acceptable (normal) range is 20 to 25; A BMI of greater than 27 is overweight while obesity is taken to start at a BMI of 30, and a BMI of 40 and above indicates gross / morbid obesity. The standards are the same for men and women. However the H-W indices measure ‘overweight’ rather than obesity. Since the H-W indices cannot distinguish between excess fat, excess muscle mass (e.g. in weight lifters). Fluid retention (oedema), large bones etc., these drawbacks lead to a high degree of inaccuracy. Circumference methods of body fat measurement: In circumference methods, estimates of present body fat are made with equations based on circumference measures typically involving areas prone to excess fat accumulation such as the upper and lower arm, waist, hip and thigh. The equations were first developed by Wright, Davis and Doston in 1981 and include two body circumferences for males viz. Neck circumference (NC), measured around the neck with the measuring tape passing just below the larynx; Abdomen circumference (AC), measured around the abdomen at the level of the umbilicus. For females, three circumferences are included in addition viz., Bicep circumference (BC), measured at the largest circumference of the arm with the arm extended and plan facing up; Forearm circumference (FC), measured at the largest circumference of the forearm and; Thigh circumference (TC), measured on the left thigh just below the buttock. All circumferences are measured in centimeters. In 1984, Hodgdon and Beckett derived equations for estimation of percent body fat in the males, using the same circumference measures and height (in cms) of the individual. The equations are: Percent body fat (men) = (0.740 * AC) – (1.249 * NC) + 0.528 Percent body fat (females) = (1.051 * BC) – (1.522 * FC) – (0.89 * NC) + (0.326 * AC) + (0.597 * TC) + 0.707


165

The circumference measures and height is used to estimate body density by the following equation: Body density (men) =(0.19077*log10(ACNC)+(0.15456*log10[Ht])+1.0324 Percent body fat is then estimated from the body density by the same equation as used in the skin fold method of body fat measurement. The estimate used by Hodgdon and Beckett is currently used by U.S. Navy, Since October 1986 for the initial screening for obesity during the initial medical examination of its personnel. Conwayet al (1989), found that the estimates of percent body fat were more strongly related with physical fitness than were the H-W indices, and thus concluded that circumference methods of body fat estimation assess actual body fat more reliably than the H-W indices. 4. Implementation of RBF Neural Network Based Fat Percentage Estimation Model Like most feed forward networks, RBFN has three layers, namely, an input layer, a hidden layer, and an output layer. The nodes within each layer are fully connected to the previous layer nodes. A schematic diagram of an RBFN with n inputs and m outputs is given in Fig. 1.

Σ Σ Σ

Φ Φ Φ

outputlayer

hiddenlayer

inputlayer

y1 yj ym

x1 xi xn

θ01 θ0j θ0m

Figure 12 :. Schematic diagram of RBFN [5]

The input variables are each assigned to a node in the input layer and pass directly to the hidden layer without weights. The hidden layer nodes are the RBF units. Each node in this layer contains a parameter vector called a center. The node calculates the Euclidean distance between the center and the network input vector, and passes the result through a nonlinear function(.). The output layer is essentially a set of linear combiners. The ith output yi due to input vector x, x=[x1, . . ., xn]T, can be expressed as

( )yi i ji j jj

M s

= + −=

∑θ θ σ01

Φ || | | ,x c

where Ms is the number of hidden units, cj and j are the center and the width of the jth hidden unit respectively, ji represents the weight between the jth hidden unit and the ith output unit, and 0i represents the bias term of the ith output unit.

In this work,

Ø The dimension of the input layer will be equal to the dimension of training patterns, that is the number of circumference measures in the dataset.

Ø The number of Hidden Layer will be decided automatically in the used Matlab version of RBF network

Ø The dimension of the output layer will be equal to one since there is only one output ‘percentage of body fat’ which is to be mapped with respect to m dimensional input.

Several ways have been proposed for training RBF networks. Recently, Professor Simon has proposed the use of Kalman filters for training RBF networks . We are going to use this model for our application.

After training the RBF network with the selected number records (50/100/150/200/250 records of the 13 measurements) the whole data set (252) is considered as test data and fed in to the trained network and the value of percentage of fat has been predicted.

Then the correlation between predicted values and the originally calculated values (of under water weighing method) has been calculated.

The correlation between two variables reflects the degree to which the variables are related. The most common measure of correlation is the Pearson Product Moment Correlation or called Pearson's correlation in short. When measured in a population the Pearson Product Moment correlation is designated by the Greek letter rho (ρ). When computed in a sample, it is designated by the letter "r" and is sometimes called "Pearson's r."

Pearson's correlation reflects the degree of linear

relationship between two variables. It ranges from +1 to -1. A correlation of +1 means that there is a perfect positive linear relationship between variables. 5. Evaluation and Results The Evaluation Dataset

The proposed model has been tested with the human body fat data set distributed by Roger W. Johnson, Department of Mathematics & Computer Science, South Dakota School of Mines & Technology, Rapid City, South Dakota, North America. This dataset has been originally provided by Dr. A. Garth Fisher (personal communication, October 5, 1994), age, weight, height, and 10 body circumference measurements are recorded for 252 men. Each man's percentage of body fat was accurately estimated by an underwater weighing and the density as well as the measured percentage of body fat were provided along with that data.

• Age (years)

• Weight (lbs)

• Height (inches)

• Neck circumference (cm)

• Chest circumference (cm)


166

• Abdomen circumference (cm)

• Hip circumference (cm)

• Thigh circumference (cm)

• Knee circumference (cm)

• Ankle circumference (cm)

• Biceps (extended) circumference (cm)

• Forearm circumference (cm)

• Wrist circumference (cm) • Percent body fat from Siri's (1956) equation (%) • Density determined from underwater weighing (cm)

These data are used to produce the predictive equations for lean body weight. The scope is to predict body fat percentage on the basis of these variables, using proposed RBF based regression model. The all body measurements are continuous variables. The first 13 variables will be used as inputs to RBF network and 14th value (the Percent body fat %) will be used as the expected output during training the network. The density value (15) is just given here as an information and it will not be used for training or testing.

The following two dimensional plots are roughly showing the distribution of data. Singular Value Decomposition is used to find the first principal component of the 13 features circumference measure data set. In x axis, we used that first principal component and the y axis we used the percentage of fat. So, the plot is showing only approximate distribution of the data.

Figure 2 : The 2D Plot of Input Data Space

Evaluation of RBF Training Performance We have trained the RBF network with different

number of records and measured the time taken for training as well as the mean square error of last two consecutive iteration of the training. The following tables list the obtained results. Table 4 : Performance of Training

Sl. No

Number of Records Used for Training

The Time Taken For Training the RBF

The Training

Error

1 50 2.4218 0.0230 2 100 3.3594 0.0645 3 150 3.2812 0.1056 4 200 4.1562 0.1439

5 250 7.9687 0.1802

The following graph shows the time taken for training for different number of records used for training. As shown in the graphs, The time is gradually increasing with respect to the increase in the size of training data set. 6. Conclusion

The proposed idea has been successfully implemented using Matlab. The input parameters of the RBF network has been selected based on several experiments. The experiments were repeated for several times with different number of records.

The arrived results were more significant and comparable. The performance measured in terms of the Pearson's correlation between the originally calculated values and newly estimated values reached almost 0.9 which is better than most of the classical methods and almost equal to that of previous skinfold based methods.

The accuracy of the system was good while the number of records used for training is high. As far as we have tested, the network can be trained in few seconds with few hundred records. After that, it was capable of predicting the percentage of fat of few hundred records with in a fraction of a second. So there will not be any practical difficulty in using the system different classes of training sets.

7 Scopes for Future Enhancements It is obvious that the data used in this experiment

contains some outliers and little errors. Even with those outliers and errors, the proposed system performed well. If we apply suitable methods to remove the outliers and isolate any abnormal records, in the training data set, then we can expect better accuracy in prediction. Our future works will address this issues. In this work, we have only used the proposed regression model to estimate the percentage of body fat. But this method is a generic model which can be applied to any multidimensional non linear regression problem. So our future works will address other application areas where the proposed model can bring better results. Reference [1] Wayt Gibbs W. Gaining on Fat Scientific American

August 1996. [2] Isik Yilmaz and Oguz Kaynar, "Multiple regression,

ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils", Geophysical Research Abstracts, Vol. 12, EGU General Assembly,TURKEY, 2010

[3] Siri, W.E., "Gross composition of the body", in Advances in Biological and Medical Physics, vol. IV, edited by J.H. Lawrence and C.A. Tobias, Academic Press, Inc., New York, 1956.

[4] Roger W. Johnson, Fitting Percentage of Body Fat to Simple Body Measurements, Journal of Statistics Education v.4, 1996


167

[5] Katch, Frank and McArdle, William, Nutrition, Weight Control, and Exercise, Houghton Mifflin Co., Boston, 1977

[6] Behnke, A.R., Wilmore, J.H. An anthropometric estimation of body density and lean body weight in young men. Journal of Applied Physiology, 1969.

[7] Jackson, A.S., Pollock, M.L. Generalized equations for predicting body density. British Journal of Nutrition, 1978.

[8] Behnke, A.R. and Wilmore, J.H., Evaluation and Regulation of Body Build and Composition, Prentice-Hall, Englewood Cliffs, N.J. 1974

[9] Katch, F.I., McArdle, W.D., Nutrition, Weight Control, and Exercise. Lea & Febiger: Philadelphia. PA, 1983.

[10] Wilmore, Jack, Athletic Training and Physical Fitness: Physiological Principles of the Conditioning Process, Allyn and Bacon, Inc., Boston, 1976

[11] Dumin & Womersley J, Body fat assessed from total body density and its estimation from skinfold thickness: measurement on 481 men and women aged from 16 to 72 years. Br J Nutr 1974

Authors Profile N. Revathy received BSc Computer Science and Master of Computer Applications (MCA) degree from Bharathiar University in 2000 and 2003 respectively. Received M.Phil, Computer Science degree from Alagappa University in 2005. At present working as a Assistant Professor in Karpagam College of Engineering, Coimbatore, India Dr. R.Amalraj Completed Bsc Computer Science, Master of Computer Applications (MCA) and PhD in Computer Science, At present working as a Professor in Department of Computer Science at Sri Vasavi College, Erode, India.


168

An Innovative Simulation of IPv4 to IPv6 BD-SIIT Transition Mechanism Methodology between IPv4

Hosts in an Integrated IPv6/IPv4 Network J.Hanumanthappa1,D.H.Manjaiah2

1Dept of Studies in Computer Science,University of Mysore Manasagangotri,Mysore,India

[email protected] 2Dept of Computer Science,Mangalore University

Mangalagangotri,Mangalore,India [email protected]

Abstract: The concept of transition from IPv4 network to IPv6 network is being processed vigorously. Extensive research study is being done presently on this topict as a transition from IPv4 to IPv6 requires a high level compatibility and clear procedure for easy and independent deployment of IPv6. The transition between IPv4 internet and IPv6 will be a long process as they are two completely incompatible protocols and will significant amount of time.For the smooth interoperation of the two protocols,various well defined transition mechanisms have been proposed so far .In this paper a comparative study of the behavior of IPv4-only network with that of BD-SIIT under various types of traffic patterns is carried out. In the proposed BD-SIIT enabled network architecture,the hosts in IPv4 initiates a connection with hosts in the IPv4 network over an integrated IPv4/IPv6 network. The performance metrics considered in this research paper are Throughput,End-to-End delay(EED),and Round trip time(Latency).The various simulations are performed using Network Simulator 2(ns-2). Keywords: BD-SIIT, IPv4,IPv6,Transition mechanism. 1. Introduction Over the last decade,the IETF has been working on the deployment of IPv6[1,2] to replace the current internet protocol version (IPv4).The motivations behind IPv6 are briefly discussed in the following section and are covered in the literature[4,5,6,7].The Internet Protocol version(IPv6) is now gaining momentum as an improved network layer protocol.The current version of Internet protocol,IPv4 has been in use for almost 30 years and exhibits some innovative challenges in supporting emerging demands for address space cardinality,high-density mobility,multimedia,strong security etc.IPv6 is an improved version of an IPv4 i.e. designed to coexist with IPv4 and eventually provide better internetworking capabilities than IPv4.IPv6 offers the potential of achieving the scalability,reachability,end-to-end internetworking,Quality of service(QoS),and commercial-grade robustness for data as well as for VoIP,IP-based TV(IPTV),distribution and triple play networks.The aim of this IJCNS Journal paper is to examine the behavior of a transition mechanism that will involve the communication between the two IPv4 hosts over an IPv6 network under various traffic conditions.This will make possible the exchange of information between IPv4-only network hosts

through an Integrated IPv6/IPv4 network and hence we will call it Bi-directional Stateless Internet Inter control messaging protocol(BD-SIIT) as the IPv6/IPv4 network maintains a dual stack of both IPv4 and IPv6. The necessity of reexamining the problem arises as the research in these areas has not very widely been explored.The rest of this paper is structured as follows.Section 2 discusses the proposed new network architecture. Section 3 illustrates the Simulation Models of BD-SIIT Translator.The Performance measurement procedures Scenario is discussed in Section 4 and Section 5 shows the Implementation and evaluation of BD-SIIT Translation for UDP and TCP protocols.Finally the whole paper is concluded in Section 6. 2. Proposed Novel Network Architecture

In this section, we present the description of the architecture of the simulation environment for our work.The scenario given in the Figure-2 depicts a conversation between two IPv4 based nodes over an IPv6 based network.Assumption in the BD-SIIT is made based on the data exchange.In this paper we proposed a new transition algorithm called BD-SIIT how it works with UDP and TCP protocols.As we know that SIIT(Stateless Internet Protocol/Internet control messaging Protocol Translation(SIIT) is an IPv6 transition mechanism that allows IPv6 only hosts to talk to IPv4 only hosts.BD- SIIT is said to be a stateless IP/ICMP translation ,which means that the translator is able to process each conversion individually without any reference to previously translated packets.In paper,the authors have adopted the BD-SIIT to study network performance with few types of traffic sources:voice over IPv4(VoIPv4),FTP-over-IPv6(FTPv6) and MPEG-4-over-IPv6.The performance is evaluated considering bandwidth,throughput, percentage of packets dropped,and mean end-to-end delay of each traffic flow for both IPv4 and IPv6.The ns-2 simulator shows that when the traffic density of IPv6 session increases,the bandwidth of IPv6 increases at the expense of the session.On the other end,the increment of the traffic density of IPv4 session does not increase its bandwidth.


169

Figure 1. Proposed Network Architecture of BD-SIIT Figure 2. Network Based BD-SIIT Translation Process 2.1 IP Header Conversion As we know that BD-SIIT is a stateless IP/ICMP translation.The BD-SIIT translator is able to process each conversion individually,without any reference to previously translated packets.Although most of the IP header field translations are relatively very simple to handle,however one of the issue related with BD-SIIT translator is how to map the IP addresses between IPv4 and IPv6 packets.The BD-SIIT translation process supports the other two additional types of IPv6 addresses. 1.IPv4-mapped address (0:FFFF:v4):-This is an IPv6 address simply created by including the IPv4 address of the IPv4 host (v4) with the prefix shown. The BD-SIIT mainly uses this type of address for the conversion of IPv4 host addresses to IPv6 addresses. 2.IPv4 translated addresses (0:FFFF:0:v4):-According to IETF specification this address is created by IPv4 address temporarily assigned to the IPv6-only host and allows for the mapping of the IPv4-translated address of the IPv6 host to an IPv4 address. 3.IPv6 over IPv4 Dynamic/Automatic tunnel addresses.These addresses are designated as IPv4-Compatible IPv6 addresses and allows the sending of IPv6 traffic over IPv4 networks in a transparent manner.They are represented as: 156.55.23.5. 4.IPv6 over IPv6 Addresses Automatic representation.These addresses allow for IPv4-only nodes to still work in IPv6 networks. They are designated as IPv4-mapped-IPv6 addresses and are represented as ::FFFF.They are also represented as ::FFFF.156.55.43.3. 2.2 The IPv6 mapped address in BD-SIIT Translator BD-SIIT resides on an IPv6 host and converts outgoing IPv6 headers into IPv4 headers, and incoming IPv4 headers into IPv6. To perform this operation the IPv6 host must be assigned an IPv4 address either configured on the host or obtained via a network service left unspecified in RFC 2765.When the IPv6 host wants to communicate with an IPv4 host,based on DNS resolution to an IPv4 address,the BD-SIIT algorithm recognizes IPv6 address as an IPv4 mapped-address as shown in Fig.4.The one of the mechanism to translate the resolved IPv4 address into an IPv4 mapped address

Figure 3. IPv4-mapped-IPv6 address

2.3 The BD-SIIT architecture In this paper we have implemented the IPv4/IPv6 BD-SIIT bwhich operates when hosts in the native IPv4 network initiates connections with hosts in the native IPv6 network and viceversa. The newly implemented BD-SIIT system should depends on the following two important characteristic features. Table 1: DNS46 Database able of IPv6 and IPv4 Public address

1v4-v6 DNS(DNS46):-It’s a server which identifies two public IPv4 and IPv6 addresses statistically or dynamically for each IPv4/IPv6 communication system. The Table-1 shows the DNS46 Database of the IPv6 and IPv4 Public addresses.

2.3.1 v4-v6 enabled gateway It’s a part of BD-SIIT translator which performs address mapping between an IPv4 and an IPv6 addresses as well as header conversions process between the IPv4 and IPv6 packet headers. Therefore the v4-v6 DNS(DNS46)server in our proposed system (BD-SIIT) has information regarding work station(host names) and their IP addresses which are available in both network regions.Table-1 clearly states that the Database table which stores two groups of IPv4 and IPv6 public addresses and their leased times which is used to

0 79 80 95 96 127 80-Zero bits FFFF(16 bits) 32 bits(IPv4 Address)

Public –IPv4 Leased Time

Public-IPv6

Leased Time

IPv4-1 180s IPv6-1 180S IPv4-2 IPv6-2

…. …… ……….. …….. IPv4-N IPv6-N


170

identify the specific period of time for which each situated at the v4-v6 enabled gateway shows the actual IP addresses of the two communication hosts(IPv4 and IPv6 hosts) with their corresponding public addresses (IPv4 and IPv6) with the two derived mapping values like G and F.The Table-2 and Table-3 shows the IPv4 and IPv6 DNS on DNS46 and v4-v6 enabled gateway.

Table 2: IPv4 and IPv6 DNS on DNS46

Table 3: V4-V6 enabled Gateway

2.4 The BD-SIIT A Novel Transition Address-mapping algorithm for the forward operation Based on the presence of IPv4-mapped-IPv6 address the destination IP addresses of the BD-SIIT algorithm performs the header translation as described in Algorithm-1 and Algorithm-2 to obtain an IPv4 packet for transmission via data link and Physical layers. The Fig-4 shows the protocol stack view of BD-SIIT.

IPv6 applications

IPv4 applications

Sockets API

TCP/UDP v4 TCP/UDP v6

IPv4 IPv6

L2

L1 Figure 4. Protocol stack view of BD-SIIT Algorithm-1:-IPv4->IPv6: How BD-SIIT works with UDP 1.When Host X belongs to X-zone initiates a request (Query message) to DNS46 in order to get the IP-address of Host Y which belongs to IPv6 zone. 2.When the DNS46 receives a query message as a request then it checks its Table-2 to identify whether Host Y has an IPv6 address which is unknown for the Host X.The DNS46 knows that the whole IPv6 zone has a public IPv4 address (like NAT method) i.e. 195.18.231.17 address in the destination address field of IPv4 header then forwards it via a network.

3. Simultaneously the DNS46 sends another message to v4-v6 enabled router in order to update Table-3. Algorithm-2:-IPv6->IPv4: How BD-SIIT works with TCP The following steps signify how BD-SIIT works with TCP in the feedback operation from IPv6-IPv4.

[Note:-Consider Host X as a Client and Host Y as a Server. If the client A sent a HTTP get command to retrieve a web page from the server Y.]. 1.As a response for the received command from a client X, server Y creates packet(s) then forwards them via a network to the client X using the public IPv6 zone address(ABC2::4321) as a destination address. 1. When v4-v6 enabled router receives a packet which has been sent by Server Y then it verifies its 2. Table-1 and Table-3 depending on the addressing mapping value like 37 in our scenario it refers to 220.12.145.10 as a sender address in Table-2 and 223.15.1.3 as a destination address in Table-2 instead of instead of 1C::DACF and ABC2::4321 IPv6 rely.

3. After that the v4-v6 enabled router creates a new IPv4 packet, based on the accepted IPv6 packet then forward it is to the destination (Client X).When the client X receives the IPv4 packet, its starts to process successfully without

any problem. Case-1:v4-to-v4 and v6-to-v6 direct connection by using a router

Figure 5. v6-to-v4 BD-SIIT connection via BD-SIT translator

Figure 6. v4-to-v6 BD-SIIT connection via BD-SIIT translator 3. Performance measurement procedures

The majority of the tests were done for a sufficiently long period of time and resulted in the exchange of 50,000 packets to 1,000,000 packets, depending on the size of the packets sent and their corresponding test.We have conducted an empirical measurement test based on the following parameters: Throughput,End-to-End delay(EED),RTT,(Latency)All these performance metrics are simulated using NS-2 Simulator.

Sl.No IPv4 Address

IPv6 Address

DNS Address mapping

value 1 212.17.1.5 ---- B 4 2` 223.15.1.3. 1C::DACF Y 37

IPv4 IPv6 P_IPv4

P_IPv6

TTL

M_Value1

M_Value2

245.87.09.68

2D::BDEF

IPv4-1 IPv6 120s

G1 F1


171

3.1 Throughput The rate at which bulk data transfer can be transmitted from one node to another over a sufficiently long period of time,and the performance is measured in Mbits/s.We calculated the throughput performance metric in order to identify the rate of received and processed data at the intermediate device(router or gateway)during the simulation period. The mean throughput for a sequence of packets of specific sizes are calculated by using equations 1 and 2. (1)

Where Thrj = (2)

Where Thrj is the value of the throughput when the packet j is received at intermediate device like BD-SIIT Gateway,v4/v6 Router and N is the number of received packets at intermediate device,Pr is the no of packets received at intermediate device and Pg is the number of packets generated by source host.

3.2 End-to-End Delay(EED) We calculate the mean end-to-end delay for IPv4-only networks as well as for the IPv6 networks with BD-SIIT enabled.The end to end delay has been calculated for varying packet sizes for the same network architecture.The mean end to end delay is calculated by taking into consideration of the time at which a packet starts at the source and the time at which the packet reaches the destination and also the number of packets received as mentioned in (1).This information is also extracted from the trace file obtained for the corresponding tcl script used for the simulation with the help of a Perl script.

Hjd= (3)

Hi=end-to-end delay of packet ‘i’=Tdi-Tsi (4) Tsi =Time of packet ‘i’ sent at Source. Tdi =Time of packet ‘i‘ received at Destination. Nr=Number of Packets received at Destination.

3.3 Round Trip Time(RTT) RTT is also known as Latency:- It’s the amount of time taken by a packet to travel from one source node to another destination node and back to the original source node, and the performance metric is also measured in microseconds per RTT. The round trip time (RTT) is one of the performance metric that is computed in our simulation for the BD-SIIT.The mean RTT for the sequence of packets with a specific size is computed as follows.

RTTi (5)

Where i is a packet number and N is the number of packets sent. It is worth noted that the packet size is directly proportional to round trip time (RTT)(Refer.Fig.9).

RTTi =Tri -Tsi (6) Where RTTi is the Round trip time of packet “i”, TSi is the created time of a packet “i” at source host, Tri is the received time of packet “i” at the destination host at the end of its trip.

4. Implementation and Evaluation We compute the mean end-to-end delay(EED),Throughput, and RTT for IPv4 only network as well as for IPv6 network with BD-SIIT enabled.The end to delay(EED),RTT and Throughput has been calculated for varying packet sizes for the same network architecture.The mean end-to-end delay is calculated by taking into consideration of the time at which a packet starts at the source node and the time at which the packet reaches the destination node and also the number of packets received as given in the equations (3) and (4).This information is extracted from the trace file obtained for the corresponding tcl script used for the simulation with the help of a Perl script. Case-2:v4-tov6 and v6-to-v4 connections via the BD-SIIT translator. 4.1 Proposed Network architecture for BD-SIIT in IPv6 To realize the DSTM,ns-2.26 with MobiWan patch pack is used to create IPv6 based scenario.The utility of this ns-2 mobiWan patch pack is that it provides some implemented protocols through which IPv6 based network simulations can be performed.The proposed DSTM scenario consists of IPv4 source node which sends its node id to a DSTM server.The IPv4 to IPv6 DSTM network architecture consists of two DSTM servers the DSTM TSP and TEP and IPv4 hosts.

4.2 The Simulation Scenario of BD-SIIT The proposed transition mechanism, BD-SIIT is evaluated through simulation using the Network Simulator ns-2.In this figure the topology for the BD-SIIT transition mechanism is shown.This topology is designed for the simulation in ns-2.26 with MobiWan patch pack.This topology consists of two IPv4 hosts like to communicate with each other over other two IPv6 networks.To realize the BD-SIIT,ns-2.26 with MobiWan patch pack has been used in an IPv6 based scenario.The utility of the MobiWan patch is that it provides some implemented protocols through which IPv6 based network simulations can be performed

Figure 7. BD-SIIT and DSTM Mean EED


172

Figure 8. BD-SIIT and DSTM Mean Throughput

Figure 9. Mean RTT of different packets in BD-SIIT 5. Conclusions

The aim of this paper is to examine the behavior of a transition mechanism that will involve the communication between the two IPv4 hosts over an IPv6 network.And hence we will call the BD-SIIT transition mechanism as the Integrated IPv6/IPv4 network. The necessity of reexamining the problem arises as the research in this area has not widely been explored. This research is just an attempt to show the current scenario of the impact of transition mechanisms over a network.This work also concludes that in spite of imposing extra delay to the network, the DSTM is significant as a transition mechanism due to two reasons.Firstly a transition mechanism is required for the smooth,interoperation of both the protocols and secondly the DSTM has shown to have several features of tunneling and dual stack approach which can be taken as an intermediate of these two transition mechanisms. In this way DSTM supports better reliability and low data loss by combining the specific characteristic features of the two transition mechanisms. References [1] G.C.Kessler,IPv6:“The Next Generation Internet Protocol,“The handbook on Local Area Network. [2] Jovial Govil,Jivesh Govil,Navkeerat Kaur,Harkeerat Kaur,”An examination of IPv4 and IPv6 Networks: Constraints and Various Transition mechanisms”, IEEE Region 3 Huntsville

Section (IEEE Southeast Con 2008),April3-6,2008,Huntsville,Alabama,USA.,pp.178-185.

[3] Jivika Govil,Jivesh Govil,“On the Investigation of Transactional and Interoperability Issues between

IPv4 And IPv6”in Proceedings of the IEEE

electro/Information Technology(EIT07) May 2007,Chicago, USA.vol-2,pp.604-608. [4] S.Deering and R.Hinden “Internet Protocol Version 6(IPv6) Specification”,RFC 2460,December 1998. [5] John.J .Amoss and Daniel Minoli,” Handbook of IPv4

to IPv6 Transition: Methodologies for institutional and corporate Networks”,Auerbach Publications.

[6] S.G.Glisic Advanced Wireless Networks 4G Technologies John Wiley and Sons Ltd,The Atrium,Southern Gate,Chichester,West Sussex PO198SQ,England, 2006. [7] Juha wiljakka, Jonne Soninnen,Managing IPv4–to–

IPv6 Transition Process in Cellular Networks and introducing new Peer-to-Peer Services.

[8] Ioan R,Sherali.Z.2003,”Evaluating IPv4 to IPv6 Transition mechanism”,West Lafayette,USA,vol- 1,pp.1091–1098. [9] Andrew.S.Tanenbaum,“Computer Networks”,Third Edition,Prentice Hall Inc.,1996, pp.686,413-436,437- 449.

Author’s Profile

Hanumanthappa.J. received his Bachelor of Engineering degree in Computer Science and Engineering from University B.D.T.College of Engineering,Davanagere Karnataka(S),India(C),from the Kuvempu University,at Shankarghatta,Shimoga in 1998

and his Master of Technology in CS& Engineering from the NITK Surathkal,Karnataka(S),India(C) in 2003 and Currently he is pursuing his PhD in Computer Science and Engineering from Mangalore University under the supervision of Dr.Manjaiah .D.H on entitled “Investigations into the Design, Performance and Evaluation of a Novel IPv4/IPv6 Transition Scenarios in 4G advanced wireless networks”. Presently he is working as an Asst.Professor at DoS in Computer Science,Manasagangotri and Mysore. He has authored more than 40 research papers in international conferences and reputed journals.

Dr. Manjaiah.D.H is currently Reader and Chairman of BoS in both UG/PG in the Computer Science at Dept.of. Computer Science,Mangalore University and Mangalore. He is also the BoE Member of all Universities of Karnataka and other reputed universities in

India.He received Ph.D degree from University of Mangalore,M.Tech.From NITK,Surathkal and B.E.,from Mysore University. Dr.Manjaiah.D.H has an extensive academic,Industry and Research experience.He has worked at many technical bodies like IAENG,WASET,ISOC,CSI,ISTE and ACS.He has authored more than-65 research papers in international conferences and reputed journals.He is the recipient of the several talks for his area of interest in many public occasions.He is an expert committee member of an AICTE and various technical bodies.He had written Kannada text book,with an entitled,“COMPUTER PARICHAYA”,for the benefits of all teaching and Students Community of Karnataka.Dr.Manjaiah D.H’s areas interest are Computer Networking & Sensor Networks,Mobile Communication,Operations Research,E-commerce,Internet Technology and Web Programming.


173

Optimizing AODV routing protocols through queuing creation and converting flow model into

packet model in wireless sensor networks

Ehsan Kharati 1, Alireza Baharizadeh 2, Meysam Chegini 3 and Mohsen Amerion 4

1 Islamic Azad University of Arak, Computer Engineering Department,

Arak, Iran [email protected]

2 Sama Technical and Vocational Trainig School

Islamic Azad University, Shahrekord Barnch Shahrekord, Iran

[email protected]

3 Sharif University of Technology, Mechanic Department,

Tehran, Iran [email protected]

4 Halmstad University

School of Information Science, Computer and Electronic Engineering Halmstad, Sweden

[email protected] Abstract: In this paper, a combination model of routing network has been presented. Flow models and packet model with together combines as Discrete Event in created queue in AODV routing protocol. In proposed combination model, available flow model in network has been analyzed dynamically and it causes reduce routing time. Simulation results in this method show that with increase network nodes, latency and rate of packet lost reduce in comparison with pervious methods. Thus with replace packet model Instead of flow model, overflow calculation have reduced and it can improve network performance.

Keywords: Routing protocol, Fluid flow, Packet model, rate of packet lost, wireless sensor networks

1. Introduction In this paper, traffic flow and discrete event models are combined to route in AODV protocol of wireless sensor networks. To do so, there is packet loss problem; to solve this issue queuing method is used in the middle nodes [1]. To calculate the rate of packet loss, first it is essential to convert packets to flow and subsequently using related differential equations and parameters such as queue length, bandwidth, etc., routing delay and packet loss rate in the nodes can be calculated. Afterwards, by evaluating and comparing obtained rates of the proposed method with previous methods and increasing network scale as well, the efficiency of the proposed method can be realized. AODV protocol is a routing protocol in wireless sensor

networks which sends packets via adjacent nodes to destination. This protocol is based on demand and provides the routing information any time it is required. Hence, there is no need to periodic refresh. This protocol selects the minimum distance to support QoS and operates based on DSDV Protocol algorithm. The protocol maintains established routes until they are active; therefore it is suitable for non-ring and large networks. When the location of nodes and network topology are changed; AODV Protocol can establish new routes quickly between nodes using response packets of previous routes which cause overhead increase in this protocol. This is because of releasing packets from source node to other nodes; to reduce this overhead different methods have been proposed. For example in [1], using queue in nodes results in storing packets and routing them in the next periods and also reduces the time and routing restrictions. In [2], to reduce routing packets a special algorithm is used. This algorithm sends routing packets only for adjacent nodes that respond to routing request during a particular time. Lack of coordination in timing, causes slow routing and reduces response time to the interactions. With distributed and paralleled computational resources in discrete event, routing processing rate can be increased [3]. In routing of flow model networks, user is able to lead network traffic and exchanges, similar to streams flowing, in contrary to single package. The two models differ in foreground traffic of


174

packet orientation and background traffic of flow model. Background traffic is less important from the standpoint of accuracy and precision; but these two traffics are competing for network resources [4]. The problem in flow model in AODV protocol is integration with packet routing of discrete events. In contrast with packet streams which use discrete events to describe routing, flow model, routes the network and traffic streams with a series of differential equations and continuous variables. Flow models routing based on discrete event is much accurate than packet model in AODV protocol. Packet loss rate decreases as well. In [5], a combination model of these two models is proposed and passes the flow of packets through the virtual network. Therefore, flow model events can be routed on packets. In [6] a flow-based model has been proposed that uses a series of ordinary differential equations (ODE) to describe the behavior of stable AODV streams. These differential equations can be solved numerical and by using complex computation. In [7], model [6] is upgraded by improving network topology information, a series of delay differential equations and Runge-Kutta time interval algorithm. The results show that this model has better accuracy and precision run time for a large number of AODV flows in compared to closed surface routers. In this reference, flow-based traffic are combined with packet flows of discrete event; for this purpose it is necessary to divide network into two parts of packet and flow and all flows compete for network resources. In flow network, network’s primary core is global; and its statuses are determined by solving differential equations. In packet network, transactions are routed like discrete events. This combination occurs while the packets are entering and network flow is being converted to packet flows steadily. Additionally, this reference shows the probability of delay, packet loss and exiting out of the network based on the schedule. In [8], a similar approach through implementation in MATLAB is presented that solves the differential equations of processing packet events which are passing through the flow network faster. Since the traffic of flow and packet model in AODV protocol are separate and the network structure cannot be changed while routing, therefore, most of listed methods, divided network into flow and packet parts prior to routing. However, the proposed method in this paper merges packet and flow models together instead of dividing the network. Therefore, queue size dynamism and other details in flow network during routing can be investigated. But since packets flow only is available as surface traffic in the network, so it causes limitation of sending packets in virtual network. The flow model also can only route packet streams from a node to another and is not able to specify packet loss and delay. In real time routing in AODV protocol, the user can send real packet streams to any part of the virtual network. Therefore, each router in any part of the virtual network combines inputted fluid and packet streams and enqueues them. Created dynamic queue is under control by a set of

differential equations. These equations are resolved using Runge-Kutta fixed interval. Enqueued packet will be routed as discrete events and sent according to queue mode. Thus packet loss and delay probability can be calculated. Other features and advantages of the proposed method include the following items:

• Network is not divided between packet and flow models; and flows and packets are combined dynamically. Therefore the network traffic could be transferred to all sections of the virtual network or could have interaction with all the flows.

• Only in network queue, packet streams are converted to fluid. This does not happen in any other parts of the network. Whereas in [7], packet flows are converted to fluid flows upon their arrival to network. Thus, the proposed method can maintain packet level accuracy with regard to changes in packet.

• User can change network traffic between packet and flow models dynamically. Therefore network accuracy and precision are specifiable by time limit. Also with increasing packet stream ratio in combined model, traffic and detail could be routed more in terms of computational time.

The rest of this article is expressed as it follows. First, section 2 expresses the features of flow model. Section 3 describes how to convert packet and flow models. Afterwards, in section 4 the accuracy, precision and performance of the proposed method is evaluated. Finally, in section 5, conclusion is presented and future functions are introduced.

2. Flow model The first flow model in a network was created with Active Queue Management (AQM) routers [9]. This model used a series of differential equations to route traffic behavior in order to produce the network traffic with the aid of flow classes. Each class contains ni homogeneous flow with the same specification which pass the same route. The task of these equations is the frame size control in every flow class, queue state control, release control and reducing the risk of loss and delay inside the network. Frame size in time t, for each AODV flow in flow class i, is equal

)(2)(

)(1)(

ttW

tRdttdW

ii

i

i λ×−=(1)

Where Wi (t) is the frame size at the time t, Ri(t) is the delay of going and coming back flow and λi(t) is the rate of packet loss at time t. This equation shows AODV format behavior changes in traffic avoidance stage, but does not specify the size range of packet frame. Queue size l at time t is:

llll Ctptdt

tdq−−×Λ= ))(1()(

)((2)


175

Where ql(t) is l queue length at time t, Cl is bandwidth of the link and ql(t) is packet loss probability. Λl(t) is arrival rate of passing flow classes and is equal to:

∑∈

=ΛlNi

lil tAt )()(

(3)

Where Ail(t) is instantaneous arrival rate of flow class i and

Nl is set of passing flow classes through link l. In this equation the queue length range that should be between 0 and maximum queue size ql is not applied either. The network which uses random early detection (RED) regulation for its queues, packet loss probability is based on the average queue size and this amount can be calculated from the instantaneous queue size.

)()1ln(

)()1ln()(

tqa

txa

dttdx

lll ×

−−×

−= δδ (4)

Where xl(t) is the average queue size, δ is movement step size and α is the used weight in calculation of Exponential Weighted Moving Average (EWMA). Therefore the probability of packet loss in AODV protocol is equal to:

≤≤+−×−

≤≤×−

−

<≤

=

otherwise 1

2 )1(

0 0

)(maxmaxmaxmax

max

max

maxminmaxminmax

min

min

qxqppq

qx

qxqpqq

qx

qx

xp

(5)

Where qmin , qmax and pmax are RED queue configuration parameters in it; therefore the arrival rate of flow classes of i to the first queue si, equals:

)()(

)( tRtWn

tsAi

iii

i = (6)

As input rate in flow class i increases, input packet and flows enter to the next queue instead of entering to middle nodes. Therefore, the arrival rate to the next queue equals:

)()()(

tDatlg

A lil

ii =+ (7)

Where )(lg

A ii is entrance rate to the next queue; a l is link

release delay and gi(l) is the next node which flow i enters to its l queue. Entrance rate to queue l has delay due to queuing. This delay equals:

l

lf C

tqtt

)(+=

(8)

tf is queue creation delay in AODV protocol while flow is entering at the time t and Cl is the bandwidth of link. Thus if the arrival rate is more than the capacity of service, packet loss rate would be equal to entrance rate. Otherwise, loss rate commensurate with flow arrival rate which are currently competing for shared services, so we have:

×Λ

≤−×Λ−×

= otherwise

)()(

))(1()( if ))(1()(

)(l

l

li

llllli

fli C

ttA

CtpttptA

tD

(9)

Where is entered flow rate i to queue l. However, overall delay caused by packet loss equals:

++−

=

= otherwise

)()())(()(

if )(

)(

l

lii

ii

il

l

fil

Ctq

lbalbatlbd

slCtq

td

(10)

Where

is flow class delay i in queue l at the time t and bi(l) is the queue prior to queue l on the route of flow i. Thus the loss rate for all of the packets is:

×+−×

=×=

otherwise )()())(()(

if )()()(

tptAlbatlbr

sltptAtr

llii

ii

illi

fil

(11)

Which rli(tf) is the loss rate for all the packets by flow I in

queue l. Since packet loss and queuing delays in the return flow path are inappreciable, and assuming that traffic is only on the forward way; we will have:

iii

ii tfdtR ππ +−= )()((12)

i

iii

i ntfr

t)(

)(π

λ−

=(13)

Where is trip time, λ i(t) is packet loss rate in flow , πI

is one way path flow class delay and fi is the last queue that flow class has passed on going way. The proposed model is only able to send AODV flows in RED queues, but by changing equation (1), which controls the size of sent frame, it is possible to use this model for developing and modeling a wider flow class such as User Datagram Protocol (UDP) and time limit processing easily [10]. UDP Protocol, like TCP protocol, works in the session layer, but unlike TCP is without connection. Therefore, it is faster, but it cannot control errors appropriately. Thus, its reliability is low. The main usage of UDP protocol is to send and receive data in high levels. So for a network that its traffic is produced via compact flows with fixed rate; this model offers a natural solution. In [11], through calculating periodic fixed point and a set of nonlinear equations a solution has been proposed; here, the problem could be solved directly and without any changes in the fluid model, by shortening the RED queue. In this case, packet loss probability equals to:

>

Λ−Λ

=

otherwise 0

(t)Λ AND full is queue if )())((

)(ll

l

ll CtCt

tlp

(14)

3. The suggested model of converting flow model to packet

Packet and fluid flows pass through the network and compete for network resources. But only packets that are enqueued can be routed. Deciding whether to accept a packet in queue depends on queue status at entrance time, instantaneous queue size and packet loss probability. Queue


176

status is controlled by a set of differential equations that was presented in the previous section. To calculate time interval of queues status change and flow class frame size, Runge-Kutta fixed time interval method is used. If δ is time interval and for each flow of which passes through queue , in each time interval , is flow arrival rate, send rate, traffic delay, packet loss rate,

traffic frame size, instantaneous queue length, average queue length, packets arrival rate, packet loss probability, Sum of the packets that entered to queue from the beginning, to calculate the number of packet flows entered into a network queue, packets should be tracked until the last time interval in Runge-Kutta. In this case packets arrival rate equals:

δδδ

δ))1(()(

)(−−

=kNkN

kAlP

lPl

P (15)

Packets as well as flows compete for queue space while entering. Thus, required queue length for packet arrival can be calculated by equation (2). Therefore queue length equals to:

llll Ctptdt

tdq−−×= ))(1()(

)(ξ

(16)

Where is entrance rate for both packet and fluid flows; is bandwidth of the link. So: ξl(t) = Λl(t) + Ap

l(t) (17) When the entrance rate is more than service capacity, the rate of packet loss is commensurate with the entrance rate packet and fluid flows which are currently competing for the shared service. Therefore, we have:

×

≤−×−×

= otherwise )(

)(

))(1()( if ))(1()(

)(l

l

li

llllli

fli Ct

tA

CtpttptA

tDξ

ξ

(18) The problem in here is that packets which have arrived into the queue between the Runge-Kutta time intervals are not routable. To solve this problem, using three variables, including instantaneous queue length ( ), average queue length ( ) and packet loss probability ( ), the changes are calculated. According to previous Runge-Kutta time interval their primary values, respectively from left to right, equal:

)(:~ , )(:~ , )(:)(~ δδδ kppkxxkqtq lllll === If we assume packet events E1, E2, ..., Em, are currently entering the queue l and t1, t2, ... , Tm are these events time intervals, So that t0 = 0 and

; then the time between successive packet entrance events will be equal with:

1-ti - Δti = ti Where: i=1, 2… m Since packets enter individually and flow enter continuously to the queue, so the queue length changes at each time interval of event processing. As a result, the first instantaneous queue length during flow arrival equals:

)}})~1)(((~,0max{,min{~lllilll CpktqQq −−Λ∆+= δ

If the packet is lost with the probability of or queue is full, arrived packet will be separated and next event processed. Otherwise, packet length will be added to the instantaneous queue length; and average queue length and packet loss probability will be re-calculated according to RED method. Consequently, processing this packet with

scheduling and queuing delay is complete. Thus, flow

model is compatible with Runge-Kutta packet model using these variables and queue mode can be specified at the beginning of each interval to update arrival of each packet. Also, the impact of packet flows on the fluid flows can be obtained by solving fluid model equations 17 and 18.

4. Efficiency evaluation of the proposed method

Simulation with time limit makes the possibility of interaction with physical world and coordination with real time. Furthermore, it causes to direct traffic from router nodes to other routers and links in the network [2, 4]. The problem is the time limit and events process guarantee before their deadlines. We perform and simulate the combination scheme of packet and flow models in PRIME SSFNet test network. This network is special for routing with time limit and most of its models are dependent on the RINSE simulator from UIUC [9]. Figure 1 shows the topology of this network that was used in [8] for the first time. In the second experiment, the Dumbbell topology has been used which its topology is shown in Figure 2. These operations have been conducted on SSFNet simulator for 100 seconds on a machine with AMD Athlon64 CPU, 2.2GHz and 2GB of RAM. In the first experiment, the network consists of 12 nodes and 11 links which all links delay is 10 milliseconds and their bandwidth is 100Mbps. According to the number of nodes, there are 22 RED queues which their maximum size is 5MB. In packet routing trend, TCP Reno model is used which its maximum frame size is 128. Also we suppose that the interval size of flow model is constantly 1ms, applied weight in EWMA calculation is 0.0004 and configuration parameters of RED queue are qmin=100KB, qmax=4.5MB and pmax =0.1. There are four flow classes in the test network. Class 0 and 1 are consisting of 10 flows, Class 2 of 20 flows and Class 3 of 40 AODV flows. Class 0, 1 and 2 start from time zero and last for total time of routing; but class 3 starts at time 30 and just lasts for 30 seconds. As we will see, there is a competition between classes 2 and 3, and traffic will occur in connection between nodes 4 and 7 at the time of 30 to 60 seconds. Figure 3, only shows packet and flow routing results in a number of required packets to store in the queue between nodes 4 and 7. Packet routing has many changes, but fluid model has almost steady behavior; because the overload of mode changing has been decreased due to computational efficiency. Additionally, global queuing level by fluid model is higher than packet routing. Thus it is required to set flow


177

model more. In [7], a method has been proposed for this regulation. To examine the accuracy and precision of combination model, the flow model in classes 0, 1 and 3 are used; and in class 2 AODV flows are routed as a combination of packet and fluid flow. In this test, packet flow ratio in class 2 is increased from 10% to 50% and finally to 100%. Figure 4 shows the total delay from source to destination in two classes, namely, 0 and 2 with flow and combined models. It could be seen, as the packet flow increases, queue length and the total delay from source to destination increase as well. In the next test by using the Dumbbell topology which is shown in Figure 2, the computational efficiency of combination model has been investigated. This topology often used for analyzing AODV traffic control algorithms. It has N server nodes as traffic sources on the left and N client nodes as traffic collector on the right. During the routing, both server and client nodes simultaneously participate in M sessions and each node has a queue with maximum size of M MB and connections towards the routers with the bandwidth of (10×M) Mbps. The bandwidth between two routers is (10 × M × N) Mbps and maximum queue size of each router is (M × N). RED queue parameters are set so that qmin is a hundredth of maximum queue size and qmax is half of maximum queue. Figure 5, shows the experiment results that the number of server and client nodes are 5 and AODV simultaneous packet flow ratio is increasing from 5 to 40. As the figure shows, when the number of flows and nodes is the same, virtually no time is consumed to complete the routing, but by increasing the packet flows, routing time increases.

Figure 1. Tentative topology of network with 4 categories of

flow

Figure 2. Network topology of Dumbbell

Figure 3. Packet flow in Comparison with Fluid flow and

affect on queue size

Figure 4. Average of Latency in different model of flow

routing

S3 R1

S2 C2

C1

CN

C3 R2

S1

SN

1 2 4 7 10 12

9 11 3 6

8 5

Class 0

Class 2

Class1

Class 3


178

Figure 5. Comparison of Run time for different model of

network

5. Conclusion In this paper, a combination routing model in AODV protocol was introduced, that shows the combination and converting discrete and continuous events via packet and fluid flow interactions in each router. Also it has shown that through solving differential equations in each router, flow model is much faster than packet routing and by increasing fluid flow in the system, little overhead is produced in compared to AODV packet flows routing cost.

References [1] E. Kharati, “A Survey and optimization of routing

protocol for Wireless Ad-hoc Networks”, Master Thesis Project, Islamic Azad University of Arak, pp. 98-119, 2004.

[2] E. Kharati and Ali Movaghar, “An object-oriented routing structure in J-Sim environment for Wireless Sensor Networks”, In 13th International CSI Computer Conference, March 2008.

[3] E. Kharati and Ali Movaghar, “Optimization of S-Mac routing protocol to reduce energy consumption for Wireless Sensor Networks”. In 13th International CSI Computer Conference, March 2008.

[4] E. Kharati, “Create a Debugger in simulation for DiSenS Sensor Networks”. In First Annual Symposium of Computer Engineering, Electronic Engineering, IT Engineering, Islamic Azad University of Hamedan, March 2008.

[5] C. Kiddle, R. Simmonds, C. Williamson, and B. Unger, “Hybrid packet/fluid flow network simulation”, In Proceedings of the 17th Workshop on Parallel and Distributed Simulation (PADS’03), pp. 143–152, June 2003.

[6] V. Misra, W-B. Gong, and D. Towsley, “Fluid based analysis of a network of AQM routers supporting TCP flows with an application to RED”, In Proceedings of the 2000 ACM SIGCOMM Conference, pp. 151–160, August 2000.

[7] Y. Liu, F.L. Presti, V. Misra, D.F. Towsley, and Y. Gu, “Scalable fluid models and simulations for large-scale

ip networks”, ACM Transactions on Modeling and Computer Simulation (TOMACS 2004), Vol 14. Issue 3: pp. 305-324, July 2004.

[8] J. Zhou, Z. Ji, M. Takai, and R. Bagrodia, “Integrating hybrid network modeling to the physical world”, ACM Transactions on Modeling and Computer Simulation (TOMACS), MAYA, Vol 14. Issue (2): pp. 149-169, April 2004.

[9] Y. Gu, Y. Liu, and D. Towsley. “On integrating fluid models with packet simulation”. In Proceedings of IEEE INFOCOM 2004, pp. 2856-2866, March 2004.

[10] M. Zukerman, T. D. Neame, and R. G. Addie. “Internet traffic modeling and future technology implications”. In Proceedings of IEEE INFOCOM 2003, pp. 587-596 ,April 2003,.

[11] D. M. Nicol, M. Liljenstam, and J. Liu, “Advanced concepts in large-scale network simulation”. In Proceedings of the 2005 Winter Simulation Conference (WSC’05), pp. 153-166, 2005.


179

Evaluation and improvement of resource retrieving method, using reinforcement learning in Grid

Network

Erfan Shams1, Abolfazl Toroghi Haghighat2 and Ehsan Ghamari3

1Islamic Azad University Qazvin Branch, School of Computer Engineering, Qazvin, Iran

[email protected]


[email protected]


[email protected] Abstract: Network calculations make virtual organization able to share distributed resources in geographical view, to achieve common goals this method lacks a place, a central controlling and trusted relation generally , in order to solve the gride problem. It is necessary to find the most suitable resource in the shortest time. This purpose as applied in part of solving process. All devoted approaches of information retrieving try to serve all requests optimally in the shortest time, but they are unable to match with or be flexible to grid network changes. Therefore, the flexibility in retrieving and resource allocation are necessary. In the presented paper, a few part is inserted into protocol in order to manage decision controlling based on reinforcement learning, that using learning patterns in grid network, space recognition , factor recognition, a number of factors and obtaining resource information in grid network ,this part performs retrieving operation and resources allocation more optical than other methods.

Keywords: resource retrieving, grid , reinforcement learning.

1. Introduction Network computation is a computation model that huge computations can be processed through them, using computation power of many netted computes in addition to keeping them as a unique virtual computer in view. In other words, rid is able to solve enormous computational problems, using computational power of several separated computers which are mostly connected through network (internet) [1, 3]. One of the most important current issues of computer network is the distributed network with grid architecture. Regarding of computer applications development as well as hardware rapid, advancement , creating integrated systems ( from free heterogeneous resources in order to multi purpose process along with supporting of maximum- efficiency resources). As many as some of the investigate and computational application these networks are for bottleneck problem

solving and in order to provide users dedicating optimal resources to execute these processes is significant issue in the field of distributed networks with grid architecture. In this paper, we will consider a part, called decision controlling management , based on reinforcement learning in resource management unit in grids, which makes us know better about the resources by passing of time and also makes the resources map onto the request optimally and rapidly. We will utilize reinforcement learning since this method matches grid network. Reinforcement learning is the online learning and also is applied in environment which is reportedly visible .this method is independent of long information mass for instructor (such as neural network and genetic – based method). But in our framework, regarding rewards which are achieved in the of path, we obtain recognition or generally rewards based on traversed steps repeating this approach. Our resource (node) recognition will be stronger; therefore we can understand the network better. Following, we introduce a new approach which is proved by comparison and testing, to show important and efficiency of our work to the previous studies.

2. Introduction of information retrieving in grid network

Serving request is performed in two manners: 1.real-time and 2. Non-real-time method. First, serving is rapidly performed to set out appropriate resources as soon as receiving request from broker resource which has mentioned conditions, send a message to broker, then broker selects the best resource based on different factors such as distance and send the request to the resource for serving. Obviously, this method makes the network traffic heavier. In second, information of resources is available at request reporting


180

time the discovered resources are saved by brokers in resource management part. Optimal resources are searched and selected as soon as request report and then the request will be sent for selected resources. According to this way, network traffic is significantly decreased and therefore the requests are responded more rapid.

2.1 Breadth First Search (BFS) BFS is one the simplest and most functional searching method with easy process. In this method, every node what has request, sends the request to all its neighbors. The node also searches its local information in order to find perfect answer. These mentioned steps will be done for every node which receives request in the case of finding necessary resourced , a message asks the resources through the nearest node which has replied to its request [8]. Making heavy traffic is the disadvantage of this method. Assume that, we have ‘n’ given nodes so that every node has m neighbors. In the first step, the broker sends a message to or responding m nodes .after passing of some steps, several messages are sent. This phenomenon makes the network occupied.

2.2 Random Breadth First Search (RBFS) This method is similar to the previous one with this different that corresponding node doesn’t send the message for all its neighbors in every step, but it sends the inquiry to the part of neighbors also this method has some disadvantages and advantages decreasing network traffic and rapid operating of the search are advantages. Since nodes haven’t any information about neighbors that the message is sent for them. Therefore there is no wasting time to verify and decide in other words. Every node selects randomly some of the neighbors and sends the message to them. Neighbor random selection is the significant disadvantage of this method because the dead parts of network which are weakly connected to the network, almost never are made inquiry.

2.3 Random Breadth First Search with RND-Step actually , this is improvement of previous ones in this method, we start the search, using ‘n’ brokers instead of only one broker(‘n’ is depended on given steps) and then every node of n node searches the releasing resources. Disadvantages of this method are like RBFS deliberations. In other words, because of being random performance in searching step, optimal results will not e obtained. On other hand, searching through couple of paths (and linear increasing of number of neighbors of all nodes) makes the efficiency higher.

2.4 Searching with keeping information In spite of 3 mentioned methods, this method responds the request in non-real status. There are some methods which consider the status of neighbors and responses to the requests, including directed breadth first search (DBFS) and hashing method the efficiency of those methods are higher then random based methods. In detail, these methods decrease significantly network traffic and mount of inquires

and therefore are able to rapidly the required resources to in-time responses of requests [10, 12].the notable difference of our presented method with mentioned ones are applying of reinforcement learning which makes suitable recognition by passing of time in to allocate optimally resources.

3. Reinforcement learning Reinforcement learning [13] generally is the art of finding strategy to improve the status to achieve the certain goal, regarding of environment recognition, relation results behind environment and benefits-damages of performing several task simply, Reinforcement learning is the learning through environmental relation to achieve specified goal decision maker and the person who learn, are called “Agents”. The thing what agent make relation with it, named environment (in fact, every foreign thing of agent is involved). This relation continuously, is occurred in this manner that, agent makes decision and then operates an action accordingly. After that, the environment will respond with granting reward. Lastly, the agent will be transferred to new state. In detail, agent and environment have relation sequencly through time step t=1,2,3,…. . in every step, for example in step t, the agent receives a new state from environment. In this paper, we suppose that the whole space of grid is s. st Є S where S is the possible state set of allocating environment resources. at is the possible task set of agent whom does them in state st. in the new step, environment grants reward R in time t + 1, so that r(t+1) Є R. based on its previous task, the agent will be transferred to the new state S(t+1). To mathematics, a policy is a mapping. For instance

]1,0[A S : →×Π is a policy. It means, a number in [0,1] is appropriated for any pair (action, state) like (s,a) which are belong to Π : S × A . This is shown by

A) S ( ×Π as follow: Pr{at = a}{st = s} = Π (s, a ) The set of states, actions and sequenced rewards is considered in reinforcement learning as follow:

Figure 1. The set of stats, actions and sequenced rewards.

The value of s is a function of state value:

Calculating of real value of the whole state is known as policy evaluation providing of following of policy Π and it is necessary for perfect learning the value which can be considered for action-state (s,a) is: Example: when we are going to learn reinforcement learning system appropriated for the separated and best resources to serve requests. According to the mount and time of processing to allocate resources, we initial this unit +1 : appropriate, -1 “ inappropriate , 0: middle. It is notable


181

that rewarding should be granted so that agent van satisfy us, maximizing reward and also it shouldn’t learn, how to satisfy o instance, in example 2 , score +1 ill be granted when the best resource is selected for processing.

4. Proposed approach to improve retrieving resources

in this paper, it is considered that, firstly there is no exact information based on grid network environment (about entity nodes) but generally, allocating resources will be more optimal, using patterns learning in grid network and recognition spaces. Agent st and number of agent a as well as gaining resource information in grid network. There are 3 general decisional states to allocate resources for agent including: • The resource has suitable processing power and

execution memory as well as suitable bandwidth. • The resource has not suitable processing power and

execution memory but there is suitable bandwidth( this is notable that 2 components of resources are not appropriately recognized to process information for making network process rate stable and also it is not recorded at RLDU). • The resource owns suitable power of processor and executable memory but the appropriated bandwidth is not suitable (it is notable that the resource is recognized proper for small processing and the resource allocation unit appropriates the small processing for this unit) thus s or state space is S={Good Node, Bad Node, Normal Node } and selection or tasks apace is : A={Specialty Process For This Process, Not Specialty Process} . To optimize resource allocation in a grid network, select resource issue has effect on efficiency of thought of network and resource managing select resource has 3 above general stats to select the best resource in state space s. in reinforcement learning, the agent goal is considered in the form of reward signal which is percepred from the environment. In every time step,this reward is considered as simple number rt Є R. In simple statement, is that the agent goal is maximizing of total of these rewards. Reward maximizing should be done in long term. this doesn’t mean maximizing reward in every step, to use this method in resource allocation, it is supposed that a unit is added to the network layout in grid network and rewards/punishes to resource allocation unit based on 2 factors as follow: 1- Appropriated processing time for resource 2- Recorded information of resource in workload management. Fig.2 sows the implementation of 2 new proposed added units to the grid layout network.

In this fig, two different unit are added to the grid protocol that these two units make decision for resource allocation to process. Reinforcement learning decision unit (RLDU):

Figure 2. Adding reward and punishment units to the grid

model This term is modeled reinforcement learning core based on marcov model and dynamic programming which dynamically make relation with 3 other layers (scheduler- broker- Gram). This unit decides +according to the two time factors which are for a processing in previous mapping and recorded information in above unit such as processing rate and data transfer bandwidth to appropriate this resource. RLDU announces proposed resources (the output of RLDU) to the grid resource manager-Gram. Reword/punishment (RE/PE) unit grants score +1 t suitable resource allocation for the processing and otherwise grants score -1 to RLDU.

4.1 Resource classification Information is saved on RLDUU based on every mapping for resource, but information included processor power, bandwidth, mount of score based on reward/punishment unit and 3 above condition will be recorded in RLDU before allocating resource the same as in search time of information and resource comparison. If there is no suitable resource for processing the request will be queued until finding suitable resource .the main reason of this strategy is to optimize and profit resource allocation , since if the request is delivered to the appropriate resource, it will not probably be completed on certain time. The resource allocation operation and planning must be repeated this is a disadvantage. In this method, this score will be granted to the resources nor request, hence the core will not be granted to the resources of RLDU, until performing resource allocation and request planning we can utilize a selection function for choosing a batch of resources to refer to the resources in order to avoid increasing information volume in RLDU. For example, we separate 20 resources which are matched the request from bank and then we start searching for finding optimal resource until the search time is decreased. Then resource will be classified in 3 batches based on classification space defined above, by passing of few times that the information is collected over the network. Dynamic property of grid is notable, since the resource of every system gets off inadvertently. In this case, the corresponding node must e deleted from classification, the score -2 will be granted by RE/PE unit, because of non-responding of RLDU. This process makes the resource delete however the information of resources


182

will be updated permanently by broker, since if the deleted resource comes back to the network, it will be classified. The RE/PE unit grants score -2 whenever resource allocation and planning are completed, as well as the request is sent to the resource. But the corresponding resource will be deleted from network before delivering the request.

5. Experimental results and valuations The simulation results show that proposed approach is matched with distributed network along with grid architecture and also efficiency stable. As compare with other methods of searching and allocation, out approach has high efficiency and optimality. In addition to that, because of using intelligent factor which make relation with other units, proposed approach is matched with grid dynamic heterogeneous network. Using limitation-maker search function, we are able to decrease search rate of recorder information in RLDU. In order to perform this decreasing, we can consider a search strategy applying different method in order to select resource. We can consider optimality of search time and selects resource simultaneously. In Figure 3 shows a number of search methods in search space, using proposed reinforcement learning which is presented in model and also compared with others.

Figure 3. Optimization of resource allocation based on reinforcement learning.

As shown, search time as compared with used methods is decreased. This improvement is evaluated in view of time and selecting optimal resource for request. In fig 4, search methods are compared with proposed approach in distributed grid network .as result, my work is optimal in view of search time and cost. It means we can optimize the result about 15% which are significant in time and cost.

Figure 4. Compartment of search method with presented approach

6. Conclusion In this paper, we try to propose an approach to retrieve resource and verify its efficiency in order to evaluate resource retrieving methods in grid network. Generally we come to conclusion that: • The efficiency of presented work will be more optimal by

passing of time and resource recognition. • In the case of changing network significantly and high

importing and exporting of network, the efficiency of our work will be decreased.

In future work, we are going to improve the efficiency of our work and presenting some search methods for resource in network. This work uses genetic algorithm and reinforcement learning at the same time.

References [1] B. Jacob, L. Ferreira, N. Bieberstein, C. Glizean, J.

Girars, R. Strachowski, and S. Yu, Enabling, Applications for Grid Computing with Globus, IBM Red Book Series, 3P, 2003.

[2] I. Foster, C. Kesselman, and S. Tuecke, ”The Anatomy of Grid, Enabling Scalable Vitrual Organizations”, International Journal of High Performance Computing Applications, XXX (3), pp. 200-222, 2001.

[3] A. Abbas, “Grid Computing, A Practical Guide to Technology and Applications”, Charles Rivar Media, 2004.

[4] M. D. Dikaiakos,” Grid benchmarking: vision, challenges, and current status”, Concurrency and Computation: Practice & Experience, 19 (1), pp. 89-105, Jan 2007.

[5] ”How does the Grid work?”, Available: http://gridcafe.web.cern.ch/gridcafe/openday/How-works.html.

[6] R. Buyya, S. Venugopal, “A Gentle Introduction to Grid Computing and Technologies”, CSI Communications, Computer, Science of India,July 2005.

[7] “what is the difference between the internet, the web and he Grid?”, Available: http://gridcafe.web.cern.ch/ gridcafe/openday/web-grid.html.

[8] “Job life-cycle”, Available: http://gridcafe.web.cern.ch/ gridcafe/openday/New-JobCycle.html.

[9] David Johnson, “Desktop Grids”, Entropia. [10] Rajkumar Buyya, “Grid Technologies and Resource

Management Systems”, A thesis submitted in fulfillment of the requirements for the Degree of Doctor of Philosophy, School of Computer Science and Software Engineering, Monash University, Melbourne, Australia, April,12, 2002.

[11] “Sun Powers the Grid, An Overview of Grid Computing”, Sun Microsystems, 2001.

[12] I. Foster, “What is the Grid? A Three Point Checklist”, Argonne National Laboratory & University of Chicago, July 20, 2002.

[13] "Reinforcement Learning: An Introduction", Richard S.Sutton and Andrew G.Barto, Cambridge, MA, 1998, MIT Press.


183

Node Realization: Proposed techniques of random walk, statistical and clustering

Abstract: The paper deals with positional node detection on the basis of random walk motion. The concept of curve fitting has been applied for random walk observation in n-dimension. The paper also points out the research issues of node inspection using statistical approaches. Nodal communication strategies in cluster based network structures have also been studied. The related graphical representation has been cited with proper justification. Keyword:Node, Random walk, curve fitting, statistical approaches, cluster 1. Introduction The random walk model concept dates back to the irregular motion of individual pollen grains experimented by Robert Brown (1828), [1],[2],[3] and now it is called Brownian Motion. Brownian Motion in two dimensions Brownian motion, in some systems can also be described as noise, and is a continuous motion of small particles. The striking features of this concept are:

• The motion is very irregular, composed of translations and rotations, and the trajectory of the motion appears to have no tangent.

• Two particles appear to move independently, even when they approach one another, within a distance less than their diameter.

• The smaller is the particle; the more active is the motion.

• The composition and density of the particles have no effect.

• The less viscous is the fluid; the more active is the motion.

• The higher is the temperature; the more active is the motion.

• The motion never ceases. In a simple random walk model for node moving in a mobile adhoc network, the following assumptions are considered -

• There is a starting point. • The distance from one point in the path to the next

is constant • The direction from one point in the path to the next

is chosen at random, and no direction is more probable than another.

Consider a node moving on an infinite, one dimensional uniform line .Let the node start its journey at the origin ( x=0) and then it moves a small distance δ either left or right in a short time τ . The motion is completely random and the probabilities of moving both left and right is 1/2 .The can be either at left or right of the origin and the distance is as assumed earlier i.e. δ. The next time step the node will be at a position 2δ to the left or right of the origin with probability of 1/4 each or even the node can return to its original position.On assuming the above behavior of the node, the probability that the node will be at a distance mδto the right direction of the origin or to the left direction of the origin after taking n time steps and can be represented as a binomial distribution with mean 0 and variance n. The mean location for one dimension is zero and the mean squared displacement is equal to 2Dt.Therefore we conclude that when direction bias is missing,there is no overall movement in any direction. The graphical representations of Random Walk Motion in n-dimension[4]i.e. for 1D, 2D and 3D have been shown in figures 1, 2, 3.

Figure 1. Random Walk Motion in 1D

A.Kumar1, P.Chakrabarti2 and P.Saini3

1Dept. of Comp. Sci. & Engg., Sir Padampat Singhania University, Udaipur-313601,Rajasthan, India

[email protected]


[email protected]


[email protected]


184



2. Proposed Node Realization using Curve Fitting Since mobile nodes are spread throughout the network, and every node will have an independent coordinate position, a mathematical procedure for finding the best fitting curve to a given set of points is by minimizing the sum of the squares of the offset points from the curve. We need to derive an approximate function that fits the data. It is to use the curve that minimizes the discrepancy between the data points and the curve. Therefore, let us first quantify the discrepancy. Therefore we will fit a straight line to a set of paired observations . The mathematical expression of the straight line is

(1) where, and are intercept and slope of the line; here, e refers to error between model and observation, which is represented as:

(2) The best fit line through the data would minimize the sum of the residual errors for all the data.

(3) Since it is not necessary that the nodes may fit in a straight line, therefore the nodes by default are scattered throughout the network. If we extend the least square procedure to fit the higher order polynomial, it is called polynomial regression, e.g. 2nd order polynomial or quadratic.

+e (4)

Here , and are the coefficients and x and y are the coordinate values for the mobile node,and e is the error value.

(5) Taking differentiation of the above equation, we get

(6)

(7)

(8)

Rearranging the above equations, we get, (9)

(10) (11)

In general we can write: (12)

(13)

Here refers to the standard error of the estimate, y/x indicates that the error is for predicted value of y corresponding to the value of x. The Figure1 shows the quadratic equation fitting curve for a data for the nodes , for a least square parabola. In figure 4, we observe that the nodes are scattered throughout the network, but least square method as shown in figure 5 for a line will not fit the data plotted. We, then use the least square parabola as shown in figure 6, to fit the data about the nodes scattered in a network.

Figure 4. Data points plotted and failure of the least Square fit line to cover all the points.

Figure 5. Least square parabola

Figure 6. Parabola covering all the points scattered


185

3. Statistical Relation Based Node Sensation 3.1 Quadrant Based Classification We consider the nodal positions respective to the timing instants t1, t2, ……tn as (x1, y1), (x2, y2),……, (xn, yn) respectively. We propose a technique of central point selection and then realizing the existence of nodes in virtual quadrants. The density of quadrants is estimated and the most one is selected as the area of observation.

Issue 1: If the nodes are more or less equally distributed in all the quadrants, in that case the origin(0,0) will be the center and based on that an imaginary circle will be drawn whose radius r will be equal to 3.2. Relation Based Classification

Issue 2: Node trace can be facilitated, if there exists any relation between co-ordinates of the nodes[5]. Justification:We assume that the nodes N1(x1, y1), N2(x2, y2),……,Nn(xn, yn) are the observed nodes such That for each node yi=cxi, (1≤i≤n) and c is constant. The representation is shown in table 1 which is as follows:

Table 1: Nodes and respective co-ordinate representation

c x1).( c x2)………… (cxn) = cn

Therefore, 1/n (cn 1/n =c( 1/n

Therefore, geometric mean of y co-ordinates is equal to the product of the constant and geometric mean of x co-ordinates.

4. Node Realization in Cluster Networks This section mainly focuses on the nodal communication between the farthest node in a N*N structure.Let us assume each cluster to be consisting of 16 nodes and then try to communicate between the source and the destination node.The point to be noted here is that to establish the communication link between the adjacent elements or units of the cluster we have to have the communication in just reverse order in the two adjacent elements.The order of the communication is

The condition can be visually imagined as follows:

Now let us consider the case when there is only one element i.e. 1*1.In this particular case, if we want to communicate between the farthest nodes, then there will be only one node in between the source and the destination which can be further visualized as follows:

If we denote it by using the function f(x), then the value of f(x)will be 1.f(x) =1,the intermediate node is II. Now, let us consider the case 2*2 matrix, the value here will bef(x)=1+2=3; The intermediate nodes are 1(2,3),2(4,3).

For the case for the 3*3 matrix, the value of the function f(x)=1+2+2=5;

Similarly for the 4*4 matrix, we can get the value of f(x)=1+2+2+2=7. Here, in this case, we were having only 4 elements in a ring.Suppose we have 8 elements in the ring, in that case,we have to compute the number of nodes required to communicate or to establish the connection between the farthest nodes. Justification:Let us consider the case of 1*1 matrix to communicate between the farthest nodes, we need 3 nodes, i.e. f(x)=3.In case of 2*2 matrix, to communicate between the farthest nodes, we need 7 nodes, i.e. f(x)=3+4;In case of 3*3 matrix to communicate between the farthest nodes, we need 7 nodes, i.e. f(x)=3+4+4;In case of 4*4 matrix, to communicate between the farthest nodes, we need 7 nodes, i.e. f(x)=3+4+4+4;In case of 16 elements in a ring, we can proceed as follows:let us consider the case of 1*1 matrix to communicate between the farthest nodes, we need 3 nodes, i.e, f(x)=7.In case of 2*2 matrix to communicate between the farthest nodes, we need 7 nodes, i.e. f(x)=7+8;In case of 3*3 matrix to communicate between the farthest nodes, we

Nodes xi yi =c xi N1 N2

….. Nn

x1 x2

…… xn

y1 =c x1 y2 =c x2

………… yn =c xn


186

need 7 nodes, i.e, f(x)=7+8+8;In case of 4*4 matrix, to communicate between the farthest nodes, we need 7 nodes, i.e. f(x)=7+8+8+8;Now the total number of nodes can be derived by the general formula as (N/2-1)+(M-1)*(N/2), where N is the number of nodes present in the unit or element and M is the dimension of the square matrix.The data can be represented in the tabular form as shown in table 2.

Table 2:Number of Nodes Corresponding to the matrix No. of nodes

1*1 2*2 3*3 4*4

4 1 3 5 7 8 3 7 11 15 16 7 15 23 31

The Graphical representation of nodal communication in Cluster is shown in figure 7 where, the x-axis represents the M*M matrix where M varies from 1 to 3.The y-axis represents the number of optimum communication nodes required in the establishing the path between the source node and the farthest node. The number of nodes per element is indicated by the 3colors.

Figure 7. Graphical representation of nodal communication

in Cluster 5. Conclusion The paper deals with random walk based node sensation. It has been shown how based on dimension random walk representation is varied. The paper also points out proposed schemes of node realization on the basis of curve fitting and statistical based approaches. Nodal communication between the farthest nodes in a cluster based network structure has also been studied with relevant graphical representation. References [1] Edward A. Coding, Michael J. Plank and Simon

Benhamou, “ Random walk models in biology” published[online] in Journal of The Royal Society Interface, 15 April, 2008.

[2] M. I. MohdSaad, Z. A. Zukarnain, “ Performance Analysis of Random-Based Mobility Models in MANET Routing Protocol” published in European Journal of Scientific Research, Vol. 32 No 4(2009), pp 444-454.

[3] Christian Bettstetter, “Mobility modeling in wireless networks: categorization smooth movement and border effects” published in ACM SIGMOBILE mobile computing and Communication review 2001.

[4] A.Kumar ,P.Chakrabarti , P.Saini , “Approach towards analyzing motion of mobile nodes- A survey and graphical representation” published in International Journal of Computer Science and Information Security, IJCSIS , USA , Vol 8 No 1 , 2010, pp250-253

[5] P.K.Giri ,J.Banerjee , Introduction to Statistics, Academic Publishers , 2001


187

Adoption OF Business Continuity Planning In IT

Services Case Study: MyLinE Parvaneh Sarshar1 and Mohd Zaidi Abd Rozan2

Universiti Teknologi Malaysia, Faculty of Computer Science and Information System,

[email protected], [email protected]

Abstract: The importance of having information technology (IT) in every single part of any organization is undeniable, however the systems integrated with it may face some threats, hazards and risks. One of the most suitable methods to prevent or recover from risks is Business Continuity Planning. BCP first was introduced in IT department but since IT has become wide spread, it has now been applied beyond IT sector. The case study (MyLinE) chosen by the researcher was because BCP is not utilized in their organization and was not capable of mitigating or even identifying existing vulnerabilities and threats, therefore this research was conducted to investigate the threats that are faced by MyLinE, the vulnerabilities that are exploited by the threats and significantly the impact of the incidents that may be caused by these factors. With adoption of BCP in MyLinE, the level of threats and vulnerabilities were assessed, mitigation strategies were delivered to help MyLinE reduce the risk level. On the other hand, the Business Impact Analysis (BIA) which had been conducted, illustrated the importance of mitigation level based on the impact of each incident on the stakeholders. Finally this paper has been developed to document the achievement of risk assessment and BIA in MyLinE and the researcher intend to reach to a comprehensive plan that can be applied for all IT services.

Keywords: Business Continuity Planning, risk assessment, BIA, IT services

1. Introduction IT services have been facing unpleasant incidents or disasters from the moment of birth and there have been always lots of attempts to overcome these incidents. They are two possibilities of incidents; premises-based incidents such as power outage, fire, flood or service-based incidents such as email, venue facilities, and network services and so on. Specialists have always been trying to defeat these threats to protect IT services, therefore great deals of solutions have been continuously provided. One of the most powerful remedies used to overcome and recover from the likely risks and their impacts on the business before, during and after an incident or a disaster is Business Continuity Planning (BCP), to be studied in this research. Business Continuity Planning is a very essential tool used in many businesses and the need of this plan in all the organizations especially for their IT services is deniable. Since ten years ago, some new concepts such as Computer Based Training, Computer Based Assessment have been introduced to the IT world especially among academic organizations. Today, those terms are represented by “e-learning”. Factors like flexibility, greater collaboration,

convenience, portable, lower cost and higher retention, are the benefits of e-learning that caused it to be very popular all over the world. There are many types of risk and challenge associated with e-learning like hack, fire, Internet infrastructure outage, Communication infrastructure outage and so on. The case study that is to be examined is an Online Resource for Learning in English called MyLinE. Since MyLinE as an IT service and a sort of e-learning, probable to be confronted with any of the risks that have been mentioned and there is no suitable plan or strategy in the current service, we need some organized procedures that enable the organization to recover from a disaster which may cause interruption to the business especially on to its critical parts. Since MyLinE is being used by 20 universities and institutes all over Malaysia, interruption in the system are not a wise option to confront.

2. Literature Review British Standards (BSI) [1] defines BCP as a methodology used to develop a plan to maintain or store business operations in the required time scales following interruption to, or failure of, critical business processes. In addition, in [2] it is stated that BCP is a documented collection of procedures and information that is developed, compiled and maintained in readiness for use in an incident to enable an organization to continue to deliver its critical activities at an acceptable predefined level. Same as any other plans, BCP has objectives and goals while being adopted by any organizations, this includes: • avoiding financial ruin

• maintaining market share

• minimizing negative publicity

• identifying hazards that may affect critical functions or activities

Based on [9] the overall goals of a business continuity plan are to ensure customers, trading partners and regulatory agencies maintain confidence in the business and to resume business as usual for employees as soon as possible. Fulmer [4] says that the most common reasons for neglecting BCP are:

• lack of time and resources • lack of top management support • lack of money


188

• too many causes of disasters to plan for, effectively • little awareness of potential hazards • lack of knowledge in developing a plan • Lack of sense of urgency

Although in today's environment, where technology reaches into every corner of almost every organization, business continuity planning has become imperative, unfortunately, it falls very low on a long list of IT priorities. [5] The difference between BCP and BCM (Business Continuity Management) refers to Business Continuity Planning (a process) or Business Continuity Plan (the documentation) and before an organization can develop BCM program, it should have BCP in advance. BCM is inclusive of BCP activities, as well as, the on-going activities. [14] Figure 1 shows this difference.

Figure 1. Differences between BCP and BCM [14] Since understanding the differences between different contingency plans such as BCP, DRP (Disaster Recovery Planning) or IRP (incident Recovery Planning) has been a controversial issue, figure 2 from [13] can show these differences and contingency planning steps so it can be understood by everyone.

Figure 2. Contingency Plan steps

3. Framework A wide variety of frameworks and models are available for business continuity planning (BCP) [10], [11], [7] & [8]. Except for the project initiation stage in BCP development, these models are not exactly the same in the other stages.

[6]. For this research, since there is not a specific framework that can suit all the organizations, and the BCP frameworks vary based on the objectives;

• the type of the products and services that the company is delivering,

• the size of the organization, Therefore the researcher decided to come up with a combination of some frameworks that can be best for MyLinE which is an IT service. This modified framework is a combination and modification of three different frameworks which have been retrieved from literature review. This framework has five phases and is cyclic, which means BCP never ends and this process continues and is being updated.

Figure 3. Modified framework

3.1 Phase One: Project foundation Project foundation or project initiation is the very first phase of developing a Business Continuity Plan. The most important factor for starting a Business Continuity Plan is having the senior management support. To kick off the project, these steps are critical [12];

• establish a business continuity working group and give it specific objectives,

• empower the group by including key business and technical stakeholders who have the decision-making authority to make it happen.

This phase of the framework has been derived from MAMPU (Malaysia Administrative Modernization and Management Planning Unit) standard for BCM [15]. It consists of five sub-steps:

3.1.1. Purpose In the very first step of each plan or project, the purpose of the project should be mentioned.

3.1.2. Objectives In this sub-process the objectives of developing a BCP and Implementation of a suitable framework must be covered.

3.1.3. Scope The scope of the plan should be defined.

3.1.4. BCM team structure The members of the plan who are going to be engaged before, during and after a disaster should be identified.

3.1.5. Roles and responsibilities


189

All the responsibilities should be defined well and assigned to the responding employees. Training may be applied when needed.

3.2 Phase Two: Business Assessment This phase has been retrieved from British Standard [16] and it consists of two very prominent sub-processes:

3.2.1. Risk Assessment Risk Assessment is an evaluation of the exposures present in an organization’s external and internal environment. It identifies whether or not the facility housing the organization is susceptible to floods, hurricanes, tornadoes, hack, sabotage, etc. It then documents what mitigating steps have been taken to address these threats [3]. Based on [16], Risk assessment consists of risk analysis and risk evaluation. • Risk analysis should include:

1. identification of assets 2. valuation of the identified assets 3. identification of significant threats and vulnerabilities for the identified assets 4. assessment of the likelihood of the threats and vulnerabilities to occur

• Risk evaluation includes: 1. calculation and evaluation of risks based on a predefined risk scale

3.2.2. Business Impact Analysis A BIA is an assessment of an organization’s business functions to develop an understanding of their criticality, recovery time objectives, and resource needs. By going through a Business Impact Analysis, the organization will gain a common understanding of functions that are critical to its survival. It will enable the client to achieve more effective planning at a lower cost by focusing on essential corporate functions [3]. The business impact analysis is an evaluation of the effects of extended outages on the ability to continue mission critical business functions. An analysis is business impact driven, and is both qualitative and quantitative. A business impact analysis should measure impacts on business elements including, financial, operations, customers, regulatory compliance and long-term obligations [17].

3.3 Phase Three: Strategy selection This phase is about selecting a strategy to mitigate the risks and vulnerabilities. One of the most important objectives of this phase is to decrease the total cost of the impact and the chosen solution. This phase is a proposed framework by J.C.Barnes [3].

3.4 Phase Four: Plan development In this step, the Business Continuity Plan for the case study will be delivered. This phase is also from J.C.Barnes proposed framework [3]. When a disaster happens, only the companies with a powerful BCP can survive, the phase of developing a comprehensive plan is very important. The objectives of a plan are to get the organization into business

as soon as possible and to keep the extraordinary expenses to a minimum. Two important steps in developing a plan are recovery team notification and documentation. In BCP the responsibilities are assigned to each member of the team and the documentation of the plan should be very good and user friendly so the members can understand their duties soon in order not to waste any time.

3.5 Phase Five: Testing and maintenance In this final step which has been derived from J.C.Barnes proposed framework [3], while the plan is completed and being approved by senior management, it needs to be tested to make sure that it works very well. Testing is an important step, since shows the planners and the team that the plan is accurate but some of the organizations ignore or neglect to do it. In organizations with an exhaustive plan, the testing is done every 6 months to once a year. After making sure that the plan is doing well, it should then be maintained. Most BCPs that are written are not maintained. Within a year or less the plan becomes useless because staffs have changed, vendors are different, and the resources required to get the product out the door have evolved. By maintaining the plan on a regular basis, the organization will avoid the time required to create a plan from scratch and it will be prepared whenever a disaster strikes.

4. Application and findings Before proceeding with any analysis, it is very important to understand about the case study which is MyLinE, at UTM. Interviews have been conducted with MyLinE manager and MyLinE admin. The goal of interview is to find out the current situation of MyLinE towards risk and to find out what they have done up to now to prevent or solve a disaster when it occurs and how risky is the system that they are housing. From this interview these result have been achieved;

• It is an online self-access resource for learning English to enhance English language communication skills among students at tertiary level.

• The goal of making it a self-access learning resource is that it persuades students to be responsible for their learning.

• MyLinE has lots of activities and learning and teaching resources to help the students and lecturers to improve their English proficiencies.

• Currently MyLinE has over 200,000 users in 20 universities and institutes all over Malaysia.

A large number of users will be affected and the consequences considered disastrous and a disruption will definitely have a severe impact on reputation of MyLinE and UTM. In addition, the threats are most likely to occur by technical problems and usually resulted in more than 4 hours but less than 24 hours, but they may repeat frequently within 30 days. Besides, when a business disruption happens it causes a delay and missed deliverables and it will affect


190

all 20 universities which shows that risk level and vulnerability in MyLinE is definitely high. BCP framework that has been illustrated in figure 3 has been applied for this case study. After developing project foundation in the first step, two types of questionnaires were conducted, threats and vulnerability questionnaire and BIA questionnaire. For identifying threats and evaluation of the identified threats and vulnerabilities, a questionnaire with two phases was needed. Through factor analysis, the numbers of factors achieved from questionnaire are reduced. For identifying the impact of the risks, second questionnaire for BIA in four different types for four different kinds of stakeholders, was developed. Based on this questionnaire, the respondents were asked to rank the impact that these risks could have on them if occur on the assets of MyLinE, based on the following scale: 1 à Almost No impact 2 à Moderate impact 3 à Significant impact The result of the questionnaires and the analysis of the data is shown in appendix 1. In step three, for the threats and vulnerabilities that were threatening MyLinE, some strategies are required that can mitigate this threats and the following impact of the risks on the system and stakeholders. Based on the vulnerabilities of MyLinE, the researcher came up with this strategies and it is hoped that they can be useful in order to help MyLinE prevent or overcome disasters. In phase four, a comprehensive business continuity plan has been developed and been submitted to the MyLinE manager and the plan has been tested in phase five by the MyLinE employees and the required changes and strategies are being applied, some plan exercise programs have been established and then some training for employees was considered.

5. Conclusion In this paper, several achievements have been obtained, from interview, the current situation of MyLinE toward disasters and business continuity planning has been defined. From literature review; assets, threats and vulnerabilities that may threaten MyLinE have been identified. Soon after, via questionnaires, the valuation of assets, threats and vulnerabilities and risk assessment have been conducted. From questionnaires, business impact analysis (BIA) has been delivered and finally, some useful mitigation strategies have been proposed by the researcher. Since BCP is very critical for all organizations, especially for the ones that are holding important information and data such as MyLinE, the researcher highly recommend to MyLinE unit that they have to take BCP seriously, because the importance of having BCP in any organization have been proved. Another recommendation is that to consider the mitigation strategies that the researcher has suggested to them, and try to adopt them based on their importance priorities, budget and alignment with mission, vision and goal of MyLinE. Finally, MyLinE staff should not neglect to update the

existing BCP and test it regularly, so it can always be applicable, and if a change occurs in the system, BCP can be updated easily and it will not lead to losing the existing BCP and having a new one, which is very costly for all organizations.

References [1] BSI , Information technology – Code of practice for

information security management BS ISO/IEC 17799:2000, BSI, pp.56-60, 2001

[2] BSI, Business continuity management –Part 2: Specification, BS 25999-2, 2007

[3] J.C.Barnes, 'A Guide to Business Continuity Planning', John Wiley & Sons, Chichester, UK, 2001.

[4] Kenneth L. Fulmer, 'Business Continuity Planning: A Step-by-Step Guide with Planning Forms', Rothstein Associates, Third Edition 2005.

[5] Susan Snedaker, ' Business Continuity and Disaster Recovery Planning for IT Professionals', Burlington, MA. Syngress Publishing, Inc., 2007.

[6] Roberta J.Witty, 'Research Roundup: Business Continuity Management and IT Disaster Recovery', Gartner, January 2009

[7] Pitt, M.and Goyal, S. (2004), “Business continuity planning as a facilities management tool”, Facilities, Vol. 22, No. 3/4, 2004, pp 87-99.

[8] BCPG, (1998), PACE - Business Continuity Planning Guide (BCPG), Office of Government Commerce (OGC), London, UK, May 1998.

[9] Jim Hoffer, 'Backing Up Business - Industry Trend or Event', Health Management Technology, Jan, 2001

[10] Elliott, D. et al. (2002), Business continuity management-a crisis management approach, Routledge, 2002

[11] Savage, M. (2002), “Business continuity planning”, Work study, Vol. 51, No. 5, 2002, pp 254-261.

[12] Wing Lam, 'ensuring business continuity', 2002 [13] Michael E.Whitman and Herbert J.Mattord,

'management of information security', Course Technology- Cengage Learning, 2008

[14] http://www.bcprm.com/demo/bcm/htmlhelp/ProjectManagement.htm, [online] (Retrieved on 10/12/2009)

[15] http://gcert.mampu.gov.my/doc, [online] (Retrieved on 15/04/2010)

[16] BSI , Information security management systems – Part 3: Guidelines for information security risk management, BS 7799-3: 2006

[17] Robert McDonald, 'New Considerations for Security Compliance, Reliability and Business Continuity', 2008.

Author’s Profile Parvaneh Sarshar received her B.S. degree in Computer Engineering from Azad University of Lahijan in 2008 and M.S. degrees in IT-Management from University Technology Malaysia (UTM) in 2010. She is now doing some research on the impact of social networks on different concepts and new ideas on BCP.


191

Mohd Zaidi Abd Rozan (Dr.) received his B.Sc. (Hons.) in Physics & Comp w. Ed., and M.Sc. IT from Universiti Teknologi Malaysia (UTM), Malaysia. He received a Doctorate of Engineering (D. Eng) in Information Science & Control Engineering from Nagaoka University of

Technology, Japan. He is also a PRINCE2 Certified & Registered Project Management Practitioner. Currently, he is the Head Department of Information Systems, Faculty of Computer Science & Information Systems, Universiti Teknologi Malaysia (UTM), and also the UTM MSc IT-Entrepreneurship (SKIT) Programme Coordinator. He is the Founder and Leader of PRIMELAB (Project Innovation Management & tEchnoentrepreneurship). His research interests are IT Project Management, Technopreneurship, Disaster Management, Profiling and Data Mining utilizing Multivariate Approach. He holds a Radio Amateur Licence, with callsign 9W2DZD. Appendix 1

Table 1: Risk Assessment and BIA


192

Denoising of Magnetic Resonance Images using Wavelets- A Comparative Study

S Satheesh1 Dr.KVSVR Prasad2 P.Vasuda3

1Asst.Prof., Dept. of ECE, G Narayanamma Institute of Technology and Science, Hyderabad, , India

[email protected] 2Prof. & Head Dept. of ECE, D.M.S.S.V.H. College of Engineering, Machilipatnam, India

3Asst.prof., Dept. of ECE, G Narayanamma Institute of Technology and Science, Hyderabad, India

Abstract: Image denoising has become an essential exercise in medical imaging especially the Magnetic Resonance Imaging (MRI). As additive white Gaussian noise (AWGN) exhibits finest grain property of noise, multi resolution analysis using wavelet transform is gaining popularity. The aim of the work is to compare the effectiveness of three wavelet based denoising algorithms viz. Wiener filter, hard threshold and soft threshold using MRI images in the presence of AWGN. Wiener filter performs better visually and in terms of PSNR than the other thresholding techniques. Keywords: Denoising, Wavelet, MRI, Wiener filtering, Threshold

1. Introduction Image denoising is a procedure in digital image

processing aiming at the removal of noise, which may corrupt an image during its acquisition or transmission, while retaining its quality. Medical images obtained from MRI are the most common tool for diagnosis in medical field. These images are often affected by random noise arising in the image acquisition process. The presence of noise not only produces undesirable visual quality but also lowers the visibility of low contrast objects. Noise removal is essential in medical imaging applications in order to enhance and recover anatomical details that may be hidden in the data. The wavelet transform has recently entered the arena of image denoising and it has firmly established its stand as a powerful denoising tool. There has been a fair amount of research on filtering and wavelet coefficients thresholding [8], because wavelets provide an appropriate basis for separating noisy image from the original image. These wavelet based methods mainly rely on thresholding the discrete wavelet transform (DWT) coefficients, which have been affected by AWGN. There has been much research by Donoho & Johnstone [1, 2, 3] on finding thresholds, however few are specifically designed for images. One of the most popular method consists of thresholding the wavelet coefficient (using the hard threshold or the soft threshold) as introduced by Donoho. Another denoising method in the wavelet domain consists of Wiener filtering the wavelet coefficients. In this paper, the performance of this method is done on a degraded image X such that X=S+N where S is the original image and N is an AWGN. The performance of three denoising techniques

such as hard thresholding, soft thresholding and Wiener filter is compared both visually and in the PSNR sense.

2. Wavelet Based Image Denoising DWT has attracted more interest in image denoising [5]. The DWT can be interpreted as image decomposition in a set of independent, spatially oriented frequency channels. The image is passed through two complementary filters and emerges as two images, Approximation and Details. This is called Decomposition or Analysis. The components can be assembled back into the original image without loss of information. This process is called Reconstruction or Synthesis. The mathematical manipulation, which implies analysis and synthesis, is called DWT and inverse DWT. For a 2D image, an N level decomposition can be performed resulting in 3N+1 different frequency bands namely, LL, LH, HL, HH. Denoising algorithms that use the wavelet transform consist of three steps:

• Calculate the wavelet transform of the noisy image • Modify the noisy wavelet coefficients according to some rule. • Compute the inverse transform using the modified coefficients.

2.1 Wiener Filter In signal processing, the Wiener filter is a filter proposed

by Norbert Wiener during the 1940s.Its purpose is to reduce the amount of noise present in a signal by comparison with an estimation of the desired noiseless signal. The discrete-time equivalent of Wiener's work was derived independently by Kolmogorov in 1941. Hence the theory is often called the Wiener-Kolmogorov filtering theory.

The inverse filtering is a restoration technique for deconvolution, i.e., when the image is blurred by a known lowpass filter, it is possible to recover the image by inverse filtering or generalized inverse filtering. However, inverse filtering is very sensitive to additive noise. The approach of reducing one degradation at a time develops a restoration algorithm for each type of degradation and simply combines them. The Wiener filtering executes an optimal tradeoff between inverse filtering and noise smoothing. It removes the additive noise and inverts the blurring simultaneously.


193

The Wiener filtering is optimal in terms of the mean square error. In other words, it minimizes the overall mean square error in the process of inverse filtering and noise smoothing. The Wiener filtering is a linear estimation of the original image. The approach is based on a stochastic framework [6, 7]. 2.1.1 Wiener Filter in the Wavelet Domain

In the model we assume that the wavelet coefficients are conditionally independent Gaussian random variables. The noise is also modeled as stationary independent zero-mean Gaussian variable. Let us consider an image corrupted by a zero-mean Gaussian noise. The coefficients of the noisy image in the wavelet domain are given by [4].

, , ,i j i j i jy s n= +

(1)

Where ,i jy represent the coefficients of the noisy image in

the wavelet domain, ,i js represent the coefficients of the

undegraded image, ,i jn represent the coefficients of the

noise. Without loss of generality, we can assume that

the { }2,i jE y ’s can be determined by averaging the squared

values of ,i jy in a window centered at (i, j). This

information can be expressed as 2

, , 1

R R

i j i k jk R l R

Q y − −=− =−

= ∑ ∑

(2) 2(2 1)M R= +

(3) ,

,i j

i j

Qq

M=

(4) As a result, the coefficients of the Wiener filter can be expressed as

2,

,,

i j ni j

i j

qa

qσ−

=

(5)

Restricting the values to only positive values, the numerator of the equation (4) takes the form ( )2

,max ,0i j nq σ− and so

( ), , ,ˆ max ,0i j i j i js a y=

(6)

Where ,i js is the best linear estimate of the signal

component ,i js The noise variance is estimated using the mean absolute deviation (MAD) method and is given by

2

var0.6745madiance =

(7)

( )( ) ( )( )i ii imad w median w=

(8)

iw represents the wavelet coefficients.

When using the Haar wavelet transform the steps for implementing denoising using the Wiener filter technique is as follows:

i. Apply the Haar wavelet transform to the original image

ii. { },i jq is computed by convolving { }2,i jy with a

kernel of size 9. iii. The Wiener filter is then applied using the formula

2,

, , ,,

max( ,0)ˆ ˆi j n

i j i j i ji j

qs y a y

qσ−

= =

iv. Apply the inverse Haar wavelet transform.

2.2 Soft Thresholding In soft thresholding, the wavelet coefficients with

magnitudes smaller than the threshold are set to zero, but the retained coefficients are also shrunk towards zero by the amount of the threshold value in order to decrease the effect of noise assumed to corrupt all the wavelet coefficients. Soft thresholding shrinks the coefficients above the threshold in absolute value.

When using the Haar wavelet transform, the steps for implementing denoising using the soft thresholding technique is as follows:

• Apply the Haar wavelet transform to the original image

• Apply the soft thresholding on the wavelet coefficients ( , )y i j

( , )

( , ) ( , )0

y T if y i j Ts i j y T if y i j T

otherwise

− ≥= + ≤ −

$

(9)

( )2logT nσ=

(10)

Where ( , )y i j is the standard deviation of the noise, n is the number of wavelet coefficients, ˆ( , )s i j are the de-noised wavelet coefficients, and T is the universal threshold and the variance is estimated using MAD method.

• Apply the inverse haar wavelet transform.

2.3 Hard Thresholding In hard thresholding, the wavelet coefficients with greater

magnitudes than the threshold are retained unmodified as they are thought to comprise the informative part of data, while the rest of the coefficients are considered to represent noise and set to zero. However, it is reasonable to assume that coefficients are not purely either noise or informative but mixtures of those.

The denoising method described in the previous subsection (soft thresolding) can be carried out using the hard threshold instead of the soft threshold on the wavelet coefficients in (ii).

The hard thresholding formula is given as


194

( , )

ˆ( , )0y i j

s i j

=

( , )

( , )

f y i j T

if y i j T

≥

< (11)

3. Results and Discussion In this section, simulation results are presented which is

performed on the four MRI images i.e. Brain, Knee, Spine Abdomen. White Gaussian noise is added to the MRI images and denoised with the methods described previously. The performance of the three techniques is compared using PSNR , which is defined as

2

1025510logPSNRMSE

=

(12) Where MSE denotes the mean square error for two m n× images ( , )l i j & ( , )k i j where one of the images is considered a noisy approximation of the other and is given as

[ ]21 1

0 0

1 ( , ) ( , )m n

i jMSE l i j k i j

mn

− −

= =

= −∑∑

(13) From the simulation results it has been observed that the Wiener filter outperforms both thresholding methods visually and in terms of PSNR. More details were lost with the thresholding methods especially for the hard threshold wherein the background was not well denoised. If the Wiener filter could be thought as another thresholding function, it will perform better as its shape is smoother than the hard and soft thresholds. This can be clearly seen from Figure1 and Figure2 that the background of the denoised images with Wiener filter appears smoother. The Wiener filter removes the noise

pretty well in the smooth regions but performs poorly along the edges.

The comparison of PSNR of the three wavelet filters for different MRI images are tabulated in Table1 and is observed that the Wiener filter gives better values compared to soft and hard thresholding for different noise variances (σ ) such as 15, 20, 25, 30.

Table 1: Comparison of PSNR of different wavelet filters for different MRI images corrupted by AWGN

Image

Noise (σ )

Peak Signal to Noise Ratio in dB(PSNR)

Hard Thresholding

Soft Thresholding

Wiener Filter

Brain

15 25.441 26.946 27.692 20 22.936 25.065 25.615 25 21.031 23.507 23.957 30 19.452 22.181 22.577

Knee

15 25.219 26.231 26.803 20 22.810 24.542 24.977 25 20.924 23.087 23.457 30 19.357 21.836 22.170

Spine

15 25.093 26.069 26.561 20 22.753 24.458 24.812 25 20.869 23.073 23.359 30 19.348 21.871 22.126

Abdomen

15 25.402 27.135 27.656 20 22.917 25.227 25.614 25 21.011 23.660 23.981 30 19.436 22.324 22.622

Figure1. Denoising of Brain MRI image for variance=20 (a) Original image (b)Noisy image (c)Denoised image with hard threshold (d) Denoised image with soft threshold (e) Denoised image with Wiener filter

(a) (b) (c) (d) (e)

Figure2. Denoising of Spine MRI image for variance=30 (a) Original image (b)Noisy image (c)Denoised image with hard threshold (d) Denoised image with soft threshold (e) Denoised image with Wiener filter


195

4. Conclusions

The paper presents a comparative analysis of three image denoising techniques using wavelet transforms. The analysis of all the experimental results demonstrates that Wiener filter surpasses other methods that have been discussed. There are a couple of areas which would like to be improved on. One area is in improving the denoising along the edges as the Wiener method did not perform so well along the edges. Another area of improvement would be to develop a better optimality criterion as the MSE is not always the best optimality criterion.

Acknowledgements We wish to express our sincere thanks to Dr. K. Jitender Reddy, Consultant Radiologist, Dept. of Radiology & Imaging Sciences, Apollo Health City, Hyderabad for providing us with different MRI image datasets.

References [1] DonohoD.L and. Johnstone I.M “Ideal spatial

adaptation via wavelet shrinkage” Biometrica, Vol. 81, pp. 425-455,1994

[2] Donoho D.L “De-noising by soft-thresholding” IEEE Transactions on Information Theory, Volume: 41, Issue: 3, Pages: 613 – 627, May 1995

[3] Donoho, D.L “Wavelet Shrinkage and W.V.D.: A 10-minute Tour; (David L. Donoho's website)

[4] Kazubek, M “Wavelet domain image denoising by thresholding and Wiener filtering” IEEE Signal Processing Letters, Volume: 10, Issue: 11, Nov. 2003

[5] Kother Mohideen S, Dr. Arumuga Perumal. S , Dr. Mohammed Sathik M “Image Denoising using Discrete Wavelet Transform”, International Journal of Computer Science and Network Security, Vol.8, No.1, January 2008.

[6] Lakhwinder Kaur , Savita Gupta , R.C. Chauhan “Image Denoising using Wavelet Thresholding” Third Conference on Computer Vision, Graphics and Image Processing, India, Dec 16-18, 2002

[7] Nevine Jacob and Aline Martin “Image Denoising in the Wavelet domain using Wiener filtering”, December 17, 2004.

[8] Zhong, S, Cherkassky V “Image Denoising using Wavelet Thresholding and Model Selection” Image Processing 2000, Proceedings. 2000 International Conference on, Volume: 3, Pages: 262 -265, 10-13 Sept. 2000

Author’s Profile

S.Satheesh received the B.Tech degree in Electronics and Communication Engineering from VRSEC (ANU, Vijayawada, India) in 2001, and the M.E (ECE) degree in Communication Engineering specialization from CBIT (OU, Hyderabad, India) in 2005.He is currently pursuing the Ph.D. degree under the guidance of Dr. KVSVR Prasad at Jawaharlal Nehru Technological

University Hyderabad, India. He is the member of ISTE, IAENG and IACSIT. His research interests are in the area of Medical Image Processing and Signal Processing.

Dr. KVSVR Prasad obtained B Sc. Degree in 1963 from Andhra University, B.E (Telecommunication Engineering) in 1967 and M.E (ECE) Microwave Engineering specialization in 1977 from Osmania University. He received the Ph.D. in 1985 from IIT Kharagpur in strip and micro strip transmission lines. He published six papers in IEEE Transactions in MTT, Antenna and Propagation and EMI/EMC and three papers in National

Conferences. He is fellow of IETE (life member). He worked in various capacities in the Department of ECE, Osmania University, Hyderabad. Presently he is working as professor and head, Department of ECE, D.M.S.S.V.H. college of engineering, Machilipatnam, India.

P.Vasuda received B.Tech degree from G.Narayanamma Institute of Technology and Science, Hyderabad, India in Electronics and Communication Engineering in the year 2007.Currently she is pursuing her M.Tech degree from G.Narayanamma Institute of Technology and Science, Hyderabad, India in Digital Electronics and Communication Engineering. Her area of interest includes

Image Processing and Digital Communications.


196

A Novel Data Imputing Algorithm

Ahmed sobhy1, Hany Harb 2 , Sherif Zaky 3 and Helmi Mahran4

1Department of Computer Science, Faculty of Computers & Informatics, Suez Canal University, Ismailia, Egypt

[email protected]

2Department of Computers & Eng., Faculty of Engineering, Al Azhar University, Cairo, Egypt.

[email protected]

3 Department of Mathematics, Faculty of Science, Suez Canal University, Ismailia, Egypt

[email protected]

4 Department of Basic Science , Faculty of Computers & Informatics, Suez Canal University, Ismailia, Egypt

[email protected] Abstract: DNA microarray analysis has become the most widely used functional genomics approach in the bioinformatics field. Microarray gene expression data often contains missing values due to various reasons. Clustering gene expression data algorithms requires having complete information. This means that there shouldn't be any missing values. In this paper, a clustering method is proposed, called "Clustering Local Least Square Imputation method (ClustLLsimpute)", to estimate the missing values. In ClustLLsimpute, a complete dataset is obtained by removing each row with missing values. K clusters and their centroids are obtained by applying a non-parametric clustering technique on the complete dataset. Similar genes to the target gene (with missing values) are chosen as the smallest Euclidian distance to the centroids of each cluster. The target gene is represented as a linear combination of similar genes. Undertaken experiments proved that this algorithm is more accurate than the other algorithms, which have been introduced in the literature.

Keywords: Missing Values, Imputation, Microarray, Regression.

1. Introduction In the last decade, molecular biologists have been using DNA microarrays as a tool for analyzing information in gene expression data. During the laboratory process, some spots on the array may be missing due to various factors e.g. insufficient resolution, image corruption, or simply due to dust or scratches on the slide. Repeating the experiments is often very costly or time consuming. As a result, molecular biologists, statisticians, and computer scientist have made attempts to recover the missing gene expressions by some ad-hoc and systematic methods. Microarray gene expression data have been formulated as gene expression matrix E with m rows, which correspond to genes, and n columns, which correspond to experiments. Many analysis methods, such as principle component analysis, singular value decomposition or clustering analysis, require complete matrices. Missing log2

transformed data are often replaced by zeros [1] or, less often, by an average expression over the row, or ‘row average’. This approach is not optimal, since these methods do not take into consideration the correlation structure of the data. Thus, many analysis techniques, as well as other analysis methods such as hierarchical clustering, k-means clustering, and self-organizing maps, may benefit from using more accurately estimated missing values. There is not a lot of work in the literature that deals with missing value estimation for microarray data, but much work has been devoted to similar problems in other fields. The question has been studied in contexts of non-response issues in sample surveys and missing data in experiments [2]. Common methods include filling in least squares estimates, iterative analysis of variance methods [3] randomized inference methods, and likelihood-based approaches [4]. An algorithm similar to the nearest neighbors was used to handle missing values in CART-like algorithms [5]. Most commonly applied statistical techniques for dealing with missing data are model-based approaches. Local least squares imputation as k-nearest neighbor imputation (KNNimpute) [6] and an estimation method based on Bayesian principal component analysis (BPCA) have been introduced [7]. In this paper, a local least squares imputation is proposed, where a target gene that has missing values is represented as a linear combination of similar genes. A k-means clustering algorithm has been used to cluster the complete microarray matrix. Rather than using all available genes in the data, only the genes with high similarity with the target gene are used in the proposed method which has the smallest Euclidian distance between the target gene and the centeroid of each cluster. The rest of the paper is organized as follows: Section 2 includes a description of a mathematical model of local least squares imputation based on regression model. Section 3, discusses the proposed k-means algorithm which is used in the clustering process. Section 4, introduces the proposed PCA as a solution for the initial number of clusters


197

parameter and the initial centeroid for each of the clusters. Section 5, explains the proposed novel imputing algorithm based on the previous solutions. The results of numerical experiments are given in Section 6. Section 7 concludes the paper. 2. Local Least Squares Imputation

A matrix denotes a gene expression data matrix with m genes and n experiments, and assume . In the matrix , a row represents expressions of the ith gene for n experiments. In order to recover the total of q missing values in any locations of a target gene g, the k-nearest neighbor genes of g,

,

are found. In this process of finding the similar genes, the q components of each gene at the q locations of missing values in g are ignored. Then, based on these k-nearest neighbor genes, a matrix , a matrix

, and a vector are formed. The ith row vector of the matrix A consists of the ith nearest neighbor genes , with their elements at the q missing locations of missing values of g excluded. Each column vector of the matrix B consists of the values of the jth location of the missing values (1 ≤ j ≤ q) of the k vectors

. The elements of the vector w are the n − q elements of

the gene vector g whose missing items are deleted. After the matrices A and B and a vector w are formed, the least squares problem is formulated as

(1)

Then, the vector of q missing values can be estimated as

, (2)

where is the pseudo inverse of . For example, assume that the target gene g has two missing values in the 1st and the 10th positions among total 10 experiments. If the missing value is to be estimated by the k similar genes, each element of the matrix A and B, and a vector w are constructed as

,

where and are the missing values and are the k genes that are most similar to g. The known elements of w can be represented by

,

where are the coefficients of the linear combination, found from the least squares formulation (1). And, the missing values in g can be estimated by

, ,

where α1 and α2 are the first and the second missing values in the target gene. For estimating missing values of each gene, we need to build the matrices A and B and a vector w, and solve the least squares problem of Eqn. (1). 2. K-Means Clustering K-means [8] is one of the simplest unsupervised learning algorithms that solve the well known clustering problem. The procedure follows a simple and easy way to classify a given data set through a certain number of clusters (assume k clusters) fixed a priori. The main idea is to define k centroids; one for each cluster. These centroids should be placed in an accurate way because different locations cause different results. As a result, the best choice is to place them as much as far away as possible from each other. The next step is to take each point belonging to a given data set and associate it with the nearest centroid. When no point is pending, the first step is completed and an early group page is done. At this point we need to re-calculate k new centroids as new centers of the clusters resulting from the previous step. After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new centroid. A loop has been generated. As a result of this loop we may notice that the k centroids change their location step by step until no more changes are done. Finally, this algorithm aims at minimizing an objective function, in this case a squared error function. The objective function

, (3)

where is a chosen distance measure between a data point and the cluster centre , is an indicator of the distance of the n data points from their respective cluster centers. The algorithm is composed of the following steps:

• Place K points into the space represented by the objects that are being clustered. These points represent initial group centroids.

• Assign each object to the group that has the closest centroid.

• When all objects have been assigned, recalculate the positions of the K centroids.

• Repeat Steps 2 and 3 until the centroids are longer moving. This produces a separation of the objects


198

into groups from which the metric to be minimized can be calculated.

3. Principal Component Analysis

Principal component analysis (PCA) is probably the most popular multivariate statistical technique and it is used by almost all scientific disciplines. It is also likely to be the oldest multivariate technique. In fact, its origin can be traced back to Pearson [9] or even Cauchy [10]. The modern instantiation was formalized by Hotelling [11] who also coined the term principal component. PCA analyzes a data table representing observations described by several dependent variables, which are, in general, inter-correlated. Its goal is to extract the important information from the data table and to express this information as a set of new orthogonal variables called principal components. PCA also represents the pattern of similarity of the observations and the variables by displaying them as points in maps [12][13]. The data table to be analyzed by PCA comprises I observations described by J variables and it is represented by the matrix X, whose generic element is . The matrix X has rank L where .

The matrix X has the following singular value decomposition [14][15]:

(4)

where P (Principle direction) is the matrix of left singular vectors, Q (Principle components) is the matrix of right singular vectors, and is the diagonal matrix of singular values. Equation 4 can also be rewritten as

(5)

with being the rank of X and , and being (respectively) the singular value, left and right singular vectors of X. This shows that X can be reconstituted as a sum of L rank one matrices (i.e., the terms). The first of these matrices gives the best reconstitution of X by a rank one matrix, the sum of the first two matrices gives the best reconstitution of X with a rank two matrix, and so on, and, in general, the sum of the first M matrices gives the best reconstitution of X with a matrix of rank M. The goals of PCA are to (a) extract the most important information from the data table, (b) compress the size of the data set by keeping only this important information, (c) simplify the description of the data set, and (d) analyze the structure of the observations and the variables. In order to achieve these goals, PCA computes new variables called principal components which are obtained as linear combinations of the original variables. The first principal component is required to have the largest possible variance. Therefore, this component will "explain" or "extract" the largest part of the inertia of the data table. The second component is computed under the constraint of being orthogonal to the first component and to have the

largest possible inertia. The other components are computed likewise. The values of these new variables for the observations are called factor scores, these factors scores can be interpreted geometrically as the projections of the observations onto the principal components. In PCA, the components are obtained from the singular value decomposition of the data table I. Specially, with

, the matrix of factor scores, denoted F is obtained as

, (6) The matrix Q gives the coefficients of the linear combinations used to compute the factors scores. This matrix can also be interpreted as a projection matrix because multiplying X by Q gives the values of the projections of the observations on the principal components. This can be shown by combining Equations 4 and 6 as:

(7) The components can also be represented geometrically by the rotation of the original axes. The matrix Q is also called a loading matrix. The matrix X can be interpreted as the product of the factors score matrix by the loading matrix as:

(8) This decomposition is often called the bilinear decomposition of X . 4. The proposed algorithm

In this section, a Local Least Squares imputation that depends on the Clustering model will be introduced. Clustering complete data sets into K clusters with K centroid per each clusters will also be discussed. Target gene that has missing values represented as a linear combination of similar genes. The similar genes are the clusters whose centroids have the smallest Euclidian distance to the target gene. 4.1. Getting number of clusters

Clustering algorithms are unsupervised learning processes i.e. users are usually required to set some parameters for these algorithms. These parameters vary from one algorithm to another, but most clustering algorithms require a parameter that either directly or indirectly specifies the number of clusters. This parameter is typically either k, the number of clusters to return, or some other parameter that indirectly controls the number of clusters to return, such as an error threshold. Setting these parameters requires either detailed prior knowledge of the data, or time-consuming trial and error. The latter case still requires that the user has sufficient domain knowledge to know what a good clustering “looks” like. However, if the data set is very large or is multidimensional, human verification could become difficult. It is necessary to have an algorithm that can


199

efficiently determine a reasonable number of clusters to return from any clustering algorithm. The following proposed algorithm will identify the correct number of clusters to return from a clustering algorithm. The algorithm is composed of the following steps:

• will be the complete microarray matrix by removing each gene row with missing value.

• By Eq. 4 get the eigengene matrix with .

• Compute

• Choose the eigengenes that contribute to be about 70%-75% of the total expression level as the number of clusters K .

4.2. Getting initial centroids of clusters

The k-means algorithm starts by initializing the K cluster centers. Two simple approaches to cluster center initialization are either to select the initial values randomly, or to choose the first K samples of the data points. However, testing different initial sets is considered impracticable criteria, especially for large number of clusters [16]. Therefore, different methods have been proposed in literature [17][18][19]. When random initialization is used, different runs of K-means typically produce different clusters groups and the resulting clusters are often poor. Another problem with the basic K-means algorithm given earlier is that empty clusters can be obtained. This paper proposes that principal components are actually the continuous solution of the cluster membership indicators in the K-means clustering method. The main basis of PCA-based dimension reduction is that PCA picks up the dimensions with largest variances(Eq. 5). Mathematically, this is equivalent to finding the best low rank approximation of the data via the singular value decomposition (Eq. 6). As a result, the first component is used as an index indicator to the K initial centroids.

The algorithm is composed of the following steps:

• will be the complete microarray matrix by removing each gene row with missing value.

• By Eq. 4 get .

• Compute first component by Eq. 6.

• Sort first component vector.

• Let the first K component indexes of be the first K initial centroids.

4.3. Clustering

K-means clustering algorithm, as shown in section 3, has been used as a clustering algorithm for our proposed imputation algorithm. After applying this algorithm, K of disjoint subsets are obtained. Each cluster is identified by its centroid 4.4. Imputing.

K-nearest neighbor method (KNNimpute) does not introduce an optimal and restricted method to find the nearest neighbor. Bayesian Principle Component (BPCA), depends on a probabilistic model. This model requires certain statistical parameters that must be known before. Local Least Squares Imputation (LLSimpute), depends on K coherent genes that have large absolute values of Pearson correlation coefficients. This can be costly in calculations. This paper proposes Local Least Squares imputation. This method represents a target gene that has missing values as a linear combination of similar genes. The similar genes are the cluster whose centroid has the smallest Euclidian distance to the target gene. The algorithm is composed of the following steps:

• will be the original microarray matrix.

• will be the target gene (with q missing elements).

• By using algorithm proposed in section 5.1 get K.

• By using algorithm proposed in section 5.2 get K centroids.

• By using algorithm proposed in section 5.3 get K clusters.

• Get the nearest cluster to the target gene.

6.1. From with columns corresponding to complete elements of .

6.2. From with columns corresponding to missing elements of .

• with columns corresponding to complete elements of .

• Solve Eq. 2 to get estimated q missing values of .

• Repeat steps from 2 to 8 until estimation of all missing genes.

5. Results and Discussion

5.1. Data Sets

Six microarray datasets were obtained for the purpose of comparison. The first data set was obtained from -factor block release that was studied for identification of cell-cycle regulated genes in Saccharomyces cerevisiae[20]. A


200

complete data matrix of 4304 genes and 18 experiments (ALPHA) that does not have any missing value to assess missing value estimation methods. The second data set of a complete matrix of 4304 genes and 14 experiments (ELU) is based on an elutriation data set [20]. The 4304 genes originally had no missing values in the -factor block release set and the elutriation data set. The third data set was from 784 cell cycle regulated genes, which were classified by Spellman et al. [20] into five classes, for the same 14 experiments as the second data set. The third data set consists of 2856 genes and 14 experiments (CYC-a). The fourth data set of 242 genes and 14 experiments (CYC-b). The fifth data set is from a study of response to environmental changes in yeast [21]. It contains 5431 genes and 13 experiments that have time-series of specific treatments (ENV). The sixth data set is the cDNA microarray data relevant to human colorectal cancer (CRC)[22]. This data set contains 758 genes and 205 primary CRCs that include 127 non-metastatic primary CRCs, 54 metastatic primary CRCs to the liver and 24 metastatic primary CRCs to distant organs exclusive of the liver, and 12 normal colonic epithelia (CRC). This is a challenging data set with multiple experiments with no time course relationships. The ALPHA, ELU, and CRC are the same data sets that were used in the study of BPCA [7] and LLsimpute[23]. The performance of the missing value estimation is evaluated by normalized root mean squared error(NRMSE) :

(9)

Where and are vectores whose elements are the estimated values and the known answer values, respectively, for all missing entries. The similarity between the target genes and the closest centroid is defined by the reciprocation of the Euclidian distance calculated for non-missing components. 5.2. Experimental results

In the experiments, we randomly removed some percentage, i.e. missing rate, of expression level to create missing values (between 1 and 20% of the data were deleted). Each method was then used to recover the introduced missing values for each data set, and the estimated values were compared to those in the original data set. From the plots of NRMSE values (Figure 1) achieved by all five methods on six datasets, we can see that KNNimpute method always performs the worst and ClustLLsimpute always performs the best. For all the other three methods, they perform equally well on env-dataset and crc-dataset but ClustLLsimpute performs better than the other three. In fact, from Figures 1(b) and 1(e), it is hard to tell which one of them performs better than the other three except ClustLLsimpute which is outperform. All other three methods again perform equally well on elu-, cyc-a-, and alpha-datasets when the missing rate is small, i.e. less than 5% (cf. Figures 1(a), 1(d), and 3(f)) and also ClustLLsimpute is outperform all of them. However, the

performances differ when the missing rate is large. Our method ClustLLsimpute performs very close to other three as in Figures 1(f) with 20% rate, though still a little better. From these results, it is deduced that the ClustLLsimpute method performs better than both BPCA and LLSimpute, the two most recent imputation methods [24][25].


201

Figure 1. NRMSE comparison for ILLsimpute, BPCA, LLSimpute, KNNimpute and ClustLLsimpute on six datasets with various percent of missing values 7. Conclusions

This paper proposes a novel version of Local Least Squares Imputation (ClustLLsimpute) method to estimate the missing values in microarray data. In ClustLLsimpute, the complete dataset is clustered by using a novel clustering k-nearest clustering method to obtain k-clusters and its centroids. The number of nearest neighbors for every target gene is automatically determined as the cluster with the nearest centroid, rather than prespecified in most existing imputation methods. The experimental results on six real microarray datasets show that ClustLLsimpute outperforms the four most well known recent imputation methods BPCA,

LLSimpute, , ILLSimpute and KNNimpute, on all datasets with simulated missing values. References

[1] Alizadeh,A.A., Eisen,M.B., Davis,R.E., Ma,C., Lossos,I.S., Rosenwald, A., Boldrick,J.C., Sabet,H., Tran,T., Yu,X., Powell,J.I., Yang,L., Marti,G.E., Moore,T., Hudson,Jr,J., Lu,L., Lewis,D.B., Tibshirani,R., Sherlock,G., Chan,W.C., Greiner,T.C., Weisenburger, D.D., Armitage,J.O., Warnke,R. and Staudt,L.M., et al. "Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling". Nature, 403, pp. 503–511, 2000.

[2] Little,R.J.A. and Rubin,D.B. "Statistical Analysis with Missing Data,". Wiley, New York,1987.

[3] Yates,Y. "The analysis of replicated experiments when the field results are incomplete". Emp. J. Exp. Agric., 1, pp. 129–142, 1933.

[4] Wilkinson,G.N. "Estimation of missing values for the analysis of incomplete data". Biometrics, 14, pp. 257–286, 1958.

[5] Loh,W. and Vanichsetakul,N. "Tree-structured classification via generalized discriminant analysis'. J. Am. Stat. Assoc., 83, pp. 715–725, 1988.

[6] O. Troyanskaya, M. Cantor, G. Sherlock, P. Brown, T. Hastie, R. Tibshirani, D. Botstein, and R. B. Altman. "Missing value estimation methods for DNA microarray". Bioinformatics, 17(6), pp. 520–525, 2001.

[7] S. Oba, M. Sato, I. Takemasa, M. Monden, K. Matsubara, and S. Ishii. "A Bayesian missing value estimation method for gene expression profile data". Bioinformatics, 19(16), pp. 2088–2096, 2003.

[8] J. B. MacQueen. "Some Methods for classification and Analysis of Multivariate Observations". Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1, pp. 281-297, 1967.

[9] Pearson, K. "On lines and planes of closest fit to systems of points in space". Philosophical Magazine, 6, pp. 559-572, 1901.

[10] Grattan-Guinness, I. "The rainbow of mathematics". New York: Norton, 1997.

[11] Hotelling, H. "Analysis of a complex of statistical variables into principal components". Journal of Educational Psychology, 25, pp. 417-441, 1933.

[12] Jolliffe, I.T. "Principal component analysis". New York: Springer, 2002.

[13] Saporta, G, and Niang, N. "Principal component analysis: application to statistical process control". In G. Govaert (Ed), Data analysis. pp. 1-23, 2009. London: Wiley.

[14] Takane, Y. "Relationships among various kinds of eigenvalue and singular value decompositions". In Yanai, H., Okada, A., Shigemasu, K., Kano, Y., and Meulman, J. (Eds.), New developments in psychometrics , 45-56,2002. Tokyo:Springer Verlag.


202

[15] Abdi, H., and Valentin, D. "Multiple correspondence analysis". In N.J. Salkind (Ed), Encyclopedia of measurement and statistics, pp. 651-657, 2007. Thousand Oaks, CA: Sage.

[16] M. Ismail and M. Kame. "Multidimensional data clustering utilization hybrid search strategies". Pattern Recognition Vol. 22 (1), pp. 75-89, 1989.

[17] G. Babu and M. Murty. "A near optimal initial seed value selection in kmeans algorithm using a genetic algorithm". Pattern Recognition Letters Vol. 14, pp. 763-769, 1993.

[18] Y. Linde, A. Buzo and R. Gray. "An algorithm for vector quantizer design. IEEE trans". Comm. Vol. 28 (1), pp. 84-95,1980.

[19] Ding, C. and He, X. "K-means clustering via principal component analysis". Proceedings of the 21st International Conference on Machine Learning, Banff, Canada,2004.

[20] Spellman, P. T., Sherlock, G., Zhang, M. Q., Iyer, V. R., Anders, K., Eisen, M. B., Brown, P. O., Botstein, D. and Futcher, B. " Comprehensive identification of cell cycle-regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization". Mol. Biol. Cell, 9,pp. 3273– 3297, 1998.

[21] Gasch, A. P., Huang, M., Metzner, S., Botstein, D., Elledge, S. J. and Brown, P. O. "Genomic expression responses to DNA-damaging agents and the regulatory role of the yeast ATR homolog Mec1p". Mol. Biol. Cell, 12 (10), pp. 2987–3003, 2001.

[22] Takemasa, I., Higuchi, H., Yamamoto, H., Sekimoto, M., Tomita, N., Nakamori, S., Matoba, R., Monden, M. and Matsubara, K. "Construction of preferential cDNA microarray specialized for human colorectal carcinoma: molecular sketch of colorectal cancer. Biochem". Biophys. Res. Commun., 285, pp. 1244–1249, 2001.

[23] H. Kim, G. H. Golub, and H. Park. "Missing value estimation for DNA microarray gene expression data: Local least squares imputation". Bioinformatics, 20:pp. 1–12, 2004.

[24] Cai Z, Heydari M and Lin G. "Iterated local least squares microarray missing value imputation". J Bioinform Comput Biol,4(5):pp. 935-57, 2006.


203

Providing a model to estimate the probability of the complexity of software projects

Foad Marzoughi1*, Mohammad Mehdi Farhangian 2* and Alex Tze Hiang Sim3

Faculty of Computer Science and Information Systems,

Universiti Teknologi Malaysia (UTM), Johor, Malaysia, 1 [email protected], [email protected], [email protected]

Abstract: Function Point Analysis (FPA) is most used technique for estimating the size of a computerized business information system which was developed by Allan Albrecht. Various studies proposed new methods to extent FPA algorithm; mainly they tried to make it more precise but they are based on the similarity of previous projects so this paper is proposed. This paper, presents a statistical simulation method that can be applied for each generic project. The proposed method is a new method to assess estimation of size and effort of software projects by a stochastic and Markov chain approach. Based on Metropolis-hasting simulation algorithm, we formulate a Probabilistic Function Point Analysis (PFPA). Moreover, A Bayesian belief network approach is used for determination of complexity of system. It determines the function weights utilizing Markov chain theory to support estimating the effort of software projects. As a case study, this new method is applied in online publication domain. This method can increase the chance of implementation of generic projects on time.

Keywords: Bayesian Probability, Markov chain Monte Carlo Simulation, Function Point Analysis.

1. Introduction and Background Estimation efforts and cost of the project is a crucial task for project managers [1]. A majority of projects fail in their development phase because of a poor estimation [2]. Nowadays , there are many demands for software projects and a precise and reliable method for predicting different aspects of projects is needed more than any time else [3]. Estimation time and cost of software projects is essential in first phase of project development life cycle for determining the feasibility of project [4]. During the project, these estimation can be used for validation and controlling the project progress [5]. For estimation of software projects’ cost, many methods have been introduced, including Metric System of Halstead, COCOMO II, PUTNAM-SLIM ,and FPA .FPA determines the size of system functionality and measure the performance of project teams. FPA is the most used method for projects’ estimation all over the world[6]. As every project is unique and each on is totally independent from previous ones, the mention methods do not satisfy managers’ expectations. In FPA two components of system are computed: the information processing size and the technical complexity of factor. The information processing size is classified as five components that are inputs, outputs, inquiries , files and program interface and each random variables classified as

low, medium and average that represent the complexity of each factor. It depends on many factors that the most important of them is the weights of the complexity to estimate of UAFP. As the classification of the system is based on experts’ judgment, FPA uses an imprecise manner of classification the functions as simple, average and complex. Sometimes the functions with different values are classified in same category and sometimes functions with same values are classified in different categories. These kinds of problems will be more noticeable in large projects especially in governmental projects. In practice, many of weights that are dedicated in information size and technical complexity are not valid. Many studies reviewed FPA critically and criticize it in term of its weakness. For instance, Symons in 1988 assessed functionality and validity of FPA [7]. Besides, many studies proposed new methods for extending FPA. Full Function Points (FFP) developed by Abran and his colleagues in 1997 that is applied for real-time software [8]. In 2003 Junior, Farias and Belchior suggested a fuzzy based function point analysis method [9]. This paper presents a statistical simulation method that can be applied for each generic project. Using this method provides accurate estimation for generating software projects. This paper is presented in three main sections. First section includes knowledge and concepts which are applied in this method such as Markov chain and function point analysis. In second section the proposed method is presented and finally it is applied in a case study.

2. Function Point Analysis The first definitions of FPA were refined and extended in IBM CIS Guideline 313, AD/M Productivity Measurement and Estimate Validation [10]. In 1986, a group of FPA users formed the International Function Point User Group (IFPUG), which is responsible for keeping their associates informed regarding any updates in technique [11]. The five function elements which are outputs, inputs, queries, files and program interfaces are assessed based on their complexity based on value of three categories: Low, Medium and High. The final function point calculation yields a single number that represents the total complexity. The total unadjusted function point (UFPs) is calculated by the values of the various weighting factors as equation (1)


204

(1) The choice of weight factors was justified by Albertch [4]. It is doubtful whether the weights will be appropriate for all users in all circumstances. The weights are derived from study of project of IBM; the weights of IBM cannot be applied for other projects. For overcoming this problem, in this paper by using the concepts of Bayesian theory and Monte Carlo simulation with Markov chain theory, the traditional FPA extends to Bayesian function point analysis.

3. Bayesian Belief Network Bayesian belief networks are graphical tools for modeling causes and effects in various domains[12]. In cases that input data is uncertain, Bayesian belief network modeling is effective. Bayesian belief networks are based on the stochastic variables which will be represented by nodes and dependency among variables represented by arrow heads[13].

4. Proposed Method The proposed method has three main steps that are explained in the Marzoughi and Farhangian paper 2010[14]. These steps are fitting distribution, Estimate unknown parameter and Optimization of weights by Markov chain Monte Carlo Simulation. In fourth step the complexity of the software project will be computed based on Bayesian belief network approach. There are 14 factors listed in table 2. This table is used for determining the complexity of the project. By using Bayesian belief network and omitting no effected factors, the adjusted processing complexity will be calculated. The total processing complexity will be calculated as following formula:

(2)

5. Case Study The proposed method is validated as such: We simulate the estimations of 20 experts. By fitting data, Poisson distribution is identified with the unknown parameter . All the functions including inputs, outputs, queries, files and program interfaces are classified in different bounds. The probability of each state is estimated in condition of prior state. The data of input is presented as an example, other functions follows same procedure. This model can be applied in other case studies with same condition. Input is divided into 9 main groups that are: low1, low2, low3, medium4, medium5, medium6, high7, high8, high9. From the fitting data, the distribution of is identified as follows:

(3) The following process is applied for search function in online publication and other functions can be followed with the same method.

The posterior will be calculated by following formula (4)

Where, is the Poisson rate parameter, and a data set. In this stage, discrete distribution is converted to continuous distribution – Gamma distribution. It is expressed in a following formula in a form of a rate parameter rather than a scale parameter where rate=1/scale

For (5)

Constants not involving can be ignored. So the needed formula is presented in following equation instead of the entire Gamma density.

(6) According to the procedure that described in proposed model, the fitting model is distributed as:

(7) For a sequence of random samples from this distribution, a metropolis-hasting method is selected. Regarding raw data that is obtained from estimation. In the first state, the expert estimated the input low is equal to 2. That can be presented as follows:

(8) Each state presents an interval. The next state of input low will be calculated as follows

= =

(9) The steady state factors will be computed as

follows: ⇒ = (10)

is selected as the weight of input low for next stage, since it has the highest probability 0.39 of weights. After determining the Total Unadjusted Function Point (TUFP), the next step is computing the complexity of system by a Bayesian belief network approach. Firstly the factors with no influence will be omitted according the following table:

Table 1. Total Unadjusted Function Point Data Communication 2 Heavy use configuration 0 Transaction rate 0 End-user efficiency 0 Complex Proceeding 0 Installation ease 0 Multiple sites 0 Performance 0 Distributed functions 1 Online data entry 0


205

Online update 0 Reusability 0 Operational ease 0 Extensibility 0 In this table factors with no influence are 0, factors with low influence are 1, factors with medium influence marked by 2 and factors with high influence marked by 3.The factors with no influence will be omitted. And the probability of other factors will be computed by a Bayesian Belief Network approach. All the probabilities in the following table the results of Markov Chain computation are presented: P(low) P(medium) P(high)

.3 .6 .1

P(low) P(medium) P(high)

0.5 0.3 0.2

Figure 1. Bayesian network of system

Table 2. Complexity DC DF Complexity

P(Low) P(Medium) P(high) Low Low 0.7 0.2 0.1 Low Medium 0.8 0.1 0.1 Low High 0.6 0.2 0.2 Medium Low 0.4 0.5 0.1 Medium Medium 0.2 0.7 0.1 Medium High 0.1 0.5 0.4 High Low 0.2 0.3 0.5 High Medium 0.1 0.4 0.5 High High 0.1 0.2 0.7 The probability of each state will be computed by Bayesian Formula according to which:

(11)

(12)

According to formula .., for complexity being “low “the probability would be:

0.1391 (15)

Similarly, for complexity of “medium” and “high” the probability would be as following, respectively:

=.o6496 (16)

0.0721 (17)

7. Conclusion and future Work This paper extends function point analysis and it provides a method for estimating the weights of functions. We use Metropolis-Hasting algorithm in Markov Chain Monte Carlo method to estimate the effort of the software projects by optimizing the weight of the FPA. Moreover, by using a Bayesian belief network the complexity of system would be determined. Based on our proposed method (PFPA) and survey data gathered from experts in an organization, a real time decision making system can be developed.

References [1] Abran, A., and Robillard, Function point analysis: an

empirical study of its measurement processes. IEEE, 1996. 22.

[2] Halstead, M.H., Elements of Software 1997: Elsevier Science Inc.

[3] Boehm, B., A spiral model of software development and enhancement. 1988. 21(5).

[4] Albrecht, A., Measuring Application Development Productivity. SHARE/GUIDE/IBM Application Development Symposium, 1979.

[5] Symons, C.R., Function Point Analysis: Difficulties and Improvements. IEEE Transactions on Software Engineering.

[6] Desharnaise, J.S.-P.D.M., M. Abran,A, Full Function Points: counting practices manual procedure and counting rules. universiti du Quebec a Montereal., 1997.

[7] Symons, C.R., Function point analysis: Difficulties and improvements. IEEE Trans. Software Eng, 1988. 14.

[8] Abran, A., Maya, M., St-Pierre D., and and J.-M. Desharnais, Adapting Function Points

Data Communication

Distributed Functions

Complexity


206

to Real Time Software. Universite du Quebec a Montreal, 1997.

[9] Junior, F.a.B., Fuzzy Modeling for Function Points Analysis. Software Quality Journal, 2003.

[10] Smith, L., Function Point Analysis and Its Uses. Predicate Logic, INC, 1997.

[11] Desharnais, J.M., Adjustment model for function point scope factors. In Proceedings of IFPUG Spring Conference, 1990.

[12] Besag, J., P. J. Green, D. Higdon, and K. L. M. Mengersen, Bayesian computation

[13] and stochastic systems. Statistical Science, 1995. Geman, S.a.D.G., Stochastic relaxation, Gibbs distribution and Bayesian restoration of images. IEE Transactions on Pattern Analysis and Machine Intelligence, 1984.

[14] Marzoughi Foad. Farhangian Mohammad mehdi. Marzoughi, Ali.Alex.sim. A Decision Model for Estimating the Effort of Software Projects using Bayesian Theory. in ICSTE. 2010. USA.


207

Modeling a Balanced Score Card systems in IT departments using a fuzzy AHP and non linear

regression approach regarding to mindset of evaluators

Mohammad Mehdi Farhangian 1* , Foad Marzoughi2*, and Alex Tze Hiang Sim3

Faculty of Computer Science and Information Systems,

Universiti Teknologi Malaysia (UTM), Johor, Malaysia, [email protected], [email protected] 2, [email protected]

Abstract: Nowadays, the key role of Balanced Score cards in performance management and performance measurement is undeniable. The current BSC systems are not able to deal with vagueness and imprecision of evaluators’ mental. In order to overcome these constraints a general framework is proposed for Balanced Score Cards. This model provides a quantitative BSC that decrease impreciseness of evaluations and the evaluation process will be integrated by regarding the proportion of each indicator and each group of indicators in BSC. Regarding this issue, a comprehensive model is developed by a fuzzy logic and Fuzzy AHP approach for weighting of items in each perspective and weight of each perspective for achieving the goals a. also, a non linear regression approach is selected for determining fuzzy membership function of mental pattern of evaluators. In addition, proposed model is applied in IT department of governmental organization in Iran as a case study. Keywords: Balanced Score Card, Fuzzy logic, performance measurement, Fuzzy AHP, non linear regression

1. Introduction and Background In recent decades, performance measurement is taken into consideration by both researchers and practitioners. One of the importance of performance measurement is driving organization actions. It is always emphasized that metrics should be aligned with strategy (Kaplan and Norton, 1992; Kaplan and Norton, 2000).besides, it provides a framework to drive decision making[1]. For example, shortest processing time policy appears to be the policy of choice when considering time in system or waiting time measures. However, an earliest due date policy is more favorable when considering order lateness as the measure of interest (Chan et. al. 2003). Another advantage of performance measurement is providing close loop control that is feedback of any process. Kaplan and Norton (1992) propose four basic perspectives that managers should monitor: financial

perspective, costumer perspective, internal perspective and innovation and learning[1]. Despite of its popularity, many researchers and practitioners criticize Balances Score Card because of its weakness. For example, Kanji, 2000 and Malina and Selto, 2001critisize that it is a top down approach[2]. Nørreklit (2003) suggested that the balanced scorecard is a result of persuasive rhetoric and not a convincing theory[3]. Lohman et al. (2004) find that BSC does not provide any chance to develop in organizations[4]. DeBusk et. al (2003) analyzed survey response data and estimated relative performance .[5] Analytical Hierarchal Process is widely used to identify weights for performance measurement. Current methods are highly influenced by managerial judgment. In this paper, a model for BCS system is proposed to overcome the problems of behavior memory of evaluators. This framework, not only consider indicators measurement but also three important characteristics will be handled.

2. Methodology For handling traditional Balanced Score System two steps should be taken up. In first step, the indicators of each perspective will be identified and these indicators will be prioritized. The importance of each indicator is determined by a fuzzy AHP method. In this study the steps of Fuzzy AHP proposed by Change (1996) [6]. After determining the weights of indicators, calculating the numerical value of each indicator is the next step. In this step, two marks will be indicated by evaluators that are Real Performance (RP) and Expected Performance (EP) Based previous research, three models are common for evaluators in term of determining the value of real performance and expected performance. These models are the optimistic, the neutral and the pessimistic.


208

3. Case study The proposed model is applied for IT department of governmental organization in Iran. The current balanced score card not satisfy managers expectation. For modifying this form following procedure is proposed. A fuzzy AHP approach is selected for determining the weights of each item. Customer (1) = (7.66, 9, 10.5) ⊗ (1/22.32, 1/19.58,1/11.32)=(.343,.429,.927) (1) Customer(2)= (3.18,3.58,4.56)⊗( 1/22.32, 1/19.58,1/11.32)=(.142,.182,.402) (2) Customer(3)=(2.46,3,3.82) ⊗(1/22.32, 1/19.58,1/11.32)=(.110,.153,.337) (3) Customer(4)=(3.12,3.75,4.44) ⊗(1/22.32, 1/19.58,1/11.32)=(.139,.191,.392) (4) The fuzzy value will be evaluated as follows: V(customer1 customer2)= .192 V(customer1 customer3)= 0 V(customer1 customer4)= .17 (5) V(customer2 customer1)=.807 V(customer2 customer3)=.770 V(customer2 customer4)=.912 (6) V(customer3 customer1)= 1 V(customer3 customer2)= 1 V(customer3 customer4)=1 (7) V(customer4 customer1)=1 V(customer4 customer2)=.9 V(customer4 customer3)=.838 The priority weight will be calculated as follows:

(customer1)= min (.192,0,.17)=0 (8) (customer2)=min (.807, .770, .912) =.770 (9) (customer3)=min (1, 1, 1) =1 (10) (customer4)=min (1,.9,.838)=.838 (11)

The weight factor is calculated as follows: (Customer)= (0, .770, 1, .838) (12)

After normalization, values priority weights will be calculated as follows:

(13) The same procedure will be iterated for weighing calculation of other perspectives. From evaluation of all the perspectives, final results of weights are evaluated that presented in table The next step, after evaluation weights is finding mental pattern of evaluators. A non linear regression approach is used for calculating the mental pattern of evaluators. That is explained for best, fitting and exceeding pattern. This behavior is usually represented by an S-shaped curve. Regarding this assumption the procedure of determining function for all the mental pattern of evaluators including best pattern, fitted pattern and exceed pattern are explained respectively. Parameter estimation of best model is presented in table 9

Figure 1. S shape fuzzy membership function fitted with data set The following equation is obtained for optimistic model

With

(14) As a result, following equation is suggested for neutral model

With (15) Estimation of parameters for pessimistic model is presented in following model.

(16)

Figure 2. Logistic fuzzy membership function fitted with data set Regarding The pattern used by evaluators, the value assigned to each performance will be converted from a scale of 1 to 5 to a scale from 0 to 1.[8] The final step is integrating the scores for each perspectives that is calculated by following formula:

(17) In this case study, 14 indicators are evaluated. Then, these indicators in 4 different perspectives are integrated. The following form show the procedures that evaluators are suggested to follow and performance of organization based proposed procedure


209

Table12. Final performance results form

Regarding to evaluators’ score for each perspective, the final score according to formula (14) is equal to 2.8518. 4. Conclusion In this paper, a quantitative model for evaluating Balanced Score Card system is proposed. In this model, mental pattern of evaluators will be considered by a non linear regression approach. By involving behavior memory in balanced score cards, accuracy and validation of evaluation system will be improved. In addition, as evaluation terms are linguistic ,a fuzzy AHP model selected for estimating the weights of each perspectives and the weight of each perspective for reaching the goals .BSC is not only a model for evolution, but also it is a model for determining the strategic goals of organization. Regarding this issue, this model bridges the evaluation system to strategic management. Although this model is applied in a case study but there is need for more evaluation in other organizations.

References

[1] Kaplan, R.S.a.D.P.N., The balanced scorecard: translating strategy into action. Harvard Business School Press, 1996.

[2] Malina, M.A.a.F.H.S., Communicating and controlling strategy: An empirical study of the effectiveness of the balanced scorecard. Journal of Management Accounting Research, 2001.

[3] Nørreklit, H., The balanced scorecard: what is the score? A rhetorical analysis of the balanced scorecard. Accounting, Organizations and Society, 2003.

[4] Lohman, C., Fortuin, L. & Wouters, M, Designing a performance measurement system: A case study. European Journal of Operational Research, 2004.

[5] DeBusk, G.K., Brown, R. M., & Killough, L. N., Components and relative weights in utilization of

dashboard measurement systems like the Balanced Scorecard. The British Accounting Review, 2003.

[6] D., C., Application of the extent analysis method on fuzzy AHP. European Journal of Operational Research, 1996.

[7] Alexandroz papalexandris, G.I., Gregory Prastacos, Klas eric and soderquist, An integrated methodology for putting the balanced scorecard intoaction. European management journal, 2005.

[8] Azar.A, A.D.Z., improving the balanced Scorecard systems based fuzzy approach. The third national conference on performance management, 2007.


210

Comparison of Congestion Control Schemes in Wireless LAN using Satellite link and ATM

Networks Dr.R.Seshadri1 and Prof.N.Penchalaiah2

1Prof.& Director of university computer center

S.V.University, Tirupati-517502,A.P. E-mail:[email protected]

2Dept of CSE ASCET, Gudur-524101,A.P. E-mail:[email protected]

Abstract: The number of wireless Internet service users that use wireless LAN or satellite links has increased. The broadband satellite Internet services have especially attracted attention because of its characteristics such as global coverage and robustness. TCP (Transmission Control Protocol) is used by many typical applications over satellite Internet. However, a typical TCP (such as TCP-New Reno) which has been developed for wired networks performs poorly in wireless networks.TCP-STAR tends to reduce throughput of the typical TCP and TCP-Fusion is developed for wired high-speed links. In ATM networks, The ATM Forum has chosen a rate-based scheme as an approach to congestion control for Available Bit Rate (ABR) services. We propose a new adaptive congestion control scheme called the Self Detective Congestion Control (SDCC) scheme for ATM networks. In this paper, we propose a TCP congestion control method for improving friendliness over satellite links by combining TCP-Fusion’s congestion control method and TCP-STAR’s congestion control method. Also Self Detective Congestion Control (SDCC) scheme for ATM networks. We evaluate the performance of congestion control schemas in various networks.

1. Introduction The long propagation delay of satellite links

decreases performance of the typical TCP.TCP-STAR has been proposed to solve these problems by modifying TCP congestion control method. TCP-STAR achieves high-speed communication by using the estimated bandwidth. However, if TCP-STAR coexists with the typical TCP, TCP-STAR tends to reduce throughput of the typical TCP.On the other hand, TCP-Fusion has been proposed for wired high-speed networks. TCP-Fusion which uses delay-based and loss-based congestion control method achieves scalability and friendliness to the typical TCP. However, TCP-Fusion cannot obtain high performance over satellite Internet, since TCP-Fusion is developed for wired high-speed links.TCP-STAR [1][2][3] has been proposed to improve the throughput over satellite Internet. TCP-STAR is the congestion control method which consists of three mechanisms; Congestion Window Setting (CWS) based on available bandwidth, Lift Window Control (LWC), and Acknowledgment Error Notification (AEN). In CWS and LWC, TCP-STAR uses ABE (Available Bandwidth Estimation) in TCP-J [4][5] as the available bandwidth estimation method. In order to satisfy efficiency and

friendliness tradeoff of TCP, TCP-Fusion has an approach combining a loss-based protocol and delay-based protocol. The key concept of TCP-Fusion is that congestion window sizes are increased aggressively whenever the network is estimated underutilized. On the other hand, when the network is fully utilized, TCP-Fusion tends to perform like the typical TCP. Therefore, TCP-Fusion has a special feature which tries to utilize the residual capacity effectively without impacts on coexisting typical TCP flows.

The Asynchronous Transfer Mode (ATM) is recommended as a transfer mode for future B-ISDN. In ATM networks, data from all types of communications services is treated in the same way. That is, all data packets are segmented into fixed length cells. Data from different sources require different characteristics of transmission. Therefore, two classes of traffic services, guaranteed and ABR services, are required by ATM networks. To cope with congestion control for the ABR services a rate-based scheme is considered to be the best [8]. There are several rate-based scheme proposed by the ATM Forum, such as FECN [9], BECN [10], and PRCA [11]. The FECN scheme uses Explicit Forward Congestion Indication (EFCI) as a single-bit to indicate congestion in the forward direction of the VC. In the BECN scheme, the notification cell is sent directly from the congested points to the source. Both the FECN and BECN scheme are based on a negative feedback rate control paradigm. That is, a source will reduce the cell transmission rate when it receives congestion notification cells. If, within a predetermined period of time, the source does not receive congestion notification cells, it will increase the current transmission rate until it reaches the peak cell rate. But, if the all notification cells in the backward direction will experience extreme congestion, all the sources will increase the rate to the peak cell rate, so the overall network congestion collapse may occur. In order to deal with the problem of network congestion collapse, the PRCA uses a positive feedback rate control paradigm instead the negative feedback. However, unfair distribution of available bandwidth among VCs may occur because data cells from a VC passing through more congested links will be marked more often than those from VCs passing through fewer congested links. Thus, VCs with more congested links in their path will suffer from starvation, because their Allowed Cell Rate (ACR) is lower than others. To resolve the


211

problems of the existing rate-based schemes, we propose a new adaptive scheme, called SDCC scheme, based on the following two basic concepts:

1. Positive feedback rate control which resolves the problems of the FECN and BECN schemes.

2. Intelligent holding or selectively holding Resource Management (RM) cells which resolves the problem of the PRCA.

2. Congestion control in satellite internet In order to overcome the problems of existing methods, we propose a new TCP congestion control method which can obtain good friendliness for the typical TCP. The proposed method is obtained by combining TCP-Fusion and CWS/LWC of TCP-STAR. It is assumed that the proposal will be able to obtain good friendliness by using TCP-Fusion and higher throughput by CWS and LWC over the satellite Internet. Following subsections show the detail of the proposed method.

2.1. Congestion Window Decrement

When the proposal detects packet losses, the congestion window (cwnd) and slow start threshold (ssthresh) are set by using CWS. 2.1.1 Detection of Packet Losses by Duplicate ACKs: If the proposed method detects packet losses by duplicate ACKs, it sets cwnd and ssthresh by Eq.(1).

packet_size indicate updated congestion window,

previous congestion window, estimated available bandwidth, minimum round trip time, and packet size, respectively BWRE is obtained by using Rate Estimation (RE) which is one of the mechanism of TCP-Westwood[2][3]. 4.1 2.1.2 Detection of Packet Losses by Retransmission Timeout: If the retransmission timeout occurs, the proposed method sets cwnd and ssthresh by Eq.(2).

2.2. Congestion Window Increment Original TCP-Fusion has three phases (increment phase, decrement phase, and steady phase) in case of updating the congestion window. In the proposed method, we apply LWC of TCP-STAR in the increment phase of TCP-Fusion. Window control of decrement phase and steady phase are same as original TCP-Fusion. Eq.(3) shows the congestion window behavior of the proposed method which uses LWC.

In Eq.(3)diff,target_win indicate the number of estimated packets in bottleneck router queue, lower bound threshold to switch three phases, and additional window size by LWC, respectively target_win is calculated by Eq.(4).

In Eq. (4), BWABE shows the bottleneck bandwidth and it is obtained by using ABE of TCP-STAR.

3. Congestion control in ATM networks (SDCC Scheme) In this section, we describe the basic operation of the SDCC scheme.

3.1 Network Elements The network elements of a VC, implementing a rate-based end-to-end feedback loop control scheme, the Virtual Connection Source (VCS), Virtual Connection Destination (VCD) and ATM Switch (SW) are shown in Fig. 1. The VCS and VCD generate and receive ATM cells. They are the extreme points of a VC. The VC is a bidirectional connection. The forward and backward connections have the same virtual connection identifiers and pass through identical transmission facilities. The VCS must have the ability to transmit cells into the network at a variable and controlled rate from a predetermined minimum cell rate to peak cell rate. On the other hand, the VCD must return every received RM cell to the VCS in the backward connection in order to support the closed loop control scheme.


212

Figure1. Network elements of a VC

ATM switches route ATM cells from the VCS to the

VCD, and each ATM switch has an identifier in the form of an address number. We assume that an ATM switch has output buffers divided into two parts: one for data cell use and the other one for RM cell use. Each part of the buffer implements FIFO cell queuing discipline. The buffer service algorithm always serves the RM cells in preference to the data cells. That is, RM cells have a higher service priority than data cells.

3.2 Basic Operation The model of the basic SDCC scheme, in which every VC has one RM cell for congestion control purpose, is illustrated in Fig. 2.We refer to this scheme as the self detective congestion control scheme because the VCS itself generates and sends out the RM cell to detect congestion which may occur at its bottleneck switch. The bottleneck switch is the switch which has the narrowest bandwidth between the switches in a VC. The bottleneck switch is considered to be known from the initial routine and its address is written in the SWI field of the RM cell. The VCS starts a timer every time it transmits a RM cell. The timer starts to count down from a predetermined value, CD_Time (Congestion Detection Time), until it reaches 0. The CD_Time is a variable of the VCS, which is determined before the RM cell is sent out for the first time. The VCD returns the received RM cell to the VCS in backward connection. If the output buffer of the bottleneck switch is congested, the bottleneck switch will hold RM cells flowing in the backward direction until it recovers from the congestion state. The bottleneck switch will pass all received RM cells without considerable delay if no congestion is detected, since the RM cells have higher priority.

Figure 2. VCs with their RM cells

When the timer expires and the RM cell has not returned back yet, the VCS considers its bottleneck switch to be congested, and decreases the cell rate at regular predetermined intervals, until it receives the RM cell. On receiving the RM cell, the VCS considers its bottleneck switch decongested if the timer is still on, or if its bottleneck switch has been recovered from congestion state. The VCS will then increase its cell rate proportional to the current rate. The RM cell will be sent out to the network again if the VCS has transmitted a Number of Cell (NC) or more data cells since the RM cell was sent out during the previous round.This procedure reduces the amount of RM cell traffic when the cell rate of the VC is low or when the VC enters an idle state. The VCS is considered idle if at the current rate it isn’t transmitted any data cell in an interval of length corresponding to NC interval.

By implementing intelligent holding, the VCs having the same bottleneck link will share the available bandwidth fairly. The intelligent holding means, all RM cells belonging to the VCs are not selected by each congested switch, but they are selected only by the bottleneck switch when it is congested. The fair sharing occurs because in the equilibrium state the VCs will increase and decrease their rate to approximately the same level.

4. Comparisons between satellite internet and ATM network Satellite Internet access is Internet access provided through satellites.The service can be provided to users world-wide through Low Earth Orbit (LEO) satellites. Geostationary satellites can offer higher data speeds, but their signals can not reach some polar regions of the world. Different types of satellite systems have a wide range of different features and technical limitations, which can greatly affect their usefulness and performance in specific applications.Satellite internet customers range from individual home users with one PC to large remote business sites with several hundred PCs.Home users tend to make use of shared satellite capacity, to reduce the cost, while still allowing high peak bit rates when congestion is absent. There are usually restrictive time based bandwidth allowances so that each user gets their fair share, according to their payment. When a user exceeds their Mbytes allowances, the company may slow down their access; deprioritise their traffic or charge for the excess bandwidth used.

Asynchronous Transfer Mode is a cell-based switching technique that uses asynchronous time division multiplexing. It encodes data into small fixed-sized cells (cell relay) and provides data link layer services that run over OSI Layer 1 physical links. This differs from other technologies based on packet-switched networks (such as the Internet Protocol or Ethernet) in which variable sized packets (known as frames when referencing Layer 2) are used. ATM exposes properties from both circuit switched and small packet switched networking, making it suitable for wide area data networking as well as real-time media transport.ATM uses a connection-oriented model and establishes a virtual circuit between two endpoints before the actual data exchange begins. ATM is a core protocol used


213

over the SONET/SDH backbone of the Integrated Services Digital Network

Table1: Comparison between Satellite network and ATM network

Satellite internet

ATM network

Packet size Variable length

Fixed length

Connection wireless Connection oriented

Mechanism Traffic shaping

Traffic shaping

Bandwidth High Low

5. Conclusion This paper proposed a new TCP congestion control method for improving TCP friendliness over the satellite Internet. We proposed a new adaptive rate-based scheme, called SDCC, for coping with congestion control in ATM networks. The SDCC scheme uses the positive feedback rate control and intelligent holding in each switch in order to resolve the problems which happen in FECN, BECN and PRCA.Finally the proposed methods are good at respective networks. And congestion control in satellite internet is done effectively than ATM networks when traffic conditions are good.

References [1] H. Obata, S. Takeuchi, and K. Ishida, “A New TCP

Congestion Control Method Considering Adaptability over Satellite Internet,” Proc. 4th International Workshop on Assurance in Distributed Systems and Networks, pp. 75-81, 2005.

[2] H. Obata, K. Ishida, S. Takeuchi, and S. Hanasaki, “TCP-STAR: TCP Congestion Control Method for Satellite Internet, ” IEICE Transactions on Communications, Vol.E89-B, No.6, pp. 1766-1773, 2006.

[3] H. Obata and K. Ishida, “Performance Evaluation of TCP Variants over High Speed Satellite Links,” Proc. 25th AIAA International Communications Satellite Systems Conference, no.AIAA2007-3156 (14page), 2007.

[4] N. Sato, M. Kunishi, and F. Teraoka, “TCP-J: New Transport Protocol for Wireless Network Enviroments,” IPSJ Journal, Vol.43, No.12, pp.3848-3858, 2002 (in Japanese).

[5] S. Saito and F. Teraoka, “Implementation, Analysis and Evaluation of TCP-J: A New Version of TCP for Wireless Networks,” IEICE Transactions on Communications, Vol.J87-D-1, No.5, pp.508-515, 2004 (in Japanese).

[6] C. Casetti, M. Gerla, S. Mascolo, M. Y. Sanadidi, and R. Wang, “TCP Westwood: Bandwidth Estimation for

Enhanced Transport over Wireless Links,” Proc. ACM Mobicom 2001, pp.287-297, 2001.

[7] R. Wang, M. Valla, M. Y. Sanadidi, B. K. F. Ng, and M. Gerla, “Efficiency/Friendliness Tradeoffs in TCP Westwood,” Proc. IEEE SCC 2002, pp.304-311, 2002.

[8] H. T. Kung and R. Morris, "Credit-based Flow Control for ATM Networks", IEEE Network, pp. 40-48, March/April 1995.

[9] M. Hluchyj and N. Yin, "On Closed-loop Rate Control for ATM Networks", Proc. INFOCOM’94, pp. 99-108, 1994.

[10] P. Newman, "Backward Explicit Congestion Notification for ATM Local Area Networks", Proc. IEEE GLOBECOM’93, Vol. 2, pp. 719-723, December 1993.

[11] K. Y. Siu and H. Y. Tzeng, "Adaptive Proportional Rate Control for ABR Service in ATM Networks", Proc. INFOCOM’95, pp. 529-535, 1995.

[12] ATM Forum, "ATM User-Network Interface Specification", Ver. 3.0, Prentice Hall, 1993.

[13] H. T. Kung and R. Morris, "Credit-based Flow Control for ATM Networks", IEEE Network, pp. 40-48, March/April 1995.

[14] P. Newman, "Traffic Management for ATM Local Area Networks", IEEE Commun. Mag., Vol. 32, No. 8, pp. 44-50, August 1994.

Author’s Profile

Dr.R.Seshadri working as Professor & Director, University Computer Centre, Sri Venkateswara University, Tirupati. He was completed his PhD in S.V.University in 1998 in the field of “ Simulation Modeling & Compression of E.C.G. Data Signals (Data compression Techniques) Electronics & Communication Egg.”. He has richest of knowledge in Research field, he is guiding 10

PhD in Fulltime as well as Part time. He has vast experience in teaching of 26 years. He published 10 national and international conferences and 8 papers published different Journals.

Prof.N.Penchalaiah Research Scholar in SV University, Tirupati and Working as Professor in CSE Dept,ASCET,Gudur. He was completed his M.Tech in Sathyabama University in 2006. He has 10 years of teaching experience. He guided PG & UG Projects. He published 2 Inter National journals and 2 national Conferences.


214

Efficient Coverage and Connectivity with Minimized Power Consumption in Wireless Sensor

Networks A.Balamurugan 1 T.Purushothaman 2 S.UmaMaheswari3

1 Department of Information Technology , V.L.B.Janakiammal College of Engineering and Technology, Coimbatore, Tamilnadu, India

2 Government College of Technology, Coimbatore, Tamilnadu, India

3Department of Information Technology , V.L.B.Janakiammal College of Engineering and Technology, Coimbatore, Tamilnadu, India

Abstact: The Wireless Sensor Networks use sensor nodes to sense the data from the specified region of interest and collect it at a centralized location for further processing and decision-making. Due to their extremely small dimension, sensor nodes have very limited energy supply. Further, it is hard to recharge the battery after deployment, either because the number of sensor nodes is too large, or because the deployment area is hostile. Therefore, conserving energy resource and prolonging system lifetime is an important challenge in Wireless Sensor Networks. Maintaining coverage and connectivity also becomes an important requirement in Wireless Sensor Networks so that the network can guarantee the quality of monitoring service. This paper addresses the problem of minimizing power consumption in each sensor node locally while ensuring two global properties: (i) connectivity, and (ii) coverage. A sensor node saves energy by suspending its sensing and communication activities according to a Markovian stochastic process. This paper presents a Markov model and its solution for steady state distributions to determine the operation of a single node. Given the steady state probabilities, a non-linear optimization problem is constructed to minimize the power consumption. Increasing the lifetime of the network by minimizing the power consumed in each node is achieved through the scheduling of nodes according to the Markov Model. Keywords: Wireless Sensor Network, Coverage, Connectivity, Markov model, ns-2 1. Introduction

The Wireless Sensor Networks of the near future are envisioned to consist of hundreds to thousands of inexpensive wireless nodes, each with some computational power and sensing capability, operating in an unattended mode. They are intended for a broad range of environmental sensing applications from vehicle tracking to habitat monitoring. The hardware technologies for these networks – low cost processors, miniature sensing and radio modules – are available today, with further improvements in cost and capabilities expected within the next decade. The applications, networking principles and protocols for these systems are just beginning to be developed.

Figure 1: The Structure of Wireless Sensor Networks The basic structure of a wireless sensor network is shown in Figure 1. A sensor network consists of a sink node, which subscribes to specific data streams by expressing interests or queries. The sensors in the network act as “sources” which detect environmental events and push relevant data to the appropriate subscriber sinks [2]. Because of the requirement of unattended operation in remote or even potentially hostile locations, sensor networks are extremely power-limited. However since various sensor nodes often detect common phenomena, there is likely to be some redundancy in the data the various sources communicate to a particular sink. In-network filtering and processing techniques can help to conserve the scarce power resources.

Power is a paramount concern in wireless sensor network applications that need to operate for a long time on battery power. For example, habitat monitoring may require continuous operation for months, and monitoring civil structures (e.g., bridges) requires an operational lifetime of several years. Recent developments have found that significant power savings can be achieved by dynamic management of node duty cycles in sensor networks with high node density [6]. In this approach, some nodes are scheduled to sleep (or enter a power saving mode) while the remaining active nodes provide continuous service. A fundamental problem is to minimize the number of nodes that remain active, while still achieving acceptable quality


215

of service for applications. In particular, maintaining sufficient sensing coverage and network connectivity with the active nodes are critical requirements in sensor networks [8, 10, 11].

To accomplish complete data collection, the sensor nodes need to actively sense or cover all ‘points of interest’ in the region. The ‘Coverage’ requirement ensures that the entire target points in the network are being covered by at least one active sensor node at all times. Once the event has been detected by one of the sensor nodes, the information needs to be propagated to the base station [14, 15, and 16]. The ‘Connectivity’ requirement ensures that any active sensor is able to transmit or communicate to the monitoring station at all times [17].

Minimizing power consumption and prolonging the system lifetime is an important issue for Wireless Sensor Networks. Connectivity and Coverage also has an important role in Wireless Sensor Networks. There are several ways to achieve these. Power aware routing protocols [7] are used most often in sensor nodes. According to these protocols, nodes spend their time in sense state, which consumes as much power as reception. The nodes never go to off state. So, most of the power is spent while sensing, and in order to decrease the power consumption the node should be turned off.

There are two routing protocols BECA (Basic Energy Conservation Algorithm) and AFECA (Adaptive Fidelity Energy Conservation Algorithm) [9], which have a Markov Model with sleeping, listening and active states. In BECA the time spent by nodes in particular nodes are deterministic. In AFECA they are adaptive, the sleeping time being a random variable that depends on the number of neighbors the node has. Using these protocols 55% of power can be saved.

The GAF (Geographic Adaptive Fidelity) routing protocol [5] aims to extend the lifetime of the network by minimizing the power consumption and preserving connectivity at the same time. It uses a 3-state transition diagram. GAF simply imposes a virtual grid on the network. If in any of the grid squares there are more than one node, the redundant nodes are turned off. In addition, a protocol called CEC (Cluster-based Energy Conservation) is used, which further eliminates redundant nodes by clustering them. About 40-60% of power can be saved.

There are some results relating the power level to the connectivity. The results show that, using percolation theory, that in order to have connectivity in a network with randomly placed nodes, the ratio of the number of neighbors to the total number of nodes should be (log n + c) = n where c should go to infinity asymptotically.

Coverage problem can be overcome by using Voronoi diagrams [12], generated with Delaunay triangulation to calculate the coverage of the network.

Coverage and Connectivity are jointly considered using a grid of sensors each of which can probabilistically fail. Here, within the transmission radius the number of active nodes should be a logarithm of the total number of nodes, for the network to have connectivity and coverage. The diameter of the network is of order nn log/ . There

should be at least one active node for coverage and connectivity.

This paper considers rigorous analysis and optimization of local decisions for the operation of sensor node. The objective is to ensure both connectivity and coverage in the network while minimizing power usage at each node. A randomized algorithm is run locally at a sensor node to govern its operation. Each node conserves energy by asynchronously and probabilistically turning itself off. The probabilities for staying in off, sense/receive, and transmit states ensure connectivity and coverage in the network. The problem of finding probabilities to maximize energy saving while ensuring both connectivity and coverage is expressed as an optimization problem defined by node parameters. 2. Principles and Design Issues OF Wireless Sensor Networks

Sensing coverage characterizes the monitoring quality provided by a sensor network in a designated region [6]. Different applications require different degrees of sensing coverage. While some applications may only require that every location in a region be monitored by one node, other applications require significantly higher degrees of coverage. For example, distributed event detection requires every location be monitored by multiple nodes, and distributed tracking and classification requires even higher degrees of coverage. The coverage requirement also depends on the number of faults that must be tolerated. The coverage requirement may also change after a network has been deployed due to changes in application modes or environmental conditions. For example, a surveillance sensor network may initially maintain a low degree of coverage required for distributed detection. After an intruder is detected, however, the region in the vicinity of the intruder must reconfigure itself to achieve a higher degree of coverage required for distributed tracking. Sensing is only one responsibility of a sensor network.

To operate successfully a sensor network must also provide satisfactory connectivity so that nodes can communicate for data fusion and reporting to base stations. The active nodes of a sensor network define a graph with links between nodes that can communicate the event information to the base station. Connectivity affects the robustness and achievable throughput of communication in a sensor network. Most sensor networks must remain connected, i.e., the active nodes should not be partitioned by dynamic scheduling of node states. However, single connectivity is not sufficient for many sensor networks because a single failure could disconnect the network. At a minimum, redundant potential connectivity through the inactive nodes can allow a sensor network to heal after a fault that reduces its connectivity, by activating particular inactive nodes. Greater connectivity may also be necessary to maintain good throughput by avoiding communication bottlenecks [13].

Although achieving power conservation by scheduling nodes to sleep is not a new approach, none of the


216

existing protocols satisfy the complete set of requirements in sensor networks. Most existing solutions have treated the problems of sensing coverage and network connectivity separately. The combination of coverage and connectivity is a special requirement introduced by Wireless Sensor Networks. This paper explores the problem of power conservation while maintaining both coverage and connectivity in Wireless Sensor Networks.

Apart from the above said principles and concepts, a Wireless Sensor Network design is influenced by many factors, which include fault tolerance, scalability, production costs, operating environment, sensor network topology, hardware constraints and transmission media. These factors are addressed by many of the previous works. However, none of these have a full integrated view of all factors that are driving the design of Wireless Sensor Networks and Sensor Nodes. These factors are important because they serve as a guideline to design a protocol or an algorithm for Wireless Sensor Networks. Although design of Wireless Sensor Networks is influenced by various factors, power consumption attains the highest priority. Along with power consumption, coverage and connectivity also plays an important role. 3. Introduction To Markov Model

Markov chain governs the behavior of an individual node. Using this Markov model [1] each sensor node makes an independent decision regarding which state it has to be in at a given time. A node transitions between states depending on the events that occur in its vicinity. The transitions are governed by a set of parameters. The issue is to determine optimal parameters governing the probabilistic transitions of a sensor node so as to minimize power consumption locally while ensuring connectivity and coverage globally.

3.1 The Markov Model. In the Markov Model each node is considered to be a three-state Markov chain. The three states are the off, O, the sense/receive, S, and the transmit, T, states. Considering a particular node, its transition matrix depends on the state of its environment. The environment of a node can be in one of two states: either a sense/receive event is occurring or no such event is occurring. Figure 2 shows Markov state diagram in each of these cases, along with the Markov transition probability matrices, M when

there is an event and M when there is no event. Notice that when a sensing event occurs, the node will always transition to the transmit state. This requirement can be relaxed. There is also an ambiguity if both sensing and receiving events occur. In this case, it is always assumed that the node always attempts to transmit the sensed event rather than the received event. At time t, there is some probability that the node is in each of its three states. Denote p O, p S, p T as the respective probabilities of finding

the node in the off, sensing/receiving and transmit states, and collect these three probabilities into the vector

p ( t ) = [ p O ( t ), p S ( t ) , p T ( t ) ] …..(1)

Let p E be the probability that there is an event. Then the state probabilities for the node at time t +1 are given by

p ( t + 1 )= p ( t )[ p E M + (1- p E ) M …..(2)

1,,,,0

1≤≤

=++pδγβα

γβα

Figure 2 : Markov State Diagram and Transition Matrix

Since an event can be either sensing or receiving, the

probability of an event will depend on the probability

that a single neighbor is transmitting. Suppose that the

system has equilibrated to a steady state, in which

p ( t + 1 )= p ( t )= p s

…..(3) 3.2 Determination Of Probability For An Event To Occur. A mean field approximation [1] is made such that all the neighbors of the node are in the same steady state and can be treated as independent, in which case

pE, probability for an event to occur can be computed as follows.

Let p SE be the probability of a sensing event and

let p RE be the probability of a receiving event.

Now, p E is given by the following equation.

p [Sense or Receive]= p SE + p RE – p SE p SE ……(4)

Determination of p SE(Probability of the sensing event):

p SE will be related to the sensing radius and the sensing event density. Probability of a node to sense an event :

ss pr 2π


217

Probability of a node not to sense an event:

ss pr 21 π− For m nodes, Probability for all nodes not to sense an event :

mss pr )1( 2π−

Probability to sense an event :

m

ss pr )1(1 2π−− m

ssSE pr )1(1p 2π−−= ……(5)

Determination p RE of(Probability of the receiving event):

p RE is the probability that exactly one of the node's neighbors is transmitting. The assumption is that, to a first order approximation, the state probabilities for the neighbors are independent. Note that if the transmit radius is Tr , then assuming that the disks are in the unit torus, the probability that a node is within transmitting range of our

node is2

Trπ ,and K has a Binomial distribution

),1;(][ 2TrnKBKp π−= .....(6)

where,

KNKN

ppkpNKB −−

= )1(),;( Multiplying p RE by

p [K] and summing over K, the expression for p RE

obtained as shown below. 222 )1()1( −−−= n

TTTTRE prprnp ππ ..…(7)

Substituting equations 5 and 7 in equation 4, p E can be obtained as follows.

Ep = SEp + ( n -1 ) ( 1- SEp ) c ( 1- c) n -2

..…(8)

where, c = TT pr 2π Using the value obtained using equation 8; the future state of the node can be calculated using the equation 2.

4. Analysis and Optimization One important issue in Wireless Sensor Networks is optimizing power consumption which is done by making only a subset of sensor nodes to operate in active mode, while fulfilling the two requirements, coverage and connectivity. The coverage characteristic ensures that the area that can be monitored is not smaller than that which can be monitored by a full set of sensors [14, 15, 16]. The connectivity characteristic ensures that the sensor network remains connected so that the information collected by sensor node can be relayed to sink node [17].

4.1 Coverage. Assume that n sensors are deployed in area. Let Sr be the sensing radius and Tr be the

transmission radius. A point Tx∈ will be covered if there is a node in the sensing state within Sr of x. In this case, an event that occurs at x will be detected. Thus, the probability that a given node is sensing and within Sr of x is ss pr 2π . Under independence assumption, the probability that no node can sense an event at x is then given by

nss pr )1( 2π− which is the probability that x is not

covered. Coverage function is defined as follows,

……(9) Now,

nss prxfp )1(]1)([ 2π−== .….(10)

Let A be the area that is not covered. Then,

∫= )(xdxfA …...(11)

Therefore, ∫ −=== n

ss prxfdxpAE )1(]1)([][ 2π …..(12)

Thus, the expected area covered is, n

ss prAE )1(1][1 2π−−=− …..(13)

then the expected coverage is given by

ss prnnss epr

2

1)1(1 2 ππ −−≥−− ..…(14)

Q ss pr 2π 1≤

4.2 Connectivity. There are two possible notions of connectivity for a sensor network. The first considers only the topology of the connectivity graph that can be derived from the sensor network. The second is a more stringent condition that also considers contention issues in the network. The existing results use the first x - Source definition, which is the tradition that will be continued, to explain most of the parts, however some heuristics for addressing the second requirement of connectivity are also presented. The goal of connectivity can be summarized as follows. The situation is illustrated in Figure 3. Assume that n sensors are deployed in area. Let Sr be the sensing

radius and Tr be the transmission radius. Suppose a sensing event fires at some position Tx∈ and it is to be transmitted to Ty ∈ . It is very important to successfully transmit the occurrence of an event with high probability for any x, y.


218

,0s 1s , 2s , 3s -sensor nodes

Tr - Transmission Radius

Sr - Sensing Radius y - Destination

A path exists from x to y if there is a sequence of nodes in the receiving state (which is the same as the sensing state) at locations for S 0, S 1,.. , S k such that the following conditions are satisfied.

p 1: ||x - 0s || ≤ Sr (X can be sensed); p 2: || is - 1−is || ≤ Tr for i=1…k

Hence the event can be transmitted from si-1 to si, and it will be received since si is in the receiving state, and

p 3: || Ks -y|| ≤ Tr

(s can transmit to y). 4.3 Minimizing Power Consumption. The main goal of this paper is to develop a systematic approach for power conservation in sensor networks. The idea is to select the available parameters in the Markov model so as to minimize the power consumption, while at the same time guaranteeing coverage and connectivity. The assumption is that the power consumption in each of the three states is given by 0λ , Tλ , Sλ . These parameters are set to particular values. These are externally supplied parameters, or functional forms which are dependent on Tr and Sr . The expected power consumption per node in steady state is then given by

E = 0λ 0p + Sλ Sp + Tλ Tp ……(15) Therefore, in order to minimize power consumption, the value of E should be minimized.

5. Simulation Study and Results The simulation is carried out for both stationary and mobile events and it shows how the nodes are deployed and scheduled, to sense and transmit the events to the base station occurring at various time intervals. NS-2 [18] is chosen as the simulation environment because it is the most widely used network simulator.

Performance analysis can be done based on percentage of power conserved (i.e., based on total power consumption per node, Amount of power conserved per node on increase in the number of nodes). The results obtained from the analysis are represented in XGRAPH [15]. 5.1 Results For Static Events. This section includes determination and comparison percentage of energy conserved in this work and previous work. Determination is based on time spend by each node in the sense; transmit, off states [1].

Total power consumption per node, Amount of power minimized per node, are represented and it is compared with the previous two works (i.e. AFECA & BECA [9] and PAM [7]). Results will be discussed below.

Total number of nodes vs total power consumption per

node

Table 1 and Figure 4 shows the comparison of this work with previous two works. In PAM, the node is scheduled only to sense state. Hence, even as the number of nodes increases the total power per node is always 100 mW. In case of AFECA and BECA, the node is scheduled to sense, transmit, off state. Table 1: No of Sensor nodes vs Total Power Consumption

Total no. of Nodes

Total Power (mW) Power Aware Routing Protocol

AFECA & BECA

Markov

10 100 67 50 20 100 67 50 30 100 67 33.33 40 100 50 25 50 100 40 20 60 100 33.33 17 70 100 28.56 14 80 100 25 12 90 100 22.22 11 100 100 19.99 10

FIGURE 3: Connectivity -Transmission of event information from x to y.


219

Figure 4: Total number of nodes Vs Total power consumption per node

Table 2: No of Sensor nodes Vs Total Power Consumption

TOTAL NUMBER OF NODES VS POWER MINIMIZED PER NODE:

Here, power minimized per node on increase in the number of nodes is compared with the previous works. The comparison is represented in tabular (Table 2) and graphical form (Figure 5).

Number of Sensor Nodes Vs Amount

of power Minimized per node

0

1

2

3

4

5

1 3 5 7 9 11 13

Number of Sensor Nodes(X 10)

Tota

l Pow

er M

inim

ized

P

er N

ode(

mW

) PAM

AFECA &BECAMarkov

Figure 5 : Total number of nodes Vs Power minimized per node

5.2 Results For Mobile Events. This section includes determination and comparison of percentage of energy conserved in this work as well as the previous work. Total power consumption per node, Amount of power minimized per node, Coverage and connectivity are represented through XGRAPH and it is compared with the previous works. (i.e. AFECA & BECA[9] and PAM[7]). Results will be discussed below: TOTAL NUMBER OF NODES VS TOTAL POWER

CONSUMPTION PER NODE

Table 3 and Figure 6 shows the comparison of this work with previous two works. In S1, the node is scheduled only to sense state. Hence, even as the number of nodes increases the total power per node is always 100 mW. In case of AFECA and BECA, the node is scheduled to sense, transmit, off state. Table 3 : No. of Nodes Vs. Total Power Consumption per node

Total no.

of nodes

Total Power (mW)

Power Aware Routing Protocol

AFECA &

BECA Markov

13 100 67 50

26 100 67 50

39 100 67 33.4

45 100 58 29

52 100 50 25

65 100 40 20

78 100 34 17

85 100 31 15

91 100 29 14

104 100 25 12

Total

no. of

nodes

Power min. per node (mW)


AFECA &

BECA Markov

10 0 0 0

20 0 0 0

30 0 0 1.6777

40 0 1.7 2.5

50 0 2.7 3.0

60 0 3.37 3.3

70 0 3.84 3.6

80 0 4.2 3.8

90 0 4.48 3.9

100 0 4.701 4.0

Number of Sensor Nodes vs Total Power Consumption(mW)

0

20

40

60

80

100

120

1 2 3 4 5 6 7 8 9 10

Number of Sensor Nodes (X10)

Tota

l Pow

er

Cons

umpt

ion(

mW

)

PAMAFECA & BECAMarkov


220

Number of Sensor nodes Vs Total

Power consumption

020406080

100120

1 2 3 4 5 6 7 8

Number of Sensor Nodes(x 10)

Total P

ower

Con

sum

ptio

n(m

W)

PAMAFMCA&BECAMarkov

Figure 6 : Total number of nodes Vs Total power consumption per node

TOTAL NUMBER OF NODES VS POWER

MINIMIZED PER NODE:

Here, power minimized per node on increase in the number of nodes is compared with the previous works. The comparison is represented in tabular and graphical form.

Table 4 shows the comparison of this work with previous two works. Figure 7 shows total number of nodes Vs power minimized per node.

Table 4 : No. of Nodes vs. Power min. per node

Figure 7 : Total number of nodes Vs Power minimized per node

TOTAL NUMBER OF NODES VS COVERAGE AND

CONNECTIVITY:

Figure 8 shows the coverage and connectivity on increase in the number of nodes. At n=13, this work provides 50% coverage and 100% connectivity. On increase in the number of nodes, Coverage and Connectivity percentage also grows higher.

Number of Sensor Nodes Vs Coverage,Connectivity(in %)

020406080

100120

1 3 5 7 9 11

Number of Sensor nodes(x 10)

Cove

rage

,con

nect

ivity

(in %

)Coverageconnectivity

Figure 8: Total number of nodes Vs Coverage

and Connectivity

From the above analysis and also from the (Figure

8), it is found that about 73% of power may be

saved using this method. 6. Conclusion

Preserving coverage and connectivity in a sensor network has been a problem that has been addressed in the past. Moreover, sensors are envisioned to be small and light-weight devices and it may not be desirable to equip them with additions like huge rechargeable batteries. This work considers a scheme that ensures coverage and connectivity in a sensor network, without the dependence on external infrastructure or complex hardware. In addition, taking advantage of the redundancy of nodes, the scheme can offer energy savings by turning off nodes that may not be required to maintain coverage. It is very obvious that significant energy is saved along with uniform decay of battery life at most of the nodes.

Total no.

of nodes

Power min. per node(mW)


AFECA &

BECA Markov

13 0 0 0

26 0 0 0

39 0 0 1.2769

52 0 1.3076 1.9230

65 0 2.0769 2.3076

78 0 2.5384 2.5385

91 0 2.9230 2.792

104 0 3.2307 2.9230

Number of Sensor Nodes Vs Amount of Power Minimized Per node

0

1

2

3

4

1 2 3 4 5 6 7 8

Number of Sensor nodes(x 10)

Am

ount

of P

ower

M

inim

ized

per

nod

e (m

W) PAM

AFECA&BECAMarkov


221

This work considered the Markov process, which runs locally at each sensor node in order to govern its operation. Each sensor node conserved its energy by switching between Sense/Receive (or) off states only until it senses an event in its proximity, after which it enters the transmit state to transmit the event information. This work shows that the power saved in each node outperforms the power saved in any other previously known protocols and this work also shows that it is possible to minimize about 73% of the power and maintain 100% coverage and connectivity. Further, simulation study also proves that it is possible to increase the life time of each sensor network by increasing the number of sensor nodes.

References [1] Malik Magdon-Ismail, Fikret Sivrikaya, Bulent Yener,

“Joint Problem of Power Optimal Connectivity and Coverage in Wireless Sensor Networks” in Proc. ACM Aug 2007.

[2] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci “Wireless sensor networks: A survey,” Computer Networks, vol. 38, no. 4, pp. 393–422, 2002.

[3] M. Stemm and R. H. Katz, “Measuring and reducing energy consumption of network interfaces in hand-held devices,” IEICE Transactions on Communications, vol. E80-B, no. 8, pp 1125–1131, August 1997.

[4] B. Krishnamachari, D. Estrin, and S. Wicker, “Impact of data aggregation in wireless sensor networks,” in Int. Workshop on DIstributed Event-Based Systems, Vienna, Austria, July 2002.

[5] Y. Xu, J. Heidemann, and D. Estrin, .Geography-informed energy conservation for ad hoc networks,. In Proceedings MOBICOM'01, 2001.

[6] Leslie Lamport, “Time, clocks, and the ordering of events in a distributed system,” Communications of the ACM, vol. 21, no. 7, pp. 558–565, July 1978.

[7] S. Singh and C. S. Raghavendra, “Power efficient MAC protocol for multihop radio networks,” in Nineth IEEE ISPIMRC98, 1998, pp. 153–157.

[8] B. Chen, K. Jamieson, H. Balakrishnan, and R. Morris, “An energy-efficient coordination algorithm for topology maintenance in ad-hoc wireless networks,,” in Proceedings of MOBICOM01, 2001.

[9] Y. Xu, J. Heidemann, and D. Estrin, “Adaptive energy conserving routing for multihop ad hoc networks,” Tech Rep. 527, USC/ISI, Los Angeles, CA, October 12 2000 http://www.isi.edu/ johnh/PAPERS/Xu00a.pdf.

[10] Y. Wei, J. Heidemann, and D. Estrin, “An energy-efficient mac protocol for wireless sensor networks,” in Proceeding INFOCOM 2002, 2002, vol. 3, pp. 1567 –1576.

[11] R. Ramanathan and R. Rosales-Hain, “Topology control of multihop wireless networks using transmit power adjustments, in Proc. IEEE INFOCOM00, 2000.

[12] S. Meguerdichian, F. Koushanfar, M. Potkonjak, and M. Srivastava, “Coverage problems in wireless sensor networks, in Proceedings of IEEE INFOCOM01, 2001.

[13] S. Shakkottai, R. Srikant, and N. Shroff, “Unreliable sensor grids: Coverage, connectivity and diameter,” in Proceedings o IEEE INFOCOM03, 2003.

[14] P. Hall, Introduction to the Theory of Coverage Processes, Joh Wiley & Sons, New York, 1988.

[15]Nor Azline Ab.Aziz Kamarulzaman Ab,AZIZ and Wan Zakish WenIsmail, “ Coverage Stragies for wireless Sensor networks”, in Proc. Of world Academy of science Engineering and Technology,Vol.38,Feb 2009,ISSN 2070 – 3740

[16] Jun Lu, Jinsu Wang and Tatsuyasud “Scalable coverage Maintennance for Dense Wireless Sensor Networks”,EURASIP Journal on Wireless communication and Networking, Volume 2007,Article ID 34758,13 Pages

[17] Bewahmed Khelifa, H.Haffaf, Mcrabti Madijd, and David Hewellyn Jones, “Monitoring Connectivity in Wireless Sensor Networks”, International journal of Future Generation Communication and Networking, Volume 2, No.2, June,2009.

[18] http://www.isi.edu/nsnam/ns/ns_doc.pdf.

Author’s Profile

A.Balamurugan received his B.E.,& M.E., degrees from Bharathiar University, Tamilnadu, India in 1991,2001 respectively. After working as a Lecturer(from 1991), in the department of Electronics and Communication Engineering, an Assistant Professor (from 2004) in the department of Computer Science and Engineering , he has been a professor in the department of

Information Technology at V.L.B. Janakiammal College of Engineering and Technology, Coimbatore, India. He was instrumental in organizing various government funded Seminars / Workshops. He has traveled Jordan to present his work in a International Conference at Jordan university of science and technology. His research interest includes Mobile Computing, Wireless Sensor Networks. He is a member of ACM (USA), ISTE, CSI, ACS.

Dr.T.Purusothaman is Assistant Professor in Department of Computer Science and Information Technology, Government College of Technology, Coimbatore, India. He obtained his Ph.D from Anna University chennai. Having 20 years of teaching experience he has 6 Journal publications and 18 papers presented in International

Conferences. His research interests are Network Security and Grid Computing. He was instrumental in completing a Department of Information Technology, NewDelhi, India funded Cryptanalysis project as an investigator.


222

S.UmaMaheswari is Senior Lecturer, in the Department of Information Technology, V.L.B. Janakiammal College of Engineering and Technology, Coimbatore, India. Previously she worked as a Lecturer in the School of Creative Computing and Engineering, Inti College Sarawak, Inti International University, Malaysia. She

obtained her B.E. degree in Electronics and Communication Engineering from Bharathidasan University, Tamilnadu, India and M.E. in Computer Science and Engineering from Anna University , Tamilnadu, India. Her research interests are Network Security and Cloud Computing.

Vol.2 No.10

Documents

Transcript of Vol.2 No.10