Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e)...
-
Upload
nigel-dighton -
Category
Documents
-
view
213 -
download
0
Transcript of Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e)...
![Page 1: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/1.jpg)
Code and Pattern Mining in C/C++
Aditya S. DeshpandeNamratha Nayak
Guides:Dr. A.Serebrenik(TU/e)
P.Kourzanov, ir(NXP)Y.Dajsuren, PDEng(Virage Logic)
![Page 2: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/2.jpg)
Agenda
• Introduction• Problem Definition• Data flow• Design patterns• Summary
/ Faculteit Wiskunde en Informatica PAGE 211-04-23
![Page 3: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/3.jpg)
Introduction
• Code mining – Process of extracting patterns from source code.
• Design Patterns – A design pattern is a general reusable solution to a commonly occurring problem in software design.
• Streaming Data - Data streaming is the transfer of data at a steady high-speed rate sufficient to support such applications as high-definition television or a radio signal.
/ Faculteit Wiskunde en Informatica PAGE 311-04-23
![Page 4: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/4.jpg)
Problem Definition
• Lack of synchronisation between models and source code.
• Significant amount of repetitive code in different modules.
• Identifying patterns and integrating them in the framework.
• Objective
/ Faculteit Wiskunde en Informatica PAGE 411-04-23
![Page 5: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/5.jpg)
Approach
• Study the Design flow models available.• Study the various design pattern matching methods
and tools.
/ Faculteit Wiskunde en Informatica PAGE 511-04-23
![Page 6: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/6.jpg)
Data flow models
• Kahn Process Networks.• Synchronous Data Flow.
/ Faculteit Wiskunde en Informatica PAGE 611-04-23
![Page 7: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/7.jpg)
Kahn Process Network - Introduction
• Processes communicate via FIFO.• Parallel communication is organized as follows
• Autonomous computing stations are connected to each other in a network by communication lines.
• A station computes on data coming on its input lines to produce output on some or all of its output lines.
• Assumptions• Communication lines are the only means of communication.
• Communication lines transmit info within a finite time.
/ Faculteit Wiskunde en Informatica PAGE 711-04-23
![Page 8: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/8.jpg)
Kahn Process Network - Introduction
• Restrictions• At any given time a computing station is either computing or
waiting for information on one of its input lines.
• Each computing station follows a sequential program.
/ Faculteit Wiskunde en Informatica PAGE 811-04-23
![Page 9: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/9.jpg)
Kahn Process Network - Example
• From Kahn’s original 1974 paper
process f(in int u, in int v, out int w){ int i; bool b = true; for (;;) { i = b ? wait(u) : wait(v); printf("%i\n", i); send(i, w); b = !b; }}
/ Faculteit Wiskunde en Informatica PAGE 911-04-23
Process alternately reads from u and v, prints the data value, and writes it to w.
Process alternately reads from u and v, prints the data value, and writes it to w.
u
vwff
![Page 10: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/10.jpg)
Kahn Process Network - Example
• From Kahn’s original 1974 paper
process f(in int u, in int v, out int w){ int i; bool b = true; for (;;) { i = b ? wait(u) : wait(v); printf("%i\n", i); send(i, w); b = !b; }}
/ Faculteit Wiskunde en Informatica PAGE 1011-04-23
Process interface includes FIFO’s.
wait() returns the next token in an input FIFO, blocking if it’s empty
send() writes a data value on an output FIFO
![Page 11: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/11.jpg)
SDF - Introduction
• Synchronous data flow graph (SDF) is a network of synchronous nodes (also called blocks).
• For a synchronous node, the consumptions and productions are known a priori.
• Homogeneous SDF
/ Faculteit Wiskunde en Informatica PAGE 1111-04-23
![Page 12: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/12.jpg)
SDF - Delay
• Delay of signal processing• Unit delay on arc between A and B, means
• nth sample consumed by B, is (n-1)th sample
produced by A.• The arc is initialized with ‘d’ zero samples.
/ Faculteit Wiskunde en Informatica PAGE 1211-04-23
A d B
![Page 13: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/13.jpg)
SDF - Implementation
• Implementation requires:
• Buffering of the data samples passing between nodes
• Schedule nodes when inputs are available
• Dynamic implementation (= runtime) requires
• Runtime scheduler checks when inputs are available and schedules nodes when a processor is free.
/ Faculteit Wiskunde en Informatica PAGE 1311-04-23
![Page 14: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/14.jpg)
SDF - Implementation
• Contribution of Lee-87:
• SDF graphs can be scheduled at compile time
• No overhead• Compiler will:
• Determine the execution order of the nodes on one or multiple processors or data path units
• Determine communication buffers between nodes.
/ Faculteit Wiskunde en Informatica PAGE 1411-04-23
![Page 15: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/15.jpg)
Design Patterns
• Describe solutions for common recurring problems• Can be used in a wider context as they are defined
informally• Documenting them in a software system simplifies
maintenance and program understanding• Usually it is not documented, so there is a need to
discover design patterns from source code
/ Faculteit Wiskunde en Informatica PAGE 1511-04-23
![Page 16: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/16.jpg)
Pattern Mining
• Structure of design pattern is searched in the source code.
• Should include the main properties of the design pattern
• Flexible to describe the slightly distorted pattern occurrences.
• Helps to understand the relationships between the different parts of a large system
/ Faculteit Wiskunde en Informatica PAGE 1611-04-23
![Page 17: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/17.jpg)
Pattern Mining
• Reverse Engineering • Analysis of a system to
− Identify the components and their interrelationships
− Create representations of the system in another form
• Why tools for Reverse Engineering?• Existing legacy code
• High number of participants in code development
• Tools developed to mine the patterns from the source code
/ Faculteit Wiskunde en Informatica PAGE 1711-04-23
![Page 18: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/18.jpg)
Pattern Mining Tools
• Aspects in the different mining tools• Programming Language : Tools for Java and C++
• Method used to discover design patterns : Graph Matching , Constraint Satisfaction Problem, pattern inference
• Intermediate Representation – Abstract Semantic Graph, Abstract Syntax Tree, Matrix and Vector
/ Faculteit Wiskunde en Informatica PAGE 1811-04-23
![Page 19: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/19.jpg)
Columbus – Design Pattern Mining Tool
• Reverse engineering framework• Developed in cooperation between the Research
Group on Artificial Intelligence in Szeged, the Software Technology Laboratory of the Nokia Research Center and FrontEndART Ltd.
• Analyze large C/C++ projects and extract data according to the Columbus Schema
• Supports project handling , data extraction , data representation, data storage, filtering and visualization
/ Faculteit Wiskunde en Informatica PAGE 1911-04-23
![Page 20: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/20.jpg)
Columbus - Design Pattern Mining Tool
• Has a C/C++ extractor plug-in that performs the parsing of the source code
• Information collected by the plug-in corresponds to the Columbus Schema
• Schema captures C++ language at low detail(i.e, Abstract Syntax Tree) and has the higher –level elements(i.e., semantics of types)
• Supports various file formats for exporting the extracted data
/ Faculteit Wiskunde en Informatica PAGE 2011-04-23
![Page 21: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/21.jpg)
Other Pattern Mining Tools
• Other tools to be studied• CPP2XMI
• Maisa
• CrocoPat
/ Faculteit Wiskunde en Informatica PAGE 2111-04-23
![Page 22: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/22.jpg)
Issues to be considered
• Can the tools support NXP source Code?• Would it be possible to add proprietary patterns to
these tools?• Can these tools be extended to support other
languages like C?
/ Faculteit Wiskunde en Informatica PAGE 2211-04-23
![Page 23: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/23.jpg)
Summary
• Overview of the Data flow models• Introduced the design pattern mining tool -
Columbus • Find the patterns present in the NXP source code
and check whether these can be mined using the available tools
/ Faculteit Wiskunde en Informatica PAGE 2311-04-23
![Page 24: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/24.jpg)
References
• E.A.Lee and D.G.Messerschmitt, “Synchronous data flow”,Proc. IEEE, vol. 75, pp. 1235-1245, Sept 1987.
• G.Kahn, “The semantics of a simple language for parallel programming”, Proc.IFIP congr., Stockholm, Sweden, Aug.1974, pp.471-475
• Gamma, E., Helm, R., Johnson, R. and Vlissides, J. Design Patterns - Elements of Reusable Object-Oriented Software. Addison-Wesley, 1995.
/ Faculteit Wiskunde en Informatica PAGE 2411-04-23
![Page 25: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/25.jpg)
References
• R. Ferenc, A. Beszedes, M. Tarkiainen, and T. Gyimothy. Columbus – Reverse Engineering Tool and Schema for C++. In Proceedings of the 6th International Conference on Software Maintenance (ICSM 2002), pages 172–181. IEEE Computer Society, Oct. 2002.
• R. Ferenc , and A. Beszedes. Data Exchange with the Columbus Schema for C++. In Proceedings of the 6th European Conference on Software Maintenance and Reengineering (CSMR 2002), pages 59–66. IEEE Computer Society, Mar. 2002.
/ Faculteit Wiskunde en Informatica PAGE 2511-04-23
![Page 26: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/26.jpg)
References
• Z. Balanyi, and R. Ferenc. Mining Design Patterns from C++ Source Code. In Proceedings of the 19th International Conference on Software Maintenance (ICSM 2003), pages 305–314. IEEE Computer Society, Sept. 2003.
/ Faculteit Wiskunde en Informatica PAGE 2611-04-23
![Page 27: Code and Pattern Mining in C/C++ Aditya S. Deshpande Namratha Nayak Guides: Dr. A.Serebrenik(TU/e) P.Kourzanov, ir(NXP) Y.Dajsuren, PDEng(Virage Logic)](https://reader035.fdocuments.in/reader035/viewer/2022062621/551c5bf2550346a5458b5262/html5/thumbnails/27.jpg)
Questions
/ Faculteit Wiskunde en Informatica PAGE 2711-04-23