Link Anamoly DEtection
-
Upload
asha-reddy -
Category
Documents
-
view
220 -
download
0
description
Transcript of Link Anamoly DEtection
![Page 1: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/1.jpg)
Table of contents
Sno TITLE Page No’s
1 ABSTRACT……………………………..
2 OVERVIEW…………………………..
2.1 Purpose of the project…………….
2.2 Existing system………………………
2.3 Proposed system…………………….
3 REQUIREMENT SPECIFICATION.............3.1 Hardware requirements…………………
3.2 Software requirements……………………
4 FEASIBILITY STUDY……………………..
4.1 Technical feasibility……………………….
4.2 Operational feasibility……………………
4.3 Economic feasibility…
5 LANGUAGE SPECIFICATION………………
5.1 Introductio to JAVA
5.2 JavaScript……………………………………
5.3 JSP……………………………
![Page 2: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/2.jpg)
5.4 Servlet…………………………………………
5.5 MySQL Database……………………………
5.6 Net beans……………………………………...
5.7 Apache Tomcat………………………………
5.8 Glassfish……………………………………….5.9 Web application……………………………
6 SYSTEM DESIGN……………………………………..
6.1 System Architecture…………………..
6.2 Data flow diagrams………………………
6.3 E-R Diagrams………………………………
6.4 UML Diagrams………………………
7 SYSTEMDESCRIPTION…………………………………
8 CODING…………………………………………………………
9 SYSTEM TESTING………………………………………….
9.1 Introduction to Testing………………………….. 9.2 Test Cases……………………………………………..
10 OUTPUT SCREENS…………………………………………
11 CONCLUSION…………………………………………………
12 BIBLIOGRAPHY……………………………………………..
![Page 3: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/3.jpg)
1.ABSTRACT
Detection of emerging topics is now receiving renewed interest
motivated by the rapid growth of social networks. Conventional-term-
frequency-based approaches may not be appropriate in this context,
because the information exchanged in social-network posts include not
only text but also images, URLs, and videos. These projects focus on
emergence of topics signalled by social aspects of these networks.
Specifically, project focus on mentions of user links between users that
are generated dynamically through replies, mentions, and retweets.
These projects recommend a probability model of the mentioning
behaviour of a social network user, and recommend detecting the
emergence of a new topic from the anomalies measured through the
model. Aggregating anomaly scores from hundreds of users, and this
project can detect emerging topics only based on the reply/mention
relationships in social-network posts. The recommend project show that
the recommend mention anomaly based approaches can detect new
topics at least as early as text-anomaly-based approaches, and in some
cases much earlier when the topic is poorly identified by the textual
contents in posts.
![Page 4: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/4.jpg)
2. OVERVIEW
Overall description consists of background of the entire specific requirement. It also gives explanation about actor which is used. It gives explanation about architecture diagram and it also gives what we are assumed and dependencies. It also support specific requirement and also it support functional requirement, supplementary requirement other than actor which is used.
2.1 PURPOSE OF THE PROJECT
Communication over social networks, such as Facebook and Twitter, is gaining its importance in our daily life. Since the information exchanged over social networks are not only texts but also URLs, images, and videos, they are challenging test beds for the study of data mining. In particular, we are interested in the problem of detecting emerging topics from social streams, which can be used to create automated “breaking news”, or discover hidden market needs or underground political movements. Compared to conventional media, social media are able to capture the earliest, unedited voice of ordinary people. Therefore, the challenge is to detect the emergence of a topic as early as possible at a moderate number of false positives.
![Page 5: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/5.jpg)
2.2 EXISTING SYSTEM
Emerging topic is something people feel like discussing, commenting, or
forwarding the information further to their friends. Conventional
approaches for topic detection have mainly been concerned with the
frequencies of textual words.
DISADVANTAGES OF EXISTING SYSTEM:
A term-frequency-based approach could suffer from the ambiguity
caused by synonyms or homonyms.
It may also require complicated pre-processing depending on the
target language.
Moreover, it cannot be applied when the contents of the messages
are mostly non-textual information.
On the other hand, the “words” formed by mentions are unique,
require little pre-processing to obtain and are available regardless
of the nature of the contents.
![Page 6: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/6.jpg)
PROPOSED SYSTEM:
Recommended system proposed a new approach to detect the
emergence of topics in a social network stream.
The basic idea of this project is to focus on the social aspect of the
posts reflected in the mentioning behaviour of users instead of the
textual contents.
There is a probability model that captures both the number of
mentions per post and the frequency of mentionee.
ADVANTAGES OF PROPOSED SYSTEM:
The recommended method does not rely on the textual contents of
social network posts, it is robust to rephrasing and it can be applied
to the case where topics are concerned with information other than
texts, such as images, video, audio, and so on.
The link-anomaly-based methods performed even better than the
keyword-based methods on “NASA” and “BBC” data sets.
![Page 7: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/7.jpg)
3.REQUIREMENT SPECIFICATION
3.1 HARDWARE REQUIREMENTS
The hardware used for the development of project is:
System : Pentium IV 2.4 GHz.
Hard Disk : 40 GB.
Floppy Drive : 1.44 Mb.
Monitor : 15 VGA Colour.
Mouse : Logitech.
Ram : 512 Mb.
3.2 SOFTWARE REQUIREMENTS
The software used for the development of project is:
Operating system : Windows XP/7.
Language : JAVA
Front End : Jsp,Servlet,JavaScript
IDE : Netbeans 7.0
Application Server : Apache Tomcat 7.0/Glassfish
Back End : MYSQL 5.5
![Page 8: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/8.jpg)
4. FEASIBILITY STUDY
Feasibility study is a process which defines exactly what a project is
and what strategic issues need to be considered to assess its feasibility,
or likelihood of succeeding. Feasibility studies are useful both when
starting a new business, and identifying a new opportunity for an
existing business. Ideally, the feasibility study process involves making
rational decisions about a number of enduring characteristics of a
project, including:
Technical feasibility- do we’ have the technology’? If not, can we
get it?
Operational feasibility- do we have the resources to build the
system? Will the system be acceptable? Will people use it?
Economic feasibility, technical feasibility, schedule feasibility, and
operational feasibility- are the benefits greater than the costs?
4.1 TECHNICAL FEASIBILITY
Technical feasibility is concerned with the existing computer
system (Hardware, Software etc.) and to what extend it can support the
proposed addition. For example, if particular software will work only in
a computer with a higher configuration, an additional hardware is
![Page 9: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/9.jpg)
required. This involves financial considerations and if the budget is a
serious constraint, then the proposal will be considered not feasible.
4.2 OPERATIONAL FEASIBILITY
Operational feasibility is a measure of how well a proposed system
solves the problems, and takes advantages of the opportunities identified
during scope definition and how it satisfies the requirements identified
in the requirements identified in the requirements analysis phase of
system development.
4.3 ECONOMIC FEASIBILITY
Economic analysis is the most frequently used method for
evaluating the effectiveness of a candidate system. More commonly
known as cost/ benefit analysis, the procedure is to determine the
benefits and savings that are expected from a candidate system and
compare them with costs. If benefits outweigh costs, then the decision
is made to design and implement the system.
![Page 10: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/10.jpg)
5. LANGUAGE SPECIFICATIONS
5.1 INTRODUCTION TO JAVA:
Java is a general-purpose computer programming language that is
concurrent, class-based, object-oriented, and specifically designed to
have as few implementation dependencies as possible. It is intended to
let application developers "write once, run anywhere", meaning that
code that runs on one platform does not need to be recompiled to run on
another. Java applications are typically compiled to byte code that can
run on any Java virtual machine (JVM) regardless of computer
architecture. Java is, as of 2014, one of the most popular programming
languages in use, particularly for client-server web applications, with a
reported 9 million developers. Java was originally developed by James
Gosling at Sun Microsystems and released in 1995 as a core component
of Sun Microsystems' Java platform. The language derives much of its
syntax from C and C++, but it has fewer low-level facilities than either
of them.
The original and reference implementation Java compilers, virtual
machines, and class libraries were originally released by Sun under
proprietary licences. As of May 2007, in compliance with the
specifications of the Java Community Process, Sun relicensed most of its
Java technologies under the GNU General Public License.
![Page 11: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/11.jpg)
The Java compiler
When you program for the Java platform, you write source code
in .java files and then compile them. The compiler checks your code
against the language's syntax rules, then writes out byte codes in .class
files. Byte codes are standard instructions targeted to run on a Java
virtual machine. In adding this level of abstraction, the Java compiler
differs from other language compilers, which write out instructions
suitable for the CPU chipset the program will run on.
![Page 12: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/12.jpg)
The JVM
At run time, the JVM reads and interprets .class files and executes
the program's instructions on the native hardware platform for which the
JVM was written. The JVM interprets the byte codes just as a CPU
would interpret assembly-language instructions. The difference is that
the JVM is a piece of software written specifically for a particular
platform. The JVM is the heart of the Java language's "write-once, run-
anywhere" principle. Your code can run on any chipset for which a
suitable JVM implementation is available. JVMs are available for major
platforms like Linux and Windows, and subsets of the Java language
have been implemented in JVMs for mobile phones and hobbyist chips.
The Garbage Collector
Rather than forcing you to keep up with memory allocation the
Java platform provides memory management out of the box. When your
Java application creates an object instance at run time, the JVM
automatically allocates memory space for that object from the heap,
which is a pool of memory set aside for your program to use. The Java
garbage collector runs in the background, keeping track of which objects
the application no longer needs and reclaiming memory from them. This
approach to memory handling is called implicit memory management
because it doesn't require you to write any memory-handling code.
![Page 13: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/13.jpg)
Garbage collection is one of the essential features of Java platform
performance.
The Java Development Kit
When you download a Java Development Kit you get in addition
to the compiler and other tools a complete class library of prebuilt
utilities that help you accomplish just about any task common to
application development. The best way to get an idea of the scope of the
JDK packages and libraries is to check out the JDK API documentation.
The Java Runtime Environment
The Java Runtime Environment includes the JVM, code libraries,
and components that are necessary for running programs written in the
Java language. It is available for multiple platforms. You can freely
redistribute the JRE with your applications, according to the terms of the
JRE license, to give the application's users a platform on which to run
your software. The JRE is included in the JDK.
Features Of Java Language
Java has so many features which are as follows:
Java is Simple
There are various features that makes the java as a simple
language. because Java is easy to learn and developed by taking the best
features from other languages mainly like C and C++. It is very easy to
learn Java who have knowledge of object oriented programming
![Page 14: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/14.jpg)
concepts. Java provides the error free development environment for
programmer because it provide automatic memory management by
development environment and eliminate pointers.
Java is Platform Independent
Java provides the facility to "Write once -Run anywhere". Not
even a single language is idle to this feature but java is closer to this
feature. Java Provide the facility of cross-platform programs by
compiling in intermediate code known as byte code. This byte code can
be interpreted on any system which has Java Virtual Machine.
Java is Object-oriented
The object oriented language must support the characteristics of
the OOPs. And Java is a fully object oriented language. it supports all
the characteristics needed to be object oriented. In the Java everything is
treated as objects to which methods are applied. As the languages like
Objective C, C++ fulfills the above four characteristics yet they are not
fully object oriented languages because they are structured as well as
object oriented languages. But in case of java, it is a fully Object
Oriented language because object is at the outer most level of data
structure in java. No stand alone methods, constants, and variables are
there in java. Everything in java is object even the primitive data types
can also be converted into object by using the wrapper class.
![Page 15: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/15.jpg)
`Java is distributed
The widely used protocols like HTTP and FTP are
developed in java. Internet programmers can call functions on these
protocols and can get access the files from any remote machine on the
internet rather than writing codes on their local system.
Java is Secure
Java does not use memory pointers explicitly. All the programs in
java are run under an area known as the sand box. Security manager
determines the accessibility options of a class like reading and writing a
file to the local disk. Java uses the public key encryption system to allow
the java applications to transmit over the internet in the secure encrypted
form. The bytecode Verifier checks the classes after loading.
1. No memory pointers
2. Programs run inside the virtual machine sandbox.
3. Array index limit checking
Java is compiled and interpreted
We all know that in Java code is compiled to byte codes that are interpreted
by Java virtual machines (JVM). This provides portability to any machine for
which a virtual machine has been written. The interpreter program reads the source
code and translates it on the fly into computations. The two steps of compilation
and interpretation allow for extensive code checking and improved security.
![Page 16: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/16.jpg)
Java is Robust
Java has the strong memory allocation and automatic garbage
collection mechanism. It carries out type checking at both compile and
runtime making sure that every data structure has been clearly defined
and typed. compiler checks the program for any error and interpreter
checks any run time error that every data structure is clearly defined and
typed. Java manages the memory automatically by using an automatic
garbage collector. All the above features make Java language robust.
Java is Portable
The feature of java "write once -run any where" make java
portable. Many type of computers and operating systems are used for
programs By porting an interpreter for the Java Virtual Machine to any
computer hardware/operating system, one is assured that all code
compiled for it will run on that system. This forms the basis for Java's
portability.
5.2 JavaScript:
A dynamic computer programming language. It is most commonly
used as part of web browsers, whose implementations allow client-side
scripts to interact with the user, control the browser, communicate
asynchronously, and alter the document content that is displayed. It is
also used in server-side network programming with frameworks such as
![Page 17: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/17.jpg)
Node.js, game development and the creation of desktop and mobile
applications.
JavaScript is classified as a prototype-based scripting language with
dynamic typing and first-class functions. This mix of features makes it a
multi-paradigm language, supporting object-oriented, imperative, and
functional programming styles.
Despite some naming, syntactic, and standard library similarities,
JavaScript and Java are otherwise unrelated and have very different
semantics. The syntax of JavaScript is actually derived from C, while the
semantics and design are influenced by Self and Scheme programming
languages. JavaScript is also used in environments that aren't web-based,
such as PDF documents, site-specific browsers, and desktop widgets.
Newer and faster JavaScript virtual machines and platforms built upon
them have also increased the popularity of JavaScript for server-side
web applications. On the client side, JavaScript has been traditionally
implemented as an interpreted language, but more recent browsers
perform just-in-time compilation.
![Page 18: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/18.jpg)
5.3 JSP
JavaServer Pages (JSP) is a server-side programming technology
that enables the creation of dynamic, platform-independent method for
building Web-based applications. JSP have access to the entire family of
Java APIs, including the JDBC API to access enterprise databases.JSP
may be viewed as a high-level abstraction of Java servlets. JSPs are
translated into servlets at runtime; each JSP servlet is cached and re-used
until the original JSP is modified. JSP can be used independently or as
the view component of a server-side model–view–controller design,
normally with JavaBeans as the model and Java as the controller. This is
a type of Model 2 architecture.JSP allows Java code and certain pre-
defined actions to be interleaved with static web markup content, with
the resulting page being compiled and executed on the server to deliver a
document. The compiled pages, as well as any dependent Java libraries,
use Java byte code rather than a native software format. Like any other
Java program, they must be executed within a Java virtual machine that
integrates with the server's host operating system to provide an abstract
platform-neutral environment.
![Page 19: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/19.jpg)
JSPs are usually used to deliver HTML and XML documents, but
through the use of OutputStream, they can deliver other types of data as
well.
The Web container creates JSP implicit objects like pageContext,
servletContext, session, request & response.
A JavaServer Pages compiler is a program that parses JSPs, and
transforms them into executable Java Servlets. A program of this type is
usually embedded into the application server and run automatically the
first time a JSP is accessed, but pages may also be precompiled for
better performance, or compiled as a part of the build process to test for
errors.
![Page 20: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/20.jpg)
5.4 Servlet:
A Servlet is basically a Java Program that executes within a Web server
or an Application Server, acting as a middle layer between requests sent
from a web client and a database on the HTTP server. By use of
Servlets, you can dynamically come up with web pages, obtain
information from users through web forms and display records from a
database.
Servlets are most often used to:
1. Process or store data that was submitted from an HTML form.
2. Provide dynamic content such as the results of a database query
3. Manage state information that does not exist in the stateless HTTP
protocol, such as filling the articles into the shopping cart of the
appropriate customer.
With that in mind, a Servlet is a Java class that complies to the Java
Servlet API. This API is the standard for executing Java classes that
![Page 21: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/21.jpg)
respond to requests. Javax.servlet.http is a package that specifies HTTP
specific subclasses for the communication of the Servlet and the Servlet
container. Therefore, you can use a Servlet to establish dynamic content
to a web server through the Java platform. The dynamic content
generated is usually HTML but it may be in other forms such as XML.
Servlets can also be used to maintain state in session variable through
the use of HTTP cookies or URL rewriting. Servlets are usually
packaged in a WAR file.
5.5 MySQL Database
MySQL is the most popular Open Source Relational SQL
database management system. MySQL is one of the best RDBMS being
used for developing web-based software applications.
A Relational Database Management System is software that:
Enables you to implement a database with tables, columns and
indexes.
Guarantees the Referential Integrity between rows of various
tables.
Updates the indexes automatically.
Interprets an SQL query and combines information from various
tables.
![Page 22: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/22.jpg)
RDBMS Terminology:
Before we proceed to explain MySQL database system, let's revise few
definitions related to database.
Database: A database is a collection of tables, with related data.
Table: A table is a matrix with data. A table in a database looks
like a simple spreadsheet.
Column: One column (data element) contains data of one and the
same kind, for example the column postcode.
Row: A row (= tuple, entry or record) is a group of related data, for
example the data of one subscription.
Redundancy: Storing data twice, redundantly to make the system
faster.
Primary Key: A primary key is unique. A key value cannot occur
twice in one table. With a key, you can find at most one row.
Foreign Key: A foreign key is the linking pin between two tables.
Compound Key: A compound key (composite key) is a key that
consists of multiple columns, because one column is not
sufficiently unique.
Index: An index in a database resembles an index at the back of a
book.
Referential Integrity: Referential Integrity makes sure that a
foreign key value always points to an existing row.
![Page 23: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/23.jpg)
MySQL is a fast, easy-to-use RDBMS being used for many small
and big businesses. MySQL is developed, marketed, and supported by
MySQL AB, which is a Swedish company. MySQL is becoming so
popular because of many good reasons:
MySQL is released under an open-source license. So you have
nothing to pay to use it.
MySQL is a very powerful program in its own right. It handles a
large subset of the functionality of the most expensive and
powerful database packages.
MySQL uses a standard form of the well-known SQL data
language.
MySQL works on many operating systems and with many
languages including JAVA, etc.
MySQL works very quickly and works well even with large data
sets.
MySQL is very friendly to PHP, the most appreciated language for
web development.
MySQL supports large databases, up to 50 million rows or more in
a table. The default file size limit for a table is 4GB, but you can
increase this.
MySQL is customizable. The open-source GPL license allows
programmers to modify the MySQL software to fit their own
specific environments.
![Page 24: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/24.jpg)
Net beans:
NetBeans is an integrated development environment for
developing primarily with Java, but also with other languages, in
particular PHP, C/C++, and HTML5.It is also an application
platform framework for Java desktop applications and others.
The NetBeans IDE is written in Java and can run on Windows, OS
X, Linux, Solaris and other platforms supporting a compatible
JVM. The NetBeans Platform allows applications to be developed
from a set of modular software components called modules.
Applications based on the NetBeans can be extended by third party
developers. The NetBeans Team actively support the product and
seek feature suggestions from the wider community.
Apache Tomcat:
Apache Tomcat is an open source web server and servlet container
developed by the Apache Software Foundation. Tomcat
implements several Java EE specifications including Java Servlet,
JavaServer Pages (JSP), Java EL, and WebSocket, and provides a
"pure Java" HTTP web server environment for Java code to run in.
![Page 25: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/25.jpg)
Glassfish:
GlassFish is an open-source application server project started by
Sun Microsystems for the Java EE platform and now sponsored by
Oracle Corporation. The supported version is called Oracle GlassFish
Server.
GlassFish is the reference implementation of Java EE and as such
supports Enterprise JavaBeans, JPA, JavaServer Faces, JMS, RMI,
JavaServer Pages, servlets, etc. This allows developers to create
enterprise applications that are portable and scalable, and that integrate
with legacy technologies. Optional components can also be installed for
additional services.
Web Application:
It has also added user- as well as system-based web applications
enhancement to add support for deployment across the variety of
environments. It also tries to manage sessions as well as applications
across the network.
Tomcat is building additional components. A number of additional
components may be used with Apache Tomcat. These components may
be built by users should they need them or they can be downloaded from
one of the mirrors.
![Page 26: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/26.jpg)
6. SYSTEM DESIGN
6.1 SYSTEM ARCHITECTURE
![Page 27: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/27.jpg)
6.2 DATA FLOW DIAGRAMS
LEVEL 0:
![Page 28: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/28.jpg)
LEVEL 1:
LEVEL 2:
![Page 29: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/29.jpg)
6.3 ER DIAGRAM
![Page 30: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/30.jpg)
Username Password
Admin Find Emerging Topic
Detect anomaly
Post Content
Comment ID
Comment
Post Date
Comments
Username
Post Content
Post ID
PostPost Date
Post ContentPost ID
Post
Post Date
Make Friend
Write Post
UserNameFriend
Friend name
Friend ID
Username
Username
Password
![Page 31: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/31.jpg)
Use case diagram:
![Page 32: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/32.jpg)
Class diagram:
![Page 33: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/33.jpg)
Sequence diagram:
![Page 34: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/34.jpg)
Activity diagram:
![Page 35: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/35.jpg)
7. SYSTEM DESCRIPTION
Event Detection Streams Event Description Module User Profiling In Social Media Kleinberg’s Burst-Detection Method Data Set.
1. Event Detection Streams
Microblogs have become an important source for reporting real-world
events. A real-world occurrence reported in microblogs is also called a
social event. Social events may hold critical materials that describe the
situations during a crisis. In real applications, such as crisis management
and decision making, monitoring the critical events over social streams
will enable watch officers to analyze a whole situation that is a
composite event, and make the right decision based on the detailed
contexts such as what is happening, where an event is happening, and
who are involved. Although there has been significant research effort on
detecting a target event in social networks based on a single source, in
crisis, we often want to analyze the composite events contributed by
different social users. So far, the problem of integrating ambiguous
views from different users is not well investigated. To address this issue,
we propose a novel framework to detect composite social events over
streams, which fully exploits the information of social data over multiple
![Page 36: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/36.jpg)
dimensions. Specifically, we first propose a graphical model called
location-time constrained topic (LTT) to capture the content, time, and
location of social messages. Using LTT, a social message is represented
as a probability distribution over a set of topics by inference, and the
similarity between two messages is measured by the distance between
their distributions. Then, the events are identified by conducting efficient
similarity joins over social media streams. To accelerate the similarity
join, we also propose a variable dimensional extendible hash over social
streams. We have conducted extensive experiments to prove the high
effectiveness and efficiency of the proposed approach.
2. Event description module
The rise of Social Media services in the last years has created huge
streams of information that can be very valuable in a variety of
scenarios. What precisely these scenarios are and how the data streams
can efficiently be analyzed for each scenario is still largely unclear at
this point in time and has therefore created significant interest in
industry and academia. In this paper, we describe a novel algorithm for
geo-spatial event detection on Social Media streams. We monitor all
posts on Twitter issued in a given geographic region and identify places
that show a high amount of activity. In a second processing step, we
analyze the resulting spatio-temporal clusters of posts with a
![Page 37: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/37.jpg)
MachineLearning component in order to detect whether they constitute
real-world events or not. We show that this can be done with high
precision and recall. The detected events are finally displayed to a user
on a map, at the location where they happen and while they happen.
3. User profiling in social media
A user profile is a visual display of personal data associated with a
specific user, or a customized desktop environment. A profile refers
therefore to the explicit digital representation of a person's identity. A
user profile can also be considered as the computer representation of
user .A profile can be used to store the description of the characteristics
of person. This information can be exploited by systems taking into
account the persons' characteristics and preferences. Profiling is the
process that refers to construction of a profile via the extraction from a
set of data. User profiles can be found on operating systems, computer
programs, recommender systems, or dynamic websites (such as online
social networking sites or bulletin boards).
A social networking service is a platform to build social
networks or social relations among people who share interests, activities,
backgrounds or real-life connections. A social network service consists
of a representation of each user (often a profile), his or her social links,
and a variety of additional services. Social networks are web-based
services that allow individuals to create a public profile, to create a list
![Page 38: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/38.jpg)
of users with whom to share connections, and view and cross the
connections within the system. Most social network services are web-
based and provide means for users to interact over the Internet, such
as e-mail and instant messaging. Social network sites are varied and they
incorporate new information and communication tools such as mobile
connectivity, photo/video/sharing and blogging. Online
community services are sometimes considered as a social network
service, though in a broader sense, social network service usually means
an individual-centered service whereas online community services are
group-centered. Social networking sites allow users to share ideas,
pictures, posts, activities, events, interests with people in their network.
A social network is a social structure made up of a set
of social actors (such as individuals or organizations) and a set of
the dyadic ties between these actors. The social network perspective
provides a set of methods for analyzing the structure of whole social
entities as well as a variety of theories explaining the patterns observed
in these structures.[1] The study of these structures uses social network
analysis to identify local and global patterns, locate influential entities,
and examine network dynamics.
4. Kleinberg’s Burst-Detection Method
In addition to the change-point detection based on SDNML
followed by DTO described in previous sections, we also test the
![Page 39: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/39.jpg)
combination of our method with Kleinberg’s burst-detection method.
More specifically, we implemented a two-state version of Kleinberg’s
burst detection model. The reason we chose the two-state version was
because in this experiment we expect no The proposed link-anomaly-
based change-point detection is highly scalable. Every step described in
the previous subsections requires only linear time against the length of
the analyzed time period. Computation of the predictive distribution for
the number of mentions can be computed in linear time against the
number of mentions. Computation of the predictive distribution for the
mention probability and can be efficiently performed using a hash table.
Aggregation of the anomaly scores from different users takes linear time
against the number of users, which could be a computational bottle neck
but can be easily parallelized. SDNML-based change-point detection
requires two swipes over the analyzed time period. Kleinberg’s burst-
detection method can be efficiently implemented with dynamic
programming.
5. Data set.
This data set is related to the recent leakage of some confidential
video by the Japan Coastal Guard officer. The keyword used in the
keyword-based methods was “Senkaku.” the results of link-anomaly
based change detection and burst detection, respectively. Text-anomaly-
based change detection and burst detection, respectively. This data set is
related to a controversial post by a famous person in Japan that “the
![Page 40: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/40.jpg)
reason students having difficulty finding jobs is, because they are
stupid” and various replies to that post. The keyword used in the
keyword-based methods was “Job hunting.” The four data sets we
collected are called “Job hunting”, “Youtube”, “NASA”, “BBC” and
each of them corresponds to a user organized list in Togetter.
For each list, we extracted a list of Twitter users that appeared in
the list, and collected Twitter posts from those users. Number of
participants and the number of posts we collected for each data set. Note
that we collected Twitter posts up to 30 days before the time period of
interest for each user; thus, the number of posts we analyzed was much
larger than the number of posts listed in Togetter. This data set is related
to the discussion among Twitter users interested in astronomy that
preceded NASA’s press conference about discovery of an arsenic-eating
organism. This data set is related to angry reactions among Japanese
Twitter users against a BBC comedy show that asked “who is the
unluckiest person in the world” (the answer is a Japanese man who got
hit by nuclear bombs in both Hiroshima and Nagasaki but survived).
![Page 41: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/41.jpg)
![Page 42: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/42.jpg)
![Page 43: Link Anamoly DEtection](https://reader035.fdocuments.in/reader035/viewer/2022062305/5695d0161a28ab9b0290eaca/html5/thumbnails/43.jpg)