Corpus of Sarada Inscription of Kashmir - Dr. B.K. Kaul Deambi
Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University...
-
Upload
renee-vipond -
Category
Documents
-
view
214 -
download
2
Transcript of Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University...
![Page 1: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/1.jpg)
Data Mining in Spatial Data Sets
Hemant Kumar Jerath,B.Tech.
MS Project Student
Mangalore University
Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam
CSRE, IIT Bombay
![Page 2: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/2.jpg)
• Data Management System
• Data Mining-Concepts, Algorithms & Tasks
• Data Warehouse
• OLAP(On-line Analytical Processing)
• Knowledge Discovery Process
• Spatial Data Warehouse & OLAP
• Spatial Data Mining – Concept & Definition
• Case Studies - Data Mining Software
• Spatial Data Mining- Software Architecture
Contents
![Page 3: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/3.jpg)
Data Base Management System
Data warehouse
OLAP
SQL QUERY INTERFACE
OUTPUT/Knowledge Explicit/Trivial Knowledge
![Page 4: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/4.jpg)
Data Mining techniques has an answer to explore the implicit knowledge.
DBMS Vs. Data Mining?
DBMS: sql driven exploration
Data Mining: automatic exploration
![Page 5: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/5.jpg)
Data Mining
Definition:
Data Mining is analysis of (often large) observational data sets to find implicit relationships and to summarize the data in a novel ways that are both understandable and useful to the data owner.[Hand, et al]
![Page 6: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/6.jpg)
Keywords in Definition
• the large data sets• observational data:opposed to the experimental data
• relationship and summaries- referred as model and patterns
– e.g. linear equations, rules, clusters, graphs, tree structures and recurrent patterns in the time series.
![Page 7: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/7.jpg)
Data Tombs
Golden NuggetsDATA MINING
Implicit Knowledge
Transform your data to critical knowledge
![Page 8: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/8.jpg)
Data Mining
Information Theory
Machine Learning
Artificial Intelligence
Data Mining – A CONFLUENCE of multi disciplines
Statistics
![Page 9: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/9.jpg)
Knowledge Discovery Process(KDD)
Phase of real discovery
![Page 10: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/10.jpg)
Data Preprocessing• Data Cleaning
– Missing values– Noisy data
• Binning
• Clustering
• Combined computer and human interaction
• Regression
– Inconsistent data
• Data Integration and Transformation– Data Integration– Data Transformation
![Page 11: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/11.jpg)
• Data Transformation– Smoothening
– Aggregation
– Generalization
– Normalization
– Attribute Construction
• Data Reduction– Data Cube aggregation
– Dimension reduction
– Data Compression
– Numerosity reduction
– Discretization and concept hierarchy generation
…Continued
![Page 12: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/12.jpg)
Data Warehouse
Definition:
A data warehouse is a
subject oriented
Integrated (heterogeneous sources)
time variant
and non-volatile
collection of data in support of management
decision making process [W.H.Inmon]
![Page 13: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/13.jpg)
STAR SCHEMA
![Page 14: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/14.jpg)
SNOWFLAKE SCHEMA
![Page 15: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/15.jpg)
[address, time, item] cell<Canada, Q1, TV>
Data Cube Technology
![Page 16: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/16.jpg)
OLAP Operations• Roll Up(Drill-up): summarize data
climbs up hierarchy or by dimension reduction• Drill Down(roll down): reserve of roll-up
from higher level summary to lower summary or detailed data or introducing new dimensions
• Slice and dice:project and select
• Pivot(rotate):
reorient the cube, visualization, 3D to series of 2D planes• Other operations
drill across: involving(across) more than one fact table
drill through: through the bottom level of the cube to its back-end relational tables(using SQL)
![Page 17: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/17.jpg)
Drill Down OperationRoll Up Operation
![Page 18: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/18.jpg)
Mining technology todayData
warehousePreprocessing
utilities Mining
operations
VisualizationTools
Vendors(IDC 1999)– SAS: 29%– SPSS: 13.5%– IBM: 6%
Extract data via ODBC •Sampling
•Attribute transformation
Scalable algorithms• association• classification• clustering• sequence mining
![Page 19: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/19.jpg)
Data Mining Algorithms
Definition:
A data mining algorithm is a well-defined procedure that takes data as input and produces output in the form of models or patterns.
![Page 20: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/20.jpg)
Data Mining Algorithms
Reductionist approach:
A data mining algorithm can be thought of as a 'tuple' consisting of:
{model structure, score function, search method, data management techniques}
![Page 21: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/21.jpg)
CART B.P A Priori
Task Classification & regression
Regression Rule Pattern Discovery
Structure Decision Tree Neural Network
Association Rules
Score function
Cross Validated Loss
function
Squared Error Support/
Accuracy
Search Method
Greedy search over structures
Gradient Descent on parameters
Breadth First with Pruning
DMT* unspecified unspecified Linear Scans
* Data Management Technique
![Page 22: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/22.jpg)
• So eventually, we can generate potentially infinite number of algorithms by combining different;
model structure score function search methods and data management techniques
![Page 23: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/23.jpg)
Data Mining Task-Taxonomy
• Prediction: use of some variables to predict own known or future values of variables– Classification, regression and deviation detection
• Description: Find human interpretable patterns that describe the data– Clustering, association rule discovery, sequential rule
discovery
![Page 24: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/24.jpg)
• Verification Model: affirm or negate the hypothesis( an iterative process, progressing refinement of hypothesis)
• Discovery Driven Model: system automatically finds the information
Data Mining Task-Taxonomy
![Page 25: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/25.jpg)
Mining operationsClassification
• Regression
• Classification trees
• Neural networks
• Bayesian learning
• Nearest neighbor
• Radial basis functions
• Support vector machines
• Meta learning methods– Bagging,boosting
Clustering
• hierarchical
• EM
• density based
Sequence mining• Time series
similarity • Temporal patterns
Item set mining• Association rules• Causality
Sequential classification• Graphical models
– Hidden Markov Models
![Page 26: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/26.jpg)
Mining Tasks
• Discovery of Association rule
X=>Y(s%,c%)
S- support
C- confidence
![Page 27: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/27.jpg)
......Continued
Clustering
Criteria: i. Available similarity
ii. Set function (optimizing technique)
Land-use: Finding the similar areas under the land use in a earth observation database
City-Planning: Identifying a group of houses according to their house type, value and geographic location
![Page 28: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/28.jpg)
![Page 29: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/29.jpg)
• Classification– Finding rules to partition data into disjoint
groups
......Continued
![Page 30: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/30.jpg)
Classification• Given old data about customers and payments,
predict new applicant’s loan eligibility.
AgeSalaryProfessionLocationCustomer type
Previous customers Classifier Decision rules
Salary > 5 L
Prof. = Exec
New applicant’s data
Good/bad
![Page 31: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/31.jpg)
Classification Vs Clustering
• Clustering: methods generate the class labels. [descriptive]
• Classification: allocation of class labels to the data thru classifier.[predictive]
![Page 32: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/32.jpg)
Frequent Episodes
• Sequence of events occur frequently
• these mainly used for the temporal data.
![Page 33: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/33.jpg)
Deviation detection
• Identification of outliers
![Page 34: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/34.jpg)
Sequence Mining
• Sequence of occurrence of the associative rules.
![Page 35: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/35.jpg)
Spatial Data Mining
![Page 36: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/36.jpg)
Spatial Data Mining
Definition:
Spatial data mining is an extraction of implicit knowledge, spatial relationships, or other interesting patterns not explicitly stored in the databases.
![Page 37: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/37.jpg)
What is the difference between Data Mining and spatial data mining?
• Data Mining: – non-spatial attribute
• Spatial Data Mining: – Integration of both spatial and non-spatial
dimension in various KDD algorithms• Spatial attribute (use of thematic maps)
• Non-spatial attribute (relational database)
![Page 38: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/38.jpg)
Spatial Data Models
• Raster Model: pixel data sets
• Vector Model: point, line, polygon objects
![Page 39: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/39.jpg)
Fundamental Operations used to vector data sets
• Spatial Relations with neighbors is an imp. Aspect of Spatial Data Mining– distance between the points– area of the object (a polygon)– length of the chain or polygon– intersection or the union of the objects– mutual position of objects( they can intersect,
overlap or touch)
![Page 40: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/40.jpg)
SOLAP
ARC SDE
DATA MINING
SPATIAL AND NON-SPATIALDATAWAREHOUSE
Attributedata
Shape files
![Page 41: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/41.jpg)
Spatial Warehouse and OLAP
Definition:
The Spatial Data Warehouse is a
subject oriented,
integrated,
time variant and
non-volatile
collection of both spatial and non-spatial data in
support of managements decision making process.
![Page 42: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/42.jpg)
SOLAP and SDW-Issues
• Spatial Data format– Structure specific– Vendor specific
• OLAP processing– Spatial indexing– Accessing methods
![Page 43: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/43.jpg)
• Spatial data Cube Model– Use of spatial dimensions in the cube.
• Star/Snowflake Model
Construction of Spatial Warehouse and OLAP
![Page 44: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/44.jpg)
Star Model of a spatial data warehouse: BC_weather
![Page 45: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/45.jpg)
Agriculture
Cash Crop Grains
Fruits vegetation Rice wheat
mango kiwi Kale tomato jasmine basmati
Concept Hierarchies
![Page 46: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/46.jpg)
G_close_to
Not_disjoint Close_to
Intersects Inside Contains Equal
Adjacent_to intersects covers contains
The hierarchy of topological relations
![Page 47: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/47.jpg)
Modeling dimension-Spatial Data Cube
• Non-spatial Dimension– temp. , precipitation with generalization hot,
wet
• Spatial to Non-Spatial– pacific_northwest, big_state
• Spatial to Spatial dimension
![Page 48: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/48.jpg)
What we can measure in spatial data cube?
• Numerical measure– e.g monthly revenue of the region, and roll up
may get total revenue of the region
• Spatial Measure– collection of pointers to the spatial objects– generalization (roll-up), regions of the same
temperature and precipitation are grouped together.
![Page 49: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/49.jpg)
Spatial Data Mining: A Database ApproachMartin Ester, Hans-Peter Kriegel, Jorg Sander
• Step I: Discover centers based on some non-spatial attribute[clustering-descriptive mining]
• Step II: determine the (theoretical) trend of some non-spatial attribute.
• Step III: discover the deviation of the theoretical trends
• Step IV: explain the deviation by the spatial object, e.g. may be presence of some infrastructure.
![Page 50: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/50.jpg)
![Page 51: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/51.jpg)
![Page 52: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/52.jpg)
Associations looks like this!!
• Spatial Association rules
Is_a(X,school) ^ close_to(X,sports centre)=>close_to(X,parks) [.5%,80%]
Is_a(X,city)^within(X,bc)^adjacent_to(X,water)=>close_to(X,border). (.5,92%)
Predicates like:
Close_to, far_away
Intersect, overlap and disjoint
Left_of, west_of
![Page 53: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/53.jpg)
……Continued
• Introduction to:– neighborhood graphs– neighborhood paths
• The predicate neighbor may be one of the neighborhood relations:
![Page 54: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/54.jpg)
Top-Down Deepening Approach
Large PatternsStrong Implication
At coarse details
Search to low level details
Progressive Deepening
Search Continues till no large patterns are not found.
![Page 55: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/55.jpg)
Top-Down Deepening Approach
• Optimization technique is that the search for large patterns at high concept level– R-tree or plane sweep techniques operating on
MBR(minimum bounding rectangles)
![Page 56: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/56.jpg)
Generalization-based Spatial Data Mining
• nonspatial-dominant generalizations – (-9C,-10-0C) COLD (attribute induction)
• Spatial-dominant generalization– Quad-tree and R-trees (attribute induction)
![Page 57: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/57.jpg)
Spatial Clustering
• Clustering algorithms can be applied to discover centers of high economic power.– DBSCAN
– PAM, CLARA, CLARANS(spatial data dominant clustering and non-spatial data dominant clustering)
– CLARANS(-neighbor graphs)
– DBLEARN on non-spatial
![Page 58: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/58.jpg)
Spatial Classification
• Non-spatial attribute e.g. no. of salespersons in a store
• Spatially related attribute with non-spatial values, like population living within 1km from store
• Spatial predicates, like – Distance_less_than_10km(X,a)
• Spatial function, like driving_distance(X,beach)
![Page 59: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/59.jpg)
Decision Tree
Description of classified objects
Description of census block group
Buffers are definedFor Trade Area
![Page 60: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/60.jpg)
High_profit=N
High_profit=Y
![Page 61: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/61.jpg)
Classification is developed using ID3 algorithm
![Page 62: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/62.jpg)
Spatial Trend Detection
• Trend- a temporal pattern– network alarms– recurrent illness
• algorithm computes the local changes of the specified attribute when moving to the neighbors as well as distance to the neighbors– Use of linear regression for the trend generation
![Page 63: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/63.jpg)
• Location Predictions– Logistic Spatial Autoregressive Model(SAR)
• y=dWy+Xb+e• Contiguity matrix
Spatial Predictions
![Page 64: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/64.jpg)
• Spatial Outlier Detection Techniques– (use of neighborhood graphs, paths and
indices)
.....Continued
![Page 65: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/65.jpg)
GeoMiner Architecture
![Page 66: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/66.jpg)
SPIN ARCHITECTURE
![Page 67: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/67.jpg)
Weka 3: Machine Learning Software in Java
machine learning algorithms for data mining problems
Weka contains tools for•data pre-processing
•classification
•regression
•clustering
•association rules
•and visualization
![Page 68: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/68.jpg)
SDAM Architecture
USE OF MLC++ Library forImplementing DM Techniques
![Page 69: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/69.jpg)
MLC++ extends•supervised machine learning•classification•accuracy estimation•cross-validation•bootstrap•decision trees•ID3•decision graphs•naive-bayes•decision tables•majority•induction algorithms•classifiers•categorizers•general logic diagrams•instance-based algorithms•discretization•lazy learning
![Page 70: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/70.jpg)
Bibliography[1] David Hand, Heikki Mannila, Padhraic Smyth: 'Principles of Data Mining', MITPress, London,2001, ISBN 0-262-08290-X
[2] Adriaans, P., and Zantige, D.: 'Data Mining', Addison-Wesley, Harlow, UK, 1996
[3] Berry, M.J.A, and Linoff, G.: 'Mastering Data Mining', Wiley, New York, 2000
[4] Ian H. Witten, Eibe Frank: ' Data Mining Practical Machine Learning Tools andTechniques with Java Implementation', Morgan Kaufmann Pub., San Francisco, CA,2000 , ISBN 1-55860-552-5
[5] Guting, R.: 'An Introduction to Spatial Database Systems', In Very Large Data BasesJournal, Springer Verlag,1994
[6] Han, J., and Koperski, K.: Discovery of Spatial Association Rules in GeographicInformation Databases. In Proc. Fourth International Symposium on Large SpatialDatabases, Maine. 47-66, 1995
[7] Shekhar, S., and Huang, Y., Co-location Rules Mining: A Summary of Results, Proc.Spatio-Temporal Symposium on Databases, http://www.cs.umn.edu/research/shashi-group/paper_list. html
[8] Barnett, V., and Lewis, T. 'Outliers in Statistical Data', John Wiley (3rd Ed),1994
[9] Hawkins, D. ' Identification of Outliers', Chapman and Hall, 1980
![Page 71: Data Mining in Spatial Data Sets Hemant Kumar Jerath,B.Tech. MS Project Student Mangalore University Advisors: Dr. B.K Mohan & Dr.(Mrs.).P. Venkatachalam.](https://reader038.fdocuments.in/reader038/viewer/2022110320/56649cac5503460f9496dec9/html5/thumbnails/71.jpg)
Issues In Building Spatial Data Mining Environment• Size of the database• Static or dynamic database• Testing present spatial data structure for finding the
implicit relationship between the spatial objects for mining tasks.
• Building Spatial Data warehouse and Spatial OLAP• Which Data Mining Task?• Choosing the mining algorithms for specific task….e.g. 10
years span between the concept of associative data mining….various algorithms has been developed and introduced.
• Which platform for implementation of mining algorithms, MLC++ on VC++ or Weka on Java