OpenKE Text Analytics...some predefi ned concepts that SAS provided out-of-the-box including...

1
Information Sources Overview Result Examples Analytical Methods © 2016 SAS Institute Inc. All rights reserved. 41749US.1216 SAS was able to ingest the data and input the taxonomy as concepts into SAS Contextual Analysis for automated scoring. Subsequently, SAS scored a subset of the web scraped records against this LAS provided taxonomy. This includes utilizing some predefi ned concepts that SAS provided out-of-the-box including identifying organizations and measures. For each concept match against the scored data, SAS also extracted the sentences that the concept, such as a UAV/UAS property, occurred in. SAS utilized Visual Analytics against the scored data to enable interactive exploration of the data. OpenKE Text Analytics SAS ® Categorical UAV/UAS Taxonomy Methodology and the step-by-step process of imputing a categorical taxonomy to classify unstructured UAV/UAS data into groups and visualize the results. LAS provided basic taxonomy describing UAV/UAS options, properties, and components, as well as a document describing certain questions LAS is seeking to answer. LAS also provided ~90K web scraped documents from UAV/UAS related sites including a list of these sites. LAS provided 3 main concepts related to drones. Imported into SAS Contextual Analysis, they were used to design the fi rst models The scored data set presented new variables that are important for visualization purposed LAS provided Knowledge Graphs with questions that were to be answered during the process, which helped shape workfl ow Diagram of one model created to identify the physical location of the contents on the server Here you can see what information is identifi ed by the drone components concept and rules when scored We were able to utilize interactive heat maps to fi lter our data and identify new rules that might help us refi ne our basic concept rules. The scored data set includes an array of new variables which identify the terms that were matched, the concept that the matched term or phrase falls under, and extracts the entire sentence in which the match was found. This enrichment code allowed us to extract full sentences around any matched concept, cutting out much of the original noise. These custom concepts were built in SAS Contextual Analysis using the original taxonomies created by the LAS team SAS Software Used: 1. Base SAS and SAS Enterprise Guide (EG) 2. SAS Contextual Analysis (SCA) 3. Visual Analytics

Transcript of OpenKE Text Analytics...some predefi ned concepts that SAS provided out-of-the-box including...

Page 1: OpenKE Text Analytics...some predefi ned concepts that SAS provided out-of-the-box including identifying organizations and measures. For each concept match against the scored data,

Information Sources

Overview

Result Examples

Analytical Methods

© 2016 SAS Institute Inc. All rights reserved. 41749US.1216

SAS was able to ingest the data and input the taxonomy as concepts into SAS Contextual Analysis for automated scoring. Subsequently, SAS scored a subset of the web scraped records against this LAS provided taxonomy. This includes utilizing some predefi ned concepts that SAS provided out-of-the-box including identifying organizations and measures. For each concept match against the scored data, SAS also extracted the sentences that the concept, such as a UAV/UAS property, occurred in. SAS utilized Visual Analytics against the scored data to enable interactive exploration of the data.

OpenKE Text AnalyticsSAS® Categorical UAV/UAS Taxonomy

Methodology and the step-by-step process of imputing a categorical taxonomy to classify unstructured UAV/UAS data into groups and visualize the results.

LAS provided basic taxonomy describing UAV/UAS options, properties, and components, as well as a document describing certain questions LAS is seeking to answer. LAS also provided ~90K web scraped documents from UAV/UAS related sites including a list of these sites.

LAS provided 3 main concepts related to drones. Imported into SAS Contextual Analysis, they were used to design the fi rst models

The scored data set presented new variables that are important for visualization purposed

LAS provided Knowledge Graphs with questions that were to be answered during the process, which helped shape workfl ow

Diagram of one model created to identify the physical location of the contents on the server

Here you can see what information is identifi ed by the drone components concept and rules when scored

We were able to utilize interactive heat maps to fi lter our data and identify new rules that might help us refi ne our basic concept rules. The scored data set includes an array of new variables which identify the

terms that were matched, the concept that the matched term or phrase falls under, and extracts the entire sentence in which the match was found.

This enrichment code allowed us to extract full sentences around any matched concept, cutting out much of the original noise.

These custom concepts were built in SAS Contextual Analysis using the original taxonomies created by the LAS team

SAS Software Used: 1. Base SAS and SAS Enterprise Guide (EG) 2. SAS Contextual Analysis (SCA) 3. Visual Analytics