Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Why Human Annotated Data Matters for Search - Grant Ingersoll, Lucidworks & Kevin Vondemkamp, Appen
-
Upload
lucidworks -
Category
Technology
-
view
433 -
download
2
Transcript of Why Human Annotated Data Matters for Search - Grant Ingersoll, Lucidworks & Kevin Vondemkamp, Appen
Why Human Annotated Data Matters for Search Grant Ingersoll—CTO, Lucidworks Kevin Vondemkamp—VP, Appen
2
Up to 30% of visitors will use site search
Shoppers using site search showed • 216% increase in
conversion rate
• 21% increase in average order value
Importance of eCommerce site search Customers who find what they want buy more
Source: WebLinc, Nov 2016
K
Over half of U.S. and European businesses cannot find the information they seek
3
3 out of 4 people agree that information is easier to find outside of the organization than within
Challenges with Enterprise Search
52%
Source: BA Link, Aug 2016
K
4
How can eCommerce retailers provide accurate search results with growing inventories and changes in brands, trends, seasons?
How can enterprises provide users with accurate search when the volume of information continues to increase?
Key Questions
? K
Improving Search Capture all user interaction Adopt experimentation mindset Use rules purposefully Always be learning
CONTEXT Where the user is,
who the user is, user’s past behavior
CONTENT Data and
documents in the index
CONSUMER Insights from similar users’
behavior
CROWD
G
Machine Learning The search function understands what the shopper means rather than just what they typed
G
Structured Data
Unstructured Data
Source: https://www.ibm.com/blogs/watson/2016/05/biggest-data-challenges-might-not-even-know/
80% of all digital data is unstructured Growing at 60% CAGR
Difficult to Mine
K
• Raw data is often noisy and unreliable, and may be missing values
• Typical data quality issues: • Incomplete: Data lacks attributes or
containing missing values • Noisy: Data contains erroneous
records or outliers • Inconsistent: Data contains conflicting
records or discrepancies • Using such data for modeling can
produce misleading results. Avoid "garbage in, garbage out“
Source: https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-data-science-prepare-data
Why structured data is important for machine learning
K
Speech Appen helps the world’s leading technology companies develop voice-activated systems for automotive, search and entertainment
eCommerce Appen helps major eCommerce vendors improve search accuracy to make shopping easier and improve conversion rates
Natural Language Appen’s work underpins natural language understanding technologies for government and commercial technologies that connect the globe
Content Relevance Appen helps leading search and social media companies deliver relevant content and news to their users
$
Appen’s Suite of Services Global Speech & Search Services appen.com
PURCHASE CONVERSION
I want to buy a flat screen TV… “
“ Categorization
Whole page relevance Ad relevance
Search Browse Decision Cart Purchase
$
User behavior Query intent
Consumer insight
Purchase conversion
K
CUSTOMER 360° VIEW
“
“ Tagging
Find Analyze Action
Query intent Better ROI
K
I want a holistic view of my customers
Search Browse
eBay’s Structured Data Initiative
Source: https://www.listsmart.io/article/why-structured-data-essential-ebay
Collect Data Process and Enrich Data
Create Product Experiences
3 Key Efforts
• Requires sellers to provide product data
• Phase I and II requirements cover • B2C sellers • Manufactured
inventory
• Create product in catalog where absent
• Link relevant items • Add pricing & other
product attributes • Enhance content with
images & descriptions
• Simplify vast inventory
• Improve discoverability on and off site
• Enable value-added data… reviews & buying guides
• Better targeting & merchandising
K
How eCommerce Leaders Use Machine Learning
Product recommendations Personalization Categorization Product search
K
Why Human Annotated Data is Better for Machine Learning
Humans are essential for establishing a baseline for search queries. Technology just hasn’t caught up , which is why human annotation is critical to ensure that the search algorithm gets trained properly to asses intent.”
— Grant Ingersoll, CTO, Lucidworks
Humans are better than computers at • Managing subjectivity • Understanding intent • Coping with ambiguity
“ K
Lucidworks & Appen Integration
Client Management Search Health
Home Office Crowd Annotation
Customer Search
Appen Global Platform
Data Calibrate
G
Where do you go from here?
Do I have enough data? 1
2 Is my company willing to invest?
3 Do I have a team available to manage and analyze the
data? 4 Are you able to action against the dashboard metrics?
K