WHITEPAPER The BloomReach Web Relevance...
Transcript of WHITEPAPER The BloomReach Web Relevance...
[email protected] +31 20 522 44 66US +1 888 263 3917
WWW.BLOOMREACH.COM
WH
ITEP
APE
R
The BloomReach Web Relevance Engine
WEB RELEVANCE ENGINE
BLOOMREACH.COM 1
The continuously learning Web Relevance Engine (WRE)
uses natural language processing and machine-learning
to take the data from your site, your customers’ behavior, your
content & products, and BloomReach’s understanding of web-
wide demand, and weave it into the most valuable solution for
your business. We’ve forged and sharpened the sword, all you
have to do is pick it up and swing.
The process of taming this data and turning it into revenue-
driving action is one that BloomReach has been perfecting
for nearly a decade, and our deep understanding of language
and user intent is available right out of the gate for our
customers. Of course, once you plug in the BloomReach WRE
it immediately gets to the task of learning the ins-and-outs of
your particular business - and then learns and learns and learns.
The more data fed into the WRE, the further our powerful
BloomReach applications can take you.
BloomReach gathers the data from your site, pairs it with
the data from relevant web-wide sources, lets the machine
get to work, and provides you with insights to optimize your
business’s digital experience. Sounds great in theory, but how
does BloomReach’s continuously-learning WRE actually do this?
Gathering Site Content:
The BloomReach WRE starts by identifying and understanding
the content and products on your site. This is done through two
methods:
Customer Product Feed - This is the data that you, as a
BloomReach customer, provide us. This information includes
titles, product attributes, prices, inventory, out of stock items,
and language usage most important and unique to your
business. The majority of BloomReach customers provide an
updated product feed daily.
T oday’s ever-connected world
produces a never-ending stream
of data. For companies that serve
millions of visitors a day, have
dozens of different channels and touchpoints,
and especially for commerce companies with
thousands - or millions - of products, gathering
this data can be overwhelming. And once you
have it, you have to actually do something with
it. You have to gather the right data, understand
it, and apply it in the right way to drive a more
successful online experience.
The BloomReachWeb RelevanceEngine turns data into value.
WEB RELEVANCE ENGINE
BLOOMREACH.COM 2
Site Crawl - BloomReach uses every tool in our toolkit to deeply
understand your site. We look across your site for updates,
crawling it similar to the way a search engine would. We also
observe through a BloomReach listening pixel on your site
which gives a clear picture of how your visitors interact across
the digital experience. Once we’ve fetched all the data, we
parse it - looking at the title, heading, content, products, and
components to create a module for each piece that we fetch.
This module is what is our machine learning algorithms use
to organize, canonicalize, and build relationships within your
site. Parsing also flags pages that are 404 and sends these to a
blacklist - making it easy to keep large, multi-page and -product
environments clean.
Merge Pipeline:
The next step is to merge the Customer Product Feed and Site
Crawl data together and then clean it. As part of the cleaning
we will drop any duplicates and merge multiple versions
of content. Once we have the merged, clean data for your
environment, we pass it on for processing.
Data Processing:
This is where BloomReach’s natural language processing (NLP)
and core machine-learning come into play. Our algorithms take
in the gathered data, determine the optimal logic and order,
and produce actionable insights to increase your environment’s
performance. Key ways this occurs is through:
Link Generation - The WRE simulates a process akin to a search
engine that determines what URLs have a strong relationship
and should link to each other. For example, if a page has a
generated tag of “Eco Friendly” (URL 1) our algorithms run it
THE BLOOMREACH WEB RELEVANCE ENGINE
Site Crawl
Customer Site
Link Generation Link Optimization
BloomReach Data Store Quality Review
Gather Merge Process
Feed Merge Data
Pull & Parse Data
Clean Data
WEB RELEVANCE ENGINE
BLOOMREACH.COM 3
through our search and find that a page
with the tag “Recycled” (URL 2) also
performs well for “Eco Friendly” - we
know we can link URLs 1 & 2 to each
other. The machine-learning algorithms
create these relationships for each URL,
creating a relationship map.
Link Optimization - We narrow down the
best relationships to power elements of
the page such as Related Searches and
Related Products/Content. First, we use
these relationships to link pages that are
most relevant to each other. Additionally,
we maximize traffic to your site by linking
URLs that are crawled every day by
search engines to URLs that don’t have as
much traffic - boosting them up. We also
link thematic pages to drive traffic and
improve efficiency.
Continuous Self-Learning
The BloomReach data pipeline runs automatically in a continuous loop. The only manual action needed is to turn it on or off, then sit back and let it learn and learn and learn. Every day - and every night - our machine-learning algorithms get to know your business and your visitors more in-depth.
The WRE’s intelligence means that is able to adapt to context. For instance, a user who types the letter ‘s’ on a department store site and the WRE might offer “shoes” as a site search query. Type in ‘s’ on an athletic team gear site and “Seattle Seahawks” is a more relevant query.
Similarly, machine learning provides the engine with the ability to adapt to different seasons. Typing the letter ‘v’ on a florist’s site in February is likely to produce “Valentine’s Day” as a query. Type ‘v’ a few months later and the auto suggestion is more likely to be “violets.”
reusablereclaimed
bamboocoasters
reclaimedwood bowl
solar
reusableshopping bags
ceramictravel mug
energy saverappliances
home composting kit
recycled glass vase
recycle bin
organic robe
organic
recycled
organic skinny jeansrecheargable
batteries
solarradio
organic gift basket
energymonitor
energysaving
non-product-tag
product-tag
node of interest
WEB RELEVANCE ENGINE
BLOOMREACH.COM 4
THE BLOOMREACH TOOLKIT
Our capacity to link your content is accomplished by a deep
understanding of content and user behavior, which combine
to give a holistic view of the user experience. We reach
this understanding in a multitude of ways, with the leading
strategies including:
Synonym Database
The heart of the WRE is a deep understanding of language.
Our algorithms are continuously refining and increasing our
knowledge of language and intent, and the foundation of
that understanding is the unique way BloomReach handles
synonyms.
To show why the way we use synonyms is so
critical, we need to look at how BloomReach
uses natural language processing. Outside
of BloomReach (mainly in academic settings)
when people talk about NLP they refer to
breaking down text by tagging the noun, verb,
adjective, etc. However, understanding a
search query is different, as people don’t tend
to type complex sentences in a search box.
But they do describe attributes of products
and services in many different ways and
combinations - which is where our synonym
dictionary comes into play.
Building this dictionary is not trivial. It has to
be both large and clean, meaning you have
to weed out a lot of noise caused by creative
marketing terms (ie. “thundercloud” is not a
color). Of course, you could perform manual quality control of
your entire dictionary, but doing so efficiently and at scale is
expensive and difficult. While leaving a dictionary completely
up to computers means you may miss some of the subtle
nuances or the terms unique to specific industries. To handle
this, BloomReach uses algorithms to extract likely synonym
candidates and has human filters on top of that to ensure the
quality of our library. We have years worth of synonyms - over
15 million pairs - that are immediately available to you when
you flip the WRE on and, through crawling, pixel data, product
feeds, and queries BloomReach continuously develops this
dictionary with the language your customers are specifically
using. Of course, while used to optimize your environment -
customer-specific site data and product data is not shared.
Web-Wide Data
With 100 million pages and up to 10 terabytes processed daily, all of the BloomReach algorithms and applications are built on a strong foundation of data. In addition to BloomReach customer sites, the WRE also crawls a myriad of other public sites such as Wikipedia, industry blogs, and competitors’ sites to gain further understanding of language, context, and consumer intent.
For example, if we have a client in the home goods sector, we will crawl several other sites in that vertical and target customer market to understand the language used by that audience and the marketers who speak to them.
The Red Dress
For an idea of just how much there is to learn about consumer language and behavior, consider an exercise BloomReach conducted, asking participants to describe a red dress. The first 500 users who took the quiz came up with 129 ways to describe the dress’s color, 194 ways to describe its neckline and 275 ways to describe its belt, which some might say defies description. Luckily, with BloomReach’s WRE synonym database, including the 1,077 different colors it understands, our algorithms can find the relationship between these variations to ensure your users get what they’re looking for - no matter how they say it.
WEB RELEVANCE ENGINE
BLOOMREACH.COM 5
Interpreting Search
When a user types in a search, the WRE scans the query to
break it up into attributes and the product. If a visitor enters
“black office chair mesh back” we scan the phrase, match it to
our site data and dictionary, and determine the core product (or
content). Then we determine the surrounding attributes. In the
case here, this would look like <color><style><product><style>.
We run the product and attributes against our synonym
dictionary, compare that with your product & content feed, and
use performance data, ranking weighting, and user preference
to return the most relevant results.
Collaborative Filtering
You can learn a great deal about user intent with aggregate
behavioral data. For example, if a site search query has a
high volume yet few conversions, it can be valuable to look
at what those shoppers did next. Did a significant number of
them navigate to a particular category or product page? Did
they refine their search query and try again? This type of
information is a key way the WRE determines complimentary
content and products.
One of the techniques the WRE uses is a process called
“collaborative filtering,” which utilizes data to cluster
recommended products. Collaborative filtering clusters
visitors around shared preferences using vector mathematics
and identifies specific users who are within a natural cluster,
but who have not yet seen recommendations related to the
cluster’s shared preferences. The WRE uses NLP to interpret
the attributes (and their synonyms) for the products and
content in the cluster and identifies other pieces that should
be in the cluster based on their attributes. The system then
offers the recommendations. The visitor’s interaction with the
recommendation helps refine the statistical clustering.
Behavioral modeling is also used to identify areas on a site
where demand is going unmet. For example, if product
information does not contain a commonly used synonym for
that product, yet that synonym is used by shoppers as a site
search query, the WRE can learn that new term term and use it
for recommendations and results.
Personalization
Every individual visitor to your digital experience has their
own personality, likes, dislikes, needs, and language style - and
they expect you to deliver recommendations relevant to them,
whether they are just starting their research or returning to a
site on a different device altogether.
Luckily, a visitor’s engagement across your digital touch points
reveals a great deal about their evolving intent. They signal
their intent by typing queries, sure, but also through the email
links they click, particular pages they open on particular devices,
shopping cart additions, and their click through behaviour.
Given a small amount of engagement - say, three page views -
the WRE can begin to learn the affinities of an individual user.
And not only is each person different, their circumstances and
needs change minute-by-minute. A user visiting an insurance
site when there is bad weather in their area may like to see a
fact sheet of weather damage covered by home insurance, the
same user visiting the same site on their mobile phone abroad
might be more interested in a direct link to file a traveler’s
insurance claim.
K1keyword
QuerySearchengine
Category page Product page Page associated concept tag T
Series of clicks by users for thesame keyword query
User 1
User 2
User 3
Product page P1 was alreadytagged with a concept
We propagate the same tag to Categorypage C1, since it occurs with P1 often
and within a few clicks.
S
P1 P2 P3
P4
C1
P1
C2
C1
C1
P1
P1
P1 P2 P3
P4
C1
P1
C2
C1
C1
P1
P1
P1 P2 P3
P4
C1
P1
C2
C1
C1
P1
P1
WEB RELEVANCE ENGINE
BLOOMREACH.COM 6
This combination of circumstantial data, aggregate and
individual user behavior, and the WRE’s understanding of
product and content relationships, can provide a contextually
relevant experience to each individual - whether they are
known or anonymous
BloomReach’s understanding of language and search intent,
combined with our algorithmic knowledge of user behavior
and performance data is how the WRE data pipeline returns
contextually relevant related searches, products, and content to
your visitors.
Our ApplicationsThe Web Relevance Engine is the beating heart behind the
data-driven personalization, relevance, and ranking within all
BloomReach applications. Each application weaves together
the data-based intelligence from our algorithms to deliver the
insights and automated actions that are the most impactful to
your business.
BloomReach Experience leverages the WRE to bring together
a deep understanding of an organization’s content with a
contextual understanding of an individual’s intent. Provide
relevant experiences to your visitors - whether they are logged-
in or anonymous - across every device, channel, and touchpoint
they interact with. BloomReach Experience’s separation of that
content from its presentation means you can go beyond simply
offering visitors the relevant pages - but can personalize each
individual component for a truly agile experience.
2 individuals, same demographicFemale, 28, College grad, Mountain View, CA, Income >$80k
Cross-Channel Data
Your audience uses multiple devices, and they expect their experience to be consistent across those devices. In an ideal world, your visitors would be logged in across devices, but we don’t live in that world (in fact, our research found that only around 1% of eCommerce customers are logged in across desktop, tablet, and smartphone).
To overcome this challenge, the WRE uses pattern detection to connect anonymous users across multiple devices using a number of behavioral and technical signals.
This cross-device connection can be useful for subtly personalizing the experience of a shopper and for proving the “mobile influence” for visitors who browse on the smartphone, yet convert on the desktop.
WEB RELEVANCE ENGINE
BLOOMREACH.COM 7
BloomReach Digital Experience Marketecture
BloomReach Personalization
Search Merchandising Testing &Targeting
Insights Open Integrations
BloomReach Experience
Content Management
OmnichannelsExperience
Forms Languages
Trends Relevance Experiments Open Integrations
CATALOG CONTENTCUSTOMER DATA WWW
BloomReach AI Algorithmic Optimization (SEO, Channel, Device) Predictive Analytics (Behavioral, Transactional)Machine Learning (Relevance, Testing, Performance) Natural Language Processing (Word Sense Disambiguation, Semantic Interpretation)
BloomReachOpen APIs
Delivery
Capture
Syndication
BloomReach Marketplace
Extensions
BloomReachOrganic
SEO
Thematic Pages
Tools
APPLICATIONS TOUCHPOINTS DEVICES CHANNELS
CONTENT
AggregatedAggregated SearchSearchOwnedOwnedPurchasedPurchased FeedsFeeds UGCUGCOwnedOwned
FeedsFeeds UGCUGC
PurchasedPurchasedOwnedOwned
SocialSocial TransactionalTransactional
PreferencePreferenceProfileProfile
Using the WRE, BloomReach Personalization optimizes and
personalizes product discovery for every visitor. A shopper’s
path to purchase may include site search, navigating to category
pages, and engaging with product recommendations and that
journey may take place across multiple devices. BloomReach
Personalization learns from every interaction with a shopper to
better tailor the product mix, ranking, and recommendations
to suit her expressed affinities. The result is an improved,
frictionless customer experience that measurably impacts
revenue.
BloomReach Organic uses the WRE to increase crawlability,
improve site content and cluster products on high quality
thematic pages. Each of these helps our customers capture and
convert consumer demand from search engines. The WRE also
helps identify additional opportunities, facilitates creating new
category pages (which can also be used for other marketing
channels, like email, paid search or social), and continuously
monitors the health of those pages - both from a technical
and consumer perspective. Achieving quality coverage for
organic search at scale is a challenge that necessitates having
technology like the WRE.
Experience is the battleground for today’s digital businesses.
The bar for this experience continues to rise, not only does your
audience expect every interaction to be as intuitive as flipping
a switch, they want it all tailored to them in the moment -
getting exactly what they want right now.
The Web Relevance Engine gives you the power to do just
that. Harnessing the power of big data, machine learning, and
natural language processing to deliver the relevant results
consumers have come to expect in the era of always-on search
and discovery. The WRE is driving business success, while
delivering the promise of a relevant and reliable web.
WEB RELEVANCE ENGINE
BLOOMREACH.COM 8
BloomReach is a Silicon Valley firm that brings businesses the first open and intelligent Digital Experience Platform (DXP). BloomReach drives customer experience to accelerate the path to conversion, increase revenue, and generate customer loyalty.
With applications for content management, site search, page management, SEO optimization and role-based analytics, BloomReach is a central location for all players who manage customer experience to come together and intelligently drive business outcomes. BloomReach’s Web Relevance Engine (WRE) algorithmically understands content and users, matching demand and intent data from across the web. BloomReach’s industry-leading tools unlock the powerful creativity of humans to improve omnichannel customer experiences at scale. Together, our users and our intelligent tools generate millions of dollars of proven incremental sales.
BloomReach’s portfolio of customers include: Neiman Marcus, Staples, REI, Mailchimp, and Autodesk. Created in 2009, BloomReach is headquartered in Mountain View, CA with offices worldwide and is backed by investment firms Bain Capital Ventures, Battery Ventures, NEA, Salesforce Ventures and Lightspeed Ventures.
About BloomReach
MOUNTAIN VIEW
+1 888 263 3917
AMSTERDAM
+31 20 522 4466
LONDON
+44 20 35 14 99 60
BOSTON
+1 877 414 4776
BANGALORE
+91 80 42 278 526
CO
NTA
CT
US