Interana-Understanding Event in Event Data

download Interana-Understanding Event in Event Data

of 12

Transcript of Interana-Understanding Event in Event Data

  • 7/24/2019 Interana-Understanding Event in Event Data

    1/12

    Understanding Event in

    Event Data

    eBook

  • 7/24/2019 Interana-Understanding Event in Event Data

    2/12

    What is Event Data

    Breaking Down Event Data

    What Makes Event Data Different?

    Where Does Event Data Come From?

    Analysis Perfect For Event Data

    Challenges of Event Data

    Summary

    3

    4

    6

    8

    9

    10

    11

    Table of Contents

  • 7/24/2019 Interana-Understanding Event in Event Data

    3/12

    What is Event Data 3Understanding Event in Event Data

    By denition, event data is data from Any identiable occurrence that has

    signicance for system hardware or software. User-generated events include

    keystrokes and mouse clicks, among a wide variety of other possibilities. Events

    describes an action performed by or associated with an entity at a certain time.

    Event data is a continuous stream of actions that reveals the patterns of events

    people, products, and machines make over time. It helps describe when and how

    things happen. Event data is the foundation for behavioral analytics; enabling

    understanding of how customers behave and products are used.

    Event data is simply any data point that has a timestamp, entity, and attributes

    of an action. As simple as that sounds, events are at the heart of many

    businesses. Clickstreams, logs, data from IoT devices, sensor data, and more

    are all event data. A mouse click is an event; it happens at a point in time and

    its context includes attributes such as where the entity clicked and what was

    clicked.

    Analysis of event data is based on key concepts about chronologically ordered

    data and its relationship to the world.For example, event data is generated by

    an entity who follows a path through a conversion ow, taking action at certain

    points along the way. If we examine the events of all entities that went through

    the conversion ow, we can understand their behavior and start to answer

    questions such as:

    What are the characteristics of entities that converted or dropped o?

    Why did some entities take longer to convert, and why?

    What happened between each step of the conversion ow?

    What is Event

    Data?

  • 7/24/2019 Interana-Understanding Event in Event Data

    4/12

    Breaking Down Event Data 4Understanding Event in Event Data

    So what does event data look like? Each piece of event data has three key pieces

    of information: a timestamp, one or more entities, and attributes.

    Timestamp: Just like it sounds, it records at what point in time the action took

    place.

    Entity: Who took the action. This could be a person, machine, sensor, etc.

    Attributes: These are inherent characteristics that describe what happened,

    like a click or a call. The more properties and information captured here, the

    richer the data.

    Here is a simple example of an event captured on a website in JSON:

    {timestamp: 2015-06-31T13:50:00-0600, id: 05632,

    attributes: { type: click, page: request_demo,

    previous_page: product_tour, session_length: 1060,

    browser: chrome, ip_address: 10.0.0.1, ip_region:

    united states, ip_state: california, ip_city: san

    francisco}}

    Lets take this one step further and explore a conversion ow for an e-commerce

    site. Lets look at some high level events in the ow:

    Event #1:Shopper D (the entity) follows a link from your advertisement on

    a 3rd party website

    Event #2:Views a suggested item on your site using the quick-view feature

    Event #3:Views your sizing guide

    Event #4:Selects the sweater shown in the advertisement

    Event #5:Selects size large

    Event #6:Checks out with a credit card

    Each of these events can be represented by a dierent shaped marker on a

    timeline.

    Breaking Down

    Event Data

  • 7/24/2019 Interana-Understanding Event in Event Data

    5/12

    Breaking Down Event Data 5Understanding Event in Event Data

    Each event above has several important attributes. Some attributes of Event #1

    above are:

    The timestamp: exactly when the shopper clicked through to the site (when)

    The entity (Shopper D)

    The session ID (this is context, or the how: - the event happened within a

    dened session)

    The advertisement location (more about how the event happened)

    The item pictured in the ad (another attribute that provides context)

    Attributes of event #2 (views a suggested item) include:

    The timestamp: exactly when the shopper viewed the suggested item

    (when)

    The entity (again, Shopper D)

    The session ID (how)

    The item viewed (context)

  • 7/24/2019 Interana-Understanding Event in Event Data

    6/12

    What Makes Event Data Diferent? 6Understanding Event in Event Data

    Event Data is Attribute-Rich

    Event data can have hundreds of attributes that describe each event. Because

    we use event data to discover behavior patterns, we want to have the full context

    for every event. Every attribute we store is context we can analyze; this makesevent data rich. For Shopper D in the example above, we can store attributes

    like rst and last names, birth date, gender, favorite color, home town, and

    preferred payment method. Then we could dene a cohort of shoppers who

    are over 50 and whose hometown is New York, and follow their behavior over

    time. Another reason events can have hundreds of properties is that they may

    describe not just one entity, but multiple entities involved in a single event. The

    attributes of each entity become part of the event data. For every transaction on

    an e-commerce site there may be a supplier, a vendor, a shopper and a 3rd party

    payer (credit card company, PayPal), any of whom may participate in a given

    event during the transaction.

    Event Data is Massive

    For most companies, it is their fastest growing type of data. But why is it so big?

    Event data captures the actions that an entity takes over time, so for every one

    entity, you could have tens of thousands of actions. Imagine a popular wearables

    company with hundreds of thousands of devices in the market. Each wearable

    device could generate thousands of rows of event data daily, quickly adding up to

    billions of events in just a short period of time.

    Event Data is Denormalized

    In an event data store, data is structured but never normalized. This is unlikea relational database, in which redundant data is normalized and referenced

    from a single location in a single table. Every time a value changes, the previous

    value is overwritten and only the last update is available. But, when we analyze

    event data, we want to know the state of the world at the moment of the event.

    For example, imagine storing data from an anemometer, which measures

    windspeed. The meter takes a reading every 30 seconds, and the windspeed

    value is automatically updated in the weather database. In this case, we will

    always know how fast the wind was blowing in the last 30 seconds, but we will

    never know how the windspeed has changed over the last hour. This is why,

    in an event data store, data is always appended and never updated. Every

    windspeed event is stored permanently. For a weather station that measuresnot just windspeed but also temperature, humidity, barometric pressure and

    precipitation, every attribute is stored for every sensor reading. Only when event

    data is denormalized can we use it to nd patterns and gain insight into change

    over time.

    Event Data can be Schemaless:

    As mentioned earlier, dierent types of events and even individual events of

    the same type may have dierent numbers of attributes. In other words, the

    data does not necessarily follow a particular schema. Since event data may be

    schemaless or adhere loosely to a schema, storing event data does not require

    What Makes

    Event Data

    Different?

  • 7/24/2019 Interana-Understanding Event in Event Data

    7/12

    What Makes Event Data Diferent? 7Understanding Event in Event Data

    a declared schema and accepts any number of attributes per event. A time

    attribute and an entity attribute are required for each event; any other attributes

    can be arbitrary. For example, while a group is running, their activity trackers

    could record 5 attributes: distance, stride length, heart rate, and speed. But,

    when they start to walk, their activity trackers may only capture two attributes:

    heart rate and stride length.

    Event Data is Connected by Time:

    Event data has a native concept of time and illustrates the connections between

    related events in a specied time period. This makes it easy to combine multiple

    data streams, because they all have time in common. For example, three

    separate data streams from mobile logs, web logs, and purchase history have

    time as a common reference and can thus be merged into a single source for

    even richer insights.

  • 7/24/2019 Interana-Understanding Event in Event Data

    8/12

    Where Does Event Data Come From? 8Understanding Event in Event Data

    Event data is everywhere and produced in just about every company today.

    Remember, it is produced from the actions and interactions people or machines

    have with applications and products such as:

    Websites

    Servers

    Sensors

    Automobiles

    Home/Building Automation

    Wearables

    Smart Appliances

    Connected Electronics

    Call Detail Records

    Engineers and developers can capture just about any action or interaction

    that is made by an application, product, or machine. It is stored in les such as

    clickstreams and logs.

    Where Does Event

    Data Come From?

  • 7/24/2019 Interana-Understanding Event in Event Data

    9/12

    Analysis Perfect for Event Data 9Understanding Event in Event Data

    Root Cause Examines what precipitates an event and is often used to solve

    problems or identify catalysts. Focuses on why an event happened.

    A/B Testing A form of hypothesis testing with two variants to show how

    they are similar or how they dier. Experiment results frequently informproduct direction.

    Growth Uncovers what and how entities are communicating/interacting

    with products and services so that businesses can use this information to

    develop ways to foster growth of the business.

    Retention Reveals how often something is used and how often the entity

    returns over time. Often, this is explored by tracking a rate across dierent

    entity groups.

    Conversion Tracks how an entity(s) moves through a pre-determined path

    and locates where along the path the entity takes an action. Typical toolsused in this process are funnels.

    Engagement Method for looking at how much an entity is using a product

    or service. Typical metrics used are average session length, daily/weekly/

    monthly active use.

    Churn Commonly known as attrition, turnover or defection, churn is the

    measurement of the likelihood of an entity disengaging. In addition to this

    probability, another is the exact point where (in usage ow) and when (in

    time) this happens.

    Analysis Perfect

    for Event Data

  • 7/24/2019 Interana-Understanding Event in Event Data

    10/12

    Challenges of Event Data 10Understanding Event in Event Data

    Challenges of

    Event Data

    Most companies struggle with event data because they are using technologies

    meant for relational data. Traditional RDBMS (Relational Database Management

    Solutions) are based on indexes to make point lookup fast, always trying to

    minimize the number of rows that need to be scanned. This works great when

    an index matches the workload, but for the most part, scanning indexes is slow.

    This is especially prevalent when we consider the massive volumes of event data

    that need to be analyzed. This can make query times range from a few hours to

    days depending on the complexity and the length of time being scanned.

    Remember, with event data, time is a rst order principal. You need to be able to

    scan all rows within a specic time period. A solution built for event data should

    assume massive scanning workloads to make queries ecient.

    Additionally, RDBMS are usually queried with SQL or another query language

    designed for relational data. Again, these query languages are great for a point

    lookup, but struggle when asking questions about events over periods of time.It almost always requires multiple scans and computations that can make them

    slow and inecient - not to mention the complexity in writing them.

    When performing analytics on event data, the query language should

    have primitives that turn many-step processes into a single pass to allow

    for maximum eciency. Using a RDBMS to analyze event data brings two

    predominant challenges to the business. The rst has to do with scale. Event

    data is massive in scale and traditional relational databases do not store and

    analyze this data eciently; there should be no disincentive to log as many

    events as possible. Instead, businesses sample from the event data potentially

    losing valuable attributes and then wait hours or days for results. The secondchallenge is that the complexities inherent in query languages often prevent

    many business teams like, Product or Marketing, from accessing data to generate

    needed insights. Rather, business teams rely on data teams to query event data

    often providing incomplete answers because this process not iterative; it is one

    question at a time.

    Using a RDBMS to store and analyze event data is a little like using a screwdriver

    to pound in a nail. You can get it done, but it isnt the best idea.

  • 7/24/2019 Interana-Understanding Event in Event Data

    11/12

    Summary 11Understanding Event in Event Data

    Investments in big data technologies are expected to top 60% in 2014. The

    question is not whether big data is here, it s how big will this data get? Much

    of this data is event data, growing by the millions daily and overwhelming

    businesses.

    Interana is a purpose-built solution for event data at scale. The full stack

    conguration consists of a highly scalable backend which is combined with a

    visual and interactive frontend to deliver comprehensive analytics on event data.

    Consequently, Interana scales to trillions of events, while keeping query times to

    just seconds.

    Questions about conversion, retention, root cause analysis and more across

    endless dimensions are a few short clicks away with behavior-based tools such

    as cohorts, funnels, and sessions. With event data at the core of the solution,

    Interana provides behavioral analytics to help companies unlock the insights

    they need to create new opportunities to grow their customer base, deepenengagement, and maximize retention in their products and services. Redening

    self-service, Interana has done the hard work by eliminating the need to

    generate long and complicated queries that take hours to write and even longer

    to run. We aim to make data part of everyones day.

    Summary

  • 7/24/2019 Interana-Understanding Event in Event Data

    12/12

    68 Willow Road

    Menlo Park, CA 94025

    www.interana.com