Improving Data Modeling Workflow
-
Upload
looker -
Category
Data & Analytics
-
view
74 -
download
1
Transcript of Improving Data Modeling Workflow
Improving the Data
Modeling Workflow
October 28, 2014
LendingHome is the most advanced
mortgage marketplace platform in the
world
What Is LendingHome?
What is LendingHome?
Simple, efficient borrower experience
Investors are matched to safe, high-yield loans
World-class mortgage ops process driven by
transparent analytics
Statistical models for credit, underwriting, pricing,
sales, marketing
LendingHome: Mortgage Marketplace
• Fastest loan funded in 72 hours
• Borrowers prequalify in 3 minutes
Investors are matched to safe, high-yield loans
• Line-item accounting for payout tracking
• Proceeds from loans wired in under a second
Looker is a data exploration solution
that operates in the database
to enable organizations to explore data
in all its detail.
What Is Looker?
Some of Looker’s Customers
LendingHome: Operations
World-class operational process driven by
transparent analytics
• Integrations with over 20 vendors
• Rigorous 96-item checklist
• Looker-driven workflow measures quality and
timing of entire process
• Highly complex reporting, dashboards are
easy in Looker
LendingHome: Ops Analytics
LendingHome: Ops Analytics
The Challenge
Data scientists create value by creating
actionable models
More time spend preparing data and evaluating
results than using heavy data science
techniques
How to speed up analytical cycles
Focusing on Expertise
Define
Goals
Data Cleaning
and Shaping
Model
Dev
Expertise
“Work”
Model
Eval
Our Example
38 million flights between 2000 and 2005
Carrier information, departure and arrival
location, manifest data, aircraft data
Modeling on-time rates
Getting Started
Analytics as the shortcut to variable selection
Data Cleaning
Analytics can provide a distinct advantage in the
constant struggle for clean data
Data Cleaning Cont.
Selecting Variables
Pulling Data
Analytics allow data scientists to lever analytical
modeling to easily grab reshaped data
- Time-zone functions
- Sub-select functions
- Cleaned/filtered data
LendingHome: Quant
Modeling
• Credit models learned over 25M loans, 4B
payments, macroeconomic factors
• Scoring transparent to borrowers, investors
• Finds and predict borrower conversion from
180M RE transactions
• Feature extraction, data exploration, train/test
splits, model analysis in Looker
LendingHome: Factor
Analysis
LendingHome: Noisy Data
Building The Data Model
Let’s try a simple model:
Evaluating The Model
Cleaning
Analytics
Modeling
Continuing Our Example
Re-using that same analytics process, we can
explore our data more seamlessly
Examining Residuals
In-Sample vs Out-of-Sample
LendingHome: Evaluate Models
Layering Entirely New
Variables
Q & A
Try Looker on your own data.
Looker.com/trial