See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/326838388
LSTM network time series predicts high-risk tenants
Presentation · July 2018
DOI: 10.13140/RG.2.2.19849.95849
CITATIONS
0READS
5
6 authors, including:
Some of the authors of this publication are also working on these related projects:
One Small Step for Machine, One Giant Leap for the Market: Lessons Learnt from a Real World Machine Learning Project View project
Bus Service Optimisations View project
Wolfgang Garn
University of Surrey
17 PUBLICATIONS 22 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wolfgang Garn on 06 August 2018.
The user has requested enhancement of the downloaded file.
10/07/2018
1
LSTM network time series
predicts high-risk tenants
WOLFGANG GARN, Y IN HU, PAUL NICHOLSON, BEVAN JONES,
HONGYING TANG, WILLIAM WRIGHT
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
OverviewBenefits to the real estate sector decreasing lost revenue, increasing efficiency providing more help to tenants before their debt becomes
unmanageable
How? Measure arrears risk for each individual tenant differentiate between short-term and long-term arrears risk predict trajectory of arrears Identify factors that can be operationally used to assist tenants
arrears management
Structure Background Data - arrears Risk & Factors Method - LSTM – network Results & Insights
AbstractIn the United Kingdom, local councils and housingassociations provide social housing at secure, low-renthousing options to those most in need. Occasionally manytenants have difficulties in paying their rent on time and fallinto arrears. The lost revenue can cause substantial financialburden for these agents, while falling into arrears can causestress to tenants. An efficient arrears management scheme isto target those who are more at risk of falling into long-termarrears so that interventions can be used for persistent loss ofrevenue. In our research, a Long Short-Term Memory Network(LSTM) based time series prediction model is implemented todifferentiate the high-risk tenants from relatively temporaryones. This model measures the arrears risk to differentiatebetween short-term and long-term arrears risk and predictsthe trajectory of arrears for each individual tenant. Moreover,further arrears analysis is conducted to investigate whichfactors could provide more assistance for tenants before theirdebt becomes unmanageable. A five-year rent arrears datasetis used to train and evaluate the proposed model. The rootmean squared error (RMSE) is used to punish large errors bymeasuring of the differences between the arrears actuallyobserved and arrears predicted by a model. Hence, this modelbrings benefits to the real estate sector by decreasing lostrevenue, increasing efficiency and providing more help totenants before their debt becomes unmanageable.
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
BackgroundHastoe Housing Association Creating opportunities to innovate
Sustainable Homes Ltd Influencing policy and practice
University of Surrey - Computing and Business School
The associate
Innovate UK funding for 3 years
Knowledge Transfer Partnership
10/07/2018
2
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Content
BackgroundData – rent arrears & factorsMethod - LSTM – network Results & Insights
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Data sourcesHeterogeneous sets Excel maps, databases, …
Mosaic UK Geodemographic classification of households
Geographic's of vacant homes Deprived areas more likely to be in rent arrears
Finances The rich, the poor and the rest
Rent arrears (2555 tenants)
Data: five years John Naisbitt wrote “We are drowning in information but starved for knowledge.”
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Scatter plots & Correlations
Correlation analysis of housing beneficiaries does not support associations between factors
Factors◦ Energy Efficiency (ree)
◦ Housing Benefits (rhb)
◦ Rent in arrears (rma)
ree -0.014 -0.004
-0.014 rhb -0.122
-0.004 -0.122 rma
r values
10/07/2018 6
SA
P r
ating
Housin
g b
enefit
Month
in a
rrears
SAP rating Housing benefit Month in arrears
0.00
0.02
0.04
0.06
Corr:
-0.0139
Corr:
-0.00416
0
50
100
150
200
Corr:
-0.122
-10
0
10
40 60 80 0 50 100 150 200 -10 0 10
10/07/2018
3
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Indirect factorsBenefits >> Energy efficiency >> rent arrears
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Rent arrears & Energy efficiencyA detour … the whole housing market
SAP = Standard Assessment Procedure for energy efficiency
Surprisingly much rent arrears
Lowest SAP rating –> highest rent arrears
Highest SAP rating -> lowest rent arrears
Other SAP ratings debatable
F E D C B A
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Chase for featuresRent arrears and average SAP rating
Average at a certain level>>
potential for systematic feature engineering?
From a subject matter expert’s view:Rent arrears drops as the SAP rating increases
s
10/07/2018
4
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Averages are better!?Correlations become visible
More importantly patterns become visibleFeature engineering “can start”
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Rent arrears & SAP “again”The better the property the less rent arrears
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Feature list78 features were considered
Feature List (ranked by importance to management cost)
SAP 2015 rating
Year of Built
Max Occupants
Number of Bedrooms
Urban (Urban to Rural)
Average of Outdoors Sub-domain Decile (where 1 is most deprived 10% of LSOAs)
Bedrooms (Many to Few)
Environment (Care to Don't Care)
Age (Young to Old)
Technology (Adopt to Reject)
Residency (Long to Short)
Facebook Usage (High to Low)
Income (High to Low)
Children (High to Low)
Smartphone (Yes to No)
Rented (High to Low)
Property (???? to ?)
Mortgage Debt (High to Low)
Twitter Usage (High to Low)
Health (Good to Bad)
10/07/2018
5
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Content
BackgroundData - arrears Risk & FactorsMethod - LSTM – network Results & Insights
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
LSTM network - intro
Source: Cloah’s blog: http://colah.github.io/posts/2015-08-Understanding-LSTMs/
memory adaptation
forg
et
Forget gate
input gate
candidates
New “memory” cell state
input
output
output
memory
output gate
outputLong Short-Term Memory (see Hochreiter & Schmidhuber, 1997)
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Content
BackgroundData - arrears Risk & FactorsMethod - LSTM – network Results & Insights
10/07/2018
6
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Arrears prediction
Time-series forecasting
85.4% accuracy over 18mths
Tenant Clusters – for policies
Towards automated processes
Human intervention where most needed
Time [month]
Arr
ears
[m
on
th] f
or
ten
ant
x
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Quality - Confusion matrixMajority of arrears and non-arrears are identified correctly.
Assuming difference between actual arrears and predicted arrears <£50 is acceptable for certain month
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Arrears per clusterSix types of tenants were identified
All levels are “pretty” stable Four distinctive levels of avg. rent arrears
Three types of tenants with similar levels oscillate around the zero level
One type has “negative” rent arrears i.e. pay rent in advance
~£400/month in arrears tenants Tenant’s benefit: comparable to a £400 free credit
Association’s cost: £146.8k x 30% = £44.0k per year (for cluster) 30% = administrative cost related to reminders & help strategies
~£1,000/month in arrears tenants Tenant’s benefit: comparable to a £1,000 free credit
Association’s cost: £159k x 30% = £47.4k per year (for cluster)Time [month]av
g. r
ent
arre
ars
per
clu
ster
10/07/2018
7
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Arrears – regional distributionGood predictions for east & west
Prediction for south “interesting”
Geography matters ->
model can be improved using this
East (201x) arrears: 34.8%• Actual - Predicted: 4.93%
South (201x) arrears: 35.7%• Actual - Predicted: -23.8%
West (201x) arrears: 29.5%• Actual – Predicted: 4.37%
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Results & Insights - summaryBenefits to the real estate sector decreasing lost revenue,
“initial customer selection”, Tackle lowest SAP properties
increasing efficiency “credit” acceptance rather than “chasing” (or automatization)
providing more help to tenants before their debt becomes unmanageable
How? Measure arrears risk for each individual tenant
85.4% accuracy
differentiate between short-term and long-term arrears risk
predict trajectory of arrears
Identify factors that can be operationally used to assist tenants arrears management £rent arrears, rel. rent arrears, region, SAP rating, benefits
Time [month]Arr
ears
[m
on
th]
for
ten
ant
x
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
AppendixoBlogs: Sustainable Homes (Hastoe) - KTP project - machine learning techniques to gain insights into the sustainability of homes o The rise of the machines - learning for environmental good and cost savings (Blog, May 24, 2017)
o Are we “boiled” for choice? (Blog, August 3, 2017)
o AI detects roof status using Drones (BA MSc dissertation summary, October 10, 2017)
oHastoe Analytics, OR & Data Science applications
oNeural Networks
oReal estate valuation using regression models and artificial neural networks: An applied study in Thessaloniki by Alexios Georgiadis
10/07/2018
8
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Five years – long term
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Machine Learning in HastoeAn overview
Machine Learning
in Hastoe
Improve service
Reduce cost
Optimise decision
Increase income
Arrear collection
Warning prediction
Investment decision
Arrear collection
startgy
Trickle sales
Proactive and targeted
service
Tenant classification
Repairs analysis
Component
replacement prediction
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
Data Science in HastoeOverview
10/07/2018
9
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
http://www.asimovinstitute.org/neural-network-zoo/
LSTM network time series predicts high-risk tenantsWolfgang Garn, Yin Hu, Paul Nicholson, Bevan Jones, Hongying Tang
View publication statsView publication stats
Top Related