Finding trends in large scale document sets

28
Treparel Delftechpark 26 2628 XH Delft The Netherlands www.treparel.com Dr. Anton Heijs CEO/CTO [email protected] September 24, 2012

description

Finding trends in large scale document sets using KMX Patent Analytics. Incl. use case example of eBook patent landscape (incl. Apple iPad, Samsung Galaxy and Amazon Kindle)

Transcript of Finding trends in large scale document sets

Page 1: Finding trends in large scale document sets

Treparel

Delftechpark 26

2628 XH Delft

The Netherlands

www.treparel.com

Dr. Anton Heijs CEO/CTO

[email protected]

September 24, 2012

Page 2: Finding trends in large scale document sets

Analysing large patent portfolios: Nothing remains uncovered

Agenda

• Introduction Treparel & KMX Patent Analytics

• How to deal with Big Data in patents?

• Landscape analysis of large patent sets

• Use Cases: SWOT analysis

Fig 1: patent landscape of ebook technologies

Treparel KMX – All rights reserved 2012 2 www.treparel.com

Page 3: Finding trends in large scale document sets

About Treparel

• Treparel is an innovative technology solution provider of

– Big Data Text Analytics and Visualization technology

– Patent Analytics solutions

• KMX is an integrated data analysis toolset which provides

– Fast and accurate insights in large unstructured document sets to allow companies to make better informed decisions.

• KMX software platform

– Strong focus on R&D with university ecosystem

– Over 30 man years of software-development

– Used by knowledge driven organizations in technology, chemical, and life sciences

• Based in Delft, The Netherlands since 2006.

Treparel KMX – All rights reserved 2012 3 www.treparel.com

Page 4: Finding trends in large scale document sets

IP landscape trend example: 1970-2007

Alstom GE

Siemens Dong Feng 4

Page 5: Finding trends in large scale document sets

The importance of analytics in IP

Research Discovery Development Market Launch

Life Cycle Management

Decreasing return

Increasing Return Investment

Market, Legal and Competitive Analysis

Patent Analysis

Research Analysis

$

years

Treparel KMX – All rights reserved 2012 5 www.treparel.com

Page 6: Finding trends in large scale document sets

Economic changes in the last decades

• Globalization :

– More companies, increased competition, lower margins

– Innovation for shorter product life cycles

– Manufacturing has become a commodity by outsourcing to low wage countries

• The cost of R&D increased but competition leads to price erosion

• Return on R&D investment drives importance for value creation from IP

• Competitive edge of companies shifts from

production-based to knowledge-based

Treparel KMX – All rights reserved 2012 6 www.treparel.com

Page 7: Finding trends in large scale document sets

The Intellectual Economy

Past : Selling products finances R&D

Return on investments

Research , Discovery

Development , Manufacturing

Marketing , Sales

Life Cycle Management

Revenues

7

Today: value creation and revenue generation from IP (to finance R&D)

Revenues

Return on investments

Research , Discovery

Development , Manufacturing

Marketing , Sales

Life Cycle Management

Revenues

Innovations

IP Licensing

Treparel KMX – All rights reserved 2012 www.treparel.com

Page 8: Finding trends in large scale document sets

Adapting to economic developments

The New Reality

• IP strategy over time – Increase the value of the IP portfolio

– Maximize the ROI from R&D

• License management – analyze the impact of economic developments

and its effect for the IP strategy

Drivers

• Number of Patent filings is growing

• Growing need to drive revenues from licensing to fund R&D investment

Value from IP portfolio • Protect market share • Save cost for royalties via cross

licensing • Generate income from royalties • Benefit from License Out from

joint ventures or spin-offs

Treparel KMX – All rights reserved 2012 8 www.treparel.com

Page 9: Finding trends in large scale document sets

• Economies depend stronger on each other

• Innovation is a driver of economic growth

• Globalization generates many patents from China and South Korea

• The number and complexity of patent filings is growing

Growing need for fast and accurate analysis of large sets of patents

Trends and implications for the future

Treparel KMX – All rights reserved 2012 9 www.treparel.com

Page 10: Finding trends in large scale document sets

Getting more insight using less experts

Big Data Paradox: • Limited (human) resources available for in-depth analysis • Growing need for data driven decisions

Growing Data, Faster Insights: Big Data Analytics

Variety Complexity

Velocity Volume

Treparel KMX – All rights reserved 2012 10 www.treparel.com

Page 11: Finding trends in large scale document sets

The big data paradox ; more data but less knowledge ‘Information Gap’

1990 1995 2000 2005 2010 2015 2020

0

2

6

8

10

4

Available data

Available experts for supporting decision making

Information Gap

Data driven decisions

Treparel KMX – All rights reserved 2012 11 www.treparel.com

Page 12: Finding trends in large scale document sets

Information democracy: Information Creators and Consumers • Creators

– Defines & prepopulate Analysis Pipelines and test it on the data

– Deploys these pipeline using Cloud computing • Required computing capacity can scale up with the business /

analytical needs

• Consumers

– Pre defined analytical reports

– Sharing feedback and input to the results to optimize analysis

Sharing information empowers collaboration

Treparel KMX – All rights reserved 2012 12 www.treparel.com

Page 13: Finding trends in large scale document sets

The traditional IP search and analysis

Information Consumer Request Information Information Creator Data

Database Results

Request

The traditional approach: • Each search/analysis request is focussed on a specific question from one user • When the number of request increases this requires more human searchers • When the searches involve analysis of more patents this requires more time • Very specific searches can not be automated • Analysis of large documents sets can be automated – which is an opportunity to

analyse more and become more competitive

Tools

Treparel KMX – All rights reserved 2012 13 www.treparel.com

Page 14: Finding trends in large scale document sets

Information democracy: Proactive analysis to search and analysis

Information Consumers Push Information Information Creators Data

Patent Database

Research Database Running

Analytics Pipeline Marketing

Database

Business User

Analyst

Liberate the Information Search, facilitate the discovery process: • The knowledge Creator:

• defines the analysis pipeline and test it on the data • deploys the analyses using cloud computing resources

• Direct access for information consumers for in depth analyses

Treparel KMX – All rights reserved 2012 14 www.treparel.com

Page 15: Finding trends in large scale document sets

Liberate more information using new technologies • Enable a small group of experts with tools to set-up IP

analysis pipelines

– Extend the search on request approach with a pro-active analysis approach on large document sets

– Use analysis pipelines to auto generate visualizations in a browser

– Invest in new technology coming from Big Data Analytics

• Give a large group of users access to the internal webpages providing them with rich statistical information and interactive visualizations

Treparel KMX – All rights reserved 2012 15 www.treparel.com

Page 16: Finding trends in large scale document sets

Combining proactive analysis with traditional IP search and discovery

Information Consumers Information Information Creators Data

Database

Tools

Patent Database

Research Database

Running Analytics Pipeline

Marketing Database

Analyst

Tailor made Results

Exchange

Information

Personal Request

Business User

Treparel KMX – All rights reserved 2012 16 www.treparel.com

Page 17: Finding trends in large scale document sets

Performing small to large scale SWOT analysis

Patent Database

5000 patents 1000 patents 500 patents

Ranking

Queries

Filtering

SWOT analysis example

• What are most important

patents?

• Who owns them? • What is growth of

patents by: • Technology?

• Owner?

• Country?

• Year?

Overview

and details Business

User

Treparel KMX – All rights reserved 2012 17 www.treparel.com

Page 18: Finding trends in large scale document sets

Auto reporting & analysis for multiple users

• Reporting of aggregated results:

– Pie & bar charts

• Providing overview of the subject:

– landscape visualization

• Enabling rich interaction

Page 18 Treparel KMX – All rights reserved 2012 18 www.treparel.com

Page 19: Finding trends in large scale document sets

19

Use Case1: SWOT analysis of Ebooks

Fig 2: Overview landscape visualization of 4257 patens

• Perform proactive SWOT analysis of ebooks market

Amazon Kindle – Apple – Samsung/Google and other players

• Who owns what?

• What can we learn from competitive technology landscape? • Why?

• Determine a company/technology position and opportunities

• We do this in KMX by:

1. Query to get patents on electronic paper technology

2. Landscape analysis 3. Classification/Ranking

4. Filter and select subset

5. Iterate step 1-to-5

Treparel KMX – All rights reserved 2012 19

Page 20: Finding trends in large scale document sets

Analysis of ebook technology

Fig 3: Overview landscape visualization of 4257 patents

Treparel KMX – All rights reserved 2012 20 www.treparel.com

Page 21: Finding trends in large scale document sets

Use document classification to rank the patents

Fig 4: Ranked patents using a classifier for ebook technology (In purple the selection of relevant

patents for deeper analysis) Treparel KMX – All rights reserved 2012 21 www.treparel.com

Purple = most important patents

Red = least relevant patents

Page 22: Finding trends in large scale document sets

Drill deeper in the data to learn more

Fig 5: Landscape visualization going from 4257 to 1049 to 369 patents

After removing the irrelevant patents we use filtering to

determine: • Who are the important players (assignees, inventors)?

• Where are the important patents filed (countries)?

• What is the trend over time (growth of patents over the years)?

Treparel KMX – All rights reserved 2012 22 www.treparel.com

Page 23: Finding trends in large scale document sets

Define the relevant set of patents to identify your strengths & opportunities

Fig 6: Landscape visualization of 369 most important patents in ebook technology Treparel KMX – All rights reserved 2012 23

Purple = most important patents

Red = least relevant patents

Page 24: Finding trends in large scale document sets

The role of language: Clustering of Patents in Chinese text

Fig 7: Patent landscape visualization using the chinese or englisch text

Treparel KMX – All rights reserved 2012 24 www.treparel.com

Page 25: Finding trends in large scale document sets

Use Case 2: SWOT Technical & Biological patents

Fig 8: Landscape visualization of 10920 patents

• Perform SWOT analysis in a converging market: analyze claims with mixed

technologies

• Who owns what?

• How does the technology landscape looks like?

• Why ? • Determine a company/technology position and opportunities

• We do this in KMX by:

1. Query to get patents on mechanical/electronic/optics mixed with

biological technology

2. Landscape analysis 3. Classification/Ranking

4. Filter and select subset

5. Iterate step 1-to-5

Treparel KMX – All rights reserved 2012 25

Page 26: Finding trends in large scale document sets

Use Case: SWOT analysis : patents covering multiple technology areas (engineering & biology)

• Patents become more complex to analyse

• Examples: • More detailed claims • Mixed technologies in

the claims

• Obtaining an landscape

overview is then key

• Analysis from the users perspective is essential (classification/ranking)

Fig 9: Landscape visualization of 10920 patents

Treparel KMX – All rights reserved 2012 26 www.treparel.com

Page 27: Finding trends in large scale document sets

Patents from total set with biological focus

• Using the text from the title/abstract and claims

• landscape analysis provides overview

• Using document classification to determine sub clusters

• Using classification and ranking to determine the most relevant documents from the users perspective

Fig 10: Landscape visualization of 134 patens

Page 28: Finding trends in large scale document sets

Key takeaways

Global economy demands value generation from an IP strategy

Big Data Paradox: • Limited (human) resources available for in-depth analysis • Growing need for data driven decisions

The information creators (patent searchers) focus on

• Providing proactive information on generic analysis tasks

• Perform specific analysis for single user request

The information consumers (patent council)

• Get knowledge from automated analysis with

interactive capabilities

• Obtain SWOT analysis knowledge of competitors

• Built and optimize patent strategies

Treparel KMX – All rights reserved 2012 28 www.treparel.com