The Fung Institute Patent Lab: Products and Future Plans
-
Upload
arnobio-morelix -
Category
Economy & Finance
-
view
40 -
download
0
Transcript of The Fung Institute Patent Lab: Products and Future Plans
![Page 1: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/1.jpg)
The Fung Institute Patent Lab: Products and Future Plans
Lee Fleming, Director of the Coleman Fung Institute
for Engineering Leadership
May 2015
With Gabe Fierro, Ben Balsmeier, Guan-Cheng Li, Kevin
Johnson, Aditya Kaulagi, Douglas O'Reagan, Bill Yeh
We gratefully acknowledge support from the National Science Foundation Grant #1064182, the US Patent and
Trademark Office, and the American Institutes for Research
![Page 2: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/2.jpg)
My objectives for today’s chat • Give you an understanding of our work
– Disambiguation (upcoming JEMS paper) – Visualization and tools – Future plans (PAIR)
• Get your feedback on our research • Help me understand bigger picture of data efforts in innovation and entrepreneurship
– I want to get our stuff used – and at the same time, aid replication and help our field to stop re-inventing inferior wheels
![Page 3: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/3.jpg)
Continuing opportunity w/ patent data • Despite many papers, basic data remain inaccessible
– Unstructured and dirty text difficult to aggregate across entities – (Semi) manual and uncoordinated efforts to date for granted patents
• We provide parsing, dbase, auto disambig of grants + apps: • inventors • assignees • patent lawyers’ firms • location
• Everything made public and supportive of complementary efforts (mainly AIR and USPTO)
![Page 4: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/4.jpg)
Basic data flow (~2-3 weeks)
![Page 5: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/5.jpg)
Conceptual database schema 10/18/13 database-simplified.svg
file://localhost/Users/gabe/Documents/Patent/patentprocessor/latex/figs/database-simplified.svg 1/1
Patent
Lawyer
<lawyers,
patents>
Assignee
<assignees,
patents>
Inventor
<patents,
inventors>
RawLawyer
<rawlayers,
lawyer>
RawInventor
<inventor,
rawinventors>
RawAssignee
<assignee,
rawassignees>
Location<assignees,
locations>
<locations
inventors>
RawLocation
<location,
rawlocations>
<rawlocations,
rawinventor>
<rawassignee,
rawlocations>
USPC
<classes,
patent>
Citation
IPCR
<ipcrs,
patent>
MainClass
<mainclass,
uspc>
SubClass
<subclass,
uspc>
USRelDoc
<patent,
usreldocs>
reldocs>
OtherReference
<patent,
otherreferences>
Application
<application,
patent>
<patent,
citations>
citedby>
<patent,
rawassignees>
<patent,
rawinventors>
<rawlawyers,
patent>
![Page 6: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/6.jpg)
Accessible data: monthly disambiguated grant, app data Jan ‘75 – Dec ‘14: http://funglab.berkeley.edu/database
• Parse, clean, disambiguate: – inventors – geography (Google lookup) – assignee (crude Jaro-Winkler) – lawyer (crude Jaro-Winkler) – consistent inventor identifiers – cites, claims, non-pat refs… – .csv download or SQL query – future: blocking, tech control – > 300M observations (not all characterized yet); ~50GB
![Page 7: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/7.jpg)
Will the real Matt Marx please stand up?
Plainview NY Everett MA Mt View CA
Class 704
![Page 8: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/8.jpg)
Disambiguation: a classifier problem • Popular methods: we currently use last three
– Manual – Linear weighting + manual tuning – Naïve Bayes, supervised and semi-supervised – String matching – K-means intra and inter cluster optimization – Look up (Google provided access to library)
• Active research topic in machine learning • Julia Lane is planning a contest • Had more complex approach (Li et al. 2014)
– latest is simpler, faster, supportable, improvable • though not as accurate yet – tends to oversplit
![Page 9: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/9.jpg)
Inventor disambiguation • Start with (block on) exact name matches • Euclidean distance for exact attribute matches • Balance min intra cluster and max inter cluster distances
![Page 10: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/10.jpg)
• Look for no further improvement
– 4 in this case
![Page 11: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/11.jpg)
• Re-label each column with a cluster • Relax exact name match and merge • Use correlation of co-authors as well
![Page 12: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/12.jpg)
Future of inventor disambiguation • Relax strict matching • Bring in additional data
– All tech fields – Lexical overlap – Law firms – Prior art citations and non patent references
• New algorithms • Make everything public and support AIR tournament
![Page 13: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/13.jpg)
Assignee disambiguation
• Jaro-Winkler after simple string cleaning • Unique assignees from 6,700,000 to 507,000 • Indentifier, raw and cleaned name available
![Page 14: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/14.jpg)
Future of assignee disambiguation • Coordinate with NBER and HBS efforts
– The field needs to curate and maintain cumulative progress
• CONAME data from USPTO • Normalize common affixes • Train with manually developed NBER disambiguation • Apply inventor algorithm • Provide Compustat identifier • Add subsidiary information
- BvD sample of 6,000 major U.S. firms revealed 50,000 subsidiaries under parental control (>50% in 2012)
- GE: 250 subsidiaries, ~98% patents filed under GE
![Page 15: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/15.jpg)
Law firms
• Similar algorithms to assignees • Not aware of any applications yet
![Page 16: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/16.jpg)
Locations
• Use Google’s geocoding API • Unique cities from 333K to 66K • City, region, country
– Lat and Long being developed – Do not provide street level data
![Page 17: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/17.jpg)
If you’re allergic to SQL: http://rosencrantz.berkeley.edu
![Page 18: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/18.jpg)
Approximate results (full 2014 data in process)
http://funglab.berkeley.edu/database
![Page 19: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/19.jpg)
Tools and applications • Look for this stuff and high level explanations at:
– http://www.funginstitute.berkeley.edu/blog-categories/faculty-directors-blog#
![Page 20: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/20.jpg)
Visualizations
• Clean tech inventions mapped by type and source • Inventor mobility movies • Patent location in technology “space” • The convergence and divergence, the coalescence and reconfiguration of components – the flow of technology - over time
• Visualizing the patent application process
![Page 21: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/21.jpg)
Clean Tech Patent Mapper
• Li, G., K. Paisner, “A List of Clean Tech Patents.” • http://funglab.berkeley.edu/cleantechx/ • Energy: wind, solar, bio, hydro, geo, nuclear • Assignee: VC backed, university, government, large and small incumbents, no assignee
![Page 22: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/22.jpg)
VC patents 1990-1999
Innovation and Entrepreneurship in Clean Energy: Nanda, Younge, Fleming
Note scale of funding activity 1990-1999
![Page 23: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/23.jpg)
VC patents 2000-2009
Innovation and Entrepreneurship in Clean Energy: Nanda, Younge, Fleming
See Nanda, R. and K. Younge, L. Fleming. “Innovation and Entrepreneurship in Clean Energy,” Forthcoming at Rethinking Science and Innovation Policy, NBER.
Much greater funding activity 2000-2009
![Page 24: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/24.jpg)
Midwest clean tech
![Page 25: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/25.jpg)
Kansas City clean tech
![Page 26: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/26.jpg)
Mobility mapper: http://funglab.berkeley.edu/mobility/
• Larger states • Example: 1987 immigration to MI (note one IL inventor):
![Page 27: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/27.jpg)
!
!
1987
1982
Illustrates causal impact of noncompetes on brain drain (Marx, Singh, Fleming, forthcoming RP)
![Page 28: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/28.jpg)
!
Variety of states
![Page 29: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/29.jpg)
Visualizing an acquisition
![Page 30: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/30.jpg)
Acknowledgment of government support – Hillary Greene, Dennis Yao, Guan Cheng
• What proportion of 2015 patents can be traced to govt?
![Page 31: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/31.jpg)
5M patent applications as a Markov process? Starting with an analysis of Bilski vs. Kappos
![Page 32: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/32.jpg)
Network Interface – http://
douglasoreagan.com/socialnetwork/
![Page 33: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/33.jpg)
Semiconductor patents in 438/283
from 1998-2000
![Page 34: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/34.jpg)
Method to illustrate network around seed inventors
![Page 35: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/35.jpg)
Cool pics – but what do they mean?
– Need to validate visualizations with ground truth – Mixed visualization and historical study of biggest semiconductor breakthrough of last decade – the FinFET
![Page 36: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/36.jpg)
Why FinFET? • Study intended to explore/develop breakthrough visualization tools
– tie to reality w/o conflating variables
• All patents Northern CA 1995-2000 • Ranked by future citations • Tech distance
– from our brains, close but moldy
• Geographic distance – about 40 yards
• Social distance – head of search committee that hired me – neighbor
![Page 37: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/37.jpg)
Quintessential architectural BT
Source: King 2012
![Page 38: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/38.jpg)
Inventors brokered social and academic/
industry networks
![Page 39: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/39.jpg)
But they also integrated outsiders
![Page 40: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/40.jpg)
The flow of technology
1) Words are components -> little differentiation, this is so incremental
2) No geographic localization of trajectories
3) How did university plop in and do this?
4) FinFET may have been only govt supported patent
![Page 41: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/41.jpg)
Coming attractions • Blocking actions – better than citations as a measure of patent impact?
• Lexical novelty – First appearance of new word in corpus – First pair-wise combination of words
• Lexical distance between classes
![Page 42: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/42.jpg)
Identification of blocking patents – pdf challenges: OCR 101,195 PDF files…
![Page 43: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/43.jpg)
Claim Rejections – 35 USC 103 3. The folowing is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set forth …
Detail Enhancement
Noise Reduction
OCR
![Page 44: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/44.jpg)
OCRed blocking data
![Page 45: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/45.jpg)
First results from 2012 • 2011 now complete as well • Need to characterize each type of action
![Page 46: The Fung Institute Patent Lab: Products and Future Plans](https://reader038.fdocuments.in/reader038/viewer/2022110313/55c35c78bb61eb01598b4744/html5/thumbnails/46.jpg)
I may come to you tin cup in hand… • Download, parse, clean, disambiguate, store and serve up > 300M data (and weekly updates)
– Julia Lane taking over part of this • Blocking data: must OCR ~400M documents • Disambiguation takes weeks, PAIR years
– ~$150K hardware alone past year – database person in Si Valley (~$140K + Cal tax)
• Mention maintenance in NSF proposal => ding • Public good (~50,000 downloads) • Talking with firms and private philanthropy