Big data taller inegi sedesol
-
Upload
abel-alejandro-coronado-iruegas -
Category
Data & Analytics
-
view
128 -
download
0
Transcript of Big data taller inegi sedesol
![Page 1: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/1.jpg)
&Ciencia de Datos
![Page 2: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/2.jpg)
![Page 3: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/3.jpg)
¿Qué es Big Data?
@abxda
![Page 4: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/4.jpg)
¿Qué es Big Data?
http://datascience.berkeley.edu/what-is-big-data/ @abxda
![Page 5: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/5.jpg)
¿Qué es Big Data?
@abxda
![Page 6: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/6.jpg)
¿Qué es Big Data?
@abxda
![Page 7: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/7.jpg)
¿Qué es Big Data?
Diciembre 2004
Octubre 2003
@abxda
![Page 8: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/8.jpg)
¿Qué es Big Data?
2006> 100,000 Artículos
2007@abxda
![Page 9: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/9.jpg)
Hadoop (2006 - 2008)
@abxda
![Page 10: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/10.jpg)
Hadoop (2006 - 2008)
@abxda
![Page 11: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/11.jpg)
¿Qué es Big Data? (2009 – 2016…)
@abxda
![Page 12: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/12.jpg)
Matei Zaharia Ion Stoica
(2009 – 2016…)
@abxda
![Page 13: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/13.jpg)
Big Money 2014
@abxda
![Page 14: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/14.jpg)
(2013)
@abxda
![Page 15: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/15.jpg)
Big Data en las Oficinas Nacionales de Estadística
http://www1.unece.org/stat/platform/download/attachments/58492100/Big+Data+HLG+Final.docx?version=1&modificationDate=1362939424184
Comisión Económica de las Naciones Unidas para Europa
@abxda
![Page 16: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/16.jpg)
• It is clear that during the next two years there is a need to identify a few pilot projects that will serve as proof of concept.• Statistical organisations are, therefore, encouraged to address formally Big data issues in their annual and multi-annual work programmes by undertaking research and pilot projects in selected areas and by allocating appropriate resources for that purpose.
Big Data en las Oficinas Nacionales de Estadística
@abxda
![Page 17: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/17.jpg)
• 'new' exploration and analysis methods are required: Visualization methods, Text mining, and High Performance Computing.• To use Big data, statisticians are needed with a different mind-set and new skills. The processing of more and more data for official statistics requires statistically aware people with an analytical mind-set, an affinity for IT (e.g. programming skills)
Big Data en las Oficinas Nacionales de Estadística
@abxda
![Page 18: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/18.jpg)
@abxda
![Page 19: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/19.jpg)
Experto encomputación ydesarrollo avanzados(Big Data)
Experto enModelado
Estadístico
Experto enel dominio de
datos
Unicornio
Zonapeligrosa!
Investigacióntradicional
Machinelearning
CIENCIADE
DATOS
http://www.anlytcs.com/2014/01/data-science-venn-diagram-v20.html @abxda
![Page 20: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/20.jpg)
Internet de las cosas
Internet de las personas
Internet de las ideas
Internet del todo
Datos Crudoshdfs://
Información(Significado)
TomarDecisiones
Actuar
¿quién?¿cuántos?
¿por qué?
¿qué?¿Dónde?
Análisis de DatosEstadística Machine Learning
Estratificaciones
Análisis de Regresión
Muestreo
Mucho más…Análisis de Redes (Grafos)
Minería de Datos
Velocidad
Varie
dad
VolumenCiencia de Datos
(Transforma/Modela)Cómputo Distribuido y Paralelo
ArquitecturaBig Data & Ciencia de Datos
@abxda
![Page 21: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/21.jpg)
![Page 22: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/22.jpg)
¿Qué clase de #BigData es esta?
En operaciones de Machine Learning, una sola tarjeta de Video, es 45 veces mas poderosa que el XEON mas rápido.
2560 CUDA CORES
@abxda
![Page 23: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/23.jpg)
Producto de Datos 2012Estratificador INEGI
![Page 24: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/24.jpg)
%Acceso a Internet, %Pc, %Telefono Celular, %Automovil
https://spark.apache.org/
2013
@abxda
![Page 25: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/25.jpg)
Twitter como fuente de Big Data (Primer Proyecto Piloto)Para medir el pulso emotivo de México …y mucho más …
@abxda
![Page 26: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/26.jpg)
Hydra
Octubre 2013INEGI
@abxda
![Page 27: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/27.jpg)
Consulta Geográfica
![Page 28: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/28.jpg)
Visualización de la Base de Datos
200 Millones de Tuits400 Gb800 Mb Diarios
@abxda
![Page 29: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/29.jpg)
Visualización de la Base de Datos
~100 Millones de Tuits
@abxda
![Page 30: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/30.jpg)
Frecuencia de Tuiteo
# Tuits
Frecuencia por hora del día
~1,000,000 Tuiteros generaron ~ 100 Millones de Tuits
![Page 31: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/31.jpg)
Movilidad de los Tuiteros4’469,550 de desplazamientos inter-municipales 347,157 Tuiteros
@abxda
![Page 32: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/32.jpg)
Equipo de Trabajo
Dr. Oscar S. Siordia [email protected]
Dr. Mario [email protected] Dra. Daniela Moctezuma
Dr. Elio Villaseñorelio.villaseñ[email protected]
Dr. Eric [email protected]
Dr. Sabino [email protected]
Dr. Gerardo [email protected]
Dr. Alfredo [email protected]
Mtro. Abel [email protected]
Ing. Silvia [email protected] Y el apoyo de:
Dr. Juan Muñoz Ló[email protected]
Ing. Ricardo [email protected]
Y en la parte de visualización:Lic. Marco [email protected]
@abxda
![Page 33: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/33.jpg)
http://cienciadedatos.inegi.org.mx/pioanalisis
@hbcolectivo @ricardoaolvera
@abxda
![Page 34: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/34.jpg)
Proceso de Machine Learning
Muestra de TuitsEtiquetado Manual
Representación numéricahttp://scikit-learn.org/http://www.r-project.org/
Machine Learning
Tuits en Tiempo Real
ClasificadorIndicador
de sentimiento
@abxda
Entrenamiento
Producción
![Page 35: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/35.jpg)
@abxda
![Page 36: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/36.jpg)
Análisis del Sentimiento (Diario)
C#{RESTful:API}
{NoSQL}
![Page 37: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/37.jpg)
![Page 38: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/38.jpg)
DENUE & Twitter
@abxda
![Page 39: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/39.jpg)
DENUE & Twitter
@abxda
![Page 40: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/40.jpg)
DENUE & Twitter
@abxda
![Page 41: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/41.jpg)
Horarios de Tuiteo cerca de algún sector
@abxda
![Page 42: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/42.jpg)
4.9 M de Polígonos de Voronoi (DENUE)
@abxda
![Page 43: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/43.jpg)
Big Spatial Join (4.9 M DENUE +60 M Tweets)
@abxda
![Page 44: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/44.jpg)
SpatialSpark (Nov. 2015)
@abxda
![Page 45: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/45.jpg)
SpatialSpark: Open Source
@abxda
![Page 46: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/46.jpg)
Runing Code into Local Apache Spark
![Page 47: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/47.jpg)
![Page 48: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/48.jpg)
DENUE - Twitter
@abxda
![Page 49: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/49.jpg)
Proceso de Machine LearningImágenes de Satélite
Imágenes
Etiquetado por Expertos
Representación numéricahttp://scikit-learn.org/http://www.r-project.org/
Machine LearningAprendizaje Estadístico
Imágenes Continuas
ClasificadorAutomático
Clases de CoberturaDe Suelo
@abxda
Entrenamiento
Producción
![Page 50: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/50.jpg)
Proceso de Machine LearningImágenes de Satélite
@abxda
![Page 51: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/51.jpg)
Siguientes Pasos
• Colaboraciones Internacionales con ONU, para explorar el uso de Big Data en el calculo de los Indicadores de Desarrollo Sostenible.• Ampliar los trabajos a mas Fuentes de Big Data:
Datos de Telefonía Móvil, Imágenes de Satélite, etc.• Salud Mental en Adolescentes con Data2x y
Instituto Nacional de Psiquiatría
@abxda
![Page 52: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/52.jpg)
Preguntas
@abxda
![Page 53: Big data taller inegi sedesol](https://reader035.fdocuments.in/reader035/viewer/2022070600/589b46bb1a28ab4a398b4d87/html5/thumbnails/53.jpg)
[email protected]@abxda