Unlinkability: A+DataScience+Perspec5ve+ · given+side+info++answers+ ...
DataScience and BigData Cebu 1st meetup
-
Upload
francisco-liwa -
Category
Technology
-
view
152 -
download
0
Transcript of DataScience and BigData Cebu 1st meetup
1st Meetup Event : Meet and GreetDataScience & BigData Cebu Meetup
Friday, May 13, 2016 at 7:00PM
A SPACE Cebu, Unit KLM Crossroads Banilad, Cebu City, 6000 Cebu Philippines
Profile❖ Data Engineer @ nanu ❖ Worked at IBM, Toshiba,
Lexmark, NEC❖ Co-founder, Jaga-me Pte.Ltd❖ Founder, HandyNanay.co❖ Master of Technology in
Knowledge Engineering @ National University of Singapore ( NUS)
❖ Organizer, IoTCebu Meetup❖ Nodejs,Python, C/C++
DataScience and BigData Cebu Meetup
❖ Aboutit is an avenue for students, tech entrepreneurs, professionals, businessmen,hobbyist,designers,developers and the academe people to collaborate,to share skills and knowledge, and to improve overall understanding of BigData,Data Analytics, Machine Learning,Hadoop and DataScience through meetups,clinics, trainings, hackathons and ideation.❖ MissionTrain, mentor and educate members on current trends and best practices for DS and Big Data through clinics,demos,presentations, ideation ,workshops, competitions(kaggle,etc.) ❖ Vision -Become the largest pool of BigData and Data Science practitioners in Cebu
-Produced more experts and evangelist of DataScience and BigData❖ GoalDevelop more talents/members in the field of Big Data, Data Analytics and Data Science
What is Data Science❖ Science - “the intellectual and practical activity
encompassing the systematic study of the structure and behavior of the physical and natural world through observation and experiment.”
❖ Data Science - is the intellectual and practical activity to systematically study raw and unprocessed facts thru explorations, observations and experimentations.
❖ Data - raw ,unprocessed, unorganised facts
Science behind DS:
•Scientific method •Math•Statistics•Data Mining,•Machine Learning
Data Science Process
1. Data Collection / Elicitation2. Data Preparation (cleansing, cleaning, munging, transformation)3. Data Exploration4. Data Analysis5. Data Modelling6. Data Visualization (Results)
0. Ask important/interesting questions
Standard extended to DSCRISP-DM
(Cross Industry Standard Process for Data Mining)
What is Big Data❖ complex large data sets❖ data that is unable to fit to ordinary desktop storage or server
storage❖ 4 Vs ( Volume, Velocity, Variety, Veracity)
The Rise of Data
• Social Media• Banking• Telecommunications• IoT (Internet of Things)• Web• Mobile• Government•By 2017 global mobile data traffic will reach 11.2 exabytes per month
1 EB = 10006bytes = 1018bytes 1000 petabytes = 1millionterabytes = 1billion gigabytes.
The Data Workers• Data Scientist• Data Engineer• Data Analyst• Business Analyst
The Data Tools• R Studio, SAS, SPSS, Excel, Python, R• Tableau, QuikView, D3js , Highchart, Kibana, Zeppelin
• Hadoop, YARN, Apache Spark• Cloud Computing - PaaS, IaaS, SaaS• HortonWorks, Cloudera, MapR, DigitalOcean• IBM, Microsoft, Google, AWS• NoSQL, NewSQL Databases• In-memory Databases - Couchbase, Aerospike, Cassandra,Redis, VoltDB, MemSQL
The Data Products
• Actionable Insights ( Data Analysis reports )• Data Visualization - Interactive - Static reports• Data Analytics -Descriptive Analytics Model -Predictive Analytics Model• Machine Learning Model
Data Science and Big Data Landscape in Cebu (Philippines)
• IBM,HP, CISCO, Microsoft, Accenture, etc
• DataSeer
• Exists Global
• SavvySherpa
• ANALITIKA - DTI , DOST, PLDT
•Big Data Analytics Summit Cebu
http://www.bigdatasummit.com.ph/cebu/
The Big GAP
• Not Enough Startups or Local Companies offering Data Science or Big Data Analytics jobs
• Shortage of Math, Engineering and IT graduates with Data Science / Big Data skills
• Less support from the Government
• Not enough Local experts
Opportunities• Grassroots and local BigData / Data Science companies
• Local Data Analytics Startup
• BigData / Data Science Institutes or Learning Centers offered by Governments or Academe
• International DataScience Competitions ( Kaggle, Google, AWS,etc)
• Train younger generation for DS and BigData Skills and Tools.
Future Plans•Workshops
•Clinics
•Speakers from Industry
•Trainings
•More meet up events
•Community sharing
•Kaggle Competitions
DEMOA. Quick Introduction to Apache Zeppelin for Data Science Life Cycle
1. Download here - https://zeppelin.incubator.apache.org/
2. Author - https://spark-summit.org/eu-2015/speakers/moon-soo-lee/
3. Mac Os Installation - http://www.makedatauseful.com/apache-zeppelin-on-osx-ultra-quick-start/
4. Sample notebooks - https://github.com/hortonworks-gallery/zeppelin-notebooks