Data mining tools overall
-
Upload
mohamed-sharique -
Category
Data & Analytics
-
view
216 -
download
0
Transcript of Data mining tools overall
Data Mining Tools
KowshikMadhumati
MayurMohamed Sharique
Vidyashankar
• Open source • Data visualization and analysis• Novice and experts• Through Python scripting• Available for all popular platforms, including
Windows, Mac OS X and variants of Linux.
• Founded on 1996• Orange is distributed free under the GPL.• M&D at the Bioinformatics Laboratory of the
Faculty of Computer and Information Science, University of Ljubljana, Slovenia.
Product Details
Company Details
Python is a widely used general-purpose, high-level programming language.GNU General Public License is the most widely used free software license
Features• Visual Programming• Visualization• Interaction and Data Analytics• Large Toolbox• Scripting Interface• Extendable• Documentation• Open Source• Platform Independence
Success Stories• Astra-Zeneca, a pharmaceutical giant, which uses
Orange in drug development and sponsors the development of several related parts of Orange
• At Jožef Stefan Institute, the visual programming interface has been upgraded in Orange4WS to support service-oriented architectures
Screenshot
• Latest R-language engine for statistical computing• Open source, R- Enterprise, R-Cloud(Paid version )• Data visualization and analysis up to 16 TB• Extended capabilities with reproducible R tool Kits• Windows , Mac OS and variants of Linux.
• Founded on 1993 in New Zealand • Robert and Rossa pioneer in R language
development .• R has General Public Licence.• Many Big MNC companies are using R software.
Product Details
Company Details
Useful Functions • Graphics Visualization • Spatial Data Analysis• Clustering• Text Mining• Social Network Analysis and Graph mining • Statistics• Data Manipulation
Success Stories• Bank of America• Bing• Facebook• Ford • Google
Screenshot
• Open source • a collection of machine learning algorithms• Data visualization and analysis• Java based platform• Most researchers and practitioners
• Founded on 1997• University of Waikato
Product Details
Company Details
Public License is the most widely used free software license
Features • General public license• GUI for interacting• Explorer is the main user interface of WEKA• primitive tasks including data pre-processing,
classification, regression, clustering, association rules and visualization
• Execute data files in multiple format• One exceptional feature of WEKA is the database
connection using JDBC with any RDBMS package
• The Weka mailing list has over 1100 subscribers in 50 countries, including subscribers from many major companies such as Rechtsportal
Success Stories
Screenshot
• Open source.• Data visualization and analysis• Machine Learning• Data Mining, Text Mining.• Business Intelligence. • Works on java runtime.• Available on all major operating systems and
platforms
• Started as YALE in 2001 by Ralf Klinkenberg, Ingo Mierswa, and Simon Fische
• In 2006 it was renamed by Rapidminer since developed by Rapid-1 founded by Ralf Klinkenberg, Ingo Mierswa
• Licensed by AGPL.
Product Details
Company Details
Features • A visual - code-free - environment, so no programming needed• Design of analysis processes• Predictive analytics (with pre-made templates)• Data loading• Data transformation• Data Modelling• Data visualization (with lots of visualizations)• Allows you to work with different types and sizes of data sources• Platform Independence.• Acts as a powerful scripting language engine along with a
graphical user• Modular operator concept.
• CISCO• PAYPAL• EBAY• MIELE• VOLKSWAGEN
Success Stories
Screenshot
COMPARISON OF ALL TOOLSWEKA RAPIDMINER R-
PROGRAMMINGORANGE
FORMATS SUPPORTED
ONLY 4 FILE FORMATS ARE SUPPORTED
SUPPORTS MORE FILE FORMATS (Approx 22)
SUPPORTS MORE FILE FORMATS
SUPPORTS MORE FILE FORMATS
USER INTERFACE
EASY USER INTERFACE
DIFFICULT USER INTERFACE
SIMPLE IN UNIX OS,DIFFICULT IN WINDOWS AND MAC
EASY
CONNECTIVITY WORSE CONNECTIVITY WITH EXCEL AND NON JAVA DATABASES
EASILY CONNECTED WITH EXCEL
EASY CONNECTIVITY WITH EXCEL AND OTHER DATABASES
BETTER THAN WEKA
Orange has elegant and concise scripting and can also be run in an ETL GUI mode.
R has elegant and concise scripting integrated with a vast statistical library.
RapidMiner has a lot of functionality, is polished and has good connectivity.
WEKA is the easiest GUI to learn and use.
• http://old.biolab.si/• http://en.wikipedia.org/• http://
www.predictiveanalyticstoday.com/
• http://thenewstack.io/• www.facebook.com/• www.slideshare.net/• www.kdnuggets.com/• www.researchgate.net• https://rapidminer.com/
• www.r-project.org• sourceforge.net/projects/weka• www.thearling.com