A Step By Step Guide with Visual Illustrations and ExamplesThe Data Science field is expected to continue growing rapidly over the next several years and Data Scientist is consistently rated as a top career.Data Science with R gives you the necessary theoretical background to start your Data Science journey and shows you how to apply the R programming language through practical examples in order to extract valuable knowledge from data. Professor Andrew Oleksy guides you through all important concepts of data science including the R programming language, Data Mining, Clustering, Classification and Prediction, Hadoop framework and more.
Table of Contents
- Introduction to Data Mining
- Data Science
- Knowledge Discovery in Databases (KDD)
- Model Types
- Examples and Counterexamples
- Classification of Data Mining methods
- Applications
- Challenges
- The R Programming Language
- Basic Concepts, Definitions and Notations
- Tool Installation
- Introduction to R
- Data Types
- Basic Tasks
- Control Structures
- Functions
- Scoping Rules
- Iterated Functions
- Help from the console and Package Installation
- Types, Quality and Data Preprocessing
- Categories and Types of Variables
- Preprocessing processes
- dplyr and tidyr packages
- Summary Statistics and Visualization
- Measures of Position
- Measures of Dispersion
- Visualization of Qualitative Data
- Visualization of Quantitative Data
- Classification and Prediction
- Classification
- Prediction
- Overfitting and Regularization
- Clustering
- Unsupervised Learning
- Concept of Cluster
- K-means algorithm
- Hierarchical Clustering Algorithms
- DBSCAN Algorithm
- Mining of Frequent Itemsets and Association Rules
- Introduction
- Theoretical Background
- Apriori Algorithm
- Frequent Itemsets Types
- Positive and Negative Border of Frequent Itemsets
- Association Rules Mining
- Alternative Methods for Large Itemsets generation
- FP-Growth Algorithm
- Arules Package
- Computational Methods for Big Data Analysis (Hadoop and MapReduce)
- Introduction
- Advantages of Hadoop's Distributed File System
- Hadoop Users
- Hadoop Architecture
- The Hadoop Cluster Architecture
- Hadoop Java API
- List Loops & Generic Classes and Methods