Introduction To Data Science

This module will introduce you to Data Science throwing light on Why data science?, Analysing Big Data, Architecture and methods to solve Big Data issues, Data visualization etc…

  • Introduction to Big Data
  • Roles played by a Data Scientist
  • Analysing Big Data using Hadoop and R
  • Different Methodologies used for analysis in Data Science
  • The Architecture and Methodologies used to solve the Big Data problems
  • Data Acquisition from various sources
  • Data preparation
  • Data transformation using Map Reduce (RMR)
  • Application of Machine Learning Techniques, Data Visualization etc.,
  • Problem statement of few data science problems which we shall solve during the course
Basic Data Manipulation Using R In Data Science.

This module teaches how to manipulate data and use R for all kinds of data conversion and restructuring processes that are frequently encountered in the initial stages of data analysis in Data Science Training.

  • Understanding vectors in R
  • Reading Data
  • Combining Data
  • Sub-setting data
  • Sorting data and some basic data generation functions
Machine Learning Techniques Using R Part-1

The goal of machine learning is to create a predictive model, that is indistinguishable from a correct model. This module, starts off giving you an overview about machine learning in Data science Training.

  • Machine Learning Overview
  • ML Common Use Cases and techniques
  • Clustering and Similarity Metrics
  • Distance Measure Types: Euclidean, Cosine Measures, Creating predictive models
 Machine Learning Techniques Using R Part-2

This module is designed to teach you ‘K’ means clustering, association rule mining and much more..

  • Understanding K-Means Clustering in Data Science
  • Understanding TF-IDF and Cosine Similarity and their
  • application to Vector Space Model
  • Implementing Association rule mining in R.
Data Science Machine Learning Techniques Using R Part-3

The last part of machine learning module of Data Science course, trains on Decision Tree’s, Random forests concept in Data Science.

  • Understanding Process flow of Supervised Learning Techniques
  • Decision Tree Classifier
  • How to build Decision trees
  • Random Forest Classifier
  • What is Random Forests concept in data science
  • Features of Random Forest
  • Out of Box Error Estimate and Variable Importance
  • Naive Bayes Classifie
Introduction To Hadoop Architecture

Understand the Hadoop architecture, its commands, SQOOP and other data loading techniques in this module.

  • Hadoop Architecture
  • Common Hadoop commands
  • MapReduce and Data loading techniques (Directly in R and in
  • Hadoop using SQOOP, FLUME, and other data Loading Techniques)
  • Removing anomalies from the data
Integrating R With Hadoop

This module of Data science course, will give good knowledge on how R is integrated with R, the integrated programming environment and writing MapReduce jobs.

Integrating R with Hadoop using R
Hadoop and RMR package
Exploring RHIPE (R Hadoop Integrated Programming Environment)
Writing MapReduce Jobs in R and executing them on Hadoop

Data Science Mahout Introduction And Algorithm Implementation

By the end of this module, you will be able to implement machine learning algorithms with Mahout
Implementing Machine Learning Algorithms on larger Data Sets with Apache Mahout

Additional Mahout Algorithms And Parallel Processing Using R

In this module, you will learn how to implement Random Forest Classifier with Parallel Processing Library using R.
Implementation of different Mahout algorithms
Random Forest Classifier with parallel processing Library in R


The aim of the project module is to let you have an idea of what a project is, problem statement, various approaches and solving algorithms.

  • Project Discussion
  • Problem Statement and Analysis
  • Various approaches to solve a Data Science Problem
  • Pros and Cons of different approaches and algorithms
Resume & Interview Questions

Spiritsofts offers advanced Data Science interview questions and answers along with Data Science resume samples.

