Microsoft R Open – Programming and Data Analysis

Microsoft R Open - Programming and Data Analysis

Public courses

£2300

- Anyone can join the training
- Course outline as presented on the website
- Small groups, 3-10 people

Private courses

Price set individually

- Training workshop just for your team
- You choose date and location of the training
- Course outline tailored to your needs

About the training

Microsoft R Open previously known as Revolution R is a enriched distribution of R, the ninth most popular programming language in the world. The increasing popularity and enormous potential of using it in data science forced the giant from Redmond to prepare its own distribution with some new features such as parallel processing of code with new computing packages or tools for creating repeatable results.

The training consists of the introduction to programming in R but also provides you with knowledge needed for data analysis, visualization and using new features of Microsoft R Open.

Who is this training for?

The training is aimed at people who are interested in programming with R or who want to work as data analysts and learn more about new features and packages offered by the Microsoft R Open distribution.

Data Analysts, Consultants, Statisticians, Engineers, Data Scientists

What will I learn?

  • Interpret and modify R code
  • Write R scripts for data analysis
  • Develop new functions
  • Import data from files and database
  • Wrangle data using advanced functions
  • Forecast, classify and detect outliers
  • Build machine learning models and tune hyperparameters
  • Create dynamic reports with Rmarkdown
  • Use packages from Microsoft R Open

Course outline

  1. Introduction
    • R program and evaluation
    • First R session, interactive and batch mode
    • RStudio IDE
    • Overview of objects and data types
    • Developing data analysis R script
    • Getting help, functions, packages
    • R libraries – install, build and load
  2. R syntax basics
    • Operators and calculations
    • Names – rules for names, namespaces, conflicts
    • Special names
    • Classes – testing, conversion
  3. R programming
    • Scripts – modifications and team projects
    • Functions and functional programming paradigm
    • Libraries –  installation, version mamagement, dependencies
    • Debugging R code
    • Errors and warnings – Error handling and tracking
  4. Data objects in R
    • Vector
    • Matrix, Table
    • List
    • Data Frame
    • Factor
  5. R programming constructs
    • Conditional execution – if else, vectorized if
    • Loops – repeat, while, for, replicate,
    • Apply functions family
    • Vectorization and code optimization
    • Simulations with R, random numbers
    • Functions -definition, parameters, call
    • Function environment- scope, hierarchy, namespaces, memory
  6. Strings, regular expressions
    • Operations on strings, functions
    • Regular expressions and manipulations
    • Text numerical representation with TF, TFIDF

Data Analysis with R

  1. Importing data
    • Data sources overview
    • Importing flat files – TAB, CSV, XML, HTML
    • Importing binary files, XLSX, SAS, STATA,SPSS, MATLAB
    • Scraping data from Web sources
    • Connecting to database and SQL sources
    • Exporting data to different formats
  2. Data preprocessing
    • Combining data from different sources, removing duplicates
    • Data cleansing, managing data types and shape of data
    • Data wrangling, recoding, renaming
    • Handling missing values and outliers–imputation
    • Data manipulation functions from dplyr, data.table, aggregate and reshape2 packages
    • Standarization, normalization, binning, One-hot encoding
  3. Exploratory data analysis, distributions, statistical data modelling
    • Descriptive statistics, correlation, covariance
    • Probability distributions, generating random numbers
    • Statistical hypothesis testing
    • Linear models, Multinomial linear regression
    • Generalized linear models, logistic regression
    • Model diagnostics, residuals, model comparison
    • Time series in R, ARIMA, VAR
  4. Introduction to Machine Learning
    • Unsupervised learning – clustering
    • Classification – decision trees, random forest
    • Neural networks
    • Evaluation and tuning

Reproducible research and data visualization

  1. Visualization and reporting with R
    • Different tools for data visualization in R
    • Libraries for data visualization – ggplot2, lattice, grid
    • Graphical parameters, formatting plots
    • Linear plot, histogram, scatterplot
    • Reproducible research with LaTeX, knitr and Rmarkdown
    • Dynamic reports and presentations with Rmarkdown

Microsoft R distribution

  1. Microsoft R Open libraries 
    • R Open, R Server compatibility with R
    • Parallel computing libraries
    • Reproducibility with R Open
    • Libraries dependency management
  2. Microsoft R products
    • Microsoft R Server
    • Microsoft R Client
    • SQL Server R Services

Course Curriculum

Curriculum is empty

Instructors


Send an enquiry

I am interested in


 

Enquire about the private (on-site) training course

I am interested in


 

Enquire about the public training course
 

I am interested in


 
Szybki kontakt