Machine Learning with R

Machine Learning with R

Public courses


-Anyone can join the training
- Course outline as presented on the website
- Small groups, 3-10 people

Private courses

Price set individually

- Training workshop just for your team
- You choose date and location of the training
- Course outline tailored to your needs

About the training

Machine Learning is a fast growing field that gives us great possibilities to explore data, make predictions and recognize patterns. It is a combination of algorithms used in IT, statistics, and automation that can, based on existing data, forecast and classify new data, recognize text, images, and speech or create recommendation systems. Such algorithms are used for example in spam filtering, web ads, marketing, and fraud detection. The ability to use machine learning techniques is a great asset for every data scientist.

During the training, you will learn the basics of machine learning. Numerous exercises and examples throughout the training will help you practice the material and gain practical knowledge necessary in daily work.

Who is this training for?

This training is aimed at people who would like to learn machine learning techniques and gain practical knowledge useful in daily work. Basic R programming knowledge is required to take part in this training.

Machine Learning with R is addressed to:

  • Analysts
  • Statisticians
  • Data scientists
  • Programmers
  • Engineers

Training participants come from different backgrounds (finance, banking, IT, FMCG, biology, medicine…). Before the course, we survey participants to learn what are they training expectations and needs. We always make sure the training is as practical and useful in your daily work as possible!

What will I learn?

During the training, you will learn basics of machine learning and how to build models used in data classification and numerical prediction.

After completing the training, you will be able to:

  • Select the right model to your problem– We will discuss various machine learning algorithms and suggest best one for a given problem type.
  • Test model’s predictive abilities – You will learn how to judge model quality and its predictive abilities. What is overfitting and why it is so bad? How do I know if model gives good forecasts?
  • Build linear regression model – We will discuss what linear regression is and what are its applications. During exercises, you will build different models and choose the one with the best predictive abilities. Regression models can be used to find factors influencing a price or to forecast sales.
  • Build classification model – We will use decision trees, neural networks, and SVM algorithms to build classification models. Such models are used in recommendation systems i.e. systems that predict which product customer will be willing to buy or which movie a person would enjoy.
  • Build unsupervised training model – We will teach you what to do when there is no validation data available.
  • Enhance model’s predictive abilities – You will learn how to further enhance your model’s performance by using methods like boosting or random forest.

Course outline

  1. Introduction to Machine Learning
    • Possibilities and limitations of machine learning
    • Supervised learning
    • Unsupervised learning
    • Reinforcement learning
    • Evolutionary learning
    • Machine learning step by step
    • Statistical learning vs Machine learning
    • Examples of Machine Learning usage
  2. Programming in R
    • Basic data structures
    • If, for, while
    • Importing data
    • R packages for Machine Learning
    • Writing scripts
    • Dynamic reports using knitr and R markdown
  3. Testing predictive abilities of Machine Learning algorithm
    • Compromise between forecast accuracy and model interpretability
    • Overfitting
    • Training set, test set, validation set
    • Accuracy measures in classification and quantitative forecasting
    • Classification matrix, ROC curve, AUC measure
    • Cross-Validation – k-Fold, LOCV
    • Bootstrap
  4. Linear regression – numerical data forecasting problem
    • Correlation coefficient
    • Multivariate linear regression
    • Graphic interpretation of the regression model
    • Methods of selecting variables
    • Selecting best model, forecasting
    • Examples and exercises in R
  5. Logistic regression – classification problem
    • Binary logistic regression
    • Polynomial logistic regression
    • Model estimation and forecasting
    • Examples and exercises in R
  6. Naive Bayes – classification problem
    • Bayes theorem and conditional probability
    • Naïve Bayes classificatory algorithm
    • Examples and exercises in R
  7. k- Nearest Neighbors – classification problem
    • k-NN lazy learning algorithm
    • Distance measures
    • Choosing k-nearest neighbors
    • Examples and exercises in R
  8. Decision trees – classification problem
    • Building trees using divide and conquer rule
    • C5.0 algorithm
    • Entropy measures, Gini coefficient
    • Pruning trees
    • Decision rules
    • Examples and exercises in R
    • Additional content: Regression decision trees
  9. Neural networks – classification problem and numerical data forecasting
    • Biological background of neural networks
    • Activation functions
    • Neural networks topology
    • Estimating neural networks using backpropagation method
    • Examples and exercises in R
  10. Support Vector Machines
    • Graphic interpretation of SVM estimation
    • Unmatched data problem
    • SVM algorithm
    • Kernels
    • Additional content: SVM regression
    • Examples and exercises in R
  11. Asociation rules
    • Apriori algorithm
    • Building rules using apriori algorithm
    • Examples and exercises in R
  12. Dimensionality reduction
    • Discriminatory analysis – LDA
    • Principal component method – PCA
    • Factor analysis – FA
    • ISOMAP – MDS
    • Examples and exercises in R
  13. Tuning parameters– increasing model’s predictive abilities
    • Parameter tuning using caret package
    • Examples and exercises in R
  14. Ensemble learning – increasing model’s predictive abilities
    • Boosting – AdaBoost, Stumping
    • Gradient boosting machines
    • Bagging
    • Randomization
    • Random forests
    • Additive regression
    • Examples and exercises in R
  15. Unsupervised learning
    • k-Means
    • k-Medoids
    • Hierarchical segmentation
    • Density distribution segmentation
    • SOM – Self-organising feature map
    • Examples and exercises in R

Course Curriculum

Curriculum is empty


Send an enquiry

I am interested in


Enquire about the private (on-site) training course

I am interested in


Enquire about the public training course

I am interested in

Szybki kontakt