R is the most popular environment and language for statistical analyses, data mining, and machine learning. Managed and scalable version of R runs in SQL Server, Power BI, and Azure ML. The main topic of this 4-day course is the R language. However, the course also shows how to use the languages and tools available in MS BI suite for data science applications, including Python, T-SQL, Power BI, Azure ML, and Excel. The labs focus on R; the demos also show the code in other languages.
Attendees of this course learn to program with R from the scratch. Basic R code is introduced using the free R engine and RStudio IDE. A lifecycle of a data science project is explained in details. The attendees learn how to perform the data overview and do the most tedious task in a project, the data preparation task. After data overview and preparation, the analytical part begins with intermediate statistics in order to analyze associations between pairs of variables. Then the course introduces more advanced methods for researching linear dependencies.
Too many variables in a model can make its own problem. The course shows how to do feature selection, starting with the basics of matrix calculations. Then the course switches more advanced data mining and machine learning analyses, including supervised and unsupervised learning. The course also introduces the currently modern topics, including forecasting, text mining, and reinforcement learning.
Finally, the attendees also learn how to use the R code in SQL Server, Azure ML, and Power BI through labs, and how to use Python for inside all of the tools mentioned through demos.
Following an introduction the modules will be as follows:
Discussion: R vs Python
Attendees should have basic understanding of data analysis and basic familiarity with SQL Server tools.
This seminar consists of instructor presentations and individual work during labs. During labs, the attendees use mainly the R language.
Every attendee gets a .PDF printout of all slides and all code and solutions for the demos presented and for the lab exercises.
In addition, every attendee gets an electronic version of the Data Science with SQL Server Quick Start Guide book by Dejan Sarka, Packt, 2018.
Each attendee works on a pre-prepared computer on a virtual machine with the following software pre-installed: