MP4 | Video: h264, 1280×720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English + srt | Duration: 28 lectures (5h 17m) | Size: 1.85 GB
Four graphical techniques you can use to quickly explore your data
What you’ll learn
Develop a fundamental framework to carry out your own Exploratory Data Analysis
The use of scatter Descriptions and how to incorporate linear and non-linear models into your graphics
How to evaluate if your data is "normal" using histograms and probability Descriptions
The power of box Descriptions to compare groups
You will need to have R and RStudio Desktop installed on your computer (Mac or PC) as well as an internet connection to download and install packages within RStudio Desktop. A basic understanding of the RStudio environment is assumed.
This example-based course introduces exploratory data analysis (EDA) using R. A primary objective is to apply graphical EDA techniques to representative data sets using the RStudio platform.
I have incorporated datasets from the NIST/SEMATECH e-Handbook of Statistical Methods into this course and adopted their fundamental approach of Exploratory Data Analysis.
We use scatter Descriptions to examine relationships between two variables, determine if there is a linear or non-linear relationship, analyze variations of the dependent variable, and determine if there are outliers in the dataset.
Of course, we need to remember that causality implies association and that association does NOT imply causality.
We will summarise the distribution of a dataset graphically using histograms. This tool can quickly show us the location and spread of the data, and give us a good indication if the data follows a normal distribution, is skewed, has multiple modes or outliers.
An underused, complementary technique to histograms is the probability Description. We will construct probability Descriptions by Descriptionting the data against a theoretical normal distribution. If the data follows a normal distribution, the Description will form a straight line. We will use the normal probability Description to assess whether or not our examples follow a normal distribution.
Finally, we will use box Descriptions to view the variation between different groups within the data.
Aside from scatterDescriptions, most spreadsheet programs do not support these methods, so learning how to do this fundamental analysis in R can improve your ability to explore your data.
Who this course is for
If you currently create multiple data visualizations in spreadsheets, you’ve probably wondered how you could improve your work or how you could work more efficiently. Or, if you have to recreate graphics repeatedly, you might be looking for a tool to make your work more reproducible. This course focuses on the basic techniques used in Exploratory Data Analysis: scatterDescriptions, histograms, probability Descriptions, and box Descriptions. Learning R and ggDescription2 will allow you to move beyond spreadsheets and use a professional tool to explore your data effectively.