Tidying Data in R
Estimated Time Commitment: 3 Hours
This skill starts with a discussion on the importance of Data Quality in business analytics. It then transitions to a focus on data transformations using Tidyverse, a group of useful R packages.
Data quality is an important concern and when organizations do not invest in creating perfect data, they suffer from data debt. Thus, organizations must invest in creating quality data to maximize return on investment (ROI) on their data investments.
To develop appreciation for data transformation, we discuss the relationship between managerial decision, analysis, and data transformation.
In this skill, you will be introduced to useful R packages such as dplyr, tidyr, and stringr for the following data manipulation tasks:
Upon successful completion, you will be able to:
Understand why and how Data Quality affects Business Analytics
Appreciate the relationship between managerial decisions, analysis, and data transformation
Perform basic data manipulation tasks
Perform basic functions of dplyr, tidyr and stringr package for data transformation
Introduction Video
Introduction to the Skill
Glossary
Data Quality
Data Structure Based on the Business Problem
Data Structure Based on the Business Problem (Part 2)
Knowledge Check 1
Subset Data Using Filter and Select Functions
Useful Operators for Data Manipulation
Creating New Variables Using Mutate Function
Knowledge Check 2
Data Aggregation Using Summaries and Group_By Functions
Handling Missing Values
Knowledge Check 3
Data Join
Long vs. Wide Format for Data
Manipulating Strings
Knowledge Check 4
Instructions
Exercise Files
Debriefing
Concluding Video
Final Quiz
Survey Instructions
Feedback Survey
Survey Verification
Next Steps