Course Introduction: Data Science

This course provides a solid foundation in the techniques and tools used in modern data science, with a special focus on R, one of the most powerful and widely used programming languages in the data science community. Students will learn how to transform raw data into insights through statistical methods, programming, and effective visualization.

Course Overview

Through a combination of lectures, hands-on labs, and a final data science challenge, this course introduces key concepts and practical skills used by data scientists:

Introduction to R
We begin with the fundamentals of R programming. Students will learn the core syntax, data structures, and workflows needed to manipulate and analyze data efficiently.

Data Wrangling
Real-world data is rarely clean. You will learn how to clean, transform, and organize datasets using modern R tools, preparing them for analysis and modeling.

Data Visualization
Learn how to explore and communicate data effectively through visualization. Using R, you will create clear, compelling graphics that help reveal patterns, trends, and insights.

Dimensionality Reduction
High-dimensional datasets can be difficult to analyze and visualize. You will learn techniques such as Principal Component Analysis (PCA) to simplify complex datasets while preserving essential information.

Clustering
Discover methods for identifying natural groupings within data. This module introduces unsupervised learning techniques such as K-means and hierarchical clustering.

Classification
Classification is a core supervised learning task that involves predicting categorical outcomes. You will learn how models can be trained to assign data points to specific classes.

Regression
Explore methods for predicting continuous outcomes using linear and non-linear regression models. These techniques help uncover relationships in data and support forecasting and decision-making.

Generative AI
The course also introduces the rapidly evolving field of generative AI. Students will gain a high-level understanding of generative models and their growing role in modern data science.

Learning Goals

By the end of this course, you will:

  • Develop a strong foundation in data science using R.

  • Gain hands-on experience analyzing real-world datasets.

  • Apply statistical and machine learning techniques to solve practical problems.

  • Learn to communicate insights effectively through data visualization and storytelling.