The Coursera “Data Science Specialization” is a ten-course introduction to data science developed and taught by leading professors of Johns Hopkins University. It covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results. In the final Capstone Project, you’ll apply the skills learned by building a data product using real-world data. At completion, students will have a portfolio demonstrating their mastery of the material.
About The Data Science Specialization
In this Specialization, you will learn how to ask the right questions, manipulate data sets, and create visualizations to communicate results. The Specialization comprises the following ten courses:
- COURSE 1: The Data Scientist’s Toolbox
In this course, you will get an introduction to the main tools and ideas in the data scientist’s toolbox. The course gives an overview of the data, questions, and tools that data analysts and data scientists work with. There are two components to this course: the ideas behind turning data into actionable knowledge, and the tools that will be used in the program like version control, markdown, git, GitHub, R, and RStudio.
- COURSE 2: R Programming
In this course, you will learn how to program in R and how to use R for effective data analysis. The course covers practical issues in statistical computing which includes programming in R, reading data into R, accessing R packages, writing R functions, debugging, profiling R code, and organizing and commenting R code. Topics in statistical data analysis will provide working examples.
- COURSE 3: Getting and Cleaning Data
This course will cover the basic ways that data can be obtained – from the web, from APIs, from databases and from colleagues in various formats. It will also cover the basics of data cleaning and how to make data “tidy”.
- COURSE 4: Exploratory Data Analysis
This course covers the essential exploratory techniques for summarizing data. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models.
- COURSE 5: Reproducible Research
This course focuses on the concepts and tools behind reporting modern data analyses in a reproducible manner. This course will focus on literate statistical analysis tools which allow one to publish data analyses in a single document that allows others to easily execute the same analysis to obtain the same results.
- COURSE 6: Statistical Inference
Statistical inference is the process of drawing conclusions about populations or scientific truths from data. After taking this course, students will understand the broad directions of statistical inference and use this information for making informed choices in analyzing data.
- COURSE 7: Regression Models
Linear models, as their name implies, relates an outcome to a set of predictors of interest using linear assumptions. This course will cover modern thinking on model selection and novel uses of regression models including scatterplot smoothing.
- COURSE 8: Practical Machine Learning
This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.
- COURSE 9: Developing Data Products
Data products automate complex analysis tasks or use technology to expand the utility of a data-informed model, algorithm or inference. This course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.
- COURSE 10: Data Science Capstone Project
The capstone project class will allow students to create a usable/public data product that can be used to show your skills to potential employers. Projects will be drawn from real-world problems and will be conducted with industry, government, and academic partners.
Summary of Main Course Features
- Creators:
- Johns Hopkins University – recognized as a destination for excellent, ambitious scholars and a world leader in teaching and research;
- Roger D. Peng, Ph.D., Associate Professor, Biostatistics;
- Brian Caffo, Ph.D., Professor, Biostatistics;
- Jeff Leek, Ph.D., Associate Professor, Biostatistics.
- Comprises: 10 courses;
- Commences: January 26, 2020
- Projects to help you practice and apply the skills you learn.
- Certificate available.
- Beginner Specialization: No prior experience required.
Visit the Specialization Page
Top Specializations on Coursera
Learn Data Science from Johns Hopkins University
Master Machine Learning from University of Washington
Earn your Business Analytics Certificate from the Wharton School on Coursera. Join now!