CAP394 - Introduction to Data Science


With Gilberto Ribeiro de Queiroz

These are the lecture notes and additional material for the course CAP394 - Introduction to Data Science, part of the Graduate Program in Applied Computing offered by the Brazilian National Institute for Space Research.
This course will be offered every second term of the year.

In this course students will learn the basic concepts of Data Science with a practical approach. Students must complete the assigned exercises and present a complete project, related to his or her research field, that collects and process data and creates, as a result, a data product.

Course material and additional notes are in English. Lectures may be presented in Portuguese. Notes are frequently updated!

See below the course schedule and references and additional material for the course.

Course Schedule for 2018

Lectures will be held on the second term (June 22nd - August 31st), on Fridays, from 8:00 to 12:00, Meeting Room #27, ASA Building.

June 22nd There will be no classes this day (there is a World Cup game this day and Brazil is going to play).
The Lecture Notes for the first lecture were recorded, students must read the slides and watch the videos in the YouTube channel.
Students must also read and follow instructions on the Exercise Sheet (see also links).
June 29nd There will be no classes this day but the students will use the time to complete the first assignment (define your project!) -- see the First Exercise Sheet.
July 6th Gilberto Ribeiro: Jupyter environment and other topics.
July 13th Lightning fast review of lecture 1 (see the Lecture Notes, the videos and the Exercise Sheet).
Introduction to R: Lecture Notes, GitHub Repo, and Exercise Sheet
July 20th More on R: exploratory data analysis, creating data products. EDA in R: Lecture Notes.
July 27th Yet another holiday... we will try to compensate this lecture on another day.
August 3rd Data Science Experiment (Gilberto Queiroz and Rolf Simões). Link to the repo.
August 10th Introdução à Programação Visual - Orange - Versão Completa (José Alberto Sá -- in Portuguese)
August 17th Introdução à Programação Visual - Orange - Versão Completa (José Alberto Sá -- in Portuguese)
August 24th TBD.
August 31st TBD.

See also the official schedule for the graduate programs at INPE.

Course Material used in 2017

This material will be deprecated for 2018!



See R code for this course (and more).

Software and Tools

Most examples and exercises can be developed with R and RStudio.

Data used in this Course


Papers, Articles, etc.

Video Lectures

Data from Elsewhere

Project Ideas from Elsewhere