CAP359 - Principles and Applications of Data Mining

These are the lecture notes and additional material for the course
CAP359 - Principles and Applications of Data Mining,
part of the Graduate Program in Applied Computing offered by the
Brazilian National Institute for Space Research.
This course will be offered every
third term of the year.
In this course students will learn algorithms, techniques and applications of Data Mining through practical examples. Students must complete the assigned exercises and present a complete project that applied the concepts and algorithms to data related to his or her research field.
Course material and additional notes are in English. Lectures may be presented in Portuguese. Notes are frequently updated!
See below the course schedule, additional material for the course and references.
Course Schedule for 2019 (Third Term)
Important: Due to many meetings and conferences during the third term, part of the time for the lectures will be used to develop projects. Students must allocate time to work on these projects, which will be discussed with the lecturer(s).
Lectures will be held on Fridays, 8:30AM - noon, at Rotunda's room #8.
September 27th |
Introduction to the course, discussion on the projects. Lecture Notes. Your project: start with the data preparation task. |
October 4th |
Introduction to Data Analysis with Python and R (Felipe Souza, Felipe Menino) Links to contents, presentation. |
October 11th | No lectures this day. Please work on the data preparation tasks. |
October 18th |
What is data and what is minable data
(Lecture Notes). Naive introduction to classification algorithms (Lecture Notes). How's your project? |
October 25th | |
November 1st | TBC: Data Mining with Orange (Olga Bittencourt) |
November 8th | |
November 15th | No lectures this day -- federal holiday. |
November 22nd | |
November 29th | |
December 6th | |
December 13th |
Additional Material
A R package used in the lectures: cap359r
Material used by José Roberto Motta Garcia for a Data Science in R talk (warning: 230MB file!)
References
Books
- Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, Pearson - Addison-Wesley, 2006.
- Leandro Augusto da Silva, Sarajane Marques Peres, Clodis Boscarioli, Introdução à Mineração de Dados com Aplicações em R, Elsevier (Coleção SBC), 2016. (In Portuguese)
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth and Ramasamy Uthurusamy, Advances in Knowledge Discovery and Data Mining, MIT Press, 1996.
- Dorian Pyle, Data Preparation for Data Mining, Academic Press, 1999.
- Ian H. Witten and Eibe Frank, Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, 2000.
- Manu Konchady, Text Mining Application Programming, Charles River Media, 2006.
- Alex Berson and Stephen J. Smith, Data Warehousing, Data Mining and OLAP, McGraw-Hill, 1997.
- J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
- Carl G. Looney, Pattern Recognition Using Neural Networks, Oxford University Press, 1997.
- M. Tim Jones, AI Application Programming, Second Edition, Charles River Media, 2005.
- Eric Backer, Computer-Assisted Reasoning in Cluster Analysis, Prentice-Hall, 1995.
- James C. Bezdek and Sankar K. Pal, Fuzzy Models for Pattern Recognition, IEEE Press, 1992.
- Zheru Chi, Hong Yan and Tuan Pham, Fuzzy Algorithms with Applications to Image Processing and Pattern Recognition, World Scientific Publishing, 1996.
- Mika Sato, Yoshiharu Sato and Lakhmi C. Jain, Fuzzy Clustering Models and Applications, Physica-Verlag Heidelberg, 1997.
- Nello Cristianini and John Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2003.
- Eric Bonabeau, Marco Dorigo and Guy Theraulaz, Swarm Intelligence: From Natural to Artificial Systems, Oxford University Press, 1999.
- David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
- Edward R. Tufte, Envisioning Information, Graphic Press, 1990.
- Edward R. Tufte, Visual Explanations -- Images and Quantities, Evidence and Narrative, Graphic Press, 1997.
- Edward R. Tufte, The Visual Display of Quantitative Information, Graphic Press, 2001.
- Edward R. Tufte, Beautiful Evidence, Graphic Press, 2006.
- B. Tso and P. M. Mather, Classification Methods for Remotely Sensed Data, Taylor and Francis, London, 2000.
- Luiz Alexandre Peternelli, Marcio Pupin Mello, Conhecendo o R - Uma visão estatística, Série Didática - Editora UFV, 2012. (In Portuguese)
Journals
- Data Mining and Knowledge Discovery
- Knowledge and Information Systems
- IEEE Transactions on Knowledge and Data Engineering
- ACM Transactions on Knowledge Discovery from Data
- Artificial Intelligence Review
- Progress in Artificial Intelligence
- Machine Learning
- Pattern Recognition and Image Analysis
- Pattern Analysis & Applications
- IEEE Transactions Pattern Analysis and Machine Intelligence
- IEEE Computational Intelligence Magazine
- IEEE Intelligent Systems
- ACM Transactions on Intelligent Systems and Technology
- ACM Transactions on Interactive Intelligent Systems
- IEEE Transactions on Neural Networks
- Neural Computing & Applications
- Natural Computing
- IEEE Transactions on Evolutionary Computation
- ACM Transactions on Database Systems
- ACM Transactions on Graphics
- IEEE Transactions on Visualization and Computer Graphics
- Computing in Science & Engineering
- Social Network Analysis and Mining
- World Wide Web
- Journal of Intelligent Information Systems
- IEEE Transactions on Services Computing
- IEEE Transactions on Software Engineering
- IEEE Transactions on Image Processing
- IEEE Transactions on Fuzzy Systems
- IEEE Transactions on Computational Intelligence and AI in Games
- IEEE Internet Computing Magazine
- IEEE Transactions on Computers
- ACM Journal of Data and Information Quality
- ACM Transactions on Information Systems
- ACM Transactions on Sensor Networks
- ACM Transactions on the Web
- Journal of the Brazilian Computer Society
- IEEE Software
- Journal of the ACM
Contact the professor if you can't find a specific article!