CAP359 - Principles and Applications of Data Mining
These are the lecture notes and additional material for the course
CAP359 - Principles and Applications of Data Mining,
part of the Graduate Program in Applied Computing offered by the
Brazilian National Institute for Space Research.
This course will be offered every third term of the year.
In this course students will learn algorithms, techniques and applications of Data Mining through practical examples. Students must complete the assigned exercises and present a complete project that applied the concepts and algorithms to data related to his or her research field.
Course material and additional notes are in English. Lectures may be presented in Portuguese. Notes are frequently updated!
Course Schedule for 2017 (Third Term)
Lectures will be held on Mondays, 8:30AM - noon, at Rotunda's classroom #8.
|September 25th||Introduction to the course, discussion on the projects.
|October 2nd||What is Data?
|October 9th||No lectures this day.|
|October 16th||Classification (part 1)
|October 23rd||No lectures this day -- please work on the projects!|
|October 30th||No lectures this day -- please work on the projects!|
|November 6th||Clustering (part 1)
|November 13th||Meetings about the projects (can be at any time, e-mail me to schedule!)
|November 20th||No meetings this day (WorCAP!) -- please work on the projects!
|November 27th||Meetings about the projects (can be at any time, e-mail me to schedule!)
|December 4th||Meetings about the projects (can be at any time, e-mail me to schedule!)
|December 11th||Meetings about the projects (can be at any time, e-mail me to schedule!)
A R package used in the lectures: cap359r
Material used by José Roberto Motta Garcia for a Data Science in R talk (warning: 230MB file!)
- Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, Pearson - Addison-Wesley, 2006.
- Leandro Augusto da Silva, Sarajane Marques Peres, Clodis Boscarioli, Introdução à Mineração de Dados com Aplicações em R, Elsevier (Coleção SBC), 2016. (In Portuguese)
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth and Ramasamy Uthurusamy, Advances in Knowledge Discovery and Data Mining, MIT Press, 1996.
- Dorian Pyle, Data Preparation for Data Mining, Academic Press, 1999.
- Ian H. Witten and Eibe Frank, Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, 2000.
- Manu Konchady, Text Mining Application Programming, Charles River Media, 2006.
- Alex Berson and Stephen J. Smith, Data Warehousing, Data Mining and OLAP, McGraw-Hill, 1997.
- J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
- Carl G. Looney, Pattern Recognition Using Neural Networks, Oxford University Press, 1997.
- M. Tim Jones, AI Application Programming, Second Edition, Charles River Media, 2005.
- Eric Backer, Computer-Assisted Reasoning in Cluster Analysis, Prentice-Hall, 1995.
- James C. Bezdek and Sankar K. Pal, Fuzzy Models for Pattern Recognition, IEEE Press, 1992.
- Zheru Chi, Hong Yan and Tuan Pham, Fuzzy Algorithms with Applications to Image Processing and Pattern Recognition, World Scientific Publishing, 1996.
- Mika Sato, Yoshiharu Sato and Lakhmi C. Jain, Fuzzy Clustering Models and Applications, Physica-Verlag Heidelberg, 1997.
- Nello Cristianini and John Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2003.
- Eric Bonabeau, Marco Dorigo and Guy Theraulaz, Swarm Intelligence: From Natural to Artificial Systems, Oxford University Press, 1999.
- David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
- Edward R. Tufte, Envisioning Information, Graphic Press, 1990.
- Edward R. Tufte, Visual Explanations -- Images and Quantities, Evidence and Narrative, Graphic Press, 1997.
- Edward R. Tufte, The Visual Display of Quantitative Information, Graphic Press, 2001.
- Edward R. Tufte, Beautiful Evidence, Graphic Press, 2006.
- B. Tso and P. M. Mather, Classification Methods for Remotely Sensed Data, Taylor and Francis, London, 2000.
- Luiz Alexandre Peternelli, Marcio Pupin Mello, Conhecendo o R - Uma visão estatística, Série Didática - Editora UFV, 2012. (In Portuguese)
- Data Mining and Knowledge Discovery
- Knowledge and Information Systems
- IEEE Transactions on Knowledge and Data Engineering
- ACM Transactions on Knowledge Discovery from Data
- Artificial Intelligence Review
- Progress in Artificial Intelligence
- Machine Learning
- Pattern Recognition and Image Analysis
- Pattern Analysis & Applications
- IEEE Transactions Pattern Analysis and Machine Intelligence
- IEEE Computational Intelligence Magazine
- IEEE Intelligent Systems
- ACM Transactions on Intelligent Systems and Technology
- ACM Transactions on Interactive Intelligent Systems
- IEEE Transactions on Neural Networks
- Neural Computing & Applications
- Natural Computing
- IEEE Transactions on Evolutionary Computation
- ACM Transactions on Database Systems
- ACM Transactions on Graphics
- IEEE Transactions on Visualization and Computer Graphics
- Computing in Science & Engineering
- Social Network Analysis and Mining
- World Wide Web
- Journal of Intelligent Information Systems
- IEEE Transactions on Services Computing
- IEEE Transactions on Software Engineering
- IEEE Transactions on Image Processing
- IEEE Transactions on Fuzzy Systems
- IEEE Transactions on Computational Intelligence and AI in Games
- IEEE Internet Computing Magazine
- IEEE Transactions on Computers
- ACM Journal of Data and Information Quality
- ACM Transactions on Information Systems
- ACM Transactions on Sensor Networks
- ACM Transactions on the Web
- Journal of the Brazilian Computer Society
- IEEE Software
- Journal of the ACM
Contact the professor if you can't find a specific article!