CAP359 - Principles and Applications of Data Mining
These are the lecture notes and additional material for the course
CAP359 - Principles and Applications of Data Mining,
part of the Graduate Program in Applied Computing offered by the
Brazilian National Institute for Space Research.
This course will be offered every third term of the year.
In this course students will learn algorithms, techniques and applications of Data Mining through practical examples. Students must complete the assigned exercises and present a complete project that applied the concepts and algorithms to data related to his or her research field.
Course material and additional notes are in English. Lectures may be presented in Portuguese. Notes are frequently updated!
Course Schedule for 2018 (Third Term)
Lectures will be held on Mondays, 8:30AM - noon, at CTE's meeting room (second floor).
|September 24th||Introduction to the course, discussion on the projects.
|October 1st||From Data Science to Data Mining -- what do we need?
Tidy Data and how to get it.
|October 8th||No lectures this day. Work on the projects' proposals!
Please send me a one-page report about the problem you will try to solve and the data that is available for it.
|October 15th||Classification Algorithms
|October 22nd||Data Mining with Orange -- José Alberto Sá
|October 29th||Clustering Algorithms|
|November 19th||No lectures this day. Work on the projects' proposals!
From this week on we will work on the projects. We may have two or more meetings during the next weeks, can be at any time (within reason), e-mail me to schedule.
|November 26th||Meetings about the projects. Can be at any time (within reason), e-mail me to schedule!
|December 3rd||Meetings about the projects. Can be at any time (within reason), e-mail me to schedule!
|December 10th||Results and Discussions
A R package used in the lectures: cap359r
Material used by José Roberto Motta Garcia for a Data Science in R talk (warning: 230MB file!)
- Pang-Ning Tan, Michael Steinbach and Vipin Kumar, Introduction to Data Mining, Pearson - Addison-Wesley, 2006.
- Leandro Augusto da Silva, Sarajane Marques Peres, Clodis Boscarioli, Introdução à Mineração de Dados com Aplicações em R, Elsevier (Coleção SBC), 2016. (In Portuguese)
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth and Ramasamy Uthurusamy, Advances in Knowledge Discovery and Data Mining, MIT Press, 1996.
- Dorian Pyle, Data Preparation for Data Mining, Academic Press, 1999.
- Ian H. Witten and Eibe Frank, Data Mining - Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers, 2000.
- Manu Konchady, Text Mining Application Programming, Charles River Media, 2006.
- Alex Berson and Stephen J. Smith, Data Warehousing, Data Mining and OLAP, McGraw-Hill, 1997.
- J. Ross Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.
- Carl G. Looney, Pattern Recognition Using Neural Networks, Oxford University Press, 1997.
- M. Tim Jones, AI Application Programming, Second Edition, Charles River Media, 2005.
- Eric Backer, Computer-Assisted Reasoning in Cluster Analysis, Prentice-Hall, 1995.
- James C. Bezdek and Sankar K. Pal, Fuzzy Models for Pattern Recognition, IEEE Press, 1992.
- Zheru Chi, Hong Yan and Tuan Pham, Fuzzy Algorithms with Applications to Image Processing and Pattern Recognition, World Scientific Publishing, 1996.
- Mika Sato, Yoshiharu Sato and Lakhmi C. Jain, Fuzzy Clustering Models and Applications, Physica-Verlag Heidelberg, 1997.
- Nello Cristianini and John Shawe-Taylor, An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press, 2003.
- Eric Bonabeau, Marco Dorigo and Guy Theraulaz, Swarm Intelligence: From Natural to Artificial Systems, Oxford University Press, 1999.
- David E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, 1989.
- Edward R. Tufte, Envisioning Information, Graphic Press, 1990.
- Edward R. Tufte, Visual Explanations -- Images and Quantities, Evidence and Narrative, Graphic Press, 1997.
- Edward R. Tufte, The Visual Display of Quantitative Information, Graphic Press, 2001.
- Edward R. Tufte, Beautiful Evidence, Graphic Press, 2006.
- B. Tso and P. M. Mather, Classification Methods for Remotely Sensed Data, Taylor and Francis, London, 2000.
- Luiz Alexandre Peternelli, Marcio Pupin Mello, Conhecendo o R - Uma visão estatística, Série Didática - Editora UFV, 2012. (In Portuguese)
- Data Mining and Knowledge Discovery
- Knowledge and Information Systems
- IEEE Transactions on Knowledge and Data Engineering
- ACM Transactions on Knowledge Discovery from Data
- Artificial Intelligence Review
- Progress in Artificial Intelligence
- Machine Learning
- Pattern Recognition and Image Analysis
- Pattern Analysis & Applications
- IEEE Transactions Pattern Analysis and Machine Intelligence
- IEEE Computational Intelligence Magazine
- IEEE Intelligent Systems
- ACM Transactions on Intelligent Systems and Technology
- ACM Transactions on Interactive Intelligent Systems
- IEEE Transactions on Neural Networks
- Neural Computing & Applications
- Natural Computing
- IEEE Transactions on Evolutionary Computation
- ACM Transactions on Database Systems
- ACM Transactions on Graphics
- IEEE Transactions on Visualization and Computer Graphics
- Computing in Science & Engineering
- Social Network Analysis and Mining
- World Wide Web
- Journal of Intelligent Information Systems
- IEEE Transactions on Services Computing
- IEEE Transactions on Software Engineering
- IEEE Transactions on Image Processing
- IEEE Transactions on Fuzzy Systems
- IEEE Transactions on Computational Intelligence and AI in Games
- IEEE Internet Computing Magazine
- IEEE Transactions on Computers
- ACM Journal of Data and Information Quality
- ACM Transactions on Information Systems
- ACM Transactions on Sensor Networks
- ACM Transactions on the Web
- Journal of the Brazilian Computer Society
- IEEE Software
- Journal of the ACM
Contact the professor if you can't find a specific article!