#SC-04-08 More Data Mining with Weka
About Course
The University of WaikatoDescription
Learn how to process, analyse, and model large data sets
On this course, led by the University of Waikato where Weka originated, you’ll be introduced to advanced data mining techniques and skills.
Following on from their first Data Mining with Weka course, you’ll now be supported to process a dataset with 10 million instances and mine a 250,000-word text dataset.
You’ll analyse a supermarket dataset representing 5000 shopping baskets and learn about filters for preprocessing data, selecting attributes, classification, clustering, association rules, cost-sensitive evaluation.
You’ll also explore learning curves and how to automatically optimize learning parameters.
What topics will you cover?
- Running large-scale data mining experiments
- Constructing and executing knowledge flows
- Processing very large datasets
- Analyzing collections of textual documents
- Mining association rules
- Preprocessing data using a range of filters
- Automatic methods of attribute selection
- Clustering data
- Taking account of different decision costs
- Producing learning curves
- Optimizing learning parameters in data mining
Who will you learn with?
I grew up in Ireland, studied at Cambridge, and taught computer science at the Universities of Essex in England and Calgary in Canada before moving to paradise (aka New Zealand) 25 years ago.
Who developed the course?
Sitting among the top 3% of universities world-wide, The University of Waikato prepares students to think critically and to show initiative in their learning.
What Will I Learn?
- Compare the performance of different mining methods on a wide range of datasets
- Demonstrate how to set up learning tasks as a knowledge flow
- Solve data mining problems on huge datasets
- Apply equal-width and equal-frequency binning for discretizing numeric attributes
- Identify the advantages of supervised vs unsupervised discretization
- Evaluate different trade-offs between error rates in 2-class classification
- Classify documents using various techniques
- Debate the correspondence between decision trees and decision rules
- Explain how association rules can be generated and used
- Discuss techniques for representing, generating, and evaluating clusters
- Perform attribute selection by wrapping a classifier inside a cross-validation loop
- Describe different techniques for searching through subsets of attributes
- Develop effective sets of attributes for text classification problems
- Explain cost-sensitive evaluation, cost-sensitive classification, and cost-sensitive learning
- Design and evaluate multi-layer neural networks
- Assess the volume of training data needed for mining tasks
- Calculate optimal parameter values for a given learning system
Topics for this course
Hello again
What will you learn?00:03:05
About this course
Welcome! Please introduce yourself
First, install Weka
Well, are you ready for this?
What are Weka’s other interfaces for?
Exploring the Experimenter
Comparing classifiers
The Knowledge Flow interface
Using the Command Line
Can Weka process big data?
Working with big data
I like the three parts of the course most, which are conducting large-scale data mining experiments, building and executing knowledge streams, and dealing with large data sets, which I think are all very good.
This course will analyze a supermarket data set, representing 5,000 shopping baskets, and learn filters for preprocessing data, selection attributes, classification, clustering, association rules, and cost-sensitive assessment. It's a great course.
This course introduced the beginning of their first data mining with the Weka Course, where we will now be supported to process a data set with 10 million instances and mine a 250,000 word text data set. It's amazing the development of technology.
This course, led by the University of Waikato, the birthplace of Weka, introduces advanced data mining techniques and skills.