quality replica watches
4.75(4)

#SC-04-08 More Data Mining with Weka

  • Categories Science
  • Duration 18h
  • Total Enrolled 4
  • Last Update September 27, 2021

About Course

The University of Waikato

Description

Learn how to process, analyse, and model large data sets

On this course, led by the University of Waikato where Weka originated, you’ll be introduced to advanced data mining techniques and skills.

Following on from their first Data Mining with Weka course, you’ll now be supported to process a dataset with 10 million instances and mine a 250,000-word text dataset.

You’ll analyse a supermarket dataset representing 5000 shopping baskets and learn about filters for preprocessing data, selecting attributes, classification, clustering, association rules, cost-sensitive evaluation.

You’ll also explore learning curves and how to automatically optimize learning parameters.

What topics will you cover?

  • Running large-scale data mining experiments
  • Constructing and executing knowledge flows
  • Processing very large datasets
  • Analyzing collections of textual documents
  • Mining association rules
  • Preprocessing data using a range of filters
  • Automatic methods of attribute selection
  • Clustering data
  • Taking account of different decision costs
  • Producing learning curves
  • Optimizing learning parameters in data mining

Who will you learn with?

Ian Witten

I grew up in Ireland, studied at Cambridge, and taught computer science at the Universities of Essex in England and Calgary in Canada before moving to paradise (aka New Zealand) 25 years ago.

 

 

Who developed the course?

Waikato

Sitting among the top 3% of universities world-wide, The University of Waikato prepares students to think critically and to show initiative in their learning.

 

What Will I Learn?

  • Compare the performance of different mining methods on a wide range of datasets
  • Demonstrate how to set up learning tasks as a knowledge flow
  • Solve data mining problems on huge datasets
  • Apply equal-width and equal-frequency binning for discretizing numeric attributes
  • Identify the advantages of supervised vs unsupervised discretization
  • Evaluate different trade-offs between error rates in 2-class classification
  • Classify documents using various techniques
  • Debate the correspondence between decision trees and decision rules
  • Explain how association rules can be generated and used
  • Discuss techniques for representing, generating, and evaluating clusters
  • Perform attribute selection by wrapping a classifier inside a cross-validation loop
  • Describe different techniques for searching through subsets of attributes
  • Develop effective sets of attributes for text classification problems
  • Explain cost-sensitive evaluation, cost-sensitive classification, and cost-sensitive learning
  • Design and evaluate multi-layer neural networks
  • Assess the volume of training data needed for mining tasks
  • Calculate optimal parameter values for a given learning system

Topics for this course

14 Lessons18h

Hello again?

This practical course on more advanced data mining follows on from Data Mining with Weka. You'll become an expert Weka user, and pick up many new techniques and principles of data mining along the way.
What will you learn?00:03:05
About this course
Welcome! Please introduce yourself
First, install Weka
Well, are you ready for this?

What are Weka’s other interfaces for??

Each week we’ll focus on a couple of “Big Questions” relating to data mining. This is the first Big Question for this week.

Exploring the Experimenter?

You can use the Experimenter to find the performance of classification algorithms on datasets, or to determine whether one classifier performs better (or runs faster) than another. In the Explorer, such things can be tedious.

Comparing classifiers?

The Experimenter can be used to compare classifiers. The "null hypothesis" is that they perform the same. To show that one is better than the other, we must *reject* this hypothesis at a given level of statistical significance.

The Knowledge Flow interface?

The Knowledge Flow interface is an alternative to the Explorer. You can lay out filters, classifiers, evaluators on a 2D canvas ... and connect them up in different ways. Data and classification models flow through the diagram!

Using the Command Line?

You can do everything the Explorer does (and more) from the command line. One advantage is that you get more control over memory usage. To access the definitive source of Weka documentation you need to learn to use JavaDoc.

Can Weka process big data??

This week's second Big Question!

Working with big data?

The Explorer can handle pretty big datasets, but it has limits. However, the Command Line Interface does not: it works incrementally whenever it can. Some classifiers can handle arbitrarily large datasets.

Student Feedback

4.8

Total 4 Ratings

5
3 ratings
4
1 rating
3
0 rating
2
0 rating
1
0 rating

I like the three parts of the course most, which are conducting large-scale data mining experiments, building and executing knowledge streams, and dealing with large data sets, which I think are all very good.

This course will analyze a supermarket data set, representing 5,000 shopping baskets, and learn filters for preprocessing data, selection attributes, classification, clustering, association rules, and cost-sensitive assessment. It's a great course.

This course introduced the beginning of their first data mining with the Weka Course, where we will now be supported to process a data set with 10 million instances and mine a 250,000 word text data set. It's amazing the development of technology.

This course, led by the University of Waikato, the birthplace of Weka, introduces advanced data mining techniques and skills.

$49

Material Includes

  • Official Certificate

Requirements

  • Before the course starts, download the free Weka software. It runs on any computer, under Windows, Linux, or Mac. It has been downloaded millions of times and is being used all around the world.

Target Audience

  • This course is aimed at anyone who deals in data professionally or is interested in furthering their professional or academic skills in data science.
  • This course follows on from Data Mining with Weka and it’s recommended that you complete that course first unless you already have a rudimentary knowledge of Weka.
  • As with the previous course, it involves no computer programming, although you need some experience with using computers for everyday tasks.
  • High school maths is more than enough; some elementary statistics concepts (means and variances) are assumed.