1. Consider the training examples shown in Table 3.5 page 185 of the second Edition of the text book. Compute the Gini index for the overall collection of training examples. Compute the Gini index for the customer ID attribute. Compute the Gini index for the Geneder attribute. Compute the Gini index for the Car type attribute. Compute the Gini index for the Shirt Size attribute. Which attribute is better Gender, Car Type, or Shirt Size? Explain why Customer ID should not be used as the attribute test condition even though it has the lowest Gini index.
2. Repeat exercise (1) using entropy instead of the Gini index.
3. Use the outline of code we discussed in class to create a decision tree for the IrisDataSet which predicts the Type column using the other attributes. Create three versions of this tree: one using entropy, one using the Gini coecient, and one using the Classication error as splitting criteria. Use the rst half of the data set as the training data and the second half as the test data. Provide the error rate for each tree.
All applicants go through a series of tests that check their level of English and knowledge of formatting styles. The applicant is also required to present a sample of writing to the Evaluation Department. If you wish to find out more about the procedure, check out the whole process.