File Exchange > Data Analysis >    Decision Tree Analysis

Author:
OriginLab Technical Support
Date Added:
10/30/2024
Last Update:
12/11/2024
Downloads (90 Days):
335
Total Ratings:
8
File Size:
674 KB
Average Rating:
File Name:
Decision_T...is.opx
File Version:
1.00
Minimum Versions:
License:
Type:
App
Summary:

Fit binary decision tree for classification.

Screen Shot and Video:
Description:

PURPOSE
This app can be used to fit binary decision tree for classification.

INSTALLATION
Download the file Decision_Tree_Analysis.opx?, and then drag-and-drop onto the Origin workspace. An icon will appear in the Apps Gallery window.

OPERATION
Make a worksheet for input data active. Click on the Decision Tree Analysis icon in the Apps Gallery window. A dialog will open. Dialog settings include:

Settings Description
Response Specify data for response variable, which should be categorical, and containing a finite and countable number of categories, and that can be text values or numeric values.
Binary or Multinomial Options include:
  • Binary Response
    When selecting this item, the response should contains only two categories.
  • Multinomial Response
    When selecting this item, the response should contains more than two categories.
Response Event Specify event for the response variable, and it is only available for binary response. The options are the categories from the actually response variable.
Continuous Predictors Specify the continuous variables, which may explain or predict the change of the response. And the values of continuous predictors must be numeric.
Categorical Predictors Specify the categorical variables, which may explain or predict the change of the response. And the values of categorical predictors can be text or numeric.
Prior Probabilities Specify how to calculate the prior probabilities for each categories of the response. Options include:
  • Equal Probability
    Each response category has the same prior probability.
  • Computed From Sample Frequencies
    The prior probabilities are computed from the sample proportions.
  • Specified
    The prior probabilities are specified by user.
Prior Probabilities (Separated by Space) Available when Prior Probabilities is Specified. Specify the prior probabilities for all response categories. Sum of all the values should be 1.
Misclassification Costs (Separated by Space) Specify misclassification cost for categories of response. It is a matrix, both dimensions are the number of categories of response. And the diagonal of the matrix is zero.
Validation Method Specify the validation method to test the model. Options include:
  • K-Fold Validation
    Use the K-fold cross-validation method to validate the test sample.
  • Test Set
    Separate the data set into two parts, and one part for training, the other part for testing.
  • None
    No validation is performed.
Number of Fold (K) When Validation Method is K-Fold Cross Validation, this is used to specify the number of folds.
Fraction of Rows as Test Set When Validation Method is Test Set, this is used to specify how many samples are used as testing data.
Random Number Seed For K-Fold Cross Validation or Test Set, the samples will be randomly selected for each fold or for testing data, this is the seed for generating the random number.
Method to Split Node Specify the splitting method for creating the decision tree. Options include:
  • Gini
  • Entropy
  • Twoing
    This method is available when response is multinomial.
Optimal Tree Selection Criterion Select the criterion for choosing the optimal tree, including
  • Minimum Misclassification Cost
    The optimal tree has the minimum misclassification cost.
  • Within K Standard Errors of Minimum Misclassification Cost
    The optimal tree has the misclassification cost within K standard errors of the minimum misclassification cost, and also contains the least number of leaf nodes.
K = K standard errors.
Number of Surrogate Splits This is used to specify how many surrogates to search for a predictor with missing values.
Minimum Samples to Split Branch Node This is one condition to decide if to split a branch node. If the number of samples a node contains is less than this value, this node will be a leaf node.
Minimum Samples Allowed for Leaf Node This is to specify the minimum number of samples that can be contained in a leaf node. If a branch node is splitted and get a leaf node with number of samples less than this value, this split is not allowed.
Maximum Tree Depth (Root Node is 1) Specify the maximum depth of a tree. The root node has depth of 1.
Maximum Number of Leaf Nodes Specify maximum number of leaf nodes that can be in a tree.
Weights Specify weights for each response sample. If not specified, all response samples have weight of 1.
Predict This tab is used to specify the data for prediction. The continuous predictors and categorical predictors should have the same structure as the training continuous predictors and categorical predictors.
Output Specify where to output the report table and result data.

Updates:

Reviews and Comments:
11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/29/2024abdullahamexcellent

11/28/2024abdullahamEXCELLENT