House Quality Prediction Dashboard


Multi Class Classification Dashboard

Data Classification is used to segregate the data into two or more classes . The classification allows us to get a holistic view of the data by categorizing the observations into multiple predefined labels . Predicting the class of the data helps to evaluate the probability of the labels , for example predicting the probability of sales proposal status namely win,lost and open can be achieved by multiclass classification machine learning techniques .

Classification predictive modelling is a process of assigning predefined class labels for the input data. For prediction ,classification  requires a training data set with preestablished labels. 

Below is a quality class prediction dashboard which demonstrates the multiclass classification model using decision tree.Here we are predicting the quality of the houses with 5 classes namely : poor quality ,below average quality ,average quality, good quality and excellent quality .

Our model has been trained on the dataset containing 1168 houses data and the further the evaluation is done on the test data ,to gather the classification metrics like Accuracy of the model ,Precision ,Recall and F1-score .The build model is used to predict the quality of the house on biases of the value of the input parameters in the dropdown which are provided by the user . 

The above Dashboard has been implemented using Tableau and Python using Tabpy .

The Advance Analytic feature of tableau is utilized using the calculated field where python scripts are executed  using tabpy server.

Features Of Quality Class Prediction Dashboard

  • Predicted Classification  

The model is trained on the  prior dataset with help of a decision tree algorithm for categorical variables whose main goal is to predict the class of the target variable ,here the target variable is the quality column.

  • Feature Importance 

The most impacting independent variable for quality is calculated using coefficient of correlation .The variable with highest coefficient is considered as the top variable affecting the quality .

  • Evaluation Metrics :

The evaluation metrics used in the above dashboard are as follow :

  1. Accuracy : Accuracy is no. of correct predictions divided by total no. of predictions . It should be as high as possible.
  2. Precision : Precision is the number of correctly identified positive results divided by the number of all positive results, including those not identified correctly.Since this is a multiclass classification we take the average of precision for each class.
  3. Recall : It’s the number of correctly identified positive results divided by the number of all samples that should have been identified as positive.Since this is a multiclass classification we take the average of Recall for each class.
  4. F1 Score :It gives a combined idea about these Recall and Precision metrics .And is calculated using overall recall and overall precision .

Classification dashboards can be lucrative in predicting the classes in company data on the biases of previous performance of the business . 

Leave a reply