Introduction

Resource-efficient decision tree-based ensemble classifiers with reduced memory footprint, low training time, and low classification latency.

Summary

This project offers a framework to augment standard decision-tree based ensemble classifiers resulting in reduced memory footprint, low training time, and low classification latency. The key idea in achieving these properties is to use a two-step strategy: the first step is to train a small model that is sufficient to classify the majority of the queries correctly. The second step involves identifying specific subsets of the training data and train secondary expert models for these fewer harder cases where the small model is at high risk of making a classification mistake.
The project includes two different classifiers:
(1) RADE: Resource-efficient classifier for supervised anomaly detection.
(2) Duet: Resource-efficient multiclass classifier.

Details

Code:
Both RADE and Duet are implemented as scikit classifiers.

RADE GitLab repository (Available for VMware employees only)
RADE Fling
RADE scikit classifier (v1.0) in GitHub

Duet GitLab repository (Available for VMware employees only)
Duet scikit classifier (v1.0) in GitHub

Papers:
RADE: resource‑efficient supervised anomaly detection using decision tree‑based ensemble methods (Springer ML)
Efficient Multiclass Classification with Duet (EuroMLSys '22)

Researchers

Related Publications

Category

  • Graduated VMW Software Systems Projects

Research Areas

  • Machine Learning