Publications

Detailed Information

WINDTUNNEL: Towards Differentiable ML Pipelines Beyond a Single Model

Cited 2 time in Web of Science Cited 3 time in Scopus
Authors

Yu, Gyeong-In; Amizadeh, Saeed; Kim, Sehoon; Pagnoni, Artidoro; Zhang, Ce; Chun, Byung Gon; Weimer, Markus; Interlandi, Matteo

Issue Date
2021-09
Publisher
ASSOC COMPUTING MACHINERY
Citation
Proceedings of the VLDB Endowment, Vol.15 No.1, pp.11-20
Abstract
While deep neural networks (DNNs) have shown to be successful in several domains like computer vision, non-DNN models such as linear models and gradient boosting trees are still considered state-of-the-art over tabular data. When using these models, data scientists often author machine learning (ML) pipelines: DAG of ML operators comprising data transforms and ML models, whereby each operator is sequentially trained one-at-a-time. Conversely, when training DNNs, layers composing the neural networks are simultaneously trained using backpropagation. In this paper, we argue that the training scheme of ML pipelines is sub-optimal because it tries to optimize a single operator at a time thus losing the chance of global optimization. We therefore propose WindTunnel: a system that translates a trained ML pipeline into a pipeline of neural network modules and jointly optimizes the modules using backpropagation. We also suggest translation methodologies for several non-differentiable operators such as gradient boosting trees and categorical feature encoders. Our experiments show that fine-tuning of the translated WindTunnel pipelines is a promising technique able to increase the final accuracy.
ISSN
2150-8097
URI
https://hdl.handle.net/10371/186726
DOI
https://doi.org/10.14778/3485450.3485452
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share