Debugging Machine Learning Tasks

Chakarov, Aleksandar; Nori, Aditya; Rajamani, Sriram; Sen, Shayak; Vijaykeerthy, Deepak

Computer Science > Machine Learning

arXiv:1603.07292 (cs)

[Submitted on 23 Mar 2016]

Title:Debugging Machine Learning Tasks

Authors:Aleksandar Chakarov, Aditya Nori, Sriram Rajamani, Shayak Sen, Deepak Vijaykeerthy

View PDF

Abstract:Unlike traditional programs (such as operating systems or word processors) which have large amounts of code, machine learning tasks use programs with relatively small amounts of code (written in machine learning libraries), but voluminous amounts of data. Just like developers of traditional programs debug errors in their code, developers of machine learning tasks debug and fix errors in their data. However, algorithms and tools for debugging and fixing errors in data are less common, when compared to their counterparts for detecting and fixing errors in code. In this paper, we consider classification tasks where errors in training data lead to misclassifications in test points, and propose an automated method to find the root causes of such misclassifications. Our root cause analysis is based on Pearl's theory of causation, and uses Pearl's PS (Probability of Sufficiency) as a scoring metric. Our implementation, Psi, encodes the computation of PS as a probabilistic program, and uses recent work on probabilistic programs and transformations on probabilistic programs (along with gray-box models of machine learning algorithms) to efficiently compute PS. Psi is able to identify root causes of data errors in interesting data sets.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Programming Languages (cs.PL); Machine Learning (stat.ML)
ACM classes:	D.2.5; I.2.3
Cite as:	arXiv:1603.07292 [cs.LG]
	(or arXiv:1603.07292v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1603.07292

Submission history

From: Shayak Sen [view email]
[v1] Wed, 23 Mar 2016 18:30:37 UTC (902 KB)

Computer Science > Machine Learning

Title:Debugging Machine Learning Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Debugging Machine Learning Tasks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators