Random Forests Can Hash

Qiu, Qiang; Sapiro, Guillermo; Bronstein, Alex

Computer Science > Computer Vision and Pattern Recognition

arXiv:1412.5083 (cs)

[Submitted on 16 Dec 2014 (v1), last revised 17 Apr 2015 (this version, v3)]

Title:Random Forests Can Hash

Authors:Qiang Qiu, Guillermo Sapiro, Alex Bronstein

View PDF

Abstract:Hash codes are a very efficient data representation needed to be able to cope with the ever growing amounts of data. We introduce a random forest semantic hashing scheme with information-theoretic code aggregation, showing for the first time how random forest, a technique that together with deep learning have shown spectacular results in classification, can also be extended to large-scale retrieval. Traditional random forest fails to enforce the consistency of hashes generated from each tree for the same class data, i.e., to preserve the underlying similarity, and it also lacks a principled way for code aggregation across trees. We start with a simple hashing scheme, where independently trained random trees in a forest are acting as hashing functions. We the propose a subspace model as the splitting function, and show that it enforces the hash consistency in a tree for data from the same class. We also introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. Experiments on large-scale public datasets are presented, showing that the proposed approach significantly outperforms state-of-the-art hashing methods for retrieval tasks.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1412.5083 [cs.CV]
	(or arXiv:1412.5083v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1412.5083

Submission history

From: Qiang Qiu [view email]
[v1] Tue, 16 Dec 2014 17:02:18 UTC (669 KB)
[v2] Tue, 24 Feb 2015 18:26:12 UTC (697 KB)
[v3] Fri, 17 Apr 2015 01:00:24 UTC (753 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Random Forests Can Hash

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Random Forests Can Hash

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators