Google Scholar

Incremental web-site boundary detection using random walks

A Alshukri, F Coenen, M Zito - Machine Learning and Data Mining in …, 2011 - Springer

Machine Learning and Data Mining in Pattern Recognition: 7th International …, 2011•Springer

The paper describes variations of the classical k-means clustering algorithm that can be
used effectively to address the so called Web-site Boundary Detection (WBD) problem. The
suggested advantages offered by these techniques are that they can quickly identify most of
the pages belonging to a web-site; and, in the long run, return a solution of comparable (if
not better) accuracy than other clustering methods. We analyze our techniques on artificial
clones of the web generated using a well-known preferential attachment method.

Abstract

The paper describes variations of the classical k-means clustering algorithm that can be used effectively to address the so called Web-site Boundary Detection (WBD) problem. The suggested advantages offered by these techniques are that they can quickly identify most of the pages belonging to a web-site; and, in the long run, return a solution of comparable (if not better) accuracy than other clustering methods. We analyze our techniques on artificial clones of the web generated using a well-known preferential attachment method.

Springer

Show moreShow less

Save Cite Cited by 4 Related articles All 15 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Incremental web-site boundary detection using random walks