default search action
14th KDD 2008: Las Vegas, Nevada, USA
- Ying Li, Bing Liu, Sunita Sarawagi:
Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Las Vegas, Nevada, USA, August 24-27, 2008. ACM 2008, ISBN 978-1-60558-193-4 - Benjamin Edelman, Michael Schwarz:
Internet advertising and optimal auction design. 1 - Thore Graepel, Ralf Herbrich:
Large scale data analysis and modelling in online services and advertising. 2 - Trevor Hastie, Jerome H. Friedman, Robert Tibshirani:
Regularization paths and coordinate descent. 3 - Jitendra Malik:
The future of image search. 4 - Udo Miletzki:
Genesis of postal address reading, current state and future prospects: thirty years of pattern recognition on duty of postal services. 5-6
Research papers
- Aris Anagnostopoulos, Ravi Kumar, Mohammad Mahdian:
Influence and correlation in social networks. 7-15 - Luca Becchetti, Paolo Boldi, Carlos Castillo, Aristides Gionis:
Efficient semi-streaming algorithms for local triangle counting in massive graphs. 16-24 - Indrajit Bhattacharya, Shantanu Godbole, Sachindra Joshi:
Structured entity identification and document categorization: two tasks with one joint model. 25-33 - Albert Bifet, Ricard Gavaldà:
Mining adaptively frequent closed unlabeled rooted trees in data streams. 34-42 - Mustafa Bilgic, Lise Getoor:
Effective label acquisition for collective classification. 43-51 - Francesco Bonchi, Carlos Castillo, Debora Donato, Aristides Gionis:
Topical query decomposition. 52-60 - Christos Boutsidis, Michael W. Mahoney, Petros Drineas:
Unsupervised feature selection for principal components analysis. 61-69 - Justin Brickell, Vitaly Shmatikov:
The cost of privacy: destruction of data-mining utility in anonymized data publishing. 70-78 - Deepayan Chakrabarti, Ravi Kumar, Kunal Punera:
Generating succinct titles for web URLs. 79-87 - Soumen Chakrabarti, Rajiv Khanna, Uma Sawant, Chiru Bhattacharyya:
Structured learning for non-smooth ranking losses. 88-96 - Ming-Wei Chang, Wen-tau Yih, Christopher Meek:
Partitioned logistic regression for spam filtering. 97-105 - Jianhui Chen, Shuiwang Ji, Betul Ceran, Qi Li, Mingrui Wu, Jieping Ye:
Learning subspace kernels for classification. 106-114 - WenYen Chen, Dong Zhang, Edward Y. Chang:
Combinational collaborative filtering for personalized community recommendation. 115-123 - Xue-wen Chen, Michael Wasikowski:
FAST: a roc-based feature selection metric for small samples and imbalanced data classification problems. 124-132 - Haibin Cheng, Pang-Ning Tan:
Semi-supervised learning with data calibration for long-term time series forecasting. 133-141 - Yong Ju Cho, Naren Ramakrishnan, Yang Cao:
Reconstructing chemical reaction networks: data mining meets system identification. 142-150 - Peter Christen:
Automatic record linkage using seeded nearest neighbour and support vector machine classification. 151-159 - David J. Crandall, Dan Cosley, Daniel P. Huttenlocher, Jon M. Kleinberg, Siddharth Suri:
Feedback effects between similarity and social influence in online communities. 160-168 - Kaustav Das, Jeff G. Schneider, Daniel B. Neill:
Anomaly pattern detection in categorical datasets. 169-176 - Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong:
Bypass rates: reducing query abandonment using negative inferences. 177-185 - Anirban Dasgupta, Ravi Kumar, Amit Sasturkar:
De-duping URLs via rewrite rules. 186-194 - Jason V. Davis, Inderjit S. Dhillon:
Structured metric learning for high dimensional problems. 195-203 - Luc De Raedt, Tias Guns, Siegfried Nijssen:
Constraint programming for itemset mining. 204-212 - Charles Elkan, Keith Noto:
Learning classifiers from only positive and unlabeled data. 213-220 - Kave Eshghi, Shyamsundar Rajaram:
Locality sensitive hash functions based on concomitant rank order statistics. 221-229 - Wei Fan, Kun Zhang, Hong Cheng, Jing Gao, Xifeng Yan, Jiawei Han, Philip S. Yu, Olivier Verscheure:
Direct mining of discriminative and essential frequent patterns via model-based search tree. 230-238 - George Forman, Shyamsundar Rajaram:
Scaling up text classification for large file systems. 239-246 - Yasuhiro Fujiwara, Yasushi Sakurai, Masashi Yamamuro:
SPIRAL: efficient and exact model identification for hidden Markov models. 247-255 - Brian Gallagher, Hanghang Tong, Tina Eliassi-Rad, Christos Faloutsos:
Using ghost edges for classification in sparsely labeled networks. 256-264 - Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, Adam D. Smith:
Composition attacks and auxiliary information in data privacy. 265-273 - Venkatesh Ganti, Arnd Christian König, Rares Vernica:
Entity categorization over large document collections. 274-282 - Jing Gao, Wei Fan, Jing Jiang, Jiawei Han:
Knowledge transfer via multiple model local structure mapping. 283-291 - Gemma C. Garriga, Esa Junttila, Heikki Mannila:
Banded structure in binary matrices. 292-300 - Rohit Gupta, Gang Fang, Blayne Field, Michael S. Steinbach, Vipin Kumar:
Quantitative evaluation of approximate frequent pattern mining algorithms. 301-309 - Robert J. Hall, Charles Sutton, Andrew McCallum:
Unsupervised deduplication using cross-field dependencies. 310-317 - Meng Hu, Jiong Yang, Wei Su:
Permu-pattern: discovery of mutable permutation patterns with proximity constraint. 318-326 - Heng Huang, Chris H. Q. Ding, Dijun Luo, Tao Li:
Simultaneous tensor subspace selection and clustering: the equivalence of high order svd and k-means clustering. 327-335 - Woochang Hwang, Taehyong Kim, Murali Ramanathan, Aidong Zhang:
Bridging centrality: graph mining from element level to group level. 336-344 - Saara Hyvönen, Pauli Miettinen, Evimaria Terzi:
Interpretable nonnegative matrix decompositions. 345-353 - Georgiana Ifrim, Gökhan H. Bakir, Gerhard Weikum:
Fast logistic regression for text categorization with variable-length n-grams. 354-362 - Tomoharu Iwata, Takeshi Yamada, Naonori Ueda:
Probabilistic latent semantic visualization: topic model for visualizing documents. 363-371 - David D. Jensen, Andrew S. Fast, Brian J. Taylor, Marc E. Maier:
Automatic identification of quasi-experimental designs for discovering causal knowledge. 372-380 - Shuiwang Ji, Lei Tang, Shipeng Yu, Jieping Ye:
Extracting shared subspace for multi-label classification. 381-389 - Bin Jiang, Jian Pei, Xuemin Lin, David W. Cheung, Jiawei Han:
Mining preferences from superior and inferior examples. 390-398 - Ruoming Jin, Muad Abu-Ata, Yang Xiang, Ning Ruan:
Effective and efficient itemset pattern summarization: regression-based approaches. 399-407 - S. Sathiya Keerthi, S. Sundararajan, Kai-Wei Chang, Cho-Jui Hsieh, Chih-Jen Lin:
A sequential dual method for large scale multi-class linear svms. 408-416 - Jerry Kiernan, Evimaria Terzi:
Constructing comprehensive summaries of large event sequences. 417-425 - Yehuda Koren:
Factorization meets the neighborhood: a multifaceted collaborative filtering model. 426-434 - Gueorgi Kossinets, Jon M. Kleinberg, Duncan J. Watts:
The structure of information pathways in a social communication network. 435-443 - Hans-Peter Kriegel, Matthias Schubert, Arthur Zimek:
Angle-based outlier detection in high-dimensional data. 444-452 - Srivatsan Laxman, Vikram Tankasali, Ryen W. White:
Stream prediction using a generative model based on frequent episodes in event sequences. 453-461 - Jure Leskovec, Lars Backstrom, Ravi Kumar, Andrew Tomkins:
Microscopic evolution of social networks. 462-470 - Lei Li, Wenjie Fu, Fan Guo, Todd C. Mowry, Christos Faloutsos:
Cut-and-stitch: efficient parallel learning of linear dynamical systems on smps. 471-479 - Charles X. Ling, Jun Du:
Active learning with direct query construction. 480-487 - Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu:
Spectral domain-transfer learning. 488-496 - Xu Ling, Qiaozhu Mei, ChengXiang Zhai, Bruce R. Schatz:
Mining multi-faceted overviews of arbitrary topics in a text collection. 497-505 - Aurélie C. Lozano, Naoki Abe:
Multi-class cost-sensitive boosting with p-norm loss functions. 506-514 - Omid Madani, Jian Huang:
On updates that constrain the features' connections during learning. 515-523 - Mary McGlohon, Leman Akoglu, Christos Faloutsos:
Weighted graphs and disconnected components: patterns and a generator. 524-532 - Gabriela Moise, Jörg Sander:
Finding non-redundant, statistically significant regions in high dimensional data: a novel approach to projected and subspace clustering. 533-541 - Ramesh Nallapati, Amr Ahmed, Eric P. Xing, William W. Cohen:
Joint latent topic models for text and citations. 542-550 - Nam Nguyen, Rich Caruana:
Classification with partial labels. 551-559 - Dino Pedreschi, Salvatore Ruggieri, Franco Turini:
Discrimination-aware data mining. 560-568 - Ian Porteous, David Newman, Alexander Ihler, Arthur U. Asuncion, Padhraic Smyth, Max Welling:
Fast collapsed gibbs sampling for latent dirichlet allocation. 569-577 - Hiroto Saigo, Nicole Krämer, Koji Tsuda:
Partial least squares regression for graph mining. 578-586 - Issei Sato, Minoru Yoshida, Hiroshi Nakagawa:
Knowledge discovery of semantic relationships between words using nonparametric bayesian graph model. 587-595 - Mukund Seshadri, Sridhar Machiraju, Ashwin Sridharan, Jean Bolot, Christos Faloutsos, Jure Leskovec:
Mobile call graphs: beyond power-law and lognormal distributions. 596-604 - Qihong Shao, Yi Chen, Shu Tao, Xifeng Yan, Nikos Anerousis:
Efficient ticket routing by resolution sequence mining. 605-613 - Victor S. Sheng, Foster J. Provost, Panagiotis G. Ipeirotis:
Get another label? improving data quality and data mining using multiple, noisy labelers. 614-622 - Jin Shieh, Eamonn J. Keogh:
iSAX: indexing and mining terabyte sized time series. 623-631 - Ka Cheung Sia, Junghoo Cho, Yun Chi, Belle L. Tseng:
Efficient computation of personal aggregate queries on blogs. 632-640 - György J. Simon, Vipin Kumar, Zhi-Li Zhang:
Semi-supervised approach to rapid and reliable labeling of large data sets. 641-649 - Ajit Paul Singh, Geoffrey J. Gordon:
Relational learning via collective matrix factorization. 650-658 - Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums:
A bayesian mixture model with linear regression mixing proportions. 659-667 - Liang Sun, Shuiwang Ji, Jieping Ye:
Hypergraph spectral learning for multi-label classification. 668-676 - Lei Tang, Huan Liu, Jianping Zhang, Zohreh Nazeri:
Community evolution in dynamic multi-mode networks. 677-685 - Hanghang Tong, Spiros Papadimitriou, Jimeng Sun, Philip S. Yu, Christos Faloutsos:
Colibri: fast mining of large static and dynamic graphs. 686-694 - Pedro O. S. Vaz de Melo, Virgílio A. F. Almeida, Antonio Alfredo Ferreira Loureiro:
Can complex network metrics predict the behavior of NBA teams? 695-703 - Daniel David Walker, Eric K. Ringger:
Model-based document clustering with a collapsed gibbs sampler. 704-712 - Pu Wang, Carlotta Domeniconi:
Building semantic kernels for text classification using wikipedia. 713-721 - Michael L. Wick, Khashayar Rohanimanesh, Karl Schultz, Andrew McCallum:
A unified approach for schema matching, coreference and canonicalization. 722-730 - Fei Wu, Raphael Hoffmann, Daniel S. Weld:
Information extraction from Wikipedia: moving down the long tail. 731-739 - Junjie Wu, Hui Xiong, Jian Chen:
SAIL: summation-based incremental learning for information-theoretic clustering. 740-748 - Shan-Hung Wu, Keng-Pei Lin, Chung-Min Chen, Ming-Syan Chen:
Asymmetric support vector machines: low false-positive learning under the user tolerance. 749-757 - Yang Xiang, Ruoming Jin, David Fuhry, Feodor F. Dragan:
Succinct summarization of transactional databases: an overlapped hyperrectangle scheme. 758-766 - Yabo Xu, Ke Wang, Ada Wai-Chee Fu, Philip S. Yu:
Anonymizing transaction databases for publication. 767-775 - Jian Yang, Ning Zhong, Yiyu Yao, Jue Wang:
Local peculiarity factor and its application in outlier detection. 776-784 - Luh Yen, Marco Saerens, Amin Mantrach, Masashi Shimbo:
A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances. 785-793 - Chun-Nam John Yu, Thorsten Joachims:
Training structural svms with kernels using sampled cuts. 794-802 - Lei Yu, Chris H. Q. Ding, Steven Loscalzo:
Stable feature selection via dense feature groups. 803-811 - Peng Zhang, Xingquan Zhu, Yong Shi:
Categorizing and mining concept drifting data streams. 812-820 - Xiang Zhang, Fei Zou, Wei Wang:
Fastanova: an efficient algorithm for genome-wide association study. 821-829 - Bin Zhao, Fei Wang, Changshui Zhang:
Cuts3vm: a fast semi-supervised svm algorithm. 830-838 - Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Yung Chang:
Identifying biologically relevant genes via multiple heterogeneous data sources. 839-847 - Wenjun Zhou, Hui Xiong:
Volatile correlation computation: a checkpoint view. 848-856
Industrial papers
- Shyam Boriah, Vipin Kumar, Michael S. Steinbach, Christopher Potter, Steven A. Klooster:
Land cover change detection: a case study. 857-865 - Mohamed Bouguessa, Benoît Dumoulin, Shengrui Wang:
Identifying authoritative actors in question-answering forums: the case of Yahoo! answers. 866-874 - Huanhuan Cao, Daxin Jiang, Jian Pei, Qi He, Zhen Liao, Enhong Chen, Hang Li:
Context-aware query suggestion by mining click-through and session data. 875-883 - Christine H. Chih, Douglas Stott Parker Jr.:
The persuasive phase of visualization. 884-892 - Richard Chow, Philippe Golle, Jessica Staddon:
Detecting privacy leaks using corpus-based association rules. 893-901 - Ying Cui, Jennifer G. Dy, Gregory C. Sharp, Brian M. Alexander, Steve B. Jiang:
Learning methods for lung tumor markerless gating in image-guided radiotherapy. 902-910 - Shantanu Godbole, Shourya Roy:
Text classification, business intelligence, and interactivity: automating C-Sat analysis for services industry. 911-919 - Robert L. Grossman, Yunhong Gu:
Data mining using high performance data clouds: experimental studies using sector and sphere. 920-927 - Shen-Shyang Ho, Ashit Talukder:
Automated cyclone discovery and tracking using knowledge sharing in multiple heterogeneous satellite data. 928-936 - Noam Koenigstein, Yuval Shavitt, Tomer Tankel:
Spotting out emerging artists using geo-aware analysis of P2P query strings. 937-945 - Prem Melville, Saharon Rosset, Richard D. Lawrence:
Customer targeting models using actively-selected web content. 946-953 - Fabian Mörchen, Mathäus Dejori, Dmitriy Fradkin, Julien Etienne, Bernd Wachmann, Markus Bundschus:
Anticipating annotations and emerging trends in biomedical literature. 954-962 - G. Niklas Norén, Andrew Bate, Johan Hopstadius, Kristina Star, I. Ralph Edwards:
Temporal pattern discovery for trends and transient effects: its application to patient records. 963-971 - Nish Parikh, Neel Sundaresan:
Scalable and near real-time burst detection from eCommerce queries. 972-980 - Renuka Sindhgatta:
Identifying domain expertise of developers from source code. 981-989 - Jie Tang, Jing Zhang, Limin Yao, Juanzi Li, Li Zhang, Zhong Su:
ArnetMiner: extraction and mining of academic social networks. 990-998 - Leonardo Weiss Ferreira Chaves, Erik Buchmann, Klemens Böhm:
Tagmark: reliable estimations of RFID tags for business processes. 999-1007 - Gang Wu, Brendan Kitts:
Experimental comparison of scalable online ad serving. 1008-1015 - Xintian Yang, Sitaram Asur, Srinivasan Parthasarathy, Sameep Mehta:
A visual-analytic toolkit for dynamic interaction graphs. 1016-1024 - Jieping Ye, Kewei Chen, Teresa Wu, Jing Li, Zheng Zhao, Rinkal Patel, Min Bae, Ravi Janardan, Huan Liu, Gene E. Alexander, Eric Reiman:
Heterogeneous data fusion for alzheimer's disease study. 1025-1033 - Shipeng Yu, Glenn Fung, Rómer Rosales, Sriram Krishnan, R. Bharat Rao, Cary Dehing-Oberije, Philippe Lambin:
Privacy-preserving cox regression for survival analysis. 1034-1042 - Sai Zeng, Prem Melville, Christian A. Lang, Ioana M. Boier-Martin, Conrad Murphy:
Using predictive analysis to improve invoice-to-cash collection. 1043-1050 - Yi Zhang, Arun C. Surendran, John C. Platt, Mukund Narasimhan:
Learning from multi-topic web documents for contextual advertisement. 1051-1059
Panel
- Ravi Kumar, Alexander Tuzhilin, Christos Faloutsos, David D. Jensen, Gueorgi Kossinets, Jure Leskovec, Andrew Tomkins:
Social networks: looking ahead. 1060
Demonstrations
- Hendrik Blockeel, Toon Calders, Élisa Fromont, Bart Goethals, Adriana Prado, Céline Robardet:
An inductive database prototype based on virtual mining views. 1061-1064 - Peter Christen:
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface. 1065-1068 - Luigi Di Caro, K. Selçuk Candan, Maria Luisa Sapino:
Using tagflake for condensing navigable tag hierarchies from tag clouds. 1069-1072 - Shantanu Godbole, Shourya Roy:
An integrated system for automatic customer satisfaction analysis in the services industry. 1073-1076 - Ming Hua, Jian Pei:
DiMaC: a disguised missing data cleaning tool. 1077-1080 - Evangelos E. Kotsifakos, Irene Ntoutsi, Yannis Vrahoritis, Yannis Theodoridis:
Pattern-Miner: integrated management and mining over data mining models. 1081-1084 - Hongyan Liu, Hui Yang, Wenbo Li, Wei Wei, Jun He, Xiaoyong Du:
CRO: a system for online review structurization. 1085-1088 - Emmanuel Müller, Ira Assent, Ralph Krieger, Timm Jansen, Thomas Seidl:
Morpheus: interactive exploration of subspace clustering. 1089-1092 - Hill Nguyen, Nish Parikh, Neel Sundaresan:
A software system for buzz-based recommendations. 1093-1096 - Shuyi Zheng, Matthew R. Scott, Ruihua Song, Ji-Rong Wen:
Pictor: an interactive system for importing data from a website. 1097-1100
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.