Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

23,079 Hits in 3.2 sec

Semantic Smoothing for Model-based Document Clustering

Xiaodan Zhang, Xiaohua Zhou, Xiaohua Hu
2006 IEEE International Conference on Data Mining. Proceedings  
Inspired by a series of statistical translation language model for text retrieval, we propose in this paper a novel smoothing method referred to as context-sensitive semantic smoothing for document clustering  ...  The comparative experiment on three datasets shows that model-based clustering approaches with semantic smoothing is effective in improving cluster quality. [13] Zhou, X.  ...  of model-based document clustering.  ... 
doi:10.1109/icdm.2006.142 dblp:conf/icdm/ZhangZH06 fatcat:gjstvfvc5bakbpnu7h25qjiwfi

Ontology-based semantic smoothing model for biomedical document clustering

S. Logeswari, K. Premalatha
2015 International Journal of Telemedicine and Clinical Practices  
In this work ontology-based semantic smoothing model is proposed which uses the domain ontology for concept extraction.  ...  Recent researches focus on the clustering of text documents based on the semantic smoothing technique, which resolves the conflicts by general words and the sparsity of class-specific core words.  ...  It comprises of two context-sensitive semantic smoothing models named document model-based and cluster model-based techniques.  ... 
doi:10.1504/ijtmcp.2015.069475 fatcat:mnmnhfgs6rec3jk3tdbhy5thje

Dragon Toolkit: Incorporating Auto-Learned Semantic Knowledge into Large-Scale Text Retrieval and Mining

Xiaohua Zhou, Xiaodan Zhang, Xiaohua Hu
2007 19th IEEE International Conference on Tools with Artificial Intelligence(ICTAI 2007)  
The incorporation of semantic knowledge then reduces to the smoothing of unigram language models using semantic knowledge.  ...  The majority of text retrieval and mining techniques are still based on exact feature (e.g. words) matching and unable to incorporate text semantics.  ...  When the dataset to cluster is small, model-based kmeans with semantic smoothing (both CSSS and CISS) not only outperform model-based k-means with Laplacian smoothing and background smoothing, but also  ... 
doi:10.1109/ictai.2007.117 dblp:conf/ictai/ZhouZH07 fatcat:n7oe6nms3vhphpa4f26zzoo2d4

Exploiting Image Contents in Web Search

Zhi-Hua Zhou, Hong-Bin Dai
2007 International Joint Conference on Artificial Intelligence  
We evaluate the new model-based similarity measure on three datasets using complete linkage criterion for agglomerative clustering and find out it significantly improves the clustering quality over the  ...  Both problems can be resolved by suitable smoothing of document model and using Kullback-Leibler divergence of two smoothed models as pairwise document distances.  ...  We also thank three anonymous reviewers for their instructive comments on the paper.  ... 
dblp:conf/ijcai/ZhouD07 fatcat:vemop4hxsfdufafkqladgjjzz4

Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval

Rui Yan, Han Jiang, Mirella Lapata, Shou-De Lin, Xueqiang Lv, Xiaoming Li
2013 International Joint Conference on Natural Language Processing  
We balance the effects of semantic and positional smoothing, and score a document based on the smoothed language model.  ...  Work on information retrieval has shown that language model smoothing leads to more accurate estimation of document models and hence is crucial for achieving good retrieval performance.  ...  Acknowledgments This work was supported by "III Innovative and Prospective Technologies Project" of the Institute for Information Industry which is subsidized by the Ministry of Economy Affairs of the  ... 
dblp:conf/ijcnlp/YanJLLLL13 fatcat:fz3lz4zxgfenlcrxfrs3ao4er4

Clustering Massive Text Data Streams by Semantic Smoothing Model [chapter]

Yubao Liu, Jiarong Cai, Jian Yin, Ada Wai-Chee Fu
2007 Lecture Notes in Computer Science  
In this paper, we firstly give an improved semantic smoothing model for text data stream environment.  ...  Then we use the improved semantic model to improve the clustering quality and present an online clustering algorithm for clustering massive text data streams.  ...  In [7] , model-based clustering approaches based on semantic smoothing that is widely used in information retrieval (IR) [8] is presented for efficient text data clustering.  ... 
doi:10.1007/978-3-540-73871-8_36 fatcat:5okdburf5zfybid4i4ezjvfpua

Context-sensitive semantic smoothing for the language modeling approach to genomic IR

Xiaohua Zhou, Xiaohua Hu, Xiaodan Zhang, Xia Lin, Il-Yeol Song
2006 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '06  
signature using the EM algorithm; and (3) expanding document and query models based on topic signature translations.  ...  The implemented semantic smoothing models, such as the translation model which statistically maps document terms to query terms, and a number of works that have followed have shown good experimental results  ...  We also thank four anonymous reviewers for their comments on the paper.  ... 
doi:10.1145/1148170.1148203 dblp:conf/sigir/ZhouHZLS06 fatcat:utt4jhemlvgfvnwxa5motw5xom

Clustering Text Data Streams

Yu-Bao Liu, Jia-Rong Cai, Jian Yin, Ada Wai-Chee Fu
2008 Journal of Computer Science and Technology  
However, the existing semantic smoothing model is not suitable for dynamic text data context. In this paper, we extend the semantic smoothing model into text data streams context firstly.  ...  Recently, researchers argue that semantic smoothing model is more efficient than the existing TF * IDF scheme for improving text clustering quality.  ...  Acknowledgements We would like to thanks the anonymous reviewers for their helpful comments on the early version of this paper.  ... 
doi:10.1007/s11390-008-9115-1 fatcat:dmrzfq6nxvflhkbyzle2hl3cui

Semantic smoothing for text clustering

Jamal A. Nasir, Iraklis Varlamis, Asim Karim, George Tsatsaronis
2013 Knowledge-Based Systems  
In this paper we present a new semantic smoothing vector space kernel (S-VSM) for text documents clustering.  ...  To the best of our knowledge, the current study is the first to systematically compare, analyze and evaluate the impact of semantic smoothing in text clustering based on 'wisdom of linguists', e.g., WordNets  ...  Hence, the need of smooth semantically the VSM model, i.e., by employing semantic smoothing VSM kernels, arises.  ... 
doi:10.1016/j.knosys.2013.09.012 fatcat:ssupj4c7engi7mhwldmpuniv7a

Tackling Sparsity, the Achilles Heel of Social Networks: Language Model Smoothing via Social Regularization

Rui Yan, Xiang Li, Mengwen Liu, Xiaohua Hu
2015 Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)  
We propose to tackle this specific weakness of social networks by smoothing the posting document language model based on social regularization.  ...  Online social networks nowadays have the worldwide prosperity, as they have revolutionized the way for people to discover, to share, and to diffuse information.  ...  We thank all the anonymous reviewers for their valuable and constructive comments in ACL short paper track 1 . References  ... 
doi:10.3115/v1/p15-2103 dblp:conf/acl/YanLLH15 fatcat:rqec3okenzdjpcsexdvjl5jtcu

Topic Signature Language Models for Ad hoc Retrieval

Xiaohua Zhou, Xiaohua Hu, Xiaodan Zhang
2007 IEEE Transactions on Knowledge and Data Engineering  
Document models based on topic signature translation are then derived.  ...  The previously implemented semantic smoothing models, such as the translation model which statistically maps document terms to query terms, and a number of other works that have followed have shown good  ...  The smoothed document models can be used not only for text retrieval, but also for many other text mining applications such as text clustering.  ... 
doi:10.1109/tkde.2007.1058 fatcat:maftbty6hfdorash3mu3dmetqa

A Complex Network Approach to Distributional Semantic Models

Akira Utsumi, Zi-Ke Zhang
2015 PLoS ONE  
Furthermore, to simulate a semantic network with the observed network properties, we propose a new growing network model based on the model of Steyvers and Tenenbaum.  ...  features of a word-context matrix and the functions of matrix weighting and smoothing.  ...  As shown in Results for Hierarchical Property SVD smoothing works differently for word-document-based and word-word-based networks.  ... 
doi:10.1371/journal.pone.0136277 pmid:26295940 pmcid:PMC4546414 fatcat:pfz5x3gbdrb5tkgaypz5ymif4e

More Discriminative Sentence Embeddings via Semantic Graph Smoothing [article]

Chakib Fettal, Lazhar Labiod, Mohamed Nadif
2024 arXiv   pre-print
Leveraging semantic graph smoothing, we enhance sentence embeddings obtained from pretrained models to improve results for the text clustering and classification tasks.  ...  Our method, validated on eight benchmarks, demonstrates consistent improvements, showcasing the potential of semantic graph smoothing in improving sentence embeddings for the supervised and unsupervised  ...  Proposed Methodology: Smoothing Sentence Embeddings In this paper, we theorize that smoothing sentence embeddings with a semantic similarity graph can help supervised and unsupervised categorization models  ... 
arXiv:2402.12890v1 fatcat:kzqhngzgiva7bjnczi4uxmxxhy

A multispan language modeling framework for large vocabulary speech recognition

D.D. O'Shaughnessy
1998 IEEE Transactions on Speech and Audio Processing  
Since in this space familiar clustering techniques can be applied, it becomes possible to derive several families of large-span language models, with various smoothing properties.  ...  This paradigm seeks to automatically uncover the salient semantic relationships between words and documents in a given corpus.  ...  Landauer, from the same institution, for pointing out a number of insightful references on the latent semantic model of knowledge acquisition, and to the anonymous reviewers for their constructive comments  ... 
doi:10.1109/89.709671 fatcat:wkkgclrrinb7vjs7pw5aphrk3u

Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA

Yue Lu, Qiaozhu Mei, ChengXiang Zhai
2010 Information retrieval (Boston)  
(LDA), using three representative text mining tasks, including document clustering, text categorization, and ad-hoc retrieval.  ...  The task-based evaluation framework is generalizable to other topic models in the family of either PLSA or LDA.  ...  Document clustering The first task we have chosen is document clustering. Generally, there are two ways of using topic models for document clustering.  ... 
doi:10.1007/s10791-010-9141-9 fatcat:v3vnphcmdrgblgziaudyddrtfu
« Previous Showing results 1 — 15 out of 23,079 results