Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

13,894 Hits in 3.7 sec

Modeling Unified Semantic Discourse Structure for High-quality Headline Generation [article]

Minghui Xu, Hao Fei, Fei Li, Shengqiong Wu, Rui Sun, Chong Teng, Donghong Ji
2024 arXiv   pre-print
In this work, We propose using a unified semantic discourse structure (S3) to represent document semantics, achieved by combining document-level rhetorical structure theory (RST) trees with sentence-level  ...  Headline generation aims to summarize a long document with a short, catchy title that reflects the main idea.  ...  Actually, documents come with hierarchical structures in two granularities, i.e., document level and sentence level.  ... 
arXiv:2403.15776v1 fatcat:ie22mypihfcchkd44gz6xdkcni

Learning domain ontologies for semantic Web service descriptions

Marta Sabou, Chris Wroe, Carole Goble, Heiner Stuckenschmidt
2005 Journal of Web Semantics  
To this end, we developed a framework for (semi-)automatic ontology learning from textual sources attached to Web services.  ...  The framework exploits the fact that these sources are expressed in a specific sublanguage, making them amenable to automatic analysis.  ...  A second impediment is the dynamic nature of the field.  ... 
doi:10.1016/j.websem.2005.09.008 fatcat:vieeqt3wirg3feoe23vu5ndfda

Learning Domain Ontologies for Semantic Web Service Descriptions

Marta Sabou, Chris Wroe, Carole Goble, Heiner Stuckenschmidt
2005 Social Science Research Network  
To this end, we developed a framework for (semi-)automatic ontology learning from textual sources attached to Web services.  ...  The framework exploits the fact that these sources are expressed in a specific sublanguage, making them amenable to automatic analysis.  ...  A second impediment is the dynamic nature of the field.  ... 
doi:10.2139/ssrn.3199264 fatcat:6eamwvic3rce3dvfxh24s3sja4

Efficient and effective retrieval using selective pruning

Nicola Tonellotto, Craig Macdonald, Iadh Ounis
2013 Proceedings of the sixth ACM international conference on Web search and data mining - WSDM '13  
Retrieval can be made more efficient by deploying dynamic pruning strategies such as Wand, which do not degrade effectiveness up to a given rank.  ...  In this work, we propose a novel selective framework that determines the appropriate amount of pruning aggressiveness on a per-query basis, thereby increasing overall efficiency without significantly reducing  ...  , we propose a selective pruning framework for identifying the appropriate pruning setting of a dynamic pruning strategy on a per-query basis, to ensure efficient yet effective retrieval.  ... 
doi:10.1145/2433396.2433407 dblp:conf/wsdm/TonellottoMO13 fatcat:nv2qvk3ihng4hltyqbei2i4jsi

Hidden Web Indexing Using HDDI Framework

Shashank Agarwal
2012 IOSR Journal of Engineering  
This research uses Hierarchical Distributed Dynamic Indexing (HDDI) Framework for indexing the Data downloaded by the Siphone++ crawler.  ...  There are various methods of indexing the hidden web database like novel indexing, distributed indexing or indexing using map reduce framework.  ...  A related research initiative involves the design of a multimedia framework for constructive, inquiry-based learning for introductory and upper level computer science courses.  ... 
doi:10.9790/3021-0204858861 fatcat:lob7urrkubcwleyigplsbwfei4

Applying Dynamic Causal Mining in Retailing

Yi Wang
2008 POLIBITS Research Journal on Computer Science and Computer Engineering With Applications  
With the fast development of information technology, retailers are suffering from the excess of information. Too much information can be a problem. However, more information creates more opportunity.  ...  The Semantic Web will provide a foundation for such a solution. However, semantics only provide a way of mapping the content of a web to user defined annotations.  ...  The number of rules has decreased by applying the support level. Table 2 shows the extracted strong rules with support level equal to average value and support larger than 0.08.  ... 
doi:10.17562/pb-37-7 fatcat:ytokixu4uvfgfa64yerpfsdx7y

Automated learning of domain taxonomies from text using background knowledge

Julia Hoxha, Guoqian Jiang, Chunhua Weng
2016 Journal of Biomedical Informatics  
We introduce a novel, unsupervised method for cluster detection based on automated dendrogram pruning, which is dynamic to each partition.  ...  The results of several experiments indicate that our method is superior to existing dynamic pruning and the state-of-art taxonomy learning methods.  ...  Pruning Performance- The results of the evaluation of dendrogram pruning performance and its comparison with two state-of-the-art dynamic tree cut techniques (dynamic tree and dynamic hybrid [38] ) are  ... 
doi:10.1016/j.jbi.2016.09.002 pmid:27597572 pmcid:PMC5077645 fatcat:dvgq7xaj4jblfk6whca2ns3gzy

Semantic data mining: A survey of ontology-based approaches

Dejing Dou, Hao Wang, Haishan Liu
2015 Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)  
Ontology is an explicit specification of conceptualization and a formal way to define the semantics of knowledge and data.  ...  The formal structure of ontology makes it a nature way to encode domain knowledge for the data mining use. In this survey paper, we introduce general concepts of semantic data mining.  ...  Specifically, in ontology-based information extraction (OBIE) [55] , [77] , the extracted information are a set of annotated terms from the document with the relations defined in the ontology.  ... 
doi:10.1109/icosc.2015.7050814 dblp:conf/semco/DouWL15 fatcat:uck4bd3gpnf5bgb4jike6xm5ne

HDDI™: Hierarchical Distributed Dynamic Indexing [chapter]

William M. Pottenger, Yong-Bin Kim, Daryl D. Meling
2001 Data Mining for Scientific and Engineering Applications  
Hierarchical Distributed Dynamic Indexing (HDDI) is an approach that dynamically creates a hierarchical index from distributed document collections.  ...  We conclude with several example applications of HDDI in the textual data mining and information retrieval fields.  ...  We also gratefully acknowledge the assistance of the many who worked together with us at the National Center for Supercomputing Applications and at Lehigh University to make this a reality.  ... 
doi:10.1007/978-1-4615-1733-7_18 fatcat:jc4omkimvrgvroatlbeyg35eyu

From keywords to keyqueries

Tim Gollub, Matthias Hagen, Maximilian Michel, Benno Stein
2013 Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13  
To determine the keyqueries for a document, we present an exhaustive search algorithm along with effective pruning strategies.  ...  We introduce the concept of keyqueries as dynamic content descriptors for documents.  ...  We conclude with an outlook on future work. RELATED WORK The state-of-the-art technique for the automated generation of content descriptors is keyword or keyphrase extraction.  ... 
doi:10.1145/2484028.2484181 dblp:conf/sigir/GollubHMS13 fatcat:4kefpe37brcwdnd7trktwjdg54

Ranking objects based on relationships and fixed associations

Albert Angel, Surajit Chaudhuri, Gautam Das, Nick Koudas
2009 Proceedings of the 12th International Conference on Extending Database Technology Advances in Database Technology - EDBT '09  
Text corpora are often enhanced by additional metadata which relate real-world entities, with each document in which such entities are discussed.  ...  We devise early pruning and termination strategies, in the presence of joins and aggregations (executed on entities extracted from text), that do not depend on any estimates.  ...  related to these documents using M T (In-memory join of document ids with M T ) 3: for each entity e found, whose related document's score is s do 4: if an entry for e exists in SeenEnt T , with NumSeen  ... 
doi:10.1145/1516360.1516464 dblp:conf/edbt/AngelCDK09 fatcat:psaqn62o65fmnliptptbb72dkq

SES-based ontological process for high level information fusion

Hojun Lee, Bernard P. Zeigler
2010 Proceedings of the 2010 Spring Simulation Multiconference on - SpringSim '10  
Pruning and transformation processes of the SES ontology generate various levels of information in accordance with C2 systems' needs, which support decision-making process in an automated way.  ...  The System Entity Structure (SES) is an ontology framework that can facilitate information exchange and represent knowledge in a network-centric environment.  ...  It is, therefore, a study about Level 2 and for partially related with Level 3 based on results of Level 1 for high-level fusion process [4] .  ... 
doi:10.1145/1878537.1878672 fatcat:agaad6eisbcdfgw3oojw547cd4

Discovering topic structures of a temporally evolving document corpus [article]

Adham Beykikhoshk and Ognjen Arandjelovic and Dinh Phung and Svetha Venkatesh
2015 arXiv   pre-print
In this paper we describe a novel framework for the discovery of the topical content of a data corpus, and the tracking of its complex structural changes across the temporal dimension.  ...  Our key technical contribution is a framework based on (i) discretization of time into epochs, (ii) epoch-wise topic discovery using a hierarchical Dirichlet process-based model, and (iii) a temporal similarity  ...  The recently proposed latent Dirichlet allocation (LDA) method [15] overcomes the overfitting problem by adopting a Bayesian framework and a generative process at the document level.  ... 
arXiv:1512.08008v1 fatcat:ir2u5bukjfbyph5vclraxjuzry

Mining Generalized Associations of Semantic Relations from Textual Web Content

Tao Jiang, Ah-hwee Tan, Ke Wang
2007 IEEE Transactions on Knowledge and Data Engineering  
First, RDF (Resource Description Framework) metadata representing semantic relations are extracted from raw text using a myriad of natural language processing techniques.  ...  The relation extraction process also creates a term taxonomy in the form of a sense hierarchy inferred from WordNet.  ...  We apply semantic relation extraction to extract semantic relations from the ICT suicide bombing (ICT-SB) documents and the ICT car bombing (ICT-CB) documents with a WordNet search depth (WNSD) of 2 and  ... 
doi:10.1109/tkde.2007.36 fatcat:jkxp3oiotbe2vmmegagu7ekpuu

CITOM: An incremental construction of multilingual topic maps

Nebrasse Ellouze, Nadira Lammari, Elisabeth Métais
2012 Data & Knowledge Engineering  
We validate our approach with a real corpus from the sustainable construction domain.  ...  Our approach takes into account three types of information sources: (a) a set of multilingual documents, (b) a domain thesaurus and (c) all the possible questioning sources such as FAQ and user's or expert's  ...  of questions related to the document and extracted from the questioning sources.  ... 
doi:10.1016/j.datak.2012.02.002 fatcat:hbmwnay73vaxpltxazbyp2rf2m
« Previous Showing results 1 — 15 out of 13,894 results