A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
A multimedia interactive search engine based on graph-based and non-linear multimodal fusion
2016
2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)
This paper presents an interactive multimedia search engine, which is capable of searching into multimedia collections by fusing textual and visual information. ...
Apart from multimedia search, the engine is able to perform text search and image retrieval independently using both high-level and lowlevel information. ...
ACKNOWLEDGMENTS This work was partially supported by the European Commission by the projects MULTISENSOR (FP7-610411), HOMER (FP7-312883) and KRISTINA (H2020-645012). ...
doi:10.1109/cbmi.2016.7500276
dblp:conf/cbmi/MoumtzidouGMLVK16
fatcat:zx3tgnl6dbcljaaezmakgmc72y
Verifying the Proximity and Size Hypothesis for Self-Organizing Maps
1999
Journal of Management Information Systems
summaries for large textual data collection. ...
advanced information indexing, searching, and classifica- tion techniques. ...
doi:10.1080/07421222.1999.11518256
fatcat:o7qr3xblcrfqtoxv6quhxs3lzq
Text Clustering in Distributed Networks with Enhanced File Security
2014
IOSR Journal of Computer Engineering
In such approaches, clustering is performed on a dedicated node and also they are not suitable for deployment in large distributed networks. ...
This centralized approach require high processing time and retrieving time during searching due to scalability of users. ...
As a result a distributed version of K Means are used.K-Means algorithm can be summarized as: (1) Select k random starting points as initial centroids forthe k clusters. ...
doi:10.9790/0661-16582430
fatcat:wp5vigciebhwfi7egftsawxuq4
Grappling with the Scale of Born-Digital Government Publications: Toward Pipelines for Processing and Searching Millions of PDFs
[article]
2021
arXiv
pre-print
Yet, these PDFs remain largely unutilized and understudied in part due to the challenges surrounding the development of scalable pipelines for searching and analyzing them. ...
In addition to demonstrating the utility of PDF metadata, this paper offers computationally-efficient machine learning approaches to search and discovery that utilize the PDFs' textual and visual features ...
Even webpage
indexing relies primarily on textual content, and this flat representation restricts our view on
what searching the web and born-digital content could be. ...
arXiv:2112.02471v1
fatcat:yg2xrmgnwva2lpoc334ptiwpoa
Assigning document identifiers to enhance compressibility of Web Search Engines indexes
2004
Proceedings of the 2004 ACM symposium on Applied computing - SAC '04
The simulations performed on a real dataset, i.e. the Google contest collection, show that our approach allows to obtain an IF index which is, depending on the d gap encoding chosen, up to 23% smaller ...
Granting efficient accesses to the index is a key issue for the performances of Web Search Engines (WSE). ...
In [12] the authors investigate the performance of different index compression schemes through experiments on large query sets and collections of Web Documents. ...
doi:10.1145/967900.968024
dblp:conf/sac/SilvestriPO04
fatcat:xyemltmfynaobay5xlpgzyllwe
Information retrieval in a peer-to-peer environment
2006
Proceedings of the 1st international conference on Scalable information systems - InfoScale '06
This paper focuses on peer-to-peer information retrieval (P2PIR), which aims to retrieve textual documents based on their contents and ranks them based on some relevance measures against the query. ...
The "open nature" of P2P systems and their lack of centralized control pose difficult challenges to the search capability and performance of P2PIR systems. ...
Within the scope of this paper, P2PIR deals with textual documents and retrieval is based on some ranking measures computed between the query and the document texts. ...
doi:10.1145/1146847.1146896
dblp:conf/infoscale/LeeZL06
fatcat:wajwt4gkj5aqhgtmgipaumhw5m
Spatiotemporal Keyword Query Suggestion Based On Document Proximity and K-Means Method– A Review
2017
IJARCCE
The K-Means method used for retrieving the highest ranked top k objects near to the current location of the user and Time aware query suggestion brings out the most relevant documents based on the temporal ...
the location of the user and the documents retrieved. ...
Yuan Hung et al [14] proposed a Top -K search results for the query clustering based on the similarity of the ranked URL results returned by the search engine and the query is used for the clustering ...
doi:10.17148/ijarcce.2017.63157
fatcat:v7a3huyozzfh7hizniut5wuhiy
Efficient logo retrieval through hashing shape context descriptors
2010
Proceedings of the 8th IAPR International Workshop on Document Analysis Systems - DAS '10
In this paper we present a method for organizing and indexing logo digital libraries like the ones of the patent and trademark offices. ...
These descriptors are then indexed by a locality-sensitive hashing data structure aiming to perform approximate k-NN search in high dimensional spaces in sub-linear time. ...
ACKNOWLEDGMENTS This work has been partially supported by the Spanish projects TIN2006-15694-C02-02, TIN2009-14633-C03-03 and CON-SOLIDER -INGENIO 2010 (CSD2007-00018). We would also thank Dr. R. ...
doi:10.1145/1815330.1815358
dblp:conf/das/RusinolL10
fatcat:6spwkxswfzgf7cvvb6uakfiola
Selective Search
2015
ACM Transactions on Information Systems
This search technique first partitions the corpus, based on documents' similarity, into topic-based shards. ...
This article investigates and extends an alternative: selective search, an approach that partitions the dataset based on document similarity to obtain topic-based shards, and searches only a few shards ...
Sample-based K-means (SB K-means). We employ a variant of the time-tested K-means clustering algorithm [Lloyd 2006 ] to partition the documents based on their similarity. ...
doi:10.1145/2738035
fatcat:fistpgm5abemdeecnpiqmt4szi
Learning to reduce the semantic gap in web image retrieval and annotation
2008
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08
retrieval performance on the training data. 2) To be scalable, millions of images together with rich textual information have been crawled from the Web to learn the similarity measure, and the learning ...
framework particularly considers the indexing problem to ensure the retrieval efficiency. 3) To alleviate the noises in the unbalanced labels of images and fully utilize the textual information, a Latent ...
This image collection is indexed based on K-means-based indexing method [8] using visual features. ...
doi:10.1145/1390334.1390396
dblp:conf/sigir/WangZZ08
fatcat:ihmuowpyk5gz3fq6bpmxayunqm
Image Retrieval based on Bag-of-Words model
[article]
2013
arXiv
pre-print
A common way to achieve this is first quantizing local descriptors into visual words, and then applying scalable textual indexing and retrieval schemes. ...
In recent years, large-scale image retrieval shows significant potential in both industry applications and research problems. ...
We then sequentially discuss the key point detection, local description, vocabulary generation, vector quantization, indexing and search. ...
arXiv:1304.5168v1
fatcat:6jycm42bg5guzgzlzdbc2pydby
Visual diversification of image search results
2009
Proceedings of the 18th international conference on World wide web - WWW '09
Due to the reliance on the textual information associated with an image, image search engines on the Web lack the discriminative power to deliver visually diverse search results. ...
Based on a performance evaluation we find that the outcome of the methods closely resembles human perception of diversity, which was established in an extensive clustering experiment carried out by human ...
In [19] , we have presented a method for detecting and resolving the ambiguity of a query based on the textual features of the image collection. ...
doi:10.1145/1526709.1526756
dblp:conf/www/LeukenPOZ09
fatcat:h2b62otiunenvcpkfiypv2dmne
Interactive Learning for Multimedia at Large
[chapter]
2020
Lecture Notes in Computer Science
We propose an interactive learning approach that builds on and extends the state of the art in user relevance feedback systems and high-dimensional indexing for multimedia. ...
We report on a detailed experimental study using the ImageNet and YFCC100M collections, containing 14 million and 100 million images respectively. ...
This work was supported by a PhD grant from the IT University of Copenhagen and by the European Regional Development Fund (project Robotics for Industry 4.0, CZ.02.1.01/0.0/0.0/15 003/0000470). ...
doi:10.1007/978-3-030-45439-5_33
fatcat:f7ai4wg7evfjnpa7k3pqsedary
Scalable search-based image annotation
2008
Multimedia Systems
Finally, the candidate annotations are re-ranked using Random Walk with Restarts and only the top ones are reserved as the final annotations. ...
First, content-based image retrieval technology is used to retrieve a set of visually similar images from a large-scale Web image set. ...
To speed up the similarity search process, a K-means-based indexing algorithm is used [13] . ...
doi:10.1007/s00530-008-0128-y
fatcat:vtmczcd5qva5tcqcufsbdg7csa
Text Mining Through Label Induction Grouping Algorithm Based Method
[article]
2021
arXiv
pre-print
LINGO works on two main steps; Cluster Label Induction by using Latent Semantic Indexing technique (LSI) and Cluster content discovery by using the Vector Space Model (VSM). ...
From theoretical evidence using Okapi BM25 for scoring method in LSI (LSI+Okapi BM25) for cluster content discovery instead of VSM, also results in better clusters generation in terms of scalability and ...
Due to the large size of data which has some sort of exponential property, it is necessary to provide a scalable framework which further capable of indexing and searching web contents i.e. ...
arXiv:2112.08486v1
fatcat:q4jr2zlfhbh4lcqxdosto6jrxu
« Previous
Showing results 1 — 15 out of 3,587 results