Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








151 Hits in 7.1 sec

Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text

Edwin Aldana-Bobadilla, Alejandro Molina-Villegas, Ivan Lopez-Arevalo, Shanel Reyes-Palacios, Victor Muñiz-Sanchez, Jean Arreola-Trapala
2020 Remote Sensing  
In this paper, we propose an extensible geoparsing approach including geographic entity recognition based on a neural network model and disambiguation based on what we have called dynamic context disambiguation  ...  The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic entities.  ...  Geographic-Named Entity Recognition We have obtained the semantic features based on word embeddings obtained with word2vec [29] .  ... 
doi:10.3390/rs12183041 doaj:2a94e8c05d16492f856aa3ed81fb4916 fatcat:odfrdlic2fa37ahqtpx624c7wa

Recognition of Named Entities in Spanish Texts [chapter]

Sofía N. Galicia-Haro, Alexander Gelbukh, Igor A. Bolshakov
2004 Lecture Notes in Computer Science  
Proper name recognition is a subtask of Name Entity Recognition in Message Understanding Conference.  ...  For our corpus annotation proper name recognition is a crucial task since proper names appear approximately in more than 50% of total sentences of the electronic texts that we collected for such purpose  ...  The preliminary results shows the possibilities of the method and the required information for better results.  ... 
doi:10.1007/978-3-540-24694-7_43 fatcat:broitu57rzajtflicbfo3rsh24

Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms

Jacques Fize, Ludovic Moncla, Bruno Martins
2021 ISPRS International Journal of Geo-Information  
To train our model, we use toponym co-occurrences collected from different contexts, namely textual (i.e., co-occurrences of toponyms in Wikipedia articles) and geographical (i.e., inclusion and proximity  ...  in geographical areas with fewer places in the data sources.  ...  The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.  ... 
doi:10.3390/ijgi10120818 fatcat:sttddeumfbg4lnkjm4tptfrzr4

Resolving Ambiguities in Toponym Recognition in Cartographic Maps [chapter]

Alexander Gelbukh, Serguei Levachkine, Sang-Yong Han
2004 Lecture Notes in Computer Science  
Our goal is to form in the vector thematic layers geographically meaningful words correctly attached to the cartographic objects.  ...  In this work, we propose a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among  ...  Acknowledgments The work was partially supported by Mexican Government (CONACYT, SNI, CGPI-IPN) and the ITRI of the Chung-Ang University.  ... 
doi:10.1007/978-3-540-25977-0_7 fatcat:5kowthkdzjcdxl7zheiuyitdke

A Pragmatic Guide to Geoparsing Evaluation [article]

Milan Gritta, Mohammad Taher Pilehvar, Nigel Collier
2019 arXiv   pre-print
of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond.  ...  Part 3) Evaluation Data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions.  ...  Authors agree that standard Named Entity Recognition is inade-quate for geographic NLP tasks.  ... 
arXiv:1810.12368v5 fatcat:omtwa7xnvrgxvgipn6pddc6l44

Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks

Li Lucy, Dorottya Demszky, Patricia Bromley, Dan Jurafsky
2020 AERA Open  
We apply techniques from natural language processing (lexicons, word embeddings, topic models) to 15 U.S. history textbooks widely used in Texas between 2015 and 2017, studying their depiction of historically  ...  Word embeddings reveal that women tend to be discussed in the contexts of work and the home. Topic modeling highlights the higher prominence of political topics compared with social ones.  ...  Acknowledgments We would like to thank the following individuals for helpful conversations, feedback, and ideas: Noah Smith, Sebastian Munoz-Najar Galvez, Lily  ... 
doi:10.1177/2332858420940312 fatcat:l5antrdnc5d5dbi4goob6k5lou

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases [article]

Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek
2021 arXiv   pre-print
This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics  ...  It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies.  ...  It is a great pleasure and honor to have such wonderful colleagues in our research community.  ... 
arXiv:2009.11564v2 fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy

Automated Travel History Extraction from Clinical Notes: Algorithm Development and Validation for Emergent Infectious Disease Events (Preprint)

Kelly S Peterson, Julia Lewis, Olga V Patterson, Alec B Chapman, Daniel Denhalter, Patricia A Lye, Vanessa W Stevens, Shantini D Gamage, Gary A Roselle, Katherine S Wallace, Makoto Jones
2020 JMIR Public Health and Surveillance  
More recently, this system was used in the early phases of response to COVID-19 in the United States, although its utility was limited to a relatively brief window due to the rapid domestic spread of the  ...  This study aims to assess the feasibility of annotating and automatically extracting travel history mentions from unstructured clinical documents in the Department of Veterans Affairs across disparate  ...  We thank the editor and anonymous reviewers for their feedback in ameliorating the reporting of this study.  ... 
doi:10.2196/26719 pmid:33759790 fatcat:bmji77r2nvgk7lkpvpx245eajm

Text-Based Twitter User Geolocation Prediction

B. Han, P. Cook, T. Baldwin
2014 The Journal of Artificial Intelligence Research  
Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its author's location.  ...  Geographical location is vital to geospatial applications like local search and event detection.  ...  Acknowledgments The authors wish to thank Stephen Roller and Jason Baldridge making their data and tools available to replicate their NA experiments.  ... 
doi:10.1613/jair.4200 fatcat:jvdb3fdb4ngoxmsjo4fm2jbqve

Microblog topic identification using Linked Open Data

Ahmet Yıldırım, Suzan Uskudarli, Syed Ahmad Chan Bukhari
2020 PLoS ONE  
Such topics are used in tasks like classification and recommendations.  ...  Numerous approaches have been proposed to detect topics from collections of microposts, where the topics are represented by lists of terms such as words, phrases, or word embeddings.  ...  Dinesh and Dr. Jayant Venkatanatha for valuable contributions during the preparation of this work.  ... 
doi:10.1371/journal.pone.0236863 pmid:32780736 fatcat:rblunq2kqffcvpfg362olx5rhq

Book of Abstracts of the Digital Humanities in the Nordic Countries 5th conference. Riga, 20–23 October 2020 [article]

Sanita Reinsone, Anda Baklāne, Jānis Daugavietis
2020 Zenodo  
Book of Abstracts DHN, Rīga 2020 Book of Abstracts of the Digital Humanities in the Nordic Countries 5th conference.  ...  conferences/dhn2020 Editors: Sanita Reinsone, Anda Baklāne, Jānis Daugavietis Editorial assistants: Justīne Jaudzema, Ilze Ļaksa-Timinska Cover: Anete Krūmiņa Publisher: Institute of Literature, Folklore and  ...  We also thank the library of the Technische Acknowledgements This work has been supported by the European Union's Horizon 2020 research and innovation programme under grant 770299 (NewsEye).  ... 
doi:10.5281/zenodo.4107117 fatcat:6ongky6p5rab7gvtawnjmp2ofm

A Vector Semantics Approach to the Geoparsing Disambiguation Task for Texts in Spanish

Filomeno Alcántara, Alejandro Molina, Victor Muñiz
unpublished
In this paper, we present a work in progress for location disambiguation in news documents that uses a vector-semantic representation learned from information sources that include events and geographic  ...  Linking these locations to coordinates in a map usually requires two steps involving the named entity: extraction and disambiguation.  ...  The locations were provided by CentroGeo based on OpenNLP's Named Entity Recognition (NER) module [1] .  ... 
doi:10.29007/pl5h fatcat:3trhftivlbcdbaubwiw5vbhaiy

Natural Language Processing for Dialects of a Language: A Survey [article]

Aditya Joshi, Raj Dabre, Diptesh Kanojia, Zhuang Li, Haolan Zhan, Gholamreza Haffari, Doris Dippold
2024 arXiv   pre-print
We expect that this survey will be useful to NLP researchers interested in building equitable language technologies by rethinking LLM benchmarks and model architectures.  ...  We observe that past work in NLP concerning dialects goes deeper than mere dialect classification, and .  ...  This approach uses two neural parsers, which are modified with the word embeddings used for initialisation. The word embeddings are trained on the standard and the dialect-specific datasets.  ... 
arXiv:2401.05632v2 fatcat:qkpfmywh2bar7capolzkfx6guu

Bias and Fairness in Large Language Models: A Survey [article]

Isabel O. Gallegos, Ryan A. Rossi, Joe Barrow, Md Mehrab Tanjim, Sungchul Kim, Franck Dernoncourt, Tong Yu, Ruiyi Zhang, Nesreen K. Ahmed
2024 arXiv   pre-print
embeddings, probabilities, and generated text.  ...  Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model:  ...  Sentence Embedding Metrics Instead of using static word embeddings, LLMs use embeddings learned in the context of a sentence, and are more appropriately paired with embedding metrics for sentence-level  ... 
arXiv:2309.00770v2 fatcat:idqaltzjdndnhnwcgd7dsqcejm

Modeling words for online sexual behavior surveillance and clinical text information extraction MODELING WORDS FOR ONLINE SEXUAL BEHAVIOR SURVEILLANCE AND CLINICAL TEXT INFORMATION EXTRACTION

Jason Fries, Jason Fries, Philip Polgreen, Ted Herman, Padmini Srinivasan
unpublished
In domains like information retrieval, words have classically been modeled as discrete entities using 1-of-n encoding, a representation that elides most of a word's syntactic and semantic structure.  ...  Recent research, however, has begun exploring more robust representations called word embeddings.  ...  named entity recognition and linking tasks.  ... 
fatcat:2doedhxumbf2vcq4mpm47wmm6a
« Previous Showing results 1 — 15 out of 151 results