A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
Filters
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text
2020
Remote Sensing
In this paper, we propose an extensible geoparsing approach including geographic entity recognition based on a neural network model and disambiguation based on what we have called dynamic context disambiguation ...
The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic entities. ...
Geographic-Named Entity Recognition We have obtained the semantic features based on word embeddings obtained with word2vec [29] . ...
doi:10.3390/rs12183041
doaj:2a94e8c05d16492f856aa3ed81fb4916
fatcat:odfrdlic2fa37ahqtpx624c7wa
Recognition of Named Entities in Spanish Texts
[chapter]
2004
Lecture Notes in Computer Science
Proper name recognition is a subtask of Name Entity Recognition in Message Understanding Conference. ...
For our corpus annotation proper name recognition is a crucial task since proper names appear approximately in more than 50% of total sentences of the electronic texts that we collected for such purpose ...
The preliminary results shows the possibilities of the method and the required information for better results. ...
doi:10.1007/978-3-540-24694-7_43
fatcat:broitu57rzajtflicbfo3rsh24
Deep Learning for Toponym Resolution: Geocoding Based on Pairs of Toponyms
2021
ISPRS International Journal of Geo-Information
To train our model, we use toponym co-occurrences collected from different contexts, namely textual (i.e., co-occurrences of toponyms in Wikipedia articles) and geographical (i.e., inclusion and proximity ...
in geographical areas with fewer places in the data sources. ...
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. ...
doi:10.3390/ijgi10120818
fatcat:sttddeumfbg4lnkjm4tptfrzr4
Resolving Ambiguities in Toponym Recognition in Cartographic Maps
[chapter]
2004
Lecture Notes in Computer Science
Our goal is to form in the vector thematic layers geographically meaningful words correctly attached to the cartographic objects. ...
In this work, we propose a method that combines OCR-based text recognition in raster-scanned maps with heuristics specially adapted for cartographic data to resolve the recognition ambiguities using, among ...
Acknowledgments The work was partially supported by Mexican Government (CONACYT, SNI, CGPI-IPN) and the ITRI of the Chung-Ang University. ...
doi:10.1007/978-3-540-25977-0_7
fatcat:5kowthkdzjcdxl7zheiuyitdke
A Pragmatic Guide to Geoparsing Evaluation
[article]
2019
arXiv
pre-print
of metrics and a detailed toponym taxonomy with implications for Named Entity Recognition (NER) and beyond. ...
Part 3) Evaluation Data: shared via a new dataset called GeoWebNews to provide test/train examples and enable immediate use of our contributions. ...
Authors agree that standard Named Entity Recognition is inade-quate for geographic NLP tasks. ...
arXiv:1810.12368v5
fatcat:omtwa7xnvrgxvgipn6pddc6l44
Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks
2020
AERA Open
We apply techniques from natural language processing (lexicons, word embeddings, topic models) to 15 U.S. history textbooks widely used in Texas between 2015 and 2017, studying their depiction of historically ...
Word embeddings reveal that women tend to be discussed in the contexts of work and the home. Topic modeling highlights the higher prominence of political topics compared with social ones. ...
Acknowledgments We would like to thank the following individuals for helpful conversations, feedback, and ideas: Noah Smith, Sebastian Munoz-Najar Galvez, Lily ...
doi:10.1177/2332858420940312
fatcat:l5antrdnc5d5dbi4goob6k5lou
Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
[article]
2021
arXiv
pre-print
This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics ...
It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. ...
It is a great pleasure and honor to have such wonderful colleagues in our research community. ...
arXiv:2009.11564v2
fatcat:vh2lqfmhhbcwpf6dcsej3hhvgy
Automated Travel History Extraction from Clinical Notes: Algorithm Development and Validation for Emergent Infectious Disease Events (Preprint)
2020
JMIR Public Health and Surveillance
More recently, this system was used in the early phases of response to COVID-19 in the United States, although its utility was limited to a relatively brief window due to the rapid domestic spread of the ...
This study aims to assess the feasibility of annotating and automatically extracting travel history mentions from unstructured clinical documents in the Department of Veterans Affairs across disparate ...
We thank the editor and anonymous reviewers for their feedback in ameliorating the reporting of this study. ...
doi:10.2196/26719
pmid:33759790
fatcat:bmji77r2nvgk7lkpvpx245eajm
Text-Based Twitter User Geolocation Prediction
2014
The Journal of Artificial Intelligence Research
Previous studies on this topic have typically assumed that geographical references (e.g., gazetteer terms, dialectal words) in a text are indicative of its author's location. ...
Geographical location is vital to geospatial applications like local search and event detection. ...
Acknowledgments The authors wish to thank Stephen Roller and Jason Baldridge making their data and tools available to replicate their NA experiments. ...
doi:10.1613/jair.4200
fatcat:jvdb3fdb4ngoxmsjo4fm2jbqve
Microblog topic identification using Linked Open Data
2020
PLoS ONE
Such topics are used in tasks like classification and recommendations. ...
Numerous approaches have been proposed to detect topics from collections of microposts, where the topics are represented by lists of terms such as words, phrases, or word embeddings. ...
Dinesh and Dr. Jayant Venkatanatha for valuable contributions during the preparation of this work. ...
doi:10.1371/journal.pone.0236863
pmid:32780736
fatcat:rblunq2kqffcvpfg362olx5rhq
Book of Abstracts of the Digital Humanities in the Nordic Countries 5th conference. Riga, 20–23 October 2020
[article]
2020
Zenodo
Book of Abstracts DHN, Rīga 2020 Book of Abstracts of the Digital Humanities in the Nordic Countries 5th conference. ...
conferences/dhn2020 Editors: Sanita Reinsone, Anda Baklāne, Jānis Daugavietis Editorial assistants: Justīne Jaudzema, Ilze Ļaksa-Timinska Cover: Anete Krūmiņa Publisher: Institute of Literature, Folklore and ...
We also thank the library of the Technische Acknowledgements This work has been supported by the European Union's Horizon 2020 research and innovation programme under grant 770299 (NewsEye). ...
doi:10.5281/zenodo.4107117
fatcat:6ongky6p5rab7gvtawnjmp2ofm
A Vector Semantics Approach to the Geoparsing Disambiguation Task for Texts in Spanish
unpublished
In this paper, we present a work in progress for location disambiguation in news documents that uses a vector-semantic representation learned from information sources that include events and geographic ...
Linking these locations to coordinates in a map usually requires two steps involving the named entity: extraction and disambiguation. ...
The locations were provided by CentroGeo based on OpenNLP's Named Entity Recognition (NER) module [1] . ...
doi:10.29007/pl5h
fatcat:3trhftivlbcdbaubwiw5vbhaiy
Natural Language Processing for Dialects of a Language: A Survey
[article]
2024
arXiv
pre-print
We expect that this survey will be useful to NLP researchers interested in building equitable language technologies by rethinking LLM benchmarks and model architectures. ...
We observe that past work in NLP concerning dialects goes deeper than mere dialect classification, and . ...
This approach uses two neural parsers, which are modified with the word embeddings used for initialisation. The word embeddings are trained on the standard and the dialect-specific datasets. ...
arXiv:2401.05632v2
fatcat:qkpfmywh2bar7capolzkfx6guu
Bias and Fairness in Large Language Models: A Survey
[article]
2024
arXiv
pre-print
embeddings, probabilities, and generated text. ...
Our first taxonomy of metrics for bias evaluation disambiguates the relationship between metrics and evaluation datasets, and organizes metrics by the different levels at which they operate in a model: ...
Sentence Embedding Metrics Instead of using static word embeddings, LLMs use embeddings learned in the context of a sentence, and are more appropriately paired with embedding metrics for sentence-level ...
arXiv:2309.00770v2
fatcat:idqaltzjdndnhnwcgd7dsqcejm
Modeling words for online sexual behavior surveillance and clinical text information extraction MODELING WORDS FOR ONLINE SEXUAL BEHAVIOR SURVEILLANCE AND CLINICAL TEXT INFORMATION EXTRACTION
unpublished
In domains like information retrieval, words have classically been modeled as discrete entities using 1-of-n encoding, a representation that elides most of a word's syntactic and semantic structure. ...
Recent research, however, has begun exploring more robust representations called word embeddings. ...
named entity recognition and linking tasks. ...
fatcat:2doedhxumbf2vcq4mpm47wmm6a
« Previous
Showing results 1 — 15 out of 151 results