Cascade word embedding to sentence embedding: A class label enhanced approach to phenotype extraction.

Combined with abbreviation revision and sentence template extraction, we improved the unsupervised word-embedding-to-sentence-embedding cascaded approach as representation learning to recognize the various ... Results: We have proposed a pipeline for extracting phenotype, gene and their relations from biomedical literature. ... Due to the lack of training of positive and negative samples, we use the Negative Class Label Enhanced (NCLE) algorithm (Xing et al., 2017) to label negative samples and train the sentence-embedding ...

doi:10.1093/bioinformatics/bty263 pmid:29950017 pmcid:PMC6022650 fatcat:43lehrpjpfdlblvcbsuvrxebty

Open Access

Unlike existing reviews covering a holistic view on BioIE, this review focuses on mainly recent advances in learning based approaches, by systematically summarizing them into different aspects of methodological ... Biomedical information extraction (BioIE) is important to many applications, including clinical decision support, integrative biology, and pharmacovigilance, and therefore it has been an active research ... To overcome cascading errors in a multi-step pipeline framework, joint models (e.g. a Markov Logic Network(MLN) based approach [111] ) have shown improved performance. ...

arXiv:1606.07993v1 fatcat:7d5om7zxxzhoviiriasrfwg3xi

provided sentences could be seen. ... Unfortunately, this knowledge is mostly embedded in the literature in such a way that it is unavailable for automated data analysis procedures. ... Also our special thanks to Jens Dörpinghaus who kindly helped to create the curation interface. Furthermore, we would like to thank all participants of the BioCreative BEL track. ...

doi:10.1093/database/baz084 pmid:31603193 pmcid:PMC6787548 fatcat:zdc2oacdhfgrdn5gjvkq4jyyzm

DOAJ

Lastly, we believe the versatility of the proposed method extends beyond TAGs and holds the potential to enhance other tasks involving graph-text data. ... With the advent of powerful large language models (LLMs) such as GPT or Llama2, which demonstrate an ability to reason and to utilize general knowledge, there is a growing need for techniques which combine ... One approach is to use a cascaded architecture, where the node features are first encoded independently by the LMs, and then fed into GNN models. ...

arXiv:2305.19523v5 fatcat:7vl5s4udabhrbkfictsp5lxtaa

Multiple Versions

We propose a conditional language model following the transformer architecture. This model uses the "encoder stack" to encode concepts that a user wishes to discuss in the generated text. ... from the capacity to select the specific set of concepts that underpin a generated biomedical text. ... tags, and entity class labels associated with each textual element. ...

doi:10.1371/journal.pone.0253905 pmid:34228754 pmcid:PMC8259990 fatcat:xig3mofcaza6tpnda2xircvrxu

DOAJ

Results: A total of 2125 publications were identified for the title and abstract screening. 69 studies were found to be relevant. ... Machine learning (37 studies) and hybrid (26 studies) approaches are predominant, while six studies relied only on rules. Majority of the approaches were trained and evaluated on public corpora. ... While some approaches have been trying to extract sentences that do not contain identifiable information by measuring frequencies of sentences and terms [6], [9], or by creating representations of clinical ...

arXiv:2312.03736v1 fatcat:gd5oci3z7nbd3bpmvmxt6unbry

Open Access

Next, a cutting-edge deep learning model is trained to classify each candidate phrase (n-gram from input sentence) into a corresponding concept label. ... Automatic phenotype concept recognition from unstructured text remains a challenging task in biomedical text mining research. ... Thanks to Chih-Hsuan Wei for his help with Web APIs. Conflict of Interest: none declared. ...

doi:10.1093/bioinformatics/btab019 pmid:33471061 pmcid:PMC11025364 fatcat:sxba74g5azgfno2lz2yrwnm5eu

These are immensely utilized by a plenty of researchers to perform new as well as former experiments. ... Recent escalation in the field of computer vision underpins a huddle of algorithms with the magnificent potential to unravel the information contained within images. ... Finally, the learnable class embedding is fed to a softmax layer for Emphysema classification. ...

arXiv:2203.15269v1 fatcat:wecjpoikbvfz5cygytqpktoxdq

Open Access

Specifically, to solve these hurdles, there has been a notable increase in research and practices conducted in recent years on the domain specialization of LLMs. ... This emerging field of study, with its substantial potential for impact, necessitates a comprehensive and systematic review to better summarize and guide ongoing work in this area. ... Verbalizer are only used for classification task where a mapping from class label to label words is required, which can be one-one mapping, trainable tokens [43] , or enhanced with extra knowledge [53 ...

arXiv:2305.18703v7 fatcat:6vnz3xnvdfb7pkburxb3i6js5y

Open Access Multiple Versions

Citation

Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Yun Li, Hejie Cui, Xuchao Zhang, Tianjiao Zhao, Amit Panalkar, Dhagash Mehta, Stefano Pasquali, Wei Cheng, Haoyu Wang, Yanchi Liu, Zhengzhang Chen, Haifeng Chen, Chris White, Quanquan Gu, Jian Pei, Carl Yang, Liang Zhao. "Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey." arXiv (2024)

context for capturing individual genotype variation related to disease.We present the READ-BioMed team's approach to identifying PPIm-related publications and to extracting specific PPIm information from ... of representative training data and the cascading impact of tool limitations in a modular system. ... In the context of biocuration, entity embeddings (embeddings over genes and mutations, for example) could be more effective than raw word embeddings and may have the potential to improve the performance ...

doi:10.1093/database/bay122 pmid:30576491 pmcid:PMC6301335 fatcat:gysdyhqrmjdevhzxcomcbu5slu

DOAJ

We show how the use of syntactic structure enables the resolution of hedge scope in a hybrid, two-stage approach to uncertainty analysis. ... In the first stage, a Maximum Entropy classifier, combining surface-oriented and syntactic features, identifies cue words. ... Acknowledgements We are grateful to the organizers of the 2010 CoNLL Shared Task and creators of the BioScope resource; first, for engaging in these kinds of community service, and second for many in-depth ...

dblp:conf/coling/OvrelidVO10 fatcat:4qcdfujfgrcend6e2ruzz2tg7q

A novel hierarchical three-level annotation scheme was proposed and implemented to tag key terms, drug interaction sentences, and drug interaction pairs. ... Using our pharmacokinetics ontology, a PK corpus was constructed to present four classes of pharmacokinetics abstracts: in vivo pharmacokinetics studies, in vivo pharmacogenetic studies, in vivo drug interaction ... Vague DDI Sentence Problem In most DDI extraction approaches, CDDIS are considered to be candidates for the analysis of DDI extraction. ...

doi:10.1007/978-1-4939-0709-0_4 pmid:24788261 pmcid:PMC4636907 fatcat:jdxhh37g2zer3n4gikt34ewkry

Attempts have been made to overcome the challenges in neural network computing by representing and embedding domain knowledge in terms of symbolic representations. ... This review presents a comprehensive survey on the state-of-the-art NeSyL approaches, their principles, advances in machine and deep learning algorithms, applications such as opthalmology, and most importantly ... For the natural language processing, the embedding can be carried out using word, sentence, and structural levels. The AM can be employed both at global and local levels. ...

arXiv:2208.00374v1 fatcat:pktmnomj3bbwpjyj7lmu37rl7i

Open Access

Citation

Muhammad Hassan, Haifei Guan, Aikaterini Melliou, Yuqi Wang, Qianhui Sun, Sen Zeng, Wen Liang, Yiwei Zhang, Ziheng Zhang, Qiuyue Hu, Yang Liu, Shunkai Shi, Lin An, Shuyue Ma, Ijaz Gul, Muhammad Akmal Rahee, Zhou You, Canyang Zhang, Vijay Kumar Pandey, Yuxing Han, Yongbing Zhang, Ming Xu, Qiming Huang, Jiefu Tan, Qi Xing, Peiwu Qin, Dongmei Yu. "Neuro-Symbolic Learning: Principles and Applications in Ophthalmology." arXiv (2022)

Traditionally, the amalgamation of diverse medical data modalities (e.g., image, text, speech, genetic data, physiological signals) is imperative to facilitate a comprehensive disease analysis, a topic ... Hence, there exists a pressing need to synthesize the latest strides in multi-modal data and AI technologies in the realm of medical diagnosis. ... [105] explored the benefits of pre-training BERT on a cancer-specific dataset, which aimed to enhance the model's ability to extract breast cancer phenotypes from pathology reports and clinical records ...

doi:10.3390/bioengineering11030219 pmid:38534493 pmcid:PMC10967767 fatcat:tdrqch5tinhhbmnk7bkf4ilsv4

DOAJ

Developing a method of information extraction (IE) from these sources to generate a cohesive data set would be a great contribution to the medical field. ... In the future, IE work should be promoted by extracting more existing entities and relations, constructing gold standard data sets, and exploring IE methods based on a small amount of labeled data. ... [49] proposed a novel cascade-type Chinese medication entity recognition approach, which integrated the sentence category classifier from an SVM and the CRF-based medication entity recognition. ...

doi:10.1155/2022/1679589 pmid:35600940 pmcid:PMC9122692 fatcat:r7sj7sdoubhwfhcscj227neoiy

DOAJ

A gene–phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach

Preserved Fulltext

Learning for Biomedical Information Extraction: Methodological Review of Recent Advances [article]

Preserved Fulltext

The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track

Preserved Fulltext

Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [article]

Preserved Fulltext

Other Versions

CBAG: Conditional biomedical abstract generation

Preserved Fulltext

De-identification of clinical free text using natural language processing: A systematic review of current approaches [article]

Preserved Fulltext

PhenoTagger: A Hybrid Method for Phenotype Concept Recognition using Human Phenotype Ontology

Preserved Fulltext

Vision Transformers in Medical Computer Vision – A Contemplative Retrospection [article]

Preserved Fulltext

Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey [article]

Preserved Fulltext

Other Versions

BioCreative VI Precision Medicine Track system performance is constrained by entity recognition and variations in corpus characteristics

Preserved Fulltext

Syntactic Scope Resolution in Uncertainty Analysis

Preserved Fulltext

Text Mining for Drug–Drug Interaction [chapter]

Preserved Fulltext

Neuro-Symbolic Learning: Principles and Applications in Ophthalmology [article]

Preserved Fulltext

A Comprehensive Review on Synergy of Multi-Modal Data and AI Technologies in Medical Diagnosis

Preserved Fulltext

Information Extraction from the Text Data on Traditional Chinese Medicine: A Review on Tasks, Challenges, and Methods from 2010 to 2021

Preserved Fulltext