Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Filters








22,318 Hits in 1.9 sec

Representation Deficiency in Masked Language Modeling [article]

Yu Meng, Jitin Krishnan, Sinong Wang, Qifan Wang, Yuning Mao, Han Fang, Marjan Ghazvininejad, Jiawei Han, Luke Zettlemoyer
2024 arXiv   pre-print
Masked Language Modeling (MLM) has been one of the most prominent approaches for pretraining bidirectional text encoders due to its simplicity and effectiveness.  ...  tokens, resulting in a representation deficiency for real tokens and limiting the pretrained model's expressiveness when it is adapted to downstream data without tokens.  ...  Figure 1 : 1 Figure 1: In an MLM-pretrained model, (a) some model dimensions are exclusively used for representing [MASK] tokens, resulting in a representation deficiency for modeling inputs without [MASK  ... 
arXiv:2302.02060v2 fatcat:nsmuwbz5qvcabkoif6rxh6cfqq

End-to-End Code Switching Language Models for Automatic Speech Recognition [article]

Ahan M. R., Shreyas Sunil Kulkarni
2020 arXiv   pre-print
approach for extracting monolingual text using Deep Bi-directional Language Models(LM) such as BERT and other Machine Translation models, and also explore different ways of extracting code-switched text  ...  from the ASR model.  ...  Based on Masked LM using BERT, given a code-switched sentence, we particularly mask the unwanted words from the second language < M ASK >, and use BERT to recover these words in terms of monolingual text  ... 
arXiv:2006.08870v1 fatcat:lz6q5ke3rjehzexj4aj7izmr5y

Combining pre-trained language models and structured knowledge [article]

Pedro Colon-Hernandez, Catherine Havasi, Jason Alonso, Matthew Huggins, Cynthia Breazeal
2021 arXiv   pre-print
In recent years, transformer-based language models have achieved state of the art performance in various NLP benchmarks.  ...  We examine a variety of approaches to integrate structured knowledge into current language models and determine challenges, and possible opportunities to leverage both structured and unstructured information  ...  compensate deficiencies in the areas.  ... 
arXiv:2101.12294v2 fatcat:3npdpudyzzhm3bmevv5qcvxuwe

Page 1456 of Linguistics and Language Behavior Abstracts: LLBA Vol. 29, Issue 3 [page]

1995 Linguistics and Language Behavior Abstracts: LLBA  
proposal, inhibitor process-based in- terference, long stimulus onset asynchrony, target masking; experi- ments; undergraduates; 9504239 speech recognition; background noise masking; empirical data; hear  ...  Johnson-Laird’s mental models representations; 9504186 resonance frequency/vocal tract cross-section relation, equal/unequal tubelet models; 950661 1 set-valued feature structures’ domain properties; 9506012  ... 

Improving Label-Deficient Keyword Spotting Using Self-Supervised Pretraining [article]

Holger Severin Bovbjerg, Zheng-Hua Tan
2022 arXiv   pre-print
In this paper, we investigate the use of self-supervised pretraining for the smaller KWS models in a label-deficient scenario.  ...  It is found that the pretrained models greatly outperform the models without pretraining, showing that Data2Vec pretraining can increase the performance of KWS models in label-deficient scenarios.  ...  As a result, self-supervised learning can be used to improve model performance in the case of label-deficiency.  ... 
arXiv:2210.01703v2 fatcat:bqonkph7wzckxhc3ijxxes2xqm

Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding

Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson
2021 Conference of the International Speech Communication Association  
Multilingual phonetic recognition systems mitigate data sparsity issues by training models on data from multiple languages and learning a speech-to-phone or speech-to-text model universal to all languages  ...  This paper argues that in the real world, even an unseen language has metadata: linguists can tell us the language name, its language family and, usually, its phoneme inventory.  ...  The performance is shown in Table 2 , where both proposed models ("w2v+linear+mask" and "w2v+gcn+mask") outperform the "base" model; "w2v+gcn+mask" model achieves the lowest multilingual error rate, while  ... 
doi:10.21437/interspeech.2021-1843 dblp:conf/interspeech/GaoNZQCH21 fatcat:h36lrbx54bbjtkplt67cgrkyeq

CLOWER: A Pre-trained Language Model with Contrastive Learning over Word and Character Representations [article]

Borun Chen, Hongyin Tang, Jiahao Bu, Kai Zhang, Jingang Wang, Qifan Wang, Hai-Tao Zheng, Wei Wu, Liqian Yu
2022 arXiv   pre-print
Pre-trained Language Models (PLMs) have achieved remarkable performance gains across numerous downstream tasks in natural language understanding.  ...  Various Chinese PLMs have been successively proposed for learning better Chinese language representation.  ...  The coarse-grained information is only implicitly explored in the masked language modeling by designing the masking strategies and the coarse-grained representations are absent.  ... 
arXiv:2208.10844v2 fatcat:vl2yls4g7factbjwovynzakl5q

Mask Attention Networks: Rethinking and Strengthen Transformer [article]

Zhihao Fan, Yeyun Gong, Dayiheng Liu, Zhongyu Wei, Siyuan Wang, Jian Jiao, Nan Duan, Ruofei Zhang, Xuanjing Huang
2021 arXiv   pre-print
However, their static mask matrices limit the capability for localness modeling in text representation learning.  ...  In this paper, we present a novel understanding of SAN and FFN as Mask Attention Networks (MANs) and show that they are two special cases of MANs with static mask matrices.  ...  We argue that deficiency of Transformer in local structure modeling is caused by the attention computation with static mask matrix.  ... 
arXiv:2103.13597v1 fatcat:qistwvn3sfa6xbuectdsg2x5my

Vision-Language Intelligence: Tasks, Representation Learning, and Large Models [article]

Feng Li, Hao Zhang, Yi-Fan Zhang, Shilong Liu, Jian Guo, Lionel M. Ni, PengChuan Zhang, Lei Zhang
2022 arXiv   pre-print
We summarize the development in this field into three time periods, namely task-specific methods, vision-language pre-training (VLP) methods, and larger models empowered by large-scale weakly-labeled data  ...  After that, we show how recent work utilizes large-scale raw image-text data to learn language-aligned visual representations that generalize better on zero or few shot learning tasks.  ...  (a) Original BERT with single-modality, where some language tokens are masked for prediction to train language representation.  ... 
arXiv:2203.01922v1 fatcat:vnjfetgkpzedpfhklufooqet7y

Adaptive Transformers for Learning Multimodal Representations [article]

Prajjwal Bhargava
2020 arXiv   pre-print
In this work, we extend adaptive approaches to learn more about model interpretability and computational efficiency.  ...  The usage of transformers has grown from learning about language semantics to forming meaningful visiolinguistic representations.  ...  In this example, the right answer is assigned a deficient score. The network does not seem to learn distinguishing features from similar classes properly.  ... 
arXiv:2005.07486v3 fatcat:mtiihd5y6vgjjkztnxpvxbsp7u

Zero-shot Aspect-level Sentiment Classification via Explicit Utilization of Aspect-to-Document Sentiment Composition [article]

Pengfei Deng, Jianhua Yuan, Yanyan Zhao, Bing Qin
2022 arXiv   pre-print
Based on this, we propose the AF-DSC method to explicitly model such sentiment composition in reviews.  ...  Our key intuition is that the sentiment representation of a document is composed of the sentiment representations of all the aspects of that document.  ...  mechanism in the pre-trained language model.  ... 
arXiv:2209.02276v1 fatcat:xfjt5xz2xnb35mpqidwgfrknvy

Page 2 of Classical Antiquity Vol. 17, Issue 1 [page]

1998 Classical Antiquity  
These vases have in common that they show a cult- image of Dionysos, consisting of a mask or masks on a column, in combination with the conventional Attic imagery of the revelling ecstatic female worshippers  ...  Sophocles thus finds in this exercise in self-representation a way to frame critical questions on dramatic theory and to define his own dramatic practice.  ... 

Global and Local Semantic Completion Learning for Vision-Language Pre-training [article]

Rong-Cheng Tu, Yatai Ji, Jie Jiang, Weijie Kong, Chengfei Cai, Wenzhe Zhao, Hongfa Wang, Yujiu Yang, Wei Liu
2023 arXiv   pre-print
Cross-modal alignment plays a crucial role in vision-language pre-training (VLP) models, enabling them to capture meaningful associations across different modalities.  ...  However, most of them pay little attention to the global semantic features generated for the masked data, resulting in a limited cross-modal alignment ability of global representations to local features  ...  The absence of MLM would lead to deficient textual representations, which will impair the reconstruction goals of MLTC, thereby limiting the effectiveness of MLTC in understanding.  ... 
arXiv:2306.07096v2 fatcat:wmz3kf4b7vblbpv2z2le2b542e

Generative Sentiment Transfer via Adaptive Masking [article]

Yingze Xie, Jie Xu, LiQiang Qiao, Yun Liu, Feiren Huang, Chaozhuo Li
2023 arXiv   pre-print
Moreover, a sentiment-aware masked language model is further proposed to fill in the blanks in the masked positions by incorporating both context and sentiment polarity to capture the multi-grained semantics  ...  In this paper, we view the positions to be masked as the learnable parameters, and further propose a novel AM-ST model to learn adaptive task-relevant masks based on the attention mechanism.  ...  Infilling Blanks In this stage, our model will infill tokens in masked positions using a sentimentaware masked language model(Senti-MLM) as shown in Figure 2(b) .  ... 
arXiv:2302.12045v1 fatcat:s7doxemsgjagfexxhgyurw3cke

Extract Aspect-based Financial Opinion Using Natural Language Inference

Raymond So, Chun Fai Carlin Chu, Cheuk Wing Jessie Lee
2022 2022 8th International Conference on E-business and Mobile Commerce  
Traditional rule-based NLP, for instance, is known for its deficiency of creating context-aware representations of words and sentences.  ...  The emergence of transformer-based pre-trained language models (PTLMs) has bought new and improved techniques to natural language processing (NLP).  ...  ACKNOWLEDGMENTS The work reported in this paper was supported by the Research Matching Grant Scheme administrated by the University Grants Committee in Hong Kong.  ... 
doi:10.1145/3543106.3543120 fatcat:vb7aofzscfglznjacjr4nxxury
« Previous Showing results 1 — 15 out of 22,318 results