ABSTRACT
Objectives: To accurately determine significant prognostic risk factors, patient information must be quantified accurately according to their extent of disease. An essential step for prediction of prognostic risk factors requires the determination of patient features which are typically hidden in electronic medical record(EMR). The goal of this study is to extract clinical entities of Chinese clinical reports, enabling automated hepatocellular carcinoma knowledge extraction.
Materials and Methods: In this paper, we annotated hepatocellular carcinoma corpora with patient records from EMR database. We present an information extraction solution based on assembled method. Our evaluation dataset contains 3996 training sentences and 1570 test sentences. The evaluation metrics are precision, recall, F1 of extract matching.
Results and Conclusions: NER of admission reports, radiology reports and discharge summaries with F1 of 0.8449, 0.5935 and 0.7320 respectively. RE of overall F1 is 0.9129. This study prepares a foundation for larger population studies to identify clinical features of hepatocellular carcinoma.
- Wang, Y., et al. 2014. Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study. J Biomed Inform. 47: p. 91--104.Google ScholarCross Ref
- Zhang, S. and N. Elhadad. 2013. Unsupervised biomedical named entity recognition: experiments with clinical and biological texts. J Biomed Inform. 46(6): p. 1088--98. Google ScholarDigital Library
- Han, L.F., D.F. Wong, and L.S. Chao, Chinese Named Entity Recognition with Conditional Random Fields in the Light of Chinese Characteristics. 2013: Springer Berlin Heidelberg. 74--85.Google Scholar
- Sun, C., et al. 2006. Biomedical Named Entities Recognition Using Conditional Random Fields Model. in Fuzzy Systems and Knowledge Discovery, Third International Conference, FSKD 2006, Xi'an, China, September 24-28, 2006, Proceedings. Google ScholarDigital Library
- Zhao, S. 2004. Named entity recognition in biomedical texts using an HMM model. in International Joint Workshop on Natural Language Processing in Biomedicine and ITS Applications. Google ScholarDigital Library
- Su, J. and J. Su. 2002. Named entity recognition using an HMM-based chunk tagger. in Meeting on Association for Computational Linguistics. Google ScholarDigital Library
- Lafferty, et al., Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. 2001.Google Scholar
- Varone, M., et al. 2017. Conditional random fields with semantic enhancement for named-entity recognition. in International Conference on Web Intelligence, Mining and Semantics. Google ScholarDigital Library
- YukunChen, et al. 2011. A study of machine-learning-based approaches to extract clinical entities and their assertions from discharge summaries. Journal of the American Medical Informatics Association Jamia. 18(5): p. 601.Google ScholarCross Ref
- Settles, B. 2004. Biomedical named intity recognition using conditional random fields and rich feature sets. In Proceedings of COLING 2004, International Joint Workshop On Natural Language Processing in Biomedicine and its Applications (NLPBA. p. 104--107. Google ScholarDigital Library
- Xia, Y. and Q. Wang. Clinical Named Entity Recognition: ECUST in the CCKS-2017 Shared Task 2.Google Scholar
- Wu, J., et al. Clinical Named Entity Recognition via Bi-directional LSTM-CRF Model.Google Scholar
- Author, et al. Chinese Named Entity Recognition.Google Scholar
- Jianglu Hu, X.S., Zengjian Liu. HITSZ_CNER: A hybrid system for entity recognition from Chinese clinical text.Google Scholar
- Kingma, D.P. and J. Ba. 2014. Adam: A Method for Stochastic Optimization. Computer Science.Google Scholar
- Shao, Y., et al. 2017. Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF.Google Scholar
- Cogswell, M., et al. 2015. Reducing Overfitting in Deep Networks by Decorrelating Representations. Computer Science.Google Scholar
- Srivastava, N., et al. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research. 15(1): p. 1929--1958. Google ScholarDigital Library
- Huang, Z., W. Xu, and K. Yu. 2015. Bidirectional LSTM-CRF Models for Sequence Tagging. Computer Science.Google Scholar
- Mikolov, T., et al. 2013. Efficient Estimation of Word Representations in Vector Space. Computer Science.Google Scholar
- . http://brat.nlplab.org/.Google Scholar
- Uzuner, Ö., et al. 2011. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association Jamia. 18(5): p. 552.Google ScholarCross Ref
Index Terms
- Towards Automated Knowledge Discovery of Hepatocellular Carcinoma: Extract Patient Information from Chinese Clinical Reports
Recommendations
Terminologies augmented recurrent neural network model for clinical named entity recognition
Graphical abstractDisplay Omitted
Highlights- We have built APcNER, a French corpus for clinical named-entity recognition.
- It ...
Abstract ObjectiveWe aimed to enhance the performance of a supervised model for clinical named-entity recognition (NER) using medical terminologies. In order to evaluate our system in French, we built a corpus for 5 types of ...
Predicting hepatocellular carcinoma recurrences: A data-driven multiclass classification method incorporating latent variables
Graphical abstractDisplay Omitted
Highlights- Early and late HCC recurrences should be separated, due to their distinct mechanisms.
- Incorporate latent dominant recurrence type to alleviate early-stage information deficiency.
- The proposed method outperforms benchmark techniques ...
AbstractHepatocellular carcinoma (HCC), a malignant form of cancer, is frequently treated with surgical resections, which have relatively high recurrence rates. Effective recurrence predictions enable physicians’ timely detections and adequate ...
MRI-based radiomics distinguish different pathological types of hepatocellular carcinoma
Abstract ObjectTo distinguish combined hepatocellular cholangiocarcinoma (cHCC-CC), hepatocellular carcinoma (HCC) and cholangiocarcinoma (CC) before operation using MRI radiomics.
MethodThis study ...
Highlights- MRI radiomics combined with machine learning can distinguish cHCC-CC, HCC and CC.
Comments