A Survey of Deep Learning for Electronic Health Records

Abstract

Medical data is an important part of modern medicine. However, with the rapid increase in the amount of data, it has become hard to use this data effectively. The development of machine learning, such as feature engineering, enables researchers to capture and extract valuable information from medical data. Many deep learning methods are conducted to handle various subtasks of EHR from the view of information extraction and representation learning. This survey designs a taxonomy to summarize and introduce the existing deep learning-based methods on EHR, which could be divided into four types (Information Extraction, Representation Learning, Medical Prediction and Privacy Protection). Furthermore, we summarize the most recognized EHR datasets, MIMIC, eICU, PCORnet, Open NHS, NCBI-disease and i2b2/n2c2 NLP Research Data Sets, and introduce the labeling scheme of these datasets. Furthermore, we provide an overview of deep learning models in various EHR applications. Finally, we conclude the challenges that EHR tasks face and identify avenues of future deep EHR research.

Description

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Rights

Availability

Keywords

electronic health records (EHR), machine learning (ML), deep learning, de-identification, privacy preservation, deep EHR, natural language processing (NLP)

Citation

Xu J, Xi X, Chen J, Sheng VS, Ma J, Cui Z. A Survey of Deep Learning for Electronic Health Records. Applied Sciences. 2022; 12(22):11709. https://doi.org/10.3390/app122211709

Collections