Heart disease has always been one of the major causes of morbidity and mortality among the world’s population. Diabetic patients, in particular, are more likely to be diagnosed with heart disease than patients without diabetes. Even though the symptoms of heart disease may range from slight chest discomfort to palpitations or even unnoticeable symptoms in some cases, chest pain (angina) has been one of the common presentations of such disease to the Emergency Department (ED). As the failure to recognise atypical presentations of serious illnesses may lead to worse outcomes, accurate diagnosis is a crucial requisite to ensure that each patient is placed on the appropriate treatment pathway.
Electronic medical records (EMRs) are considerably informative resources for clinicians to effectively diagnose patients along with providing safer care. Although much epicrisis information in EMRs is organised in structured form, still more critical information beneficial for medical investigations remains within the unstructured narrative portions. Gaining a full aspect of a patients medical history by reading through EMRs is considered time-consuming, especially when only a specific piece of information is needed in an emergency situation. The diffculty of this process increases in the case of heart disease (especially coronary artery disease or CAD) due to its complex progression, which regularly involves various factors including lifestyle and social factors, such as family history and smoking status, as well as specific medical conditions such as hyperlipidemia and hypertension.
The aforementioned concern leads to the problem of clinicians attempting to obtain a complete perspective on patients’ history of a disease along with its progression and associated risk factors with an EMR under pressure or in a limited amount of time. As a means of addressing the issue, many researchers have proposed various potential solutions based on natural language processing (NLP), precisely information extraction (IE), with the need to improve the quality of health care and reduce medical errors. A variety of IE techniques has been applied to the field of clinical text mining, ranging from uncomplicated pattern matching to complete processing approaches based on symbolic or statistic information and machine learning. Notwithstanding, a class of machine learning techniques, namely deep learning, has recently revealed its remarkable capability to apply NLP techniques to unstructured narrative text with exceptional performance.
The primary objective of the research is to investigate the applications of deep learning for identifying risk factors for heart disease in EMRs and develop a clinical information extraction model based on the investigation result. The main task is to determine clinical evidence contained in each patients medical record, which indicates the presence of diseases (CAD and diabetes) and relevant risk factors (hyperlipidaemia, hypertension, obesity status, smoking status and family history). As EMRs contain critical clinical information in the form of unstructured narrative text and are regarded as the primary communication approach in the medical domain, knowledge derived from such source will be beneficial to various medical applications, such as clinical decision support or analysis of the potential disease treatment pathways. The data collected from this research will also assist the ER for further refinements to the chest pain pathway protocols along with developing a clinical data registry for future study.