AbstractHealth information quality is a significant public health issue. Recently, concerns have focused on online information, as the inherently unregulated nature of the Internet allows anyone to post incorrect information which can be harmful to the internet users. This has led to the development of several instruments designed for assessing Health Information Quality (HIQ) of websites. Generally, HIQ is evaluated using such instruments in which the evaluation is done manually which require time and effort. However, a recent study found that HIQ varied across medical domains and websites; the study stated that the overall quality of online health information is still problematic. Consequently, automatically evaluating the quality of online health information websites is highly desirable. The aim of this thesis is to demonstrate that Natural Language Processing (NLP) and machine learning techniques can be used to assess information quality in online health documents automatically. The thesis explores several aspects of this problem. Firstly, an analysis of people’s understanding of HIQ criteria was undertaken, where a questionnaire with all the available quality criteria in the literature was designed and disseminated online to identify the most crucial criteria. The criteria were ranked according to the users’ feedback and also organised into quality dimensions. Secondly, different annotated datasets were collected for the purpose of assessing HIQ as this is one of the most challenging issues in these kinds of studies. Thirdly, novel methods to assess the quality of health information were developed. These are NLP and machine learning-based and varied from rule-based methods to machine learning classification methods and evaluated on criteria that had available annotated datasets. Furthermore, another method was developed to predict the performance of a sophisticated set of criteria used to evaluate new interventions in online health news articles. The obtained results were, generally, promising and showed that the automatic assessment is viable. Finally, a framework for assessing the importance and suitability for automatic assessment of quality metrics more generally was proposed based on these results. This is the basis of a general framework for assessing HIQ automatically where NLP-based features can be utilised to evaluate HIQ criteria with the help of machine learning algorithms.
To sum up, the main contributions to knowledge of this thesis are a detailed analysis of a range of HIQ criteria in terms of importance to consumers and amenability to automatic
|Date of Award||2019|
|Supervisor||Roger Evans (Supervisor), Pietro Ghezzi (Supervisor) & Gulden Uchyigit (Supervisor)|