Implementation of BERT based machine learning model to extract cancer-miRNA relationship from research literature / by Arunprasad Sundharam.

Author/creator Sundharam, Arunprasad author.
Other author Ding, Qin, degree supervisor.
Other author East Carolina University. Department of Computer Science.
Format Theses and dissertations
Publication[Greenville, N.C.] : [East Carolina University], 2021.
Description50 pages : illustrations
Supplemental ContentAccess via ScholarShip
Subjects

Summary In the world today, text mining is a widely popular and growing branch of Information technology, in which we extract useful information out of the given pile of text data. There are thousands of research papers in medical science pertaining to the study of how microRNAs (miRNAs) can assist or impede the development of various types of cancers. mirCancer is a repository which provides the details of this cancer-miRNA association by analyzing 6500+ research papers using text mining techniques. It would be helpful to create a machine learning model which can analyze the title and abstract content of the research papers and extract the cancer-miRNA association details if it is available in the given text. In this thesis work, we are proposing a solution for creating a machine learning model using the open source NLP framework - BERT, provided by Google which can identify the cancer-miRNA relationship in the given abstract text content. Bert is a deep learning model which is pretrained on Wikipedia text corpse and has built-in knowledge on the usage of English language. As part of this work, we have designed and implemented a machine learning model using Bert framework along with preparation of the dataset required to train the model in the task of identifying cancer-miRNA relationship from the given text. The machine learning model developed in this thesis work performed with an overall accuracy of 90.3% in retrieving the required information from the research papers of the test dataset and hence it can be leveraged to review the results of the existing mircancer text mining implementation.
General notePresented to the faculty of the Department of Computer Science.
General noteAdvisor: Qin Ding
General noteTitle from PDF t.p. (viewed October 19, 2021).
Dissertation noteM.S. East Carolina University 2021.
Bibliography noteIncludes bibliographical references.
Technical detailsSystem requirements: Adobe Reader.
Technical detailsMode of access: World Wide Web.

Availability

Library Location Call Number Status Item Actions
Electronic Resources Access Content Online ✔ Available