شبکه بیزی ترکیبی و رویکرد عامل بندی تانسور برای ارزش از دست رفته برای پیش بینی سرطان سینه A hybrid Bayesian network and tensor factorization approach for missing value imputation to improve breast cancer recurrence prediction

نوع فایل : کتاب
زبان : انگلیسی
ناشر : Elsevier
چاپ و سال / کشور: 2018

توضیحات

رشته های مرتبط مهندسی کامپیوتر
گرایش های مرتبط نرم افزار
مجله دانشگاه شاه سعود – کامپیوتر و علوم اطلاعات – Journal of King Saud University – Computer and Information Sciences
دانشگاه Department of Software Engineering – Islamic Azad University – Mashhad – Iran

منتشر شده در نشریه الزویر
کلمات کلیدی انگلیسی Breast cancer recurrence, Missing value imputation, Classification, Tensor factorization, Bayesian network

Description

1. Introduction Nowadays, breast cancer is the second deadliest cancer in Iran. After years of study and research, there are still many unanswered questions facing researchers in various domains, such as prediction, diagnosis and treatment. According to the latest statistics, in Iran, the mean annual number of new cases of breast cancer is approximately 10,000. Among these cases, approximately 2500 patients lose their lives (Sharfian et al., 2015). Women comprise approximately 98% of breast cancer patients and it is worth mentioning that the average age of breast cancer diagnosis in Iranian women is a decade lower than that of the world average (Sharfian et al., 2015). Recurrence is one of the major problems in breast cancer that means possibility of regrowth of cancer cells in surgery or related areas. The likelihood of post-surgery recurrence affects breast cancer patients’ lives at any time. Therefore, recurrence prediction is the main factor for successful treatment of this disease (Kim, 2012). Even though, a large amount of patient information is collected in medical datasets. To benefit from the collected data of patients and increase the accuracy of prediction, a number of researchers have utilized data mining and machine learning approaches for predicting breast cancer (Choi and Jiang, 2010). Classification algorithms are widely used for discovering valuable information from datasets, which can be applicable in the real world. The aim of classification is to predict a class label for each existing sample in the dataset (Zheng et al., 2014). Based on number of features, number of instances, number of classes and the degree of imbalance, results of classification approaches are different. However, datasets are not always complete. They often include missing values in some samples. This is a major challenge in utilizing data mining approaches for breast cancer prediction. This may occur due to different reasons, such as lack of response from the patients, human errors or system faults for collecting information. Although some of the learning algorithms can work with incom plete data, most of them are not able to handle missing values. They discard the samples that contain at least one missing value or assign a valid value to the corresponding attribute (Zheng et al., 2014; García-Laencina, 2015; Tutz and Ramzan, 2015; Little and Rubin, 2002). Removing incomplete data is an acceptable method but only when there is a little proportion of missing values i.e., 5%. With the increase of missing ratio, using this method leads to valuable information loss. Imputation of missing values is thus necessary for making efficient predictions using data mining tools (García-Laencina, 2015).

شبکه بیزی ترکیبی و رویکرد عامل بندی تانسور برای ارزش از دست رفته برای پیش بینی سرطان سینه A hybrid Bayesian network and tensor factorization approach for missing value imputation to improve breast cancer recurrence prediction

توضیحات

Description

اگر شما نسبت به این اثر یا عنوان محق هستید، لطفا از طریق "بخش تماس با ما" با ما تماس بگیرید و برای اطلاعات بیشتر، صفحه قوانین و مقررات را مطالعه نمایید.

دیدگاه کاربران