Page 67 - Big Data Analytics for Intelligent Healthcare Management
P. 67

60      CHAPTER 4 TRANSFER LEARNING AND SUPERVISED CLASSIFIER




             models, a dimensionality reduction algorithm PCA was applied. Then to perform classification, three
             different classifiers logistic regression (LR), support vector machine (SVM), and K-nearest neighbor
             (K-NN) were used. This model could help the pathologist have a preliminary idea of to what class a
             breast tissue image belongs to for example, to benign or malignant. After that step, it will be easy for the
             pathologists to diagnose the image to the predicted class or not.




             4.2 RELATED WORK
             A huge amount of research work has been done for the automatic prediction of the presence of breast
             cancer using different datasets. In paper [3], using the BreaKHis image dataset to classify the images
             into two classes (benign and malignant), the authors used six different feature extractors with four
             different classifiers and reported accuracy ranges from 80% to 85%. The authors of paper [4], using
             different pretrained ConvNet models running thousands of epochs, reported the highest accuracy of
             99.8%. In paper [5], the authors reported accuracy ranges from 81% to 90% using DECAF features
             with other classifiers and compared their work to other works. In paper [6], the authors showed a com-
             parative study of different machine learning techniques on breast cancer FNA biopsy data and the
             K-NN with Euclidean distance approach showed a prediction accuracy of 100% with the K value
             of 5, 10, and 11, and it also showed the same accuracy using cityblock distance with a K value of
             13. In the paper [7], the authors applied different machine learning algorithms to the Wisconsin breast
             cancer dataset and analyzed the performance. They reported an accuracy close to 100% and SVM gaves
             an accuracy of 100%. In paper [8], the authors used ResNet-50 and VGG16 and reported an accuracy of
             89% and 84%, respectively. In paper [9], the author used different machine learning algorithms and
             reported an accuracy of 98.8% and 96.33%m respectively, using SVM on two different datasets. In
             paper [10], the authors reported their best accuracy of 99.038% using MLP on the Wisconsin breast
             cancer dataset. A huge amount of research work has been done in the area of cancer with different
             supervised and semisupervised classification, clustering, and feature detection methods of biomedical
             images [11–20], optimization, and information security techniques of medical data [21–25] helping to
             make computer-aided medical systems.




             4.3 DATASET AND METHODOLOGIES
             Dataset: In this work, we have used the BreaKHis [3] breast cancer histopathological image dataset.
             This dataset contains about 7909 RGB images of benign and malignant tissue at four magnification
             factors (40 , 100 , 200 , 400 )(Fig. 4.1, Table 4.1).



             4.3.1 CONVOLUTION NEURAL NETWORKS (CNNS/CONVNETS)
             Convolution neural networks are the state-of-the-art models for image classification and they are very
             similar to ordinary neural networks except they have some extra layers. There are three main building
             blocks of convolution neural networks: Convolution Layer, Pooling Layer, and Fully Connected Layer.
             These terms are described further below.
   62   63   64   65   66   67   68   69   70   71   72