Page 215 - Big Data Analytics for Intelligent Healthcare Management
P. 215

208     CHAPTER 8 BLOCKCHAIN IN HEALTHCARE: CHALLENGES AND SOLUTIONS






              Table 8.2 A List of Personally Identifiable Information (PII) and Potential PII (PPII)
              Personally Identifiable Information (PII)  Potential Personally Identifiable Information (PPII)
              Social security number (SSN), full name, credit card  Living area, partial name, few digits of SSN, food place,
              information, bank information, car number, passport  medical information, workplace information, partial
              information, National Identification (NID), login  email, race and sex information, educational institute,
              information, handwriting, image, full name, location,  medical information, IP address, supported information,
              health insurance information.             blood pressure, height, weight, partial phone number.


             weight, race, and living area are unable to identify a person alone. However, a combination of that
             information will surely lead us to a particular entity. Onik et al. [35–39] proposed a sequential
             question-based attack classification. A similar concept can be implemented to identify healthcare in-
             formation security issues classification. Similarly, healthcare data are being breached through different
             mobile applications linked with healthcare issues. A study by Onik et al. [35–39] described how easy it
             was for a collaborative application manufacturer to obtain patient health-related identity.
                Protected health information (PHI): This refers to any individually recognizable health facts gen-
             erated by any clinic, health planner, or health payer (i.e., clinic, doctor, pathology, government health
             department, insurance company etc.). In detail, any physical or mental health information linked to an
             individual’s past, present, and future is known is PHI. Generally, PHI data can either be maintained or
             transmitted in any given form, speech, paper, or electronic document etc. The first use of PHI was done
             by HIPAA (Health Insurance Portability and Accountability Act) in 1996. Some example of sensitive
             PHI are given below:
                Hospital information: First date of hospital visit, patient registration ID, hospital bed number, doc-
             tor’s ID, hospital address etc.
                Images: Images related to individuals and their items of interest. This includes every kind of X-ray
             image, medical image, clinical report picture, MRI.
                Biometric data: The main purpose of biometric data is to carry individual’s biological information.
             This information is very sensitive and includes special body marks, fingerprints, handwriting, retina
             color, weight, blood type, voice type, DNA, race, body color, etc.
                Payment and contact information: Every kind of payment method and associated numbers, pa-
             tient phone number, therapy taking the address, contact email, etc.
                Mehmet Kayaalp [48] stated a few relationships among different personal identities. They consid-
             ered PHI as a common set of PII and medical records. Three kinds of medical data were classified by
             that study. The elaboration of the idea is shown in Fig. 8.11.
                Anonymization techniques: Anonymization of information is the alteration of PII, PPII, or PHI into
             an anonymous state. Although Berinato [49] mentioned in the Harvard Business Review that “there is no
             such thing called anonymous data,” several researchers have proposed data anonymizing techniques. He
             mentioned that several MIT scientists experimented on a dataset of 1.1 million credit card information
             entries and 90%–94% of these entries could be used to obtain personal information using reverse engi-
             neering. However, we now discuss a few anonymization methods for high-dimensional data or big data.
             The statistical learning method along with the Hilbert curve anonymization to increase the utility of the
             dataset was used by Abdalaal et al. [50]. This MSA-diversity technique converts multidimensional iden-
             tifiers to single dimensional data. Sweeney [51] first used the k-anonymity techniques for database anon-
             ymization. Gradually, several other studies used this method in order to de-identify personal data.
   210   211   212   213   214   215   216   217   218   219   220