Page 215 - Big Data Analytics for Intelligent Healthcare Management
P. 215
208 CHAPTER 8 BLOCKCHAIN IN HEALTHCARE: CHALLENGES AND SOLUTIONS
Table 8.2 A List of Personally Identifiable Information (PII) and Potential PII (PPII)
Personally Identifiable Information (PII) Potential Personally Identifiable Information (PPII)
Social security number (SSN), full name, credit card Living area, partial name, few digits of SSN, food place,
information, bank information, car number, passport medical information, workplace information, partial
information, National Identification (NID), login email, race and sex information, educational institute,
information, handwriting, image, full name, location, medical information, IP address, supported information,
health insurance information. blood pressure, height, weight, partial phone number.
weight, race, and living area are unable to identify a person alone. However, a combination of that
information will surely lead us to a particular entity. Onik et al. [35–39] proposed a sequential
question-based attack classification. A similar concept can be implemented to identify healthcare in-
formation security issues classification. Similarly, healthcare data are being breached through different
mobile applications linked with healthcare issues. A study by Onik et al. [35–39] described how easy it
was for a collaborative application manufacturer to obtain patient health-related identity.
Protected health information (PHI): This refers to any individually recognizable health facts gen-
erated by any clinic, health planner, or health payer (i.e., clinic, doctor, pathology, government health
department, insurance company etc.). In detail, any physical or mental health information linked to an
individual’s past, present, and future is known is PHI. Generally, PHI data can either be maintained or
transmitted in any given form, speech, paper, or electronic document etc. The first use of PHI was done
by HIPAA (Health Insurance Portability and Accountability Act) in 1996. Some example of sensitive
PHI are given below:
Hospital information: First date of hospital visit, patient registration ID, hospital bed number, doc-
tor’s ID, hospital address etc.
Images: Images related to individuals and their items of interest. This includes every kind of X-ray
image, medical image, clinical report picture, MRI.
Biometric data: The main purpose of biometric data is to carry individual’s biological information.
This information is very sensitive and includes special body marks, fingerprints, handwriting, retina
color, weight, blood type, voice type, DNA, race, body color, etc.
Payment and contact information: Every kind of payment method and associated numbers, pa-
tient phone number, therapy taking the address, contact email, etc.
Mehmet Kayaalp [48] stated a few relationships among different personal identities. They consid-
ered PHI as a common set of PII and medical records. Three kinds of medical data were classified by
that study. The elaboration of the idea is shown in Fig. 8.11.
Anonymization techniques: Anonymization of information is the alteration of PII, PPII, or PHI into
an anonymous state. Although Berinato [49] mentioned in the Harvard Business Review that “there is no
such thing called anonymous data,” several researchers have proposed data anonymizing techniques. He
mentioned that several MIT scientists experimented on a dataset of 1.1 million credit card information
entries and 90%–94% of these entries could be used to obtain personal information using reverse engi-
neering. However, we now discuss a few anonymization methods for high-dimensional data or big data.
The statistical learning method along with the Hilbert curve anonymization to increase the utility of the
dataset was used by Abdalaal et al. [50]. This MSA-diversity technique converts multidimensional iden-
tifiers to single dimensional data. Sweeney [51] first used the k-anonymity techniques for database anon-
ymization. Gradually, several other studies used this method in order to de-identify personal data.