Skip to main content

Identifying PII

Personally Identifiable Information (PII), also called Personal Data or Personal Information, means every piece of information that relates to a natural person (or an individual), referred to as a "data subject" in data protection terms.

The scope of the definition is broad as it includes directly identifying data (for example, first and last name). And indirectly identifying data with several independent data elements (for example, telephone number, license plate, terminal identifier, etc.).

For example, when relating to natural persons, the following data is personal data:

  • Surname, first name, pseudonym, date of birth
  • photos, sound recordings of voices
  • fixed or mobile telephone number, postal address, email address
  • IP address, the computer connection identifier, or cookie identifier
  • Fingerprint, palm or venous network of the hand, retinal print
  • License plate number, social security number, ID number
  • Application usage data, comments, etc.

Any operation on PII constitutes processing under data protection law and must, therefore, meet the applicable requirements (for example, accessing, manipulating, storing, and transferring PII are all processing activities that need to comply with data protection law).

You should implement privacy by design and consider if you have a justification for processing PII even if your app never stores it but proxies it or queries it from elsewhere.

Anonymization and pseudonymization

Anonymizing personal data is an irreversible process that makes identifying individuals within data sets impossible. Anonymous data doesn't contain any PII and, therefore, falls outside the scope of data protection law.

You need to decide whether to anonymize data and the anonymization technique on a case-by-case basis according to different contexts of use. By default, you should never consider a raw dataset anonymous. However, having implemented an appropriate anonymization process, a developer can collaborate with third parties, and you can keep the data indefinitely.

Whereas pseudonymization data relating to an individual can no longer be attributed without more information, for example, to pseudonymize PII, you replace personal identifiers with random numbers or codes in a dataset. Unlike anonymization, pseudonymization can be a reversible process.

Despite reducing risks, differently from anonymized data, pseudonymized data is considered PII and therefore is subject to data protection law!

To better understand the difference between anonymization and pseudonymization, consider these examples:

  • Anonymization
    • Data Masking (+ data deletion)
    • Randomization
    • Aggregation (+ identifiers deletion)
    • Generalization (+ identifiers deletion)
  • Pseudonymization
    • Data Masking (+ masked data is kept)
    • Encryption
    • Hashing (+ table is kept)
    • Tokenization
Still have questions?
Find answers in the Dynatrace Community