“Hashing” refers to the process of using an algorithm to transform data of any size into a unique fixed-sized output (e.g., combination of numbers and letters). To put it in layman’s terms, some piece of information (e.g., a name) is run through an equation that creates a unique string of characters. Anytime the exact same name is run through the equation, the same unique string of characters is created. If a different name (or even the same name spelled differently) is run through the equation, an entirely different string of characters will emerge.
While the output of a hash cannot be immediately reversed to “decode” the information, if the range of input values that was submitted into the algorithm is known, they could be replayed through the algorithm until there is a matching output. The matching output would then indicate what the initial input had been. For instance, if a Social Security Number was hashed, the number might be reverse engineered by hashing all possible Social Security Numbers and comparing the resulting values. When a match to the initial hash is found, the initial Social Security Number that created the hash string would be identified. The net result is that while hash functions are designed to mask personal data, they can be subject to brute force attacks.
Whether a hash value in and of itself is considered “personal information” depends upon the particular law or regulation at issue.
Information is not considered “personal information” under the CCPA if it has been “deidentified.” Deidentification means that the data “cannot reasonably identify, relate to, describe, be capable of being associated with, or be linked, directly or indirectly, to a particular consumer.” An argument could be made that, once hashed, information cannot reasonably be associated with an individual. That argument is strengthened under the CCPA if a business takes the following four steps to help ensure that the hashed data will not be re-identified:
- Implement technical safeguards that prohibit reidentification. Technical safeguards may include the process or techniques by which data has been deidentified. For example, this might include the strength of the hashing algorithm used or combining hashing with other techniques to further obfuscate information (e.g., salting).
- Implement business processes that specifically prohibit reidentification. This might include an internal policy or procedure that prevents employees or vendors from attempting to reidentify data or reverse hashed values.
- Implement business processes to prevent inadvertent release of deidentified information. This might include a policy against disclosing hashed values to the public.
- Make no attempt to reidentify the information. As a functional matter, this entails taking steps to prohibit reidentification by the business’s employees.
Pursuant to the CPRA slightly different deidentification factors will apply beginning in 2023.
In comparison, in the context of the European GDPR, the Article 29 Working Party considered hashing to be a technique for pseudonymization that “reduces the linkability of a dataset with the original identity of a data subject” and thus “is a useful security measure,” but is “not a method of anonymisation.” In other words, from the perspective of the Article 29 Working Party, while hashing might be a useful security technique, it is not sufficient to convert personal data into deidentified data.
 Cal. Civ. Code § 1798.145(0)(3) (West 2020).
 Cal. Civ. Code § 1798.140(h) (West 2020).
 See Cal. Civ. Code § 1798.140(h)(1)-(4) (West 2020).
 Salting refers to the insertion of characters into data before it is hashed to make brute force reidentification more difficult.
 Cal. Civ. Code § 1798.140(m) (West 2021).
 The Article 29 Working Party was the predecessor to the European Data Protection Board.
 Opinion of the Data Protection Working Party on Anonymisation Techniques, 0829/14/EN WP 216, at 20 (adopted on April 10, 2014).