Subject identifiers
The following is not legal advice, but simply general suggestions and ideas. The GDPR field is still in flux, and only with time will accepted patterns and common use cases emerge.
As introduced in the overview, dealing with data subject identifiers is an important part of a GDPR strategy. There are multiple ways to implement identifiers correctly, depending on your application:
-
If your application already has some
userId
you may want to use this identifier for the data subject id. However, you should take care that such id does not itself carry personal information. For example, if the id includes the “nickname” of an user, you should not use it as a data subject id: the ID itself can be seen as personal data. If you would like to use user ids as your data subject ids you may want to consider using a SHA-1 of the user id and some additional seed, as illustrated in the example below. -
A good alternative is to generate UUIDs for data subjects. You can do so using the
java.util.UUID
class and obtain itsString
representation to be used in theWithDataSubjectId
wrapper provided byakka-gdpr
, as described in @ref[].
Also, check whether “a user” has more than one data subject id. For example, different systems may have assigned the same user different ids. When a request to remove data for a given user is issued to your application, it may need to deal with all of the user’s data subject ids.
Another point of discussion is whether metadata associated with a particular data subject id should be removed or not. Even when using data shredding, it is possible that information about when events were stored linked together with correlated data could be used to deduce some information about “that specific” data subject. At this point no rulings have established how far one should go with regards to sanitizing such metadata.
Example
The following example illustrates using SHA-1 to encrypt a user id:
- Scala
-
import java.security.MessageDigest import akka.Done import akka.stream.scaladsl.Sink // only share the instance when using parallelism = 1 private val sha1 = MessageDigest.getInstance("SHA-1") /** * Implement your logic for determining a stable data subject id for each event here. * * For example, it could be based on masking a known user identifier that exists in all * events related to a given user. Or it could be *based on* the persistenceId of the event passed in, * which is a simple and effective solution. */ private def determineEncryptionKey(event: JournaledEvent): Option[String] = { if (event.persistenceId startsWith "user") { try { sha1.update("my-app-secrets".getBytes) sha1.update(event.persistenceId.getBytes) Some(new String(sha1.digest())) } finally { sha1.reset() } } else { None } }
- Java
-
import java.security.MessageDigest; final MessageDigest SHA1; // only share the instance when using parallelism = 1 /** * Implement your logic for determining a stable data subject id for each event here. * * For example, it could be based on masking a known user identifier that exists in all * events related to a given user. Or it could be *based on* the persistenceId of the event passed in, * which is a simple and effective solution. */ private Optional<String> determineEncryptionKey(JournaledEvent event) { if (event.persistenceId().startsWith("user")) { try { SHA1.update("my-app-secrets".getBytes()); SHA1.update(event.persistenceId().getBytes()); return Optional.of(new String(SHA1.digest())); } catch (Exception ex) { return Optional.empty(); } finally { SHA1.reset(); } } else { return Optional.empty(); } }