Legal and privacy considerations in healthcare
Key Takeaways
- The reuse of patient data for training an AI model is possible, but requires a clear legal basis and a careful balancing of privacy interests.
- Training an AI model can qualify as scientific research. In that case, the compatibility of the processing is presumed, and a compatibility test does not need to be carried out.
- Focus on transparency and patient autonomy. Clear communication about policies on secondary processing and the possibility to opt out strengthens both the lawfulness of secondary processing and patient trust.
Secondary processing in a hospital
Hospitals process large volumes of personal data on a daily basis in the context of patient care. This data is primarily used to deliver healthcare. At the same time, there is a growing need to reuse this data for secondary purposes such as (scientific) research, quality monitoring, policy evaluation, education and operational management.
The secondary use of personal data in healthcare offers significant opportunities for medical progress and improvements in the quality of care. However, it also entails specific risks for patient privacy. The data collected within a hospital largely consists of special categories of personal data within the meaning of Article 9 GDPR, which are subject to an increased level of protection.
Societal expectations regarding the use of health data have evolved considerably in recent years. There is growing support for using patient data to advance medical research and innovation. At the same time, patients are increasingly aware of their privacy rights and expect clear transparency about how their data is used. This calls for a balanced approach that safeguards both the interests of the hospital and those of the patient.
Scientific research
Under the GDPR, the concept of “scientific research” is generally interpreted broadly. It covers not only fundamental research, but also technological development and demonstration activities, meaning the practical testing and validation of new knowledge or technologies to show that they work in practice. This raises the question of whether training an AI model can also be considered scientific research. This remains a subject of debate.
When assessing whether an activity qualifies as scientific research, the following elements may be taken into account:
- the use of a scientific methodology;
- the added value for society;
- the potential impact on society and/or patients.
Recital 61 of the European Health Data Space Regulation (EHDSR) explicitly addresses this point, stating that:
“Activities related to scientific research include innovation activities such as training of AI algorithms that could be used in healthcare or the care of natural persons, as well as the evaluation and further development of existing algorithms and products for such purposes.”
If, in a specific context, the training of an AI model is qualified as scientific research, reliance can be placed on the principle of compatible processing. In that case, the compatibility of processing personal data for scientific research purposes is presumed, meaning that a compatibility test does not need to be carried out (Recital 50 GDPR).
In addition, a separate exception is required for the processing of special categories of personal data. Given that the activity qualifies as scientific research, the most appropriate exception will generally be Article 9(2)(j) GDPR. When relying on this legal basis, appropriate safeguards within the meaning of Article 89(1) GDPR must be implemented. The exact nature of these safeguards depends on the specific context of the project, but the principle of data minimisation remains central.
Compatible processing and the compatibility test
If the training of an AI model does not qualify as scientific research in the specific situation, another legal basis must be identified. In that case, it may still be possible to rely on the doctrine of compatible processing. This doctrine allows personal data to be further processed without a new legal basis, provided that the secondary purpose is compatible with the original purpose of collection. To assess this, a compatibility test must be carried out.
In accordance with Article 6(4) GDPR, the controller must take into account several factors when assessing whether further processing for another purpose is compatible with the original purpose, including:
- the link between the purposes for which the personal data was originally collected and the purposes of the intended further processing;
- the context in which the personal data was collected, in particular the relationship between the data subjects and the controller;
- the nature of the personal data, especially whether special categories of personal data within the meaning of Article 9 GDPR are processed, or whether data relating to criminal convictions and offences under Article 10 GDPR are involved;
- the possible consequences of the intended further processing for the data subjects;
- the existence of appropriate safeguards, such as encryption or pseudonymisation.
Applied to the example of training an AI model using cancer data to enable earlier cancer detection in the future, this assessment could look as follows:
Link between the purposes
This concerns the extent to which there is a logical connection between the original purpose of data collection and the secondary purpose. What is the link between healthcare provision as the primary purpose and training an AI model as a secondary purpose? In this example, there is a clear connection, as cancer data initially collected in the context of care provision is reused to improve early cancer detection in the future.
Context of data collection
This factor assesses whether patients can reasonably expect their data to be used for secondary purposes. The initial information provided to patients is crucial in shaping these expectations. Is the training of an AI model within the reasonable expectations of the patient, and how is this communicated? Through a privacy notice, a specific information notice, or information screens? In this example, a cancer patient could be proactively informed about the development of an AI model aimed at early cancer detection. Such communication contributes to patient trust in the healthcare institution.
Nature of the personal data
Particular attention must be paid to the sensitivity of health data and the potential impact of further processing on the patient’s fundamental rights.
Possible consequences of the intended further processing
This involves assessing the potential positive and negative effects for both patients and society. Does the processing create added value for healthcare without causing adverse consequences for the patient? In this example, if the AI model leads to earlier cancer detection, it is likely to have a positive impact on future cancer patients.
Existence of appropriate safeguards
Appropriate technical and organisational measures must be implemented, such as encryption, pseudonymisation, generalisation, anonymisation or aggregation. The key question is whether the data remains identifiable for suppliers or third parties. Only where the AI model cannot be developed using fully anonymous data may pseudonymised data be used.
Consent
As an alternative, data processing for the purpose of training an AI model may be considered a form of primary processing rather than secondary processing. In that case, consent may serve as the legal basis. Such consent must be explicit, specific, informed and unambiguous.
In practice, however, this legal basis is only chosen exceptionally. This is due to hospital policies that promote secondary data use for societal benefit, the practical difficulty of obtaining consent from every patient, and the inherently fragile nature of consent as a legal basis.
Transparency
Regardless of the chosen legal basis, patients must always be informed about what happens to their personal data. This information can be provided in different ways, including:
- communicating an external policy via the hospital website or brochures explaining the reuse of data for secondary purposes, including the use of data for training AI models;
- providing project-specific information. Depending on its policy choices, a hospital may opt for proactive, general communication about the various secondary processing projects carried out within the organisation. Patients can then request more detailed information about the specific projects in which they are included.
Clear and layered communication helps align patients’ reasonable expectations. This, in turn, affects patient trust and the lawfulness of processing, particularly in situations where reliance is placed on the concept of compatible processing.
Patient autonomy
Finally, hospitals can strengthen patient autonomy by offering an opt-out mechanism that allows patients to object to inclusion in future AI projects or other forms of secondary processing. This approach gives patients a voice, respects their preferences and choices, and ultimately contributes to trust in the healthcare institution.
Bron: VANSWEEVELT, T en BROECKX N, Patiëntendossier, beroepsgeheim, GDPR en EHDSR, Larcier Intersentia, 2025.