As private digital health and medtech companies continue to integrate artificial intelligence (AI) into diagnostics, patient monitoring, and engagement platforms, the legal challenges surrounding the use of sensitive health data are becoming increasingly complex. One of the most pressing issues is whether and how AI systems can be lawfully trained on health data, especially when anonymisation is imperfect or where re-identification remains a plausible risk. Health data is subject to heightened legal protection under both the UK and the EU GDPR.
This article explores how healthtech providers can lawfully and effectively structure contracts and data governance frameworks to mitigate legal risk while enabling innovation. Part 1 addresses contractual considerations for AI development in healthtech and Part 2 focuses on data protection considerations for AI training on health data.
Part 1: Contractual considerations for AI development in healthtech
Contracts play a critical role in defining the rights, limitations, and responsibilities of parties developing AI using health data, and must strike a balance between enabling lawful and effective innovation, and patients’ rights, regulatory obligations, and commercial interests. The enforceability of these rights and responsibilities is only as strong as the underlying contractual terms.
A. Structuring AI training and use rights
The contract must define the authorised scope of data use for AI training activities, with reference to their function and context such as diagnostic support tools, internal research and development, performance improvement, or commercial AI product development. General-purpose authorisations or catch-all clauses should be avoided as they create legal uncertainty and may not survive regulatory scrutiny.
In relation to AI model ownership, the agreement must set out who owns the resulting trained models, including derivative models or fine-tuned iterations, and the term of any licence to use them.
The agreement should also address access by other parties including subcontractors, AI vendors, or cloud providers. It must regulate both access and onward use, particularly where the AI is being developed collaboratively or cloud hosted. Restrictions on subcontracting, sublicensing, and further commercial exploitation are essential to mitigate both legal and reputational risk.
B. Risk allocation and enforcement
Given the sensitivity of the data and the complexity of the technology involved, the contract should clearly establish which party is liable if things go wrong by including appropriately tailored liability terms and indemnities.
Both parties may seek warranties that the training data has been lawfully collected, is not subject to undisclosed third-party rights, and complies with applicable regulations. This is particularly important where the AI developer is relying on an external source to supply the data.
Audit rights are essential as they enable monitoring throughout the AI development lifecycle and detection of any unauthorised processing or deviation from the agreed scope. Robust termination clauses should allow the right to terminate for unauthorised data sharing, training of AI models beyond the agreed purpose, and for any data breach involving health information. The agreement should also detail requirements for return, deletion, or secure anonymisation of the data at the end of the contract, and clarify whether trained models may be retained and used after termination, or must be deleted, disabled, or dismantled to prevent use of the original dataset.
Part 2: Data protection considerations for AI training on health data
In parallel to contractual safeguards, healthtech providers must assess and document the lawful basis for processing health data and to ensure that their AI development processes comply with applicable UK and EU GDPR obligations.
A. Legal basis and anonymisation under GDPR
Health data is defined as special category data under the GDPR, and may only be processed if one of the narrow exceptions apply. In many direct-to-consumer health applications, consent may be relied upon. Moreover, consent is fragile: it may be withdrawn, must be freely given, and may be insufficient where the training use is not transparent at the time of collection.
Where data processing is for scientific research purposes, an exemption may apply, provided appropriate safeguards are implemented. However, this exemption is subject to strict interpretation and may not always cover commercial product development, depending on national implementations.
A common misconception is that data used for AI training can be anonymised and thereby falls outside the scope of the GDPR.
In practice, truly anonymised data is rare, and the threshold for anonymisation is high. Controllers must apply a contextual, risk-based test that assesses the identifiability of data in the hands of all parties who could reasonably access it, in line with guidance issued by the European Data Protection Board (EDPB), which emphasises that anonymisation must be irreversible and assessed against real-world re-identification risks. Regular re-identification testing, combined with data minimisation, masking, aggregation, and contractual prohibitions on re-identification, should be implemented as part of a wider technical and organisational safeguards framework.
B. Data sharing governance and international transfers
Transparent and accountable governance of data sharing and international transfers is essential.
Data Sharing Agreements (DSAs) should define the roles of the parties (whether controller, joint controller, or processor), specify the legal basis for processing, and limit use to agreed purposes. DSAs must also regulate retention periods, data subject rights, onward disclosures, and audit mechanisms. In the AI context, such agreements may need to be tailored to accommodate iterative development and ongoing model refinement.
Where personal data is transferred outside the UK or EU, to an AI vendor or cloud host in a third country, appropriate safeguards must be in place. This may involve the use of the EU’s Standard Contractual Clauses (SCCs), the UK’s International Data Transfer Agreement (IDTA), or reliance on an adequacy decision. In addition, Transfer Impact Assessments (TIAs) must be completed, and supplementary technical, organisational, or contractual measures may be required to ensure equivalent protection for data subjects in third countries.
In the UK, the Information Commissioner’s Office (ICO) has also issued guidance on how AI systems can be developed and deployed in a manner that respects data protection principles throughout the AI lifecycle and with a particular focus on explainability and accountability in AI decision-making.
The use of AI in health technology offers transformative benefits but comes with acute contractual, legal regulatory and ethical obligations.