The significance of data privacy is increasingly prominent on a global scale. There are laws in many jurisdictions governing data, to ensure reduction in unnecessary data sharing and to uphold the rights of data principals (the persons whose data is being used). The recent emergence of Federated Learning represents one important technological advancement with the potential to secure data privacy in the context of Artificial Intelligence (AI).
Currently, generative AI is developed on systems known as Machine Learning (ML) Models, wherein datasets are shared and used to train the model, so that the system is able to generate content and perform tasks. ML models are capable of generating content and performing tasks without human intervention, and are trained on huge amounts of data. However, the data on which ML systems are trained can pose privacy issues: all data used to train AI applications, such as chatbots or recommendation tools, is traditionally stored and gathered in one single place – a global server. This storage centralisation can potentially lead to data breaches. It is here that Federated Learning comes into play.
In layman’s terms, Federated Learning enables the usage of datasets to train AI without actually sharing the data. With Federated Learning, data can be used in a decentralised manner to train AI without the data ever being shared or stored. Federated Learning is a new mechanism that develops Machine Learning models (ML) wherein the federated device shares only parameters and variables, without sharing the actual dataset. This ensures that data stored on multiple laptops, phones and servers is never actually transferred or shared and does not leave the original device. It focuses on the principle of data minimisation so as to limit access to data at all stages and to ensure minimal retention of data. This principle has been enshrined in Article 5(1)(c) of the European General Data Protection Regulation (GDPR), which states that personal data should be limited to what is necessary for the relevant purposes, thus focussing on data minimisation.
Federated Learning has several applications, including within the healthcare industry. Recently, data security concerns have increased, especially as data generated by the healthcare industry is sensitive and requires legal protection. Health data can be utilised to conduct research on important medical matters but its unregulated use can cause human rights violations. Furthermore, commercialisation of health data can lead to certain corporations possessing monopoly powers, thus preventing other entities from accessing and utilising the data. A real-life example of privacy concerns arising out of health data is the agreement between the UK’s National Health Service (NHS) and Google’s AI system DeepMind, wherein the NHS has shared health data of over a million patients in return for an app which facilitates diagnosis of kidney conditions for high risk patients. This controversial agreement has been investigated by the Information Commissioner’s Office (ICO), the watchdog for data protection in the UK. This example illustrates the critical need for laws and policies regulating health data privacy.
In such a scenario, recognition by states of the necessity of data minimisation, and thus the implementing of systems such as Federated Learning, would advance human rights protections. The concept of Federated Learning can aid in ensuring that adequate privacy safety safeguards are in place. Several states regulate the usage of healthcare data, as the Health Insurance Portability and Accountability Act does in the United States. Such laws prevent the usage of medical data, restricting research endeavours. Through Federated Learning, the patient data will not move beyond the institution in which it is stored, so that the entire ML process takes place locally within each participating institution. Only certain model characteristics, such as parameters or gradients, are transferred, thus protecting the privacy of data and ensuring that sensitive data is not shared to multiple platforms.
The ICO has acknowledged the importance of data minimisation and has recommended the adoption of newer privacy preserving techniques like Federated Learning, particularly given their large-scale applications in healthcare. The Global South too is realising the importance of data protection, increasingly regarding data as a valuable asset that needs legal protection. Given its significant privacy implications, States need to create regulatory frameworks which encourage and give legal backing to Federated Learning, placing a special emphasis on the healthcare industry.
Want to learn more?
- Read: The 2024 Paris Olympics: AI Mass Surveillance Under the Upcoming EU AI Act
- Read: Unpacking India’s Digital Personal Data Protection Act: A New Dawn or a False Start?
- Read: Rethinking Patients’ Privacy in light of the Coronavirus Epidemic in India
- Read: Why Artificial Intelligence is Already a Human Rights Issue