Application deadline: 31st March 2024
Statistical analysis & machine learning approaches can only be as good as the underlying data and harder tasks call for a larger collection of accurate data. Collectively, health organisations, companies and service providers collect massive amounts of data from patients, employees and users, but on an individual level, a single hospital, company or service provider only accesses their smaller subset. Data sharing partnerships are challenging due to distrust, varying quality standards, legal obligations and ethical considerations aimed to protect the privacy of the person whose confidential information is collected. Naive anonymisation techniques may remove names, dates and identifiers, but overlook pseudo-identifiers leveraging contextual knowledge, i.e., rare combinations of facts pointing to a particular person when considering additional information. Sophisticated frameworks like differential privacy & federated learning employ randomised algorithms that satisfy formal privacy guarantees & privacy-enhancing technologies that safely aggregate data in a decentralised fashion using cryptographic techniques. The studentship can take many directions such as mathematical foundations for privacy-preserving algorithms, intuitively explainable formal privacy guarantees, federated learning systems satisfying legal requirements, and other practical or theoretical solutions towards privacy-preserving data sharing. Contact m.shekelyan@qmul.ac.uk for further information or questions about potential directions.
Supervisor: Michael Shekelyan - m.shekelyan@qmul.ac.uk