Large Language Models (LLMs) present an interesting opportunity for Nokia to increase its efficiency and improve the usability of our products (by automating routine tasks, by augmenting human experts in specialized areas, etc.). However, for Nokia, a company renowned as a trusted and reliable partner, data confidentiality and privacy are paramount. As such, special measures need to be taken into account when deploying such LLMs onto sensitive data, especially when such LLMs are deployed remotely as a service, or when the sensitive data is sourced from various stakeholders.
In this PhD internship, you will research techniques to qualify and quantify how much confidential data could be leaked or exposed when processed by GenAI-driven applications. For this, you will combine empirical evaluations with theoretical modeling, and investigate the advantages and disadvantages of various mitigation strategies to reduce this leakage.
Duration: flexible, to be agreed (typically 3-4 months), starting time is flexible
Location: Antwerp (Belgium)
Student enrolled in Ph.D. Computer Science/Engineering in Machine Learning and/or Information Theory Strong programming skills in Python Language skills: English Experience in AI w.r.t. confidentiality and privacy attacks and mitigation strategies is a big plus. Similarly, experience in information theory and/or modelling is a bug plus. A strong publication record is also a big plus. You will review scientific literature on data confidentiality in the context of LLM applications. You will both empirically as well as theoretically evaluate how much confidential information can be leaked when being processed by GenAI-driven applications You will develop various mitigation strategies to reduce the amount of leakage and quantify/qualify the impact on the GenAI-driven application You evaluate techniques in the context of concrete LLM use cases