Extracting Personal Information from Large Language Models Like GPT-2
Researchers have been able to find all sorts of personal information within GPT-2. This information was part of the training data, and can be extracted with the right sorts of queries. Paper: “ Extracting Training Data from Large Language Models.” Abstract: It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. We demonstrate our attack on GPT-2, a language model trained on scrapes of the public [...]