Thema4870
Abschlussarbeitstyp: Bachelor, Master
Betreuer: Nicholas Popovic
Forschungsgruppe: Web Science
Archivierungsnummer: 4870
Abschlussarbeitsstatus: Offen
Beginn:
01. März 2022
Abgabe: unbekannt
Topic
Large language models, such as GPT-3 [1], are capable of generating natural language outputs in response to a prompt and have shown to perform well on few-shot learning tasks. One key issue which makes the use of these models impractical, however, is their size: In order to use models with hundreds of billions of parameters, specialized and expensive hardware is needed. For example, the weights of GPT-3 require hundreds of GB of GPU memory, while current high end consumer GPUs typically have a maximum 24GB of memory.
The idea of data augmentation is to artificially increase the amount of training data available for a given task by automatically creating new training examples. The larger training corpus can then be used to train a considerably smaller model for the task at hand. Using a small set of labeled examples, the goal of the proposed thesis is to use a large language model, such as GPT-NeoX-20B [2], to perform data augmentation for relation extraction, where the goal is to detect relations between entities (such as a person or organization) mentioned in a text.
An example of GPT-3 being used for data augmentation can be found in [3].
Prerequisites
Hands-on experience in machine learning, no fear to implement neural network models (under guidance of the supervisors).
[1] https://arxiv.org/pdf/2005.14165.pdf
[2] http://eaidata.bmk.sh/data/GPT_NeoX_20B.pdf
[3] https://aclanthology.org/2021.findings-emnlp.192.pdf