Interactive language learning in deep network agents

Dessí, Roberto

Interactive language learning in deep network agents

dc.contributor

Universitat Pompeu Fabra. Departament de Traducció i Ciències del llenguatge

dc.contributor.author

Dessí, Roberto

dc.date.accessioned

2024-06-26T14:51:25Z

dc.date.available

2024-06-26T14:51:25Z

dc.date.issued

2024-04-12

dc.identifier.uri

http://hdl.handle.net/10803/691516

dc.description.abstract

Recent advances in language modeling led to remarkable performance in many NLP applications. However, the static and passive learning imposed by the nexttoken- prediction objective makes models suboptimal when deployed in interactive settings. This shortcoming prevents the widespread use of such models in many cooperative human-machine scenarios. On the other hand, interactive language learning has emerged as an alternative paradigm to bootstrap linguistic knowledge in deep networks, while also giving them the ability to engage in basic interactive and communicative tasks. In this thesis, I first show how interactive language learning can serve as a viable language learning method for tabula-rasa neural network agents. I then present evidence that pretrained language models, if taken as-is, are not able to successfully coordinate with humans in a simple referential task. Finally, I report experimental evidence on how interactive learning can overcome this limitation and help turning pretrained language models into useful cooperative agents. All in all, my thesis contributes to the program of advancing NLP by combining language modeling and interactive language learning.

dc.description.abstract

Los recientes avances en modelos de lenguaje han llevado a un rendimiento notable en muchas aplicaciones de PLN. Sin embargo, el aprendizaje estático y pasivo impuesto por el objetivo de predicción del siguiente token hace modelos subóptimos en entornos interactivos. Esta limitación impide el uso generalizado de dichos modelos en muchos escenarios cooperativos humano-máquina. Por otro lado, el aprendizaje interactivo de lenguaje surge como un paradigma alternativo para iniciar el conocimiento lingüístico en redes profundas, además de permitirles realizar tareas interactivas y comunicativas básicas. En esta tesis, primero demuestro cómo el aprendizaje interactivo de lenguaje puede servir como un método viable para agentes de redes neuronales sin conocimientos previos. Luego presento evidencia de que los modelos de lenguaje preentrenados, tal como están, no pueden coordinarse exitosamente con humanos en una tarea referencial simple. Finalmente, reporto evidencia experimental sobre cómo el aprendizaje interactivo puede superar esta limitación y ayudar a convertir modelos de lenguaje preentrenados en agentes cooperativos útiles. En resumen, mi tesis contribuye al avance de PLN mediante la combinación de modelado de lenguaje y aprendizaje interactivo de lenguaje.

dc.description.abstract

Els recents avenços en modelatge de llenguatge han portat a un rendiment remarcable en moltes aplicacions de PLN. No obstant això, l’aprenentatge estàtic i passiu imposat per l’objectiu de predicció del següent token fa models subòptims en entorns interactius. Aquesta limitació impedeix l’ús generalitzat d’aquests models en molts escenaris cooperatius humà-màquina. D’altra banda, l’aprenentatge de llenguatge interactiu ha emergit com un paradigma alternatiu per iniciar el coneixement lingüístic en xarxes profundes, a més de donar-los la capacitat de participar en tasques interactives i comunicatives bàsiques. En aquesta tesi, primer mostro com l’aprenentatge de llenguatge interactiu pot servir com un mètode viable per a agents de xarxes neuronals tabula rasa. Després presento evidència que els models de llenguatge preentrenats, tal com estan, no poden coordinar-se amb èxit amb humans en una tasca referencial simple. Finalment, reporto evidència experimental sobre com l’aprenentatge interactiu pot superar aquesta limitació i ajudar a convertir models de llenguatge preentrenats en agents cooperatius ´útils. En resum, la meva tesi contribueix al programa d’avançar en la PLN combinant modelatge de llenguatge i aprenentatge de llenguatge interactiu.

dc.format.extent

125 p.

dc.language.iso

eng

dc.publisher

Universitat Pompeu Fabra

dc.rights.license

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-nc-sa/4.0/

dc.rights.uri

http://creativecommons.org/licenses/by-nc-sa/4.0/

dc.source

TDX (Tesis Doctorals en Xarxa)

dc.title

Interactive language learning in deep network agents

dc.type

info:eu-repo/semantics/doctoralThesis

dc.type

info:eu-repo/semantics/publishedVersion

dc.subject.udc

004

dc.subject.udc

dc.contributor.authoremail

roberto.dessi@upf.edu

dc.contributor.director

Baroni, Marco

dc.embargo.terms

cap

dc.rights.accessLevel

info:eu-repo/semantics/openAccess

dc.description.degree

Programa de Doctorat en Traducció i Ciències del Llenguatge

Documents

trd.pdf

14.97Mb PDF

This item appears in the following Collection(s)

Programa de Doctorat en Traducció i Ciències del Llenguatge [311]