Publications
Detailed Information
Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ahuja, Sarthak | - |
dc.contributor.author | Kachuee, Mohammad | - |
dc.contributor.author | Sheikholeslami, Fateme | - |
dc.contributor.author | Liu, Weiqing | - |
dc.contributor.author | Do, Jae Young | - |
dc.date.accessioned | 2024-05-09T06:42:03Z | - |
dc.date.available | 2024-05-09T06:42:03Z | - |
dc.date.created | 2024-05-09 | - |
dc.date.issued | 2023-07 | - |
dc.identifier.citation | Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings, Vol.5, pp.361-367 | - |
dc.identifier.issn | 0736-587X | - |
dc.identifier.uri | https://hdl.handle.net/10371/201359 | - |
dc.description.abstract | Off-Policy reinforcement learning has been a driving force for the state-of-the-art conversational AIs leading to more natural human-agent interactions and improving the user satisfaction for goal-oriented agents. However, in large-scale commercial settings, it is often challenging to balance between policy improvements and experience continuity on the broad spectrum of applications handled by such system. In the literature, off-policy evaluation and guard-railing on aggregate statistics has been commonly used to address this problem. In this paper, we propose a method for curating and leveraging high-precision samples sourced from historical regression incident reports to validate, safe-guard, and improve policies prior to the online deployment. We conducted extensive experiments using data from a real-world conversational system and actual regression incidents. The proposed method is currently deployed in our production system to protect customers against broken experiences and enable long-term policy improvements. | - |
dc.language | 영어 | - |
dc.publisher | Association for Computational Linguistics (ACL) | - |
dc.title | Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems | - |
dc.type | Article | - |
dc.identifier.doi | 10.48550/arXiv.2305.10528 | - |
dc.citation.journaltitle | Association for Computational Linguistics (ACL). Annual Meeting Conference Proceedings | - |
dc.identifier.scopusid | 2-s2.0-85174238544 | - |
dc.citation.endpage | 367 | - |
dc.citation.startpage | 361 | - |
dc.citation.volume | 5 | - |
dc.description.isOpenAccess | N | - |
dc.contributor.affiliatedAuthor | Do, Jae Young | - |
dc.type.docType | Conference Paper | - |
dc.description.journalClass | 1 | - |
- Appears in Collections:
- Files in This Item:
- There are no files associated with this item.
Related Researcher
- College of Engineering
- Department of Electrical and Computer Engineering
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.