An Adversarial Machine Learning Approach on Securing Large Language Model with Vigil, an Open-Source Initiative
Abstrakt
Several security concerns and efforts to breach system security and prompt safety concerns have been brought to light as a result of the expanding use of LLMs. These vulnerabilities are evident and LLM models have been showing many signs of hallucination, repetitive content generation, and biases, which makes them vulnerable to malicious prompts that raise substantial concerns in regard to the dependability and efficiency of such models. It is vital to have a complete grasp of the complex behaviours of malicious attackers in order to build effective strategies for protecting modern artificial intelligence (AI) systems through the development of effective tactics. The purpose of this study is to look into some of these aspects and propose a method for preventing devastating possibilities and protecting LLMs from potential threats that attackers may pose. Vigil is an open-source LLM prompt security scanner, that is accessible as a Python library and REST API, specifically to solve these problems by employing a sophisticated adversarial machine-learning algorithm. The entire objective of this study is to make use of Vigil as a security scanner. and asses its efficiency. In this case study, we shed some light on Vigil, which effectively recognises and helps LLM prompts by identifying two varieties of threats: malicious and benign.
Cytowania
-
0
CrossRef
-
0
Web of Science
-
0
Scopus
Autorzy (4)
Cytuj jako
Pełna treść
- Wersja publikacji
- Accepted albo Published Version
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1016/j.procs.2024.09.486
- Licencja
- otwiera się w nowej karcie
Słowa kluczowe
Informacje szczegółowe
- Kategoria:
- Aktywność konferencyjna
- Typ:
- publikacja w wydawnictwie zbiorowym recenzowanym (także w materiałach konferencyjnych)
- Opublikowano w:
-
Procedia Computer Science
nr 246,
strony 686 - 695,
ISSN: 1877-0509 - Język:
- angielski
- Rok wydania:
- 2024
- Opis bibliograficzny:
- Pokhrel K., Sanín C., Islam M. R., Szczerbicki E.: An Adversarial Machine Learning Approach on Securing Large Language Model with Vigil, an Open-Source Initiative// / : , 2024,
- DOI:
- Cyfrowy identyfikator dokumentu elektronicznego (otwiera się w nowej karcie) 10.1016/j.procs.2024.09.486
- Źródła finansowania:
-
- Publikacja bezkosztowa
- Weryfikacja:
- Politechnika Gdańska
wyświetlono 21 razy
Publikacje, które mogą cię zainteresować
Approach to security assessment of critical infrastructures' information systems
- R. Leszczyna,
- I. Nai Fovino,
- M. Masera
Security Information Sharing for the Polish Power System
- R. Leszczyna,
- M. Łosiński,
- R. Małkowski
News that Moves the Market: DSEX-News Dataset for Forecasting DSE Using BERT
- M. N. R. Khan,
- M. R. Islam,
- C. Sanin
- + 1 autorów