Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Jan 12, 2024 · Abstract:Aligning large language models (LLMs) with human values, particularly in the face of complex and stealthy jailbreak attacks ...
Jan 12, 2024 · Our observations reveal that LLMs are highly effective in analyzing the intentions behind jailbreak queries, with models like Vicuna-7B, Vicuna- ...
We focus on enhancing LLM safety during the inference stage. In practice, developers usually implement pre-defined system prompts for LLMs.
This study presents a simple yet highly effective defense strategy, i.e., Intention Analysis Prompting, to trigger LLMs' inherent self-correct and improve ...
Aligning large language models (LLMs) with human values, particularly in the face of complex and stealthy jailbreak attacks, presents a formidable challenge ...
Missing: Prompting | Show results with:Prompting
Jan 12, 2024 · Aligning large language models (LLMs) with human values, particularly in the face of stealthy and complex jailbreaks, presents a formidable ...
... Analysis of Jailbreak Attacks Against Large Language Models · LLM ... [2024/01] Intention Analysis Prompting Makes Large Language Models A Good Jailbreak Defender ...
Jan 30, 2024 · Intention Analysis Prompting Makes Large Language Models A Good Jailbreak Defender (2024); Pruning for Protection: Increasing Jailbreak ...
Jan 21, 2024 · According to the paper “Intention Analysis Prompting Makes Large Language Models a Good Jailbreak Defender,” IAPrompt has been shown to ...
May 7, 2024 · Intention Analysis Makes LLMs A Good Jailbreak Defender ... Jailbreak Attacks in Large Language Models ... Jailbreak Prompts on Large Language ...