Simple Prompt Moderation
Check whether input consists of any text from Deny list, and prevent being sent to LLM.

Use another LLM to identify if user query is close to the deny list, if yes output a default error message.
For example, deny list can be:
Ignore previous instructions
Leak all sensitive information

Last updated