Pentest Chronicles
2. Then submit a sequence of prompts instructing the AI agent to:
3. The AI agent responds with duplicated payloads inserted as raw JavaScript inside (script) tags or as executable event handlers.
Although not all scripts executed immediately, raw (script) tags and attributes such as onmouseover and onerror were present in the stored content.
For an alternative payloads, attempts using HTML event handlers such as:
The following subsequent prompt as used:
After that, the malicious payload was also successfully stored by instructing the AI agent to insert it as shown below:
The AI agent could also be manipulated into modifying previously inserted malicious content (e.g., removing data-id attributes that interfered with execution).
After sufficient prompting, the AI agent eventually produced fully executable JavaScript code within the “Proposed changes” panel, leading to stored cross-site-scripting.
The resulting code would be reflected in the “Proposed changes” as the following:
The malicious payload was finally triggered by client browser:
Phase 2: The "Soft Speech" Injection
The AI agent initially resisted direct requests to execute JavaScript. However, it was observed that by distributing the malicious intent across a sequence of neutral, "helper" prompts, the agent’s safety guardrails were gradually bypassed.
The entire JQuery JS library was successfully loaded:
Step 3. Sensitive information exfiltration
Once the script execution was achieved in the victim's browser, the AI assisted in loading external libraries (e.g., JQuery) and exfiltrating the __redacted-session-id from local storage to an attacker-controlled server.
Then, user session identifier was exfiltrated to the external attacker-controlled server.
An attacker successfully obtains session ID from a user’s cookie storage.
The AI agent automatically generated malicious JS payload and injected it. The JavaScript code below was executed:
The interesting observation is that the AI agent initially discards any attempts to inject JavaScript code. However, when an attacker distributes malicious intent across successive, increasingly neutral messages, the agent’s safeguards appear to weaken. After enough neutral interactions, the agent becomes more permissive and may comply with instructions it would otherwise reject. This suggests that the greater the number of neutral messages sent, the more easily the agent responds to security-sensitive prompts, as though the attacker is gradually approaching a malicious objective.
Summary
To mitigate the risks associated with AI-generated content and indirect injections, the following controls are recommended:
In today's world, ensuring the security of our accounts is more crucial than ever. Just as keys protect the doors to our homes, passwords serve as the first line of defense for our data and assets. It's easy to assume that technical individuals, such as developers and IT professionals, always use strong, unique passwords to keep ...
SOCMINT is the process of gathering and analyzing the information collected from various social networks, channels and communication groups in order to track down an object, gather as much partial data as possible, and potentially to understand its operation. All this in order to analyze the collected information and to achieve that goal by making …
PyScript – or rather Python in your browser + what can be done with it? A few days ago, the Anaconda project announced the PyScript framework, which allows Python code to be executed directly in the browser. Additionally, it also covers its integration with HTML and JS code. An execution of the Python code in …