CertiK Hands-On Test: How the Vulnerable OpenClaw Skill Evades Audits and Takes Over Computers Without Authorization
- Core Viewpoint: CertiK's research report indicates that the current industry perception, represented by platforms like OpenClaw, which relies on "pre-listing scan audits" as the core security boundary, is fundamentally flawed. This mechanism is easily bypassed in real-world attacks. The true security foundation should lie in runtime-enforced isolation and fine-grained permission control.
- Key Elements:
- Third-party Skills on platforms like OpenClaw run in high-privilege environments, allowing direct access to local files, execution of system commands, and even manipulation of user crypto assets, posing extremely high security risks.
- The existing "pre-listing scan" protection system has inherent flaws: static detection rules are easily bypassed by code modifications, AI audits cannot identify vulnerabilities hidden within normal logic, and Skills can be listed and installed even before the audit is complete.
- Through a proof-of-concept attack, CertiK successfully allowed a Skill implanted with a remote code execution vulnerability to bypass all detection mechanisms, install without warnings, and achieve arbitrary command execution on the host device.
- The core issue is that the industry generally treats "audit scanning" as the security line of defense, overlooking the fact that runtime-enforced isolation and fine-grained permission control, similar to the iOS sandbox mechanism, are the true security foundation.
- Currently, OpenClaw's sandbox mechanism is an optional configuration. Most users disable it to ensure functionality, leaving the agents in an "unprotected" state, which can easily lead to catastrophic consequences.
- The report recommends that developers should set sandbox isolation as the default mandatory configuration for third-party Skills, and users should deploy agents in non-critical, isolated environments, away from sensitive assets.
Recently, the open-source self-hosted AI agent platform OpenClaw (commonly known as "小龙虾" or "Crayfish" in the community) has rapidly gained popularity due to its flexible scalability and self-controlled deployment features, becoming a phenomenal product in the personal AI agent track. Its ecosystem core, Clawhub, serves as an app store, aggregating a vast number of third-party Skill plugins. These plugins enable agents to unlock advanced capabilities with one click, ranging from web search and content creation to crypto wallet operations, on-chain interactions, and system automation, leading to explosive growth in both ecosystem scale and user base.
However, for these third-party Skills running in high-privilege environments, where exactly does the platform's real security boundary lie?
Recently, CertiK, the world's largest Web3 security company, published its latest research on Skill security. The report points out a significant cognitive misalignment in the market regarding the security boundaries of AI agent ecosystems: the industry widely considers "Skill scanning" as the core security boundary, yet this mechanism is almost useless against hacker attacks.
If we compare OpenClaw to an operating system for smart devices, Skills are the various APPs installed on it. Unlike ordinary consumer-grade APPs, some Skills in OpenClaw run in high-privilege environments. They can directly access local files, invoke system tools, connect to external services, execute host environment commands, and even operate users' crypto digital assets. Once a security issue arises, it can directly lead to severe consequences such as sensitive information leakage, remote device takeover, and theft of digital assets.
The current universal security solution for third-party Skills across the industry is "pre-listing scanning and review." OpenClaw's Clawhub has also established a three-layer review and protection system: integrating VirusTotal code scanning, a static code detection engine, and AI logic consistency detection. It pushes security pop-up warnings to users based on risk classification, attempting to secure the ecosystem. However, CertiK's research and proof-of-concept attack tests confirm that this detection system has shortcomings in real-world attack and defense scenarios and cannot bear the core responsibility of security protection.
The research first deconstructs the inherent limitations of the existing detection mechanisms:
Static detection rules are extremely easy to bypass. The core of this engine relies on matching code features to identify risks, such as flagging the combination of "reading sensitive environmental information + making external network requests" as high-risk behavior. However, attackers only need to make slight syntactic modifications to the code while completely preserving the malicious logic to easily bypass feature matching. It's like rephrasing dangerous content with synonyms, rendering the security scanner completely ineffective.
AI review has inherent detection blind spots. The core function of Clawhub's AI review is a "logic consistency detector," which can only catch obvious malicious code where "declared functionality does not match actual behavior." It is powerless against exploitable vulnerabilities hidden within normal business logic, much like how it's difficult to find a fatal trap buried deep within seemingly compliant contract clauses.
More critically, there is a fundamental design flaw in the review process: even when VirusTotal's scan results are still in a "pending" state, Skills that have not completed the full "health check" process can still be listed publicly. Users can install them without any warning, leaving an opening for attackers.
To verify the real harm of these risks, the CertiK research team conducted a complete test. The team developed a Skill named "test-web-searcher." On the surface, it is a fully compliant web search tool, with code logic entirely adhering to standard development practices. In reality, it embeds a remote code execution vulnerability within its normal functional flow.
This Skill bypassed detection by both the static engine and AI review. While its VirusTotal scan was still pending, it was installed normally without any security warnings. Ultimately, by sending a remote command via Telegram, the vulnerability was successfully triggered, achieving arbitrary command execution on the host device (in the demo, it directly caused the system to launch the calculator).
In its research, CertiK clearly states that these issues are not unique bugs of OpenClaw but rather a widespread cognitive misconception across the entire AI agent industry: the industry generally treats "review scanning" as the core security defense line, overlooking that the true security foundation lies in runtime-enforced isolation and granular permission control. This is similar to how the security core of Apple's iOS ecosystem has never been the strict review of the App Store, but rather the system-enforced sandbox mechanism and granular permission management, ensuring each APP runs in its dedicated "isolation pod" without arbitrarily obtaining system permissions. In contrast, OpenClaw's existing sandbox mechanism is optional, not mandatory, and highly reliant on manual user configuration. Most users, to ensure Skill functionality, choose to disable the sandbox, ultimately leaving the agent in a "naked" state. Once a Skill with vulnerabilities or malicious code is installed, it can directly lead to catastrophic consequences.
Regarding the issues discovered, CertiK also provides security guidance:
● For AI agent developers like OpenClaw, sandbox isolation must be set as the default mandatory configuration for third-party Skills. A granular permission control model for Skills must be implemented, and third-party code must never be allowed to inherit the host's high privileges by default.
● For ordinary users, Skills with a "Safe" label in the Skill marketplace merely indicate they haven't been detected as risky, not that they are absolutely safe. Before the official implementation of a default, underlying strong isolation mechanism, it is recommended to deploy OpenClaw on non-critical idle devices or virtual machines. Never let it near sensitive files, password credentials, or high-value crypto assets.
The AI agent track is currently on the eve of an explosion. The speed of ecosystem expansion must not outpace the pace of security construction. Review scanning can only block elementary malicious attacks but can never become the security boundary for high-privilege agents. Only by shifting from "pursuing perfect detection" to "assuming risk exists and focusing on damage containment," and by enforcing isolation boundaries from the runtime layer, can we truly secure the safety baseline for AI agents, allowing this technological revolution to progress steadily and sustainably.
Original Research: https://x.com/hhj4ck/status/2033527312042315816?s=20


