Excel(ent) Obfuscation: Regex Gone Rogue
Microsoft Office-based attacks have long been a favored tactic amongst cybercriminals— and for good reason. Attackers frequently use Office documents in cyberattacks because they are widely trusted. These files, such as Word or Excel docs, are commonly exchanged in business and personal settings. They are also capable of carrying hidden malicious code, embedded macros, and external links that execute code when opened, especially if users are tricked into enabling features like macros.
Moreover, Office documents support advanced techniques like remote template injection, obfuscated macros, and legacy features like Excel 4.0 macros. These allow attackers to bypass antivirus detection and trigger multi-stage payloads such as ransomware or information-stealing malware.
Since Office files are familiar to users and often appear legitimate (e.g., invoices, resumes, or reports), they’re also highly effective tools in phishing and social engineering attacks.
This mixture of social credit and advanced attack characteristics unique to Office files, as well as compatibility across platforms and integration with scripting languages, makes them ideal for initiating sophisticated attacks with minimal user suspicion.
New Excel Regex Functions
Last year, Microsoft announced the availability of three new functions that use Regular Expressions (regex) to help parse text more easily:
Regex are sequences of characters that define search patterns, primarily used for string matching and manipulation. They enable efficient text processing by allowing complex searches, replacements, and validations based on specific criteria.
For example, regex can identify email addresses, phone numbers, or specific word patterns within a text. They are widely used in programming languages like Python, JavaScript, and Perl, and are essential for tasks such as data validation, parsing, and text editing.
The example below demonstrates a practical application, using REGEXTRACT to isolate only names from a mixed-text column:
Proof of Concept: Weaponizing Regex Functions
To demonstrate the security implications of these new Excel functions, we developed a proof of concept that leverages regex functions as an obfuscation technique. Our experiment began by establishing a baseline attack scenario using traditional methods.
First, we created a standard macro-enabled Excel document (XLSM) containing unobfuscated VBA code. This macro uses the "WScript.Shell" object to execute PowerShell commands, which in turn downloads and runs a batch file hosted on Pastebin.

The macro below demonstrates the core functionality— a simple downloader that can retrieve and execute arbitrary payloads:
When submitted to VirusTotal, this plain-text sample triggered significant alerts, with 22 different security vendors flagging it as malicious:
Threat actors typically employ various obfuscation techniques to mask malicious code and evade widespread detection. To demonstrate this technique, we applied the Macro-pack obfuscation tool to our test document, resulting in VBA code that becomes deliberately challenging for both human analysts and automated security tools to interpret.

When analyzed with VirusTotal, this traditionally obfuscated sample triggered more detections than the plain-text version. This increased detection rate is expected, as security vendors have developed specific heuristics to identify common obfuscation patterns:
Next, we created another document, but this time we used the Excel REGEXEXTRACT function to obfuscate the VBA code.
Unlike traditional VBA obfuscation methods, this approach stores and dynamically reconstructs malicious code components using regular expression pattern matching, creating a significantly more evasive payload.
Our first step was to add a large text to cell “A1” and hide our PowerShell command and any other strings in the text as follows:
Then, we created a function that uses REGEXEXTRACT to retrieve these hidden strings from the text. Combined with the REPLACE function, this allows dynamic reconstruction of the payload at runtime:
The implementation extracts each component using tailored regex patterns and assigns them to intentionally obscured variable names (getval0-2), making static analysis challenging. When executed, the macro seamlessly reconstructs and runs the PowerShell command that downloads and executes our remote batch file.
The evasion effectiveness was remarkable— VirusTotal detection dropped from 22 vendors with the plaintext sample to just two with our regex-obfuscated version:
We’ve also analyzed both samples using OLEVBA, a specialized tool for VBA macro analysis that’s widely used in security operations. While OLEVBA easily identified high-risk indicators in our original sample (including PowerShell usage, Shell object creation, and suspicious string operations), it failed to detect any of these indicators in our regex-obfuscated version. The tool couldn’t identify critical indicators like PowerShell execution or WScript.Shell object instantiation because these strings never appear directly in the code— they’re dynamically constructed at runtime from regex pattern matches.
This demonstrates why this technique is particularly concerning: it defeats not just signature-based detection, but also many heuristic analysis methods that security tools rely on.


Current Limitations & Deployment Status
While this technique demonstrates significant potential for security evasion, several factors currently limit its immediate threat:
- Microsoft has disabled VBA macro execution by default since 2022, requiring explicit user action to enable macros in downloaded documents
- The new regex functions have limited deployment, currently available only to Beta Channel users on:
- Windows: Version 2406 (Build 17715.20000) or later
- Mac: Version 16.86 (Build 24051422) or later
As these functions roll out to the general release channels, the potential attack surface will expand significantly.
Prevention
At the time of writing, we have not observed this technique being used in the wild. And while most legacy antivirus tools fail to detect regex-obfuscated malicious files, Deep Instinct’s deep-learning agent detects and prevents all three files presented in this article. Additionally, Deep Instinct’s Artificial Neural Network Assistant (DIANNA) can easily detect the use of regex obfuscation in documents.
.png)
Organizations, with or without Deep Instinct, should also implement the following protective measures:
- Maintain strict macro security policies, especially “Block macros from running in Office files from the Internet”
- Deploy advanced endpoint protection with behavioral analysis capabilities
- Consider application control solutions that restrict Excel’s ability to invoke system commands
- Implement network monitoring to detect unusual outbound connections from Office applications
Future Use
The regex-based obfuscation technique demonstrated here represents just the beginning of potential exploitation. While our proof of concept used relatively simple VBA code, this approach could easily be combined with more sophisticated attack techniques:
- Multi-stage execution chains that further obscure malicious intent
- Advanced persistence mechanisms to maintain access after initial compromise
- Privilege escalation techniques hidden behind regex-extracted components
- Data exfiltration methods that leverage the same obfuscation principles
Additionally, Microsoft’s introduction of Python functionality in Excel creates another potential avenue for attack. While this feature runs calculations in Microsoft’s cloud environment and has inherent latency limitations, it introduces yet another powerful scripting language into the Office ecosystem that determined threat actors could weaponize.
Want to prevent threats in your environment? Request your free scan.
Indicators of Compromise
sample1_re_new.xlsm - dedbe856891dd633ce3dd66ecc120ef4f1ae0a61a37dbb4cc6a59f7eae7019d9
sample1.xlsm - 2c99e702609d549440952ef72f2386a74e0da1462df65ab4206f44c94e8dbc72
sample1_mp.xlsm - 5af1bd3d95e6307d95e9973aa4a084ae210f9038cbea2235d14b02d97abd4f2b
References
https://github.com/sevagas/macro_pack
https://techcommunity.microsoft.com/blog/microsoft365insiderblog/new-regular-expression-regex-functions-in-excel/4226334