Threat Hunting With YARA


Introduction

Learning Objectives

  • Looking for actionable information that can be used to search for threats
  • Installing YARA
  • Creating a YARA rule
  • Deploying a YARA rule

Prerequisites

  • Basic understanding of security concepts including but not limited to Cyber Kill Chain, TTPs, Indicator of Compromise, Hashes, and APTs.
  • Basic understanding of using the Windows command line and PowerShell.
  • Basic understanding of data types and encoding.

Answer the question below

Q: What technique does ID T1134 describe?
A: Access Token Manipulation

Q: What does the detection rule M_APT_Dropper_Rootsaw_Obfuscated detect?
A: Detects obfuscated ROOTSAW payloads

Opportunities for Threat Hunting

There are mainly 3 types of Threat Hunting: Structured, Unstructured and Situational and they employed depending on the situation.

Structured Hunting

This approach starts with a plan or hypothesis based on known attacker behaviors and strategies. You’re looking for specific patterns or signs that show an attack might be starting.

How it works:

  • Searches are powered by tools like YARA rules (to find patterns) or queries in a SIEM (to analyze logs).
  • Relies on external sources, like ecurity blogs or platforms that share attack details (e.g., MISP or AlienVault).

Why it’s useful:

  • It’s good for investigating after a possible compromise or when new threats are reported.

Example:

  • List itemFinding alware on a server after reading a blog about a similar attack.

Situational/Entity-Driven Hunting

This combines the first two approaches and focuses on what’s relevant right now for your organization. It responds to new threats or specific situations, like industry-specific risks or targeted reports.

How it works:

  • Starts with a hypothesis about who might attack and what they’ll target.
  • Focuses on protecting your most critical assets (e.g., financial data or customer info).
  • Uses tools like the MITRE ATT&CK framework and past attack data.

Why it’s good:

  • List itemIt adapts to current threats, making it highly focused and relevant.

Example:

  • If your industry is being targeted by a ransomware group, you look for signs of their activity in your systems.

Threat Hunting Process

Trigger

This is what starts the hunt for a security threat. A trigger could be:

  • A warning sign, like a suspicious file (IOC).
  • A known behavior or method used by attackers (TTPs).
  • An educated guess (hypothesis).
  • A system acting strangely.
  • Information from blogs, news, or reports from other companies.

Investigation

Once you pick a trigger, you use it to start searching for problems.

  • List itemSecurity experts use tools like malware scanners, network tools (e.g., Wireshark), or rules (like YARA) to find anything unusual in the systems or files.

Resolution

If evidence of a security breach is found:

  • The incident response (IR) team is informed to handle the problem.
  • The threat hunter may help the IR team investigate further, find the root cause, and figure out how bad the damage is.

Opportunities

How to apply the above scenarios:

Example: The received threat intelligence details specific TTPs attributed to APT29, which is known to target political entities. This intelligence enables a structured hunting style using the TTPs included in the report to build a hypothesis.

Example: The received threat intel includes Indicators of Compromise and YARA rules to hunt for malware. This intelligence enables an unstructured hunting style using the IOCs provided.

The two opportunities above can be combined to enable a situational or entity-driven hunting style.

Answer the questions below

Q: Which threat hunting style is proactive and uses indicators of attack and TTPs?
A: structured hunting

Q: In which phase of the threat hunting process, tools like YARA or Volatility are used?
A: Investigation

Q: You have received a threat intelligence report consisting only of Indicators of Compromise. What threat hunting style do you recommend to use?
A: unstructured hunting

YARA Rule Introduction

YARA stands for Yet Another Ridiculous Acronym. It is a tool Victor Alvarez of VirusTotal developed to assist malware researchers in detecting and describing malware families. The main functionality of YARA is based on advanced pattern matching, explicitly tailored to malware. It can be best compared to using a supercharged grep with complex regular expressions in Linux. Just like the grep command, the YARA binary will iterate over all files in a designated path, trying to find a match with the information provided in the YARA rule. A YARA rule describes a malware family based on a pattern using a set of strings and Boolean logic.

Structure of a YARA Rule

Yara rules are compsed by name, meta, strings, and condition.

Rule Name

The Rule name is a descriptive name for the rule and starts with the keyword rule. Best practices include setting a name that clarifies what the rule is used for.

Meta

This part defines extra information like description, author, and more. Custom identifiers and value pairs can be freely created. The information defined in meta cannot be used in the condition part. Whether to include this part or not is entirely up to you. The rule will work completely fine without it. It is, however, recommended to include the meta part with some basic information, including the author and the description of what to use the rule for.

Strings

In this part of the rule, matching strings are defined. Multiple types of strings can be defined, which is essential for creating functional rules.

Condition

In this part of the rule, a matching condition is defined using the identifiers defined in the strings part.

Example of a YARA Rule

Rule name: M_APT_Dropper_Rootsaw_Obfuscated. The rule’s title is well-chosen and gives the user a good idea of what to use it for. In this case, it is to detect a dropper called Rootsaw that is obfuscated.

Meta: It is good practice to include relevant data that provides more information about the rule. This helps the user of the YARA rule know what to use the rule for, who wrote it, and where to apply it.

Strings: The strings included in this example help the user find a file containing those strings. How do malware analysts choose those strings? They analyze the malware and determine what uniquely identifies it. The strings used are text strings. The first two lines are straightforward.

Condition: This rule requires that all defined strings be present to have a match. This means all the strings defined in part 3 must have a match in the same file being matched against.

Answer the questions below

Q: Apart from the rule name, which other section is also required in a YARA rule?
A: condition

Yara Strings and Conditions

False positives

An important part of threat hunting is avoiding false alarms. With YARA, this means writing rules that accurately detect the specific threat. However, writing good YARA rules can be tricky and complicated, especially for specific malware. You have two options:

  • Learn to write detailed YARA rules yourself.
  • Use rules made by experts (like those from threat intelligence reports).

You can also combine both approaches. Even if you use pre-made rules, it’s important to understand how they work.

Strings

Text String

Strings are case sensitive.


rule textString
{
strings:
$1 = "This is an ASCII-encoded string" //strings are defined between double quotes
$2 = "This is an ascii-encoded string" //not the same as $1.

condition:
all of them
}


rule noCaseTextString
{
strings:
$1 = "This is an ASCII-encoded string" nocase

condition:
$1
}

Wide-Character Strings

Used for special encoded strings. It is possible to use a modifier next to the defined string so the rule matches for this wide-character string. In this case the modifier used is wide.

rule wideTextString
{
strings:
$1 = "tryhackme" wide // will match with t\x00r\x00y\x00h\x00a\x00c\x00k\x00m\x00e\x00

condition:
$1
}

Hexadecimal String

When malware analysts study malware, they use tools like IDA Pro to break down the code, which often appears in hexadecimal format. These hexadecimal strings can be used to create YARA rules because they are harder for attackers to hide or change. This makes them a reliable way to identify specific malicious files.


rule hexString
{
strings:
$1 = { E2 34 B6 C8 A3 FB } // Hexadecimal strings are defined between {}

condition:
$1
}


rule hexStringExpanded
{
strings:
$1 = { E2 34 B6 ?? A3 FB } // The ? is a wildcard and can represent any hex value.
$2 = { E2 34 B6 ~00 A3 FB } // The ~ is a not operator that precedes the value to exclude from the search. In this case 00.
$3 = { E2 34 [2-4] A3 FB } // The [X-Y] construct defines a jump. This means that any value between 2 and 4 bytes can occupy this position.
$4 = { E2 34 (C5|B5) A3 FB } // Between () alternative byte sequences can be defined separated with the boolean operator OR. The value can be B5 OR C5.

condition:
$1
}

XOR String

Malware creators use XOR encryption to hide their code, making it harder for analysts to analyze and avoid detection by antivirus software. YARA helps by detecting these encrypted strings, even when the encryption uses 1-byte keys.


rule xorString
{
strings:
$1 = "http://maliciousurl.thm" xor // This line will look for all variations possible with a 1-byte XOR key

condition:
$1
}

rule base64String
{
strings:
$1 = "This is a regular string" base64 // At runtime YARA will encode the string with base64 and look for matches.

condition:
$1
}

Regular Expressions

You can define regular expressions the same way as strings, with the only difference being forward slashes instead of double quotes.

rule regularExpression
{
strings:
$1 = /THM\{[a-zA-Z]{3}\}/ // This regex will match any string that starts with "THM{", ends with "}" and has 3 alphabetic characters (lower-case or upper-case) between the curly brackets.

condition:
$1
}

Conditions


Boolean operatorsRelational operatorsArithmetic operatorsBitwise operatorsKeywords
and>=+&1 of them
or<=-|any of them
not<*<<none of them
>\>>contains
==%~icontains
!=^startswith
istartswith
endswith
iendswith
iequals
matches
not defined
filesize

Some examples:

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
all of them // Matches when all defined strings are present.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
any of them // Matches when at least one of the defined strings is present.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
1 of $(*) // Identical to "any of them" condition.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
"$1 or $2" // Matches when 'Try' or 'Hack' is present.
}

In the example above, it seems there is an error as the condition in quotes and the next one is not. I have to verify if the quotes changes the behavior or is intended.

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
$1 and $2 // Matches when 'Try' and 'Hack' are present.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
$1 and ($2 or $3) // Matches when 'Try' and 'Hack' or 'Try' and 'Me' combinations are present.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
none of them // Matches only when none of the defined strings are present.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
filesize < 500KB // Matches all files smaller than 500 KiloByte. This can only be used when matching for files.
}

rule differentConditions
{
strings:
$1 = "Try"
$2 = "Hack"
$3 = "Me"
condition:
($1 or $2) and filesize < 200KB // Matches for 'Try' or 'Hack' in files smaller than 200KB.
}

Answer the questions below

Q: What modifier should be used if you want to search for 2-byte encoded characters?
A: wide

Q: What condition should be used if you want to exclude the defined strings from the matching process?
A: none of them

Basic Syntax for YARA

Execute Yara rules with yara64.

PS C:\TMP> yara64
yara: wrong number of arguments # Red text (error)
Usage: yara [OPTION]... [NAMESPACE:]RULES_FILE... FILE | DIR | PID # Cyan text (help/usage)
Try `--help` for more options # Cyan text (help)

Use the argument --help to see the available options.

Short Flag Long Flag Description
-r --recursive Scan directories recursively
-n --negate Print only rules that weren't matched
-S --print-stats Print metadata related to the performance and efficiency of the rule
-s --print-strings Print the strings that were matched in a file
-X --print-xor-key Print xor key and plaintext of matched strings
-v --version Show the YARA version
-p --threads=N Use N threads to scan a directory

Run a YARA Rule for the First Time


PS C:\TMP>get-content C:\TMP\YARARULES\myfirstrule.yar
rule myfirstrule
{
meta:
Description = "Searches for the string tryhackme"
Author = "TryHackMe"

strings:
$s = "tryhackme"

condition:
$s
}

This rule searches for the string tryhackme in the give directory as argument. For example if the target directory where I want to search is C:\TMP I have to run the following command.

PS C:\TMP> yara64 C:\TMP\YARARULES\myfirstrule.yar C:\TMP\
myfirstrule C:\TMP\test.txt

The result shows that the string matches inside the file C:\TMP\test.txt

Combine Multiple Rules in One File

Putting multiple YARA rules in one file can make things easier, especially when they focus on the same malware, campaign, or purpose. For example, if you’re tracking a malware family with different versions, having all the rules in one file keeps everything organized and easier to use. The same goes for investigating a phishing campaign—grouping rules for emails, payloads, and domains into one file makes the process smoother.

On the other hand, if the rules are unrelated, it’s better to keep them in separate files. This avoids confusion and makes updates easier. The idea is simple: combine rules that share a common goal to keep things clear and efficient, but don’t lump everything together just for the sake of it.

rule M_APT_Downloader_WINELOADER_1
{
meta:
author = "Mandiant"
disclaimer = "This rule is meant for hunting and is not tested to run in a production environment."
description = "Detects rc4 decryption logic in WINELOADER samples"

strings:
$ = {B9 00 01 00 00 99 F7 F9 8B 44 24 [50-200] 0F B6 00 3D FF 00 00 00} // Key initialization
$ = {0F B6 00 3D FF 00 00 00} // Key size

condition:
all of them
}

rule M_APT_Downloader_WINELOADER_2
{
meta:
author = "Mandiant"
disclaimer = "This rule is meant for hunting and is not tested to run in a production environment."
description = "Detects payload invocation stub in WINELOADER"

strings:

// 48 8D 0D ?? ?? 00 00 lea rcx, module_start (Pointer to encrypted resource)
// 48 C7 C2 ?? ?? 00 00 mov rdx, ???? (size of encrypted source)
// E8 [4] call decryption
// 48 8D 05 [4] lea rcx, ??
// 48 8D 0D [4] lea rax, module_start (decrypted resource)
// 48 89 05 [4] mov ptr_mod, rax

$ = {48 8D 0D ?? ?? 00 00 48 C7 C2 ?? ?? 00 00 E8 [4] 48 8D 0D [4] 48 8D 05 [4] 48 89 05}

condition:
all of them
}

rule M_APT_Dropper_Rootsaw_Obfuscated
{
meta:
author = "Mandiant"
disclaimer = "This rule is meant for hunting and is not tested to run in a production environment."
description = "Detects obfuscated ROOTSAW payloads"

strings:
$ = "function _"
$ = "new XMLHttpRequest();"
$ = '\x2e\x7a\x69\x70'
$ = '\x4f\x70\x65\x6e'
$ = "\x43\x3a\x5c\x57"
$ = "https://waterforvoiceless.org/util.php"

condition:
2 of them
}

Answer the questions below

Q: What option do you need to pass to ensure you scan all directories recursively?
A: -r

Indicators of Compromise Detected

When you find clear evidence of a security problem (called an Indicator of Compromise) on a system, the first thing to do is follow the steps in your company's incident response (IR) plan. A good company will have a document that explains what to do before, during, and after a security incident.

Your first step is usually to tell the person or team in charge of handling incidents. They will follow the plan, bring the right people together, and start working on fixing the problem.

The plan might include using a system like DAIR (Dynamic Approach to Incident Response) to organize the steps. You might be asked to help with tasks such as looking more closely at the affected computer, saving important evidence, disconnecting the computer from the network, and more.

When you're threat hunting, it’s very important to write down everything you find. This information can be really helpful if there's ever an incident, and it can save time when you start responding to it. Time is very important when dealing with security problems. If you look at the Cyber Kill Chain below, documenting things early could make the difference between catching a problem in the Command and Control (C2) phase or later when the attacker has already started doing damage (Actions on Objectives phase).

YARA: Hands-on Exercise

Exercise 1

Write a YARA rule to find the file that contains the pattern "THM{}". Use the C:\TMP\Exercise1\ path as the target in the YARA command, enter the flag as the answer.

rule exercise1
{
strings:
$1 = "THM{"

condition:
$1
}

Run as:

PS C:\TMP\YARARULES> yara64 .\exercise1.yara C:\TMP\Exercise1

Exercise 2

Write a YARA rule that finds the file that contains the following strings: "Yet another", "Ridiculous acronym". Use the C:\TMP\Exercise2\ path as the target in the YARA command. Enter the name of the file as the answer.

rule exercise2
{
strings:
$1 = "Yet another" wide
$2 = "Ridiculous acronym" wide

condition:
$1 and $2
}

Run as:

PS C:\TMP\YARARULES> yara64 .\exercise2.yara C:\TMP\Exercise2

Exercise 3

Write a YARA rule that searches for the file that contains the base64 encoded string "THM{This was a really fun exercise}". Use the C:\TMP\Exercise3\ path as the target in the YARA command, and enter name of the file as the answer.

rule exercise3
{
strings:
$1 = "THM{This was a really fun exercise}" base64

condition:
$1
}

Run as:

PS C:\TMP\YARARULES> yara64 .\exercise3.yara C:\TMP\Exercise3

Exercise 4

Write a YARA rule that searches for the XOR encrypted string "THM{FoundSomethingHidden}" in the C:\TMP directory and subdirectories. Fill in the encrypted text and XOR key used.

rule exercise4
{
strings:
$1 = "THM{FoundSomethingHidden}" xor

condition:
$1
}

Run as:

PS C:\TMP\YARARULES> yara64 .\exercise4.yara -X C:\TMP\Exercise4

Answer the questions below

Q: What is the flag found in exercise 1?
A: THM{Threathuntingisawesome}

Q: What is the filename found in exercise 2? (Format: filename.extension)
A: file10.txt

Q: What is the filename found in exercise 3? (Format: filename.extension)
A: file13.txt

Q: What was the XOR key used for encryption in exercise 4?
A: 0x01

Q: What encrypted string did you find in exercise 4?
A: UILzGntoeRnlduihofIheedo|