Migrated to a new hosting company… will take a few weeks to get setup.
Migrated to a new hosting company… will take a few weeks to get setup.
Today I’m going to be discussing the top APIs imported from a large number of confirmed malware samples. This all started out of a curiosity and a lack of research published surrounding the topic. I’m not 100% sure I reached any concrete conclusions after completing this experiment but here are my results and the conclusions I drew.
Download the largest collections of malware that I could find (making sure all samples were unique and confirmed on VirusTotal) then proceed to retrieve the imports of all of the PE files. I ended up with 549,035 PE samples with a final uncompressed size of just over 5TB. Once I retrieved all of my samples (thanks to virusshare.com and my own personal collection) I proceeded to write a multi-threaded python script (yes it was terribly slow) that would retrieve all the imports and count the number of times each sample uniquely imported an API. The script then racked and stacked the results to show which APIs were imported the most.
There was a final total of 120,126 uniquely imported APIs. A much larger number than I would have predicted. There was a total of 21,043 samples with no imports at all compared to 527,992 samples that did import at least one API. There were a number of interesting findings. I’m attaching a PDF with the all of the imports at the end.
The first result that I found interesting was that only 3.8% of the samples had no imports at all. That means that less than 5% of the files were either packed with no imports, statically included their dlls, or were using their own methods for finding and importing APIs outside of the PE import table. This is fairly interesting and not personally what I’ve seen in the wild.
Top Ten Imported APIs
#1 GetProcAddress 394546 #2 LoadLibraryA 344607 #3 GetModuleHandleA 305054 #4 ExitProcess 301073 #5 VirtualAlloc 244900 #6 WriteFile 223855 #7 GetModuleFileNameA 221006 #8 CloseHandle 220358 #9 RegCloseKey 213748 #10 VirtualFree 211790
The second and most important result was the top ten imported APIs. If you compare the top ten APIs vs. the remaining imported APIs there’s a significant drop off. I expected some APIs such as WinExec to have a much larger import (one of my personal favorite APIs) but it was only imported 31,943 times, this is a significantly smaller number than the number one import. Even from the number one import to the number three import there is a fairly significant difference. What this tells me is that there is a significant number of malicious files that are dynamically loading their own libraries at run time (good potential for being packed), a very interesting result. Attached is a graph showing the large drop off after GetProcAddress and LoadLibraryA (only top 100 imported APIs are graphed).
One of the most interesting results from this experiment was the large number of APIs imported (120,126). I wasn’t expecting this so I began look through some of the imports to look for any common trends that stuck out. What became clear is that a number of APIs were being imported from 3rd party dll’s. For example av_dup_packet was imported from an audio dll (http://ffmpeg.org/doxygen/0.6/avpacket_8c.html). After some discussion with my friend Matt Weeks (scriptjunkie – website linked below), it’s likely that these APIs are being used to break AntiVirus sandboxes (and potentially malware sandboxes like Cuckoo). Further there are a number of imports that are just aliases to Windows APIs such as vlc_memset (alias to memset). These are two interesting techniques that would work great for evading a heuristic or signature based AV product that’s examining imports. To read more about these techniques I included a link in the Resources section at the bottom.
There were a large number of Windows SystemFunction APIs imported (undocumented Windows APIs). Specifically there were 38 SystemFunction imports, ranging from being imported 122 times to just 10. While this is not unexpected, I did find some of their imports interesting. I expected the largest number of imports to be from function to help with retrieving passwords or hashes from the system but it doesn’t appear that was the case (at least from my knowledge of the methods used to retrieve passwords or hashes from Windows). The most imported SystemFunction was SystemFunction040 which is an alias for RtlEncryptMemory according to the MSDN. More interestingly, SystemFunction006 was the third most imported SystemFunction, this is used in the current version of Mimikatz (Google if you don’t know what Mimikatz does).
There were some remaining imports which struck me as interesting but overall nothing I didn’t expect. For example one file imported an API from the SKIDROW dll. SKIDROW is a notorious cracker group of commercial protection in PC games, I can only imagine what this sample was trying to do.
Feel free to draw your own conclusion from these results, I’d love to hear any thoughts on these findings.
Attached here are the results of the findings in a PDF. If you’d like the excel file to perform your own analysis on please email me at firstname.lastname@example.org.
The Windows API is one of the “must know” areas for most reverse engineers and exploit writers. It’s an area than the more I use the APIs the more that I find myself looking up speific APIs and wishing that I would have known what I know now about these sometimes vague and/or mysterious functions.
Why should someone who’s in the INFOSEC community care about these APIs? Well to put it shortly, they can make your life considerably easier. If you do incident response, are just getting starting writing exploits, or anything related, then you’ve likely seen these APIs mentioned before. They’re a crucial part of everything from shellcode design to malware analysis.
One of the most common places you’ll run into these APIs is in malware analysis. The Windows APIs are crucial to nearly every piece of software that runs on Windows. Without these APIs malware authors would be left writing a considerable amount more code, which few malware authors want to do. Knowing that these are going to be the malware’s link to Windows itself, just examining the APIs can give you great clues about what the malware is trying to do. (Note: malware authors could statically compile their code, which would not need to import the APIs, this is not common and would leave the malware sample significantly larger)
There are endless tools which will show you which APIs are being imported. Some of the most common tools are OllyDbg, Immunity Debugger, IDA Pro, MASTIFF, and countless other tools and scripts. Let’s take a look at a malware sample’s imports.
kernel32.dll DeleteCriticalSection 0x4090dc kernel32.dll LeaveCriticalSection 0x4090e0 kernel32.dll EnterCriticalSection 0x4090e4 kernel32.dll VirtualFree 0x4090e8 kernel32.dll LocalFree 0x4090ec kernel32.dll GetCurrentThreadId 0x4090f0 kernel32.dll GetStartupInfoA 0x4090f4 kernel32.dll GetCommandLineA 0x4090f8 kernel32.dll FreeLibrary 0x4090fc kernel32.dll ExitProcess 0x409100 kernel32.dll WriteFile 0x409104 kernel32.dll UnhandledExceptionFilter 0x409108 kernel32.dll RtlUnwind 0x40910c kernel32.dll RaiseException 0x409110 kernel32.dll GetStdHandle 0x409114 user32.dll GetKeyboardType 0x40911c user32.dll MessageBoxA 0x409120 advapi32.dll RegQueryValueExA 0x409128 advapi32.dll RegOpenKeyExA 0x40912c advapi32.dll RegCloseKey 0x409130 kernel32.dll TlsSetValue 0x409138 kernel32.dll TlsGetValue 0x40913c kernel32.dll TlsFree 0x409140 kernel32.dll TlsAlloc 0x409144 kernel32.dll LocalFree 0x409148 kernel32.dll LocalAlloc 0x40914c wsock32.dll closesocket 0x409154 wsock32.dll WSACleanup 0x409158 wsock32.dll recv 0x40915c wsock32.dll send 0x409160 wsock32.dll connect 0x409164 wsock32.dll htons 0x409168 wsock32.dll socket 0x40916c wsock32.dll WSAStartup 0x409170 wsock32.dll gethostbyname 0x409174 advapi32.dll RegSetValueExA 0x40917c advapi32.dll RegCreateKeyA 0x409180 advapi32.dll RegCloseKey 0x409184 advapi32.dll AdjustTokenPrivileges 0x409188 advapi32.dll LookupPrivilegeValueA 0x40918c advapi32.dll OpenProcessToken 0x409190 user32.dll GetForegroundWindow 0x409198 user32.dll wvsprintfA 0x40919c kernel32.dll CloseHandle 0x4091a4 kernel32.dll RtlMoveMemory 0x4091a8 kernel32.dll RtlZeroMemory 0x4091ac kernel32.dll WriteProcessMemory 0x4091b0 kernel32.dll ReadProcessMemory 0x4091b4 kernel32.dll VirtualProtect 0x4091b8 kernel32.dll Sleep 0x4091bc kernel32.dll GetTickCount 0x4091c0 kernel32.dll MoveFileExA 0x4091c4 kernel32.dll ReadFile 0x4091c8 kernel32.dll WriteFile 0x4091cc kernel32.dll SetFilePointer 0x4091d0 kernel32.dll FindClose 0x4091d4 kernel32.dll FindFirstFileA 0x4091d8 kernel32.dll DeleteFileA 0x4091dc kernel32.dll CreateFileA 0x4091e0 kernel32.dll GetPrivateProfileIntA 0x4091e4 kernel32.dll GetPrivateProfileStringA 0x4091e8 kernel32.dll WritePrivateProfileStringA 0x4091ec kernel32.dll SetFileAttributesA 0x4091f0 kernel32.dll GetCurrentProcessId 0x4091f4 kernel32.dll GetCurrentProcess 0x4091f8 kernel32.dll Process32Next 0x4091fc kernel32.dll Process32First 0x409200 kernel32.dll Module32Next 0x409204 kernel32.dll Module32First 0x409208 kernel32.dll CreateToolhelp32Snapshot 0x40920c kernel32.dll WinExec 0x409210 kernel32.dll lstrcpyA 0x409214 kernel32.dll lstrcatA 0x409218 kernel32.dll lstrcmpiA 0x40921c kernel32.dll lstrcmpA 0x409220 kernel32.dll lstrlenA 0x409224 kernel32.dll lstrlenA 0x40922c kernel32.dll lstrcpyA 0x409230 kernel32.dll lstrcmpiA 0x409234 kernel32.dll lstrcmpA 0x409238 kernel32.dll lstrcatA 0x40923c kernel32.dll WriteProcessMemory 0x409240 kernel32.dll VirtualProtect 0x409244 kernel32.dll TerminateThread 0x409248 kernel32.dll TerminateProcess 0x40924c kernel32.dll Sleep 0x409250 kernel32.dll OpenProcess 0x409254 kernel32.dll GetWindowsDirectoryA 0x409258 kernel32.dll GetTickCount 0x40925c kernel32.dll GetSystemDirectoryA 0x409260 kernel32.dll GetModuleHandleA 0x409264 kernel32.dll GetCurrentProcessId 0x409268 kernel32.dll GetCurrentProcess 0x40926c kernel32.dll GetComputerNameA 0x409270 kernel32.dll ExitProcess 0x409274 kernel32.dll CreateThread 0x409278 user32.dll wvsprintfA 0x409280 user32.dll UnhookWindowsHookEx 0x409284 user32.dll SetWindowsHookExA 0x409288 user32.dll GetWindowThreadProcessId 0x40928c user32.dll GetWindowTextA 0x409290 user32.dll GetForegroundWindow 0x409294 user32.dll GetClassNameA 0x409298 user32.dll CallNextHookEx 0x40929c
Looking over these imported API functions may at first seem useless to the untrained analyst. However, if you begin to dissect what some of the APIs can be used for you can begin to make assumptions about the function of this malware. For example GetTickCount is a very common API for detecting debuggers. AdjustTokenPrivileges and LookupPrivilegeValueA are both commonly used in accessing the Windows security tokens. RegSetValueExA, RegCreateKeyA, and RegCloseKey are used when accessing and altering a registry key. Taking just these APIs into consideration you could begin to make some interesting hypothesis about the capabilities of this specific sample.
I’ve noticed that analysts who don’t totally understand these API function will typically ignore them. For that fact I’m creating a “cheat sheet” for the Windows API functions. The “pre-final” release is attached below.
Please don’t forget that Microsoft did not build these APIs for malicious use and are very commonly used by Windows programmers (unless it’s an undocumented API). Thus analyzing just the imported APIs may not tell you if a sample is malicious or not (but is very useful if you already know a sample is malicious).
Over the past month I’ve also been working on analyzing what is now over 5TB of malware to gather the most frequently used Windows APIs. This data will likely continue to process for close to another month. Once this is done I’ll work on completing this cheat sheet based on those findings and write another post about my discoveries. Keeping that in mind this list is not final and if you have any feedback, comments, questions, or recommendations please make them!
In the course of developing the current list I used multiple resources, I’d just like to highlight a few. These are also great resources if you’re looking to learn more.
Practical Malware Analysis - great book on reverse engineering malware
MSDN - where to go if you’re curious about a specific Windows API
Windows PE File Details – Great article that describes the fundamentals of the PE file and more details surrounding PE file imports
Cheat Sheet Version .5 :
Summary : Great Offensive Security Course & Certification — Just make sure you’re prepared!
Before I start diving into this review I should start by saying I did have a background in writing exploits and Reverse Engineering before I got into Cracking the Permitter (CTP). While this didn’t necessarily help me in every aspect, my knowledge of x86 Assembly definitely did. This influences my perspective on the course.
This is definitely one of those courses you need some sort of background before starting. Completing the OSCP is a great start. I highly recommend knowing x86 Assembly at least on some basic level, have an understanding of the Windows API, and have a background in web applications (php, etc). The more of a background you have the easier the course will be. I believe one of the reasons for the high failure rates on the certification is due to the fact that people don’t have the background required (it’s hard to learn the things I mentioned and the course material at the same time).
This course is focused on the code side of exploitation. You’re working to find bugs at the code level and then take advantage of them (rather than depend on a bug someone else found and using their exploit). You spend your time looking for web vulnerabilities (php, etc) and windows application vulnerabilities (x86). This course picks up where PWB (see this review) left off. The web vulnerabilities you work on in the course range from cross site scripting to remote and local file inclusions.
The majority of the course is focused on Windows exploit development and discovery (fuzzing). This ranges from simple SEH overwrites to more complicated scenarios (encoding, bad characters, small space for shellcode, etc). Offsec does a good job of keeping the course material interesting by throwing a variety of situations at you.
The only complaint I had about the material is that some of it is dated. For example there is some dated material on Anti Virus avoidance. Most of these techniques don’t work today with modern AV. It does however challenge you to think about new and innovative ways to evade AV. This course also doesn’t go into any details about the latest exploit techniques (ROP, heap spraying, etc). These techniques are covered in Offsec’s Advanced Windows Exploitation. Personally it makes sense, there’s a lot to be covered in this course and you need to be proficient in intermediate exploits before moving to more advanced (walking before you run).
Of course the exam is hard. Offsec’s courses are some of the only IT security courses that force you to exhibit true mastery of the material. I’ve found that no matter how prepared you are for the course you’ll still end up learning a lot in the exam. I was lucky enough to pass the exam on the first attempt. I would challenge everyone to take the exam and continue to do so until you pass!
Offsec Courses in General:
Aside from this specific course it’s worth mentioning the overall value of Offsec’s courses and certifications. In today’s over saturated market of mostly useless certifications there are a few that stand out. Offsec’s courses are one of the companies out there that have put together extremely high quality material that translates into real world knowledge and application. I’m not saying you’ll be an expert after one of Offsec’s courses but you will have a solid foundation in the material.
People are often curious about what certifications you should take to further your abilities in IT security and/or push yourself to learn something new and challenging. My answer is almost always an Offsec certification. One of the best parts about their courses is that they don’t just give you material and expect you to memorize it. The Offsec team specifically forces you to learn how to use and put the course material into action. They also force you to take it further, much the same approach that a security researcher does. Then they give you an exam, which some consider quite challenging, which only continues to force you to “Try Harder”.
If you plan on reciting facts about IT security, take a random certification, if you want to actually practice IT security, take an Offsec course.
In my day to day work I primarily focus on defensive cyber and countering the advanced actors. Thus I find myself dreaming about what the next major threats will look like. With today’s defensive posture on most networks it’s pretty clear that getting malicious code on a target machine isn’t normally that hard. Short of a few defensive techniques (i.e. whitelisting) the infections keep happening and corporations rely on the network sensors to detect and block connections back to the malware’s communications channels. For this posting I’m going to assume that this is true (assume powershell or some executable not on any anti-virus list is used against the host).
Assuming that, let’s look at some of the current public malware C2 methods.
Current State of Malware C2
Web C2 -
Most of the “everyday” malware has been using this for years. There have been evolutions from imbedded data in the website, only using https connections, and also the newer watering hole attacks. These techniques can be varied but if the malware is calling out to any sort of hard coded domain, these connections usually are easy to detect, monitor, and block.
At Defcon 19 Itzik Kotler and Iftach Ian Amit gave a talk about creating a VOIP based bonnet. This idea helped motivate me in my work. However there are a number of problems with VOIP based connections. Does the network allow VOIP connections in and out, how are their VOIP services configured, etc? A VOIP controlled botnet is more complicated and can often be unreliable. This technique hasn’t really been seen in the wild. For more information check out their talk at the link listed.
Over the years there have been multiple implementations of using social networking sites for C2. These are great from the fact that they are usually not blocked at the network level. They also generate a lot of traffic on a daily basis so a large number of connections during the work day would not stand out for more advanced traffic analysis. The problems lie with the social networking sites. It can be difficult for the malware author to transfer a lot of data and conceal what they are doing from the public (i.e. twitter feed being viewable) and from the site’s security admins themselves. If detected it also becomes very easy for Twitter/Myspace/Facebook to disable the accounts.
This has been been a classic C2 method for a number of years. There have been numerous implementations of this technique. They are traditionally fairly complicated. The major problem of DNS C2 is that to not be detected you have to be very slow. A large amount of DNS activity to a specific domain is very apparent and not normal in a network. The speed here can cause major issues if you’re trying to upload or download from a target host.
If you were to design an advanced C2 method that has nearly no way of detection currently, IPV6 is the way to go. The problem for network defenders is that IPV6 is currently turned on out of the box for most host and network appliances. The even bigger problem for network defenders is that there currently aren’t very many products that support network monitoring on IPV6 (and nearly no legacy systems support it). If the targeted organization has left IPV6 on then an attacker has the perfect C2 method. I may research more about IPV6 C2 channels in the future, as it’s a great C2 method.
Gmail C2 –
A few months back when I started investigating advanced malware C2 I found a piece of malware on contagio (link below). I was pointed to this malware by scriptjunkie (www.scriptjunkie.us). After doing a full reversing of the malware (can share IDA notes if anyone is curious), it became clear that this malware was using Gmail for it’s C2. What stood out was it wasn’t trying to hide anything. This is interesting because the malware itself is advanced and the technique had not been seen before. As indicated on contagio you can just run strings to get the credentials for the gmail account. This tells me that this malware technique was only being tested or very targeted, as most advanced malware uses some sort of encoding or encryption. Aside from this specific malware sample, this C2 method poses two major problems. One is managing the email account to use it effectively as a C2 node and the other is Google itself. Google is very good at finding and tracking advanced threats, thus once the email address was identified the account was quickly shutdown.
All of these techniques have their plusses and minuses. During the research of these techniques an even better C2 method quickly came to mind. Why not use some of the cloud storage services?
Cloud Based Malware/Botnets
There are a number of benefits of using cloud-based storage for malware C2. The first is most networks openly allow these services. They’re regularly used and sometimes are a core part of the company’s processes. Next is the open API model most of these services have. If you look the Dropbox, Evernote, etc API they are incredibly easy to use. Most include sample scripts and show you how to get started. The third great part about these services is that the APIs already integrate SSL (Oauth2 is pretty standard). Perfect for a covert channel. Lastly and obviously, these services were built for downloading and uploading large amounts of data. This is great for interacting with a target.
While researching I tried to find if anyone had written about this previously. I found a few discussions with network admins trying to decide whether or not to allow these services on their networks. Here’s some screenshots of what I found.
All this information led me to developing EverRat.
About six months ago I started developing EverRat. I wrote EverRat in python to make it platform independent. I didn’t want to write anything likely to be used by malware authors against organizations so I only added basic features to prove the feasibility of this C2 method. It uses Evernote note’s to communicate. At the network level it looks no different than a standard Evernote connection.
Here’s a brief video showing the RAT working.
This should clearly illustrate the problem to network defenders.
The most complicated code in the script is the synchronization methods. After these issues were resolved it’s quite easy to add features. Due to the robust APIs the executable or script is quite small. To make this even better for a pentesting scenario it would be awesome to switch this over to powershell and perhaps integrate with @Obscuresec’s powershell bot (http://www.obscuresecurity.blogspot.com/2013/02/shmoocon-firetalks-and-epilogue.html).
If someone wants a copy of the code please email me at email@example.com. I’m not posting the code because I don’t want this used by anyone other than pentesters. I have also stopped work on this code as it was just a research project.
These threats are not likely go away. As cloud based storage becomes more and more popular it’s safe to assume that advanced threats may start moving more and more to techniques like EverRat.
When I began working on this research project I was planning on giving this as a talk at a security conference. Instead I gave it as a brief talk at the San Antonio Hackers Association. This was largely due to the fact that a few weeks ago (months after I started my research) the first sample was published that was utilizing the exact techniques I was researching. The link is below.
This is just the start of a new generation of malware C2, we need to start planning on how to best defend against it!
Note: This is for educational purposes only. This does NOT illustrate any vulnerability with Evernote (or any mentioned cloud storage services). If Evernote, Dropbox or any other cloud based storage service would like details on how they might be able to fight malicious users from using their services in future malware please contact me at firstname.lastname@example.org.