- There is still work in progress, for a lot of improvements and goodies, but only in CAPEv2, v1 is dead :)
CAPE extraction demystified, this is based on CAPEv2
- As one of my friends asked me recentrly how CAPE extraction works and how I do that, yes I do that differently, why not? :D
CAPE debugger based config extraction
- Only CAPE debugger based extractors requires more than 1 sandbox run
- Debugger extractor on first run grabs the offsets, set breakpoints and extracts the config on second run, but that also can be done in another way, will explain at the end of the post
- If you want to undestand how this works, read submitCAPE.py
- The new plan(submitCAPE2) will be have a checkbox already ticked with combo option, then if any second job is needed (like you say for debugger mainly) it will have a sum of all the options needed in one go from submitCAPE2
External libraries/external extractors
pip3 install mwcp git+https://github.com/kevthehermit/RATDecoders
DC3-MWCP
-
Integration, we only import all plugins once
#Import All config parsers try: import mwcp mwcp.register_parser_directory(os.path.join(CUCKOO_ROOT, "modules", "processing", "parsers", "mwcp")) malware_parsers = {block.name.split(".")[-1]:block.name for block in mwcp.get_parser_descriptions(config_only=False)} HAS_MWCP = True #disable logging #[mwcp.parser] WARNING: Missing identify() function for: a35a622d01f83b53d0407a3960768b29.Emotet.Emotet except ImportError as e: HAS_MWCP = False print("Missed MWCP -> pip3 install git+https://github.com/Defense-Cyber-Crime-Center/DC3-MWCP\nDetails: {}".format(e))
-
Please pay attention that current parsers are in CAPEv2/modules/processing/parsers/mwcp
- You can add your plugins to there too, but you need to follow their structure format, I strongly suggest to see as example DridexLoader which I rewrote for optimizations
#static_config_parsers - https://github.com/kevoreilly/CAPEv2/blob/master/lib/cuckoo/common/cape_utils.py#L138 if cape_name and HAS_MWCP and cape_name in malware_parsers: try: reporter = mwcp.Reporter() reporter.run_parser(malware_parsers[cape_name], data=file_data)
- So as you might know CAPE dumps all kinds of payloads/extractions/shellcodes/compressions/unpacking/etc, and then scan with YARA from CAPE folder
- If yara matched, and we have library and name of yara is in our dict of config extractors, we run it on the matched file and extract confing, pretty simple :)
RATDecoders
- Import
try: from malwareconfig import fileparser from malwareconfig.modules import __decoders__, __preprocessors__ HAS_MALWARECONFIGS = True except ImportError: HAS_MALWARECONFIGS = False print("Missed RATDecoders -> pip3 install git+https://github.com/kevthehermit/RATDecoders")
-
Usage
if not parser_loaded and cape_name in __decoders__: try: file_info = fileparser.FileParser(rawdata=file_data) module = __decoders__[file_info.malware_name]['obj']() module.set_file(file_info) module.get_config() malwareconfig_config = module.config #ToDo remove if isinstance(malwareconfig_config, list): for (key, value) in malwareconfig_config[0].items(): cape_config["cape_config"].update({key: [value]}) elif isinstance(malwareconfig_config, dict): for (key, value) in malwareconfig_config.items(): cape_config["cape_config"].update({key: [value]}) except Exception as e: log.error("CAPE: malwareconfig parsing error with %s: %s", cape_name, e)
-
As you can see, if not parser_loaded(if we don’t have MWCP/CAPE extractors) and matched yara(cape_name) is in RATDecoders parsers, run it
CAPE extractors
-
Import
cape_decoders = os.path.join(CUCKOO_ROOT, "modules", "processing", "parsers", "CAPE") CAPE_DECODERS = [ os.path.basename(decoder)[:-3] for decoder in glob.glob(cape_decoders + "/[!_]*.py") ] for name in CAPE_DECODERS: try: file, pathname, description = imp.find_module(name, [cape_decoders]) module = imp.load_module(name, file, pathname, description) malware_parsers[name] = module except (ImportError, IndexError) as e: print("CAPE parser: No module named %s - %s", (name, e))
-
Usage, if we don’t have MWCP extractor but we have CAPE’s
if not parser_loaded and cape_name in malware_parsers: parser_loaded = True try: cape_config = malware_parsers[cape_name].config(file_data) if isinstance(cape_config, list): for (key, value) in cape_config[0].items(): cape_config["cape_config"].update({key: [value]}) elif isinstance(cape_config, dict): for (key, value) in cape_config.items(): cape_config["cape_config"].update({key: [value]}) except Exception as e: log.error("CAPE: parsing error with %s: %s", cape_name, e)
To access config you can:
- In signatures/reporting module check
self.results["cape_config"]
- With API
host/configdownload/<task_id>/<cape_name>
, where cape_name is malware family name
Standalone/Custom extractors
- If you don’t like any previous example or you want to make your own extractor, I always placing them in
CAPEv2/lib/cuckoo/common/decoders/
- Then just import that in your signature and execute it on your matched file.
- Im strongly recommend to go with signatures as they allows you to do a lot of different checks to detect malware family, and once you sure that is that family run your extractor, you just import your plugins lets say
from lib/cuckoo/common/decoders/my_custom_extractor import extractor
- Few utilities:
- you have
yara_detected
function that checks all files(dropped/procdump/procmemory/binary/etc) and returns you path and other details see abstracts.py for details, so you can runconfig = extractor(path)
, volia, you got your config :P - Even if you using Volatility <3, I also recomment to run it from signatures and not from memory.py(by adding it to memory.conf also)
- you have
Volatility3 <3333333
- Im a huge fun of Volatility, Vol3 has up to 50% time cut out of the box without tricks, thats just amazing, if someone says vol is slow, thats because you didn’t learn how to tune it for max performance and do some tricks
-
So tricks:
- Vol2 - kdbg value ;)
- inside of the signature:
from modules.processing.memory import VolatilityAPI # later in code once you sure that is your family volapi = VolatilityAPI(mem_path, profile, kdbg) command = volapi.plugins["<MALWARE_FAMILY>"](volapi.config) pids = self.get_pids() for rounds in range(1, 3): log.info("Executing vol with round: {}".format(rounds)) for task, config in command.calculate(pids, rounds): # only return the first extracted config if config: return config
-
Note that I’m using
get_pids
function, that function gets all pids captured by CAPE, and do memdump scan in 2 rounds- Scans only captured pids, works in 99% and extraction time is extremely short
- Scans the rest of the pids without pids from round 1, just in case if there is new injection technic or something else happend
- For me in 99% works with just first round
-
If you want to learn to write volatility examples here are few examples
- mine pony vol3 plugin example
- vol2 andromeda
- vol2 zbotscan
I hope you learned something useful, enjoy and remember, be friedly, we doing this in our free time for fun