Consume Mordor Datasets

You can simply download the json files available in this repo and start using some grep fu! if you feel like it. However, there are other more efficient ways you can consume the pre-recorded data and even simulate a real data pipeline to ingest the data to your own SIEM or data lake.

Kafkacat Style

You can start using a tool named Kafkacat to act as a Kafka producer and send data to Kafka brokers. In producer mode, Kafkacat reads messages from standard input (stdin) or a file. This means that you can send data back to any other Kafka broker that you are using as part of your pipeline. You can just grab the logs from this repo and re-play them as if they were being ingested in real-time.


  • Kafka Broker : A distributed publish-subscribe messaging system that is designed to be fast, scalable, fault-tolerant, and durable (Installed by HELK).

  • Kafkacat : A generic non-JVM producer and consumer for Apache Kafka >=0.8, think of it as a netcat for Kafka.

  • HELK (Basic Option): An elastic ELK (Elasticsearch, Logstash, Kibana) stack.

  • Docker CE : Docker Community Edition (CE) is ideal for developers and small teams looking to get started with Docker and experimenting with container-based apps (Installed by HELK).

  • Docker Compose : a tool for defining and running multi-container Docker applications (Installed by HELK).

Consume Logs

Install Kafkacat following the instructions from the official Kafkacat repo

  • If you are using a debian-based system, make sure you install the latest Kafkacat deb package.

  • I recommend at least Ubuntu 18.04. You can check its Kafkacat deb package version and compare it with the latest one in the Kafkacat GitHub repo.

  • If you are using Ubuntu 19, you might need to run the following commands (Thank you Jason Yee)

    • wget

    • sudo dpkg -i libssl1.0.0_1.0.2n-1ubuntu6_amd64.deb

  • You can also install it from source following the Quick Build instructions.

Download and run the HELK. Make sure you have enough memory to run the basic build. You can run it with 5-6GB of RAM now (More information here).

$ git clone
$ cd HELK/docker
$ sudo ./helk_install

Use the defaults (Option 1 and Basic license)

**          HELK - THE HUNTING ELK          **
**                                          **
** Author: Roberto Rodriguez (@Cyb3rWard0g) **
** HELK build version: v0.1.7-alpha02262019 **
** HELK ELK version: 6.6.1                  **
** License: GPL-3.0                         **

[HELK-INSTALLATION-INFO] HELK being hosted on a Linux box
[HELK-INSTALLATION-INFO] Available Memory: 12541 MBs
[HELK-INSTALLATION-INFO] You're using ubuntu version xenial

*      HELK - Docker Compose Build Choices          *


Enter build choice [ 1 - 2]: 1
[HELK-INSTALLATION-INFO] Set HELK elastic subscription (basic or trial): basic
[HELK-INSTALLATION-INFO] Set HELK IP. Default value is your current IP:
[HELK-INSTALLATION-INFO] Set HELK Kibana UI Password: hunting
[HELK-INSTALLATION-INFO] Verify HELK Kibana UI Password: hunting
[HELK-INSTALLATION-INFO] Installing htpasswd..
[HELK-INSTALLATION-INFO] Installing docker via convenience script..
[HELK-INSTALLATION-INFO] Installing docker-compose..
[HELK-INSTALLATION-INFO] Checking local vm.max_map_count variable and setting it to 4120294
[HELK-INSTALLATION-INFO] Building & running HELK from helk-kibana-analysis-basic.yml file..

Download the mordor repo and choose your technique:

$ cd ../../
$ git clone
$ cd mordor/small_datasets/windows/credential_access/credential_dumping_T1003/credentials_from_ad/

Decompress the specific mordor log file

$ tar -xzvf empire_dcsync.tar.gz
x empire_dcsync_2019-03-01174830.json

Send the data to HELK via Kafcakat with the following flags:

  • -b: Kafka Broker

  • -t: Topic in the Kafka Broker to send the data to

  • -P: Producer mode

  • -l: Send messages from a file separated by delimiter, as with stdin. (only one file allowed)

$ kafkacat -b <HELK IP>:9092 -t winlogbeat -P -l empire_dcsync_2019-03-01174830.json

If you are using a fresh HELK install, then you should not have hundres of events in a few indices. Without accessing the Kibana interface, you can simply access the helk-elasticsearch docker container and use its APIs to see data available in it:

sudo docker exec -ti helk-elasticsearch bash
[root@6f8ff404383a elasticsearch] curl http://localhost:9200/_cat/indices?v

Output Example:

health status index                                         uuid                   pri rep docs.count docs.deleted store.size
green  open   .monitoring-logstash-7-2020.09.04             0wtkIPo-RxCZ1qRzxQTBoA   1   0       1085            0     34.9mb         34.9mb
green  open   .monitoring-kibana-7-2020.09.04               O73EUP4iREu1ZhfKI9iFrQ   1   0          5            0     90.8kb         90.8kb
green  open   logs-endpoint-winevent-additional-2019.03.01  GKlzhBx5RAii6hlHe7aTmQ   1   0         21            6    110.8kb        110.8kb
green  open   .apm-agent-configuration                      HfOnH3kOQgacIKGUNJWq9w   1   0          0            0       283b           283b
yellow open   mitre-attack-2020.09.04                       IIc2OSzqQc6NVZnOfQx-dQ   1   1       8624            0     25.7mb         25.7mb
green  open   .kibana_1                                     TiRgVYzYTByIDTPmh8afNA   1   0        276            2    274.9kb        274.9kb
green  open   logs-endpoint-winevent-sysmon-2019.03.01      qpoKrgjCQHKaNGSyPzGAIQ   1   0        323           70    596.3kb        596.3kb
green  open   logs-endpoint-winevent-sysmon-1990.12.18      QywJ1MqmSxKT4jRK0BiynQ   1   0          1            1     55.1kb         55.1kb
green  open   .monitoring-es-7-2020.09.04                   plh8oVR1QomWHr0QxPTxag   1   0         17            1      1.1mb          1.1mb
green  open   .kibana_task_manager_1                        s5b-0WjHR4OgXNtITmqDJg   1   0          2            0       34kb           34kb
green  open   logs-endpoint-winevent-security-2019.03.01    wK9G1vP5Qiy9HLLn6Q2tIw   1   0        178          265      504kb          504kb
green  open   logs-endpoint-winevent-powershell-2019.03.01  CziOhDQnT3-KhWm1FjGbwg   1   0       4583            0     39.8mb         39.8mb
green  open   logs-endpoint-winevent-system-2019.03.01      kgVBuZiETjCQOZmyOhIlBA   1   0         62            4    201.1kb        201.1kb
green  open   logs-endpoint-winevent-wmiactivity-2019.03.01 Z_umc3lEQHyBfNWIRAcWFw   1   0         14            0     49.5kb         49.5kb

If your HELK instance is only accessible via SSH, you can create port forwarding with the following command to access the Kibana interface:

ssh -N -L 8080: -L 4043: <USER>@<IP-OF-HELK>

Browse to your Kibana Discover view and start going through the data

You could look for potential DCSync actvity from a non-Domain-Controller account with the following query in Kibana:

event_id:4662 AND object_properties:("*1131f6aa-9c07-11d1-f79f-00c04fc2dcd2*" OR "*1131f6ad-9c07-11d1-f79f-00c04fc2dcd2*" OR "*89e95b76-444d-4c62-991a-0facbeda640c*")

Jupyter Notebook Style

You can consume mordor data directly with a Jupyter notebook and analyze it via python libraries such as Pandas or PySpark.

Suricata Style

  • Install Suricata (OSX)

  • Download open Emerging Threat rules

tar zxvf emerging.rules.tar.gz
sudo mkdir /var/lib/suricata/
sudo mv rules /var/lib/suricata/
  • Update Suricata config to point to that folder /etc/suricata/suricata.yaml

default-rule-path: /var/lib/suricata/rules

  - emerging*
  • Clone Project and change directories

git clone && cd mordor/datasets/large
  • Decompress every PCAP in the same folder (Password Protected: infected)

find apt29/day*/pcaps -name '*.zip' -execdir unzip -P infected {} \;
  • Run Suricata on every single PCAP and append results from every PCAP to fast.log and eve.json files in their respective directories.

find apt29/day*/pcaps -name '*cap' -execdir suricata -r {} -k none \;
  • Stack count the alers generated

jq 'select((.event_type == "alert") and .alert.category != "Generic Protocol Command Decode") | .alert.signature' apt29/day1/pcaps/eve.json | sort | uniq -c