Malcolm

Malcolm is a powerful network traffic analysis tool suite designed with the following goals in mind:

Easy to use – Malcolm accepts network traffic data in the form of full packet capture (PCAP) files and Zeek (formerly Bro) logs. These artifacts can be uploaded via a simple browser-based interface or captured live and forwarded to Malcolm using lightweight forwarders. In either case, the data is automatically normalized, enriched, and correlated for analysis.
Powerful traffic analysis – Visibility into network communications is provided through two intuitive interfaces: Kibana, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime (formerly Moloch), a powerful tool for finding and identifying the network sessions comprising suspected security incidents.
Streamlined deployment – Malcolm operates as a cluster of Docker containers, isolated sandboxes which each serve a dedicated function of the system. This Docker-based deployment model, combined with a few simple scripts for setup and run-time management, makes Malcolm suitable to be deployed quickly across a variety of platforms and use cases, whether it be for long-term deployment on a Linux server in a security operations center (SOC) or for incident response on a Macbook for an individual engagement.
Secure communications – All communications with Malcolm, both from the user interface and from remote log forwarders, are secured with industry standard encryption protocols.
Permissive license – Malcolm is comprised of several widely used open source tools, making it an attractive alternative to security solutions requiring paid licenses.
Expanding control systems visibility – While Malcolm is great for general-purpose network traffic analysis, its creators see a particular need in the community for tools providing insight into protocols used in industrial control systems (ICS) environments. Ongoing Malcolm development will aim to provide additional parsers for common ICS protocols.

Although all of the open source tools which make up Malcolm are already available and in general use, Malcolm provides a framework of interconnectivity which makes it greater than the sum of its parts. And while there are many other network traffic analysis solutions out there, ranging from complete Linux distributions like Security Onion to licensed products like Splunk Enterprise Security, the creators of Malcolm feel its easy deployment and robust combination of tools fill a void in the network security space that will make network traffic analysis accessible to many in both the public and private sectors as well as individual enthusiasts.

In short, Malcolm provides an easily deployable network analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs. While Internet access is required to build it, it is not required at runtime.

Quick start
- Getting Malcolm
- User interface
Overview
Components
Supported Protocols
Development
- Building from source
Pre-Packaged installation files
Preparing your system
- Recommended system requirements
- System configuration and tuning
Running Malcolm
- Configure authentication
  - Local account management
  - Lightweight Directory Access Protocol (LDAP) authentication
    - LDAP connection security
- Starting Malcolm
- Stopping and restarting Malcolm
- Clearing Malcolm's data
Capture file and log archive upload
- Tagging
- Processing uploaded PCAPs with Zeek
Live analysis
- Capturing traffic on local network interfaces
- Using a network sensor appliance
- Manually forwarding Zeek logs from an external source
Arkime
- Zeek log integration
  - Correlating Zeek logs and Arkime sessions
- Help
- Sessions
  - PCAP Export
- SPIView
- SPIGraph
- Connections
- Hunt
- Statistics
- Settings
Kibana
- Discover
  - Screenshots
- Visualizations and dashboards
  - Prebuilt visualizations and dashboards
    - Screenshots
  - Building your own visualizations and dashboards
    - Screenshots
Search Queries in Arkime and Kibana
Other Malcolm features
- Automatic file extraction and scanning
- Automatic host and subnet name assignment
- Elasticsearch index management
- Alerting
- "Best Guess" Fingerprinting for ICS Protocols
Using Beats to forward host logs to Malcolm
Malcolm installer ISO
- Installation
- Generating the ISO
- Setup
- Time synchronization
- Hardening
  - STIG compliance exceptions
  - CIS benchmark compliance exceptions
Known issues
Installation example using Ubuntu 20.04 LTS
Upgrading Malcolm
Copyright
Contact

Quick start

Getting Malcolm

For a TL;DR example of downloading, configuring, and running Malcolm on a Linux platform, see Installation example using Ubuntu 20.04 LTS.

The scripts to control Malcolm require Python 3.

Source code

The files required to build and run Malcolm are available on its GitHub page. Malcolm's source code is released under the terms of a permissive open source software license (see see License.txt for the terms of its release).

Building Malcolm from scratch

The build.sh script can build Malcolm's Docker images from scratch. See Building from source for more information.

Initial configuration

You must run auth_setup prior to pulling Malcolm's Docker images. You should also ensure your system configuration and docker-compose.yml settings are tuned by running ./scripts/install.py or ./scripts/install.py --configure (see System configuration and tuning).

Pull Malcolm's Docker images

Malcolm's Docker images are periodically built and hosted on Docker Hub. If you already have Docker and Docker Compose, these prebuilt images can be pulled by navigating into the Malcolm directory (containing the docker-compose.yml file) and running docker-compose pull like this:

$ docker-compose pull
Pulling arkime        ... done
Pulling elasticsearch ... done
Pulling file-monitor  ... done
Pulling filebeat      ... done
Pulling freq          ... done
Pulling htadmin       ... done
Pulling kibana        ... done
Pulling logstash      ... done
Pulling name-map-ui   ... done
Pulling nginx-proxy   ... done
Pulling pcap-capture  ... done
Pulling pcap-monitor  ... done
Pulling upload        ... done
Pulling zeek          ... done

You can then observe that the images have been retrieved by running docker images:

$ docker images
REPOSITORY                                          TAG                 IMAGE ID            CREATED             SIZE
malcolmnetsec/arkime                                3.2.1               xxxxxxxxxxxx        39 hours ago        683MB
malcolmnetsec/elasticsearch-od                      3.2.1               xxxxxxxxxxxx        40 hours ago        690MB
malcolmnetsec/file-monitor                          3.2.1               xxxxxxxxxxxx        39 hours ago        470MB
malcolmnetsec/file-upload                           3.2.1               xxxxxxxxxxxx        39 hours ago        199MB
malcolmnetsec/filebeat-oss                          3.2.1               xxxxxxxxxxxx        39 hours ago        555MB
malcolmnetsec/freq                                  3.2.1               xxxxxxxxxxxx        39 hours ago        390MB
malcolmnetsec/htadmin                               3.2.1               xxxxxxxxxxxx        39 hours ago        180MB
malcolmnetsec/kibana-helper                         3.2.1               xxxxxxxxxxxx        40 hours ago        141MB
malcolmnetsec/kibana-od                             3.2.1               xxxxxxxxxxxx        40 hours ago        1.16GB
malcolmnetsec/logstash-oss                          3.2.1               xxxxxxxxxxxx        39 hours ago        1.41GB
malcolmnetsec/name-map-ui                           3.2.1               xxxxxxxxxxxx        39 hours ago        137MB
malcolmnetsec/nginx-proxy                           3.2.1               xxxxxxxxxxxx        39 hours ago        120MB
malcolmnetsec/pcap-capture                          3.2.1               xxxxxxxxxxxx        39 hours ago        111MB
malcolmnetsec/pcap-monitor                          3.2.1               xxxxxxxxxxxx        39 hours ago        157MB
malcolmnetsec/zeek                                  3.2.1               xxxxxxxxxxxx        39 hours ago        887MB

Import from pre-packaged tarballs

Once built, the malcolm_appliance_packager.sh script can be used to create pre-packaged Malcolm tarballs for import on another machine. See Pre-Packaged Installation Files for more information.

Starting and stopping Malcolm

Use the scripts in the scripts/ directory to start and stop Malcolm, view debug logs of a currently running instance, wipe the database and restore Malcolm to a fresh state, etc.

User interface

A few minutes after starting Malcolm (probably 5 to 10 minutes for Logstash to be completely up, depending on the system), the following services will be accessible:

Arkime: https://localhost:443
Kibana: https://localhost/kibana/ or https://localhost:5601
Capture File and Log Archive Upload (Web): https://localhost/upload/
Capture File and Log Archive Upload (SFTP): sftp://@127.0.0.1:8022/files
Host and Subnet Name Mapping Editor: https://localhost/name-map-ui/
Account Management: https://localhost:488

Overview

Malcolm processes network traffic data in the form of packet capture (PCAP) files or Zeek logs. A sensor (packet capture appliance) monitors network traffic mirrored to it over a SPAN port on a network switch or router, or using a network TAP device. Zeek logs and Arkime sessions are generated containing important session metadata from the traffic observed, which are then securely forwarded to a Malcolm instance. Full PCAP files are optionally stored locally on the sensor device for examination later.

Malcolm parses the network session data and enriches it with additional lookups and mappings including GeoIP mapping, hardware manufacturer lookups from organizationally unique identifiers (OUI) in MAC addresses, assigning names to network segments and hosts based on user-defined IP address and MAC mappings, performing TLS fingerprinting, and many others.

The enriched data is stored in an Elasticsearch document store in a format suitable for analysis through two intuitive interfaces: Kibana, a flexible data visualization plugin with dozens of prebuilt dashboards providing an at-a-glance overview of network protocols; and Arkime, a powerful tool for finding and identifying the network sessions comprising suspected security incidents. These tools can be accessed through a web browser from analyst workstations or for display in a security operations center (SOC). Logs can also optionally be forwarded on to another instance of Malcolm.

For smaller networks, use at home by network security enthusiasts, or in the field for incident response engagements, Malcolm can also easily be deployed locally on an ordinary consumer workstation or laptop. Malcolm can process local artifacts such as locally-generated Zeek logs, locally-captured PCAP files, and PCAP files collected offline without the use of a dedicated sensor appliance.

Components

Malcolm leverages the following excellent open source tools, among others.

Arkime (formerly Moloch) - for PCAP file processing, browsing, searching, analysis, and carving/exporting; Arkime itself consists of two parts:
- moloch-capture - a tool for traffic capture, as well as offline PCAP parsing and metadata insertion into Elasticsearch
- viewer - a browser-based interface for data visualization
Elasticsearch (Open Distro variant) - a search and analytics engine for indexing and querying network traffic session metadata
Logstash and Filebeat - for ingesting and parsing Zeek Log Files and ingesting them into Elasticsearch in a format that Arkime understands and is able to understand in the same way it natively understands PCAP data
Kibana (Open Distro variant) - for creating additional ad-hoc visualizations and dashboards beyond that which is provided by Arkime viewer
Zeek - a network analysis framework and IDS
Yara - a tool used to identify and classify malware samples
Capa - a tool for detecting capabilities in executable files
ClamAV - an antivirus engine for scanning files extracted by Zeek
CyberChef - a "swiss-army knife" data conversion tool
jQuery File Upload - for uploading PCAP files and Zeek logs for processing
List.js - for the host and subnet name mapping interface
Docker and Docker Compose - for simple, reproducible deployment of the Malcolm appliance across environments and to coordinate communication between its various components
Nginx - for HTTPS and reverse proxying Malcolm components
nginx-auth-ldap - an LDAP authentication module for nginx
Mark Baggett's freq - a tool for calculating entropy of strings
Florian Roth's Signature-Base Yara ruleset
These Zeek plugins:
- some of Amazon.com, Inc.'s ICS protocol analyzers
- Andrew Klaus's Sniffpass plugin for detecting cleartext passwords in HTTP POST requests
- Andrew Klaus's zeek-httpattacks plugin for detecting noncompliant HTTP requests
- ICS protocol analyzers for Zeek published by DHS CISA and Idaho National Lab
- Corelight's bro-xor-exe plugin
- Corelight's "bad neighbor" (CVE-2020-16898) plugin
- Corelight's HTTP protocol stack vulnerability (CVE-2021-31166) plugin
- Corelight's callstranger-detector plugin
- Corelight's community ID flow hashing plugin
- Corelight's pingback plugin
- Corelight's ripple20 plugin
- Corelight's SIGred plugin
- Corelight's Zerologon plugin
- J-Gras' Zeek::AF_Packet plugin
- Johanna Amann's CVE-2020-0601 ECC certificate validation plugin and CVE-2020-13777 GnuTLS unencrypted session ticket detection plugin
- Lexi Brent's EternalSafety plugin
- MITRE Cyber Analytics Repository's Bro/Zeek ATT&CK-Based Analytics (BZAR) script
- Salesforce's gQUIC analyzer
- Salesforce's HASSH SSH fingerprinting plugin
- Salesforce's JA3 TLS fingerprinting plugin
- Zeek's Spicy plugin framework
GeoLite2 - Malcolm includes GeoLite2 data created by MaxMind

Supported Protocols

Malcolm uses Zeek and Arkime to analyze network traffic. These tools provide varying degrees of visibility into traffic transmitted over the following network protocols:

Traffic	Wiki	Organization/Specification	Arkime	Zeek
Internet layer	🔗	🔗	✓	✓
Border Gateway Protocol (BGP)	🔗	🔗	✓
Building Automation and Control (BACnet)	🔗	🔗		✓
Bristol Standard Asynchronous Protocol (BSAP)	🔗	🔗 🔗		✓
Distributed Computing Environment / Remote Procedure Calls (DCE/RPC)	🔗	🔗		✓
Dynamic Host Configuration Protocol (DHCP)	🔗	🔗	✓	✓
Distributed Network Protocol 3 (DNP3)	🔗	🔗		✓✓
Domain Name System (DNS)	🔗	🔗	✓	✓
EtherCAT	🔗	🔗		✓
EtherNet/IP / Common Industrial Protocol (CIP)	🔗 🔗	🔗		✓
FTP (File Transfer Protocol)	🔗	🔗		✓
Google Quick UDP Internet Connections (gQUIC)	🔗	🔗	✓	✓
Hypertext Transfer Protocol (HTTP)	🔗	🔗	✓	✓
IPsec	🔗	🔗		✓
Internet Relay Chat (IRC)	🔗	🔗	✓	✓
Lightweight Directory Access Protocol (LDAP)	🔗	🔗	✓	✓
Kerberos	🔗	🔗	✓	✓
Modbus	🔗	🔗		✓✓
MQ Telemetry Transport (MQTT)	🔗	🔗		✓
MySQL	🔗	🔗	✓	✓
NT Lan Manager (NTLM)	??	🔗		✓
Network Time Protocol (NTP)	🔗	🔗		✓
Oracle	🔗	🔗	✓
OpenVPN	🔗	🔗 🔗		✓
PostgreSQL	🔗	🔗	✓
Process Field Net (PROFINET)	🔗	🔗		✓
Remote Authentication Dial-In User Service (RADIUS)	🔗	🔗	✓	✓
Remote Desktop Protocol (RDP)	🔗	🔗		✓
Remote Framebuffer (RFB)	🔗	🔗		✓
S7comm / Connection Oriented Transport Protocol (COTP)	🔗 🔗	🔗 🔗		✓
Session Initiation Protocol (SIP)	🔗	🔗		✓
Server Message Block (SMB) / Common Internet File System (CIFS)	🔗	🔗	✓	✓
Simple Mail Transfer Protocol	🔗	🔗	✓	✓
Simple Network Management Protocol	🔗	🔗	✓	✓
SOCKS	🔗	🔗	✓	✓
Secure Shell (SSH)	🔗	🔗	✓	✓
Secure Sockets Layer (SSL) / Transport Layer Security (TLS)	🔗	🔗	✓	✓
Syslog	🔗	🔗	✓	✓
Tabular Data Stream	🔗	🔗 🔗	✓	✓
Telnet / remote shell (rsh) / remote login (rlogin)	🔗 🔗	🔗 🔗	✓	✓❋
TFTP (Trivial File Transfer Protocol)	??	🔗		✓
WireGuard	🔗	🔗 🔗		✓
various tunnel protocols (e.g., GTP, GRE, Teredo, AYIYA, IP-in-IP, etc.)	🔗		✓	✓

Additionally, Zeek is able to detect and, where possible, log the type, vendor and version of various other software protocols.

As part of its network traffic analysis, Zeek can extract and analyze files transferred across the protocols it understands. In addition to generating logs for transferred files, deeper analysis is done into the following file types:

Portable executable files
X.509 certificates

See automatic file extraction and scanning for additional features related to file scanning.

See Zeek log integration for more information on how Malcolm integrates Arkime sessions and Zeek logs for analysis.

Development

Checking out the Malcolm source code results in the following subdirectories in your malcolm/ working copy:

Dockerfiles - a directory containing build instructions for Malcolm's docker images
docs - a directory containing instructions and documentation
elasticsearch - an initially empty directory where the Elasticsearch database instance will reside
elasticsearch-backup - an initially empty directory for storing Elasticsearch index snapshots
filebeat - code and configuration for the filebeat container which ingests Zeek logs and forwards them to the logstash container
file-monitor - code and configuration for the file-monitor container which can scan files extracted by Zeek
file-upload - code and configuration for the upload container which serves a web browser-based upload form for uploading PCAP files and Zeek logs, and which serves an SFTP share as an alternate method for upload
freq-server - code and configuration for the freq container used for calculating entropy of strings
htadmin - configuration for the htadmin user account management container
kibana - code and configuration for the kibana container for creating additional ad-hoc visualizations and dashboards beyond that which is provided by Arkime Viewer
logstash - code and configuration for the logstash container which parses Zeek logs and forwards them to the elasticsearch container
malcolm-iso - code and configuration for building an installer ISO for a minimal Debian-based Linux installation for running Malcolm
moloch - code and configuration for the arkime container which processes PCAP files using moloch-capture and which serves the Viewer application
moloch-logs - an initially empty directory to which the arkime container will write some debug log files
moloch-raw - an initially empty directory to which the arkime container will write captured PCAP files; as Arkime as employed by Malcolm is currently used for processing previously-captured PCAP files, this directory is currently unused
name-map-ui - code and configuration for the name-map-ui container which provides the host and subnet name mapping interface
nginx - configuration for the nginx reverse proxy container
pcap - an initially empty directory for PCAP files to be uploaded, processed, and stored
pcap-capture - code and configuration for the pcap-capture container which can capture network traffic
pcap-monitor - code and configuration for the pcap-monitor container which watches for new or uploaded PCAP files notifies the other services to process them
scripts - control scripts for starting, stopping, restarting, etc. Malcolm
sensor-iso - code and configuration for building a Hedgehog Linux ISO
shared - miscellaneous code used by various Malcolm components
zeek - code and configuration for the zeek container which handles PCAP processing using Zeek
zeek-logs - an initially empty directory for Zeek logs to be uploaded, processed, and stored

and the following files of special note:

auth.env - the script ./scripts/auth_setup prompts the user for the administrator credentials used by the Malcolm appliance, and auth.env is the environment file where those values are stored
cidr-map.txt - specify custom IP address to network segment mapping
host-map.txt - specify custom IP and/or MAC address to host mapping
net-map.json - an alternative to cidr-map.txt and host-map.txt, mapping hosts and network segments to their names in a JSON-formatted file
docker-compose.yml - the configuration file used by docker-compose to build, start, and stop an instance of the Malcolm appliance
docker-compose-standalone.yml - similar to docker-compose.yml, only used for the "packaged" installation of Malcolm

Building from source

Building the Malcolm docker images from scratch requires internet access to pull source files for its components. Once internet access is available, execute the following command to build all of the Docker images used by the Malcolm appliance:

$ ./scripts/build.sh

Then, go take a walk or something since it will be a while. When you're done, you can run docker images and see you have fresh images for:

malcolmnetsec/arkime (based on debian:buster-slim)
malcolmnetsec/elasticsearch-od (based on amazon/opendistro-for-elasticsearch)
malcolmnetsec/filebeat-oss (based on docker.elastic.co/beats/filebeat-oss)
malcolmnetsec/file-monitor (based on debian:buster-slim)
malcolmnetsec/file-upload (based on debian:buster-slim)
malcolmnetsec/freq (based on debian:buster-slim)
malcolmnetsec/htadmin (based on debian:buster-slim)
malcolmnetsec/kibana-od (based on amazon/opendistro-for-elasticsearch-kibana)
malcolmnetsec/kibana-helper (based on alpine:3.12)
malcolmnetsec/logstash-oss (based on docker.elastic.co/logstash/logstash-oss)
malcolmnetsec/name-map-ui (based on alpine:3.12)
malcolmnetsec/nginx-proxy (based on alpine:3.12)
malcolmnetsec/pcap-capture (based on debian:buster-slim)
malcolmnetsec/pcap-monitor (based on debian:buster-slim)
malcolmnetsec/pcap-zeek (based on debian:buster-slim)

Pre-Packaged installation files

Creating pre-packaged installation files

scripts/malcolm_appliance_packager.sh can be run to package up the configuration files (and, if necessary, the Docker images) which can be copied to a network share or USB drive for distribution to non-networked machines. For example:

$ ./scripts/malcolm_appliance_packager.sh 
You must set a username and password for Malcolm, and self-signed X.509 certificates will be generated

Store administrator username/password for local Malcolm access? (Y/n): 

Administrator username: analyst
analyst password: 
analyst password (again): 

(Re)generate self-signed certificates for HTTPS access (Y/n): 

(Re)generate self-signed certificates for a remote log forwarder (Y/n): 

Store username/password for forwarding Logstash events to a secondary, external Elasticsearch instance (y/N): 

Store username/password for email alert sender account (y/N): 

Packaged Malcolm to "/home/user/tmp/malcolm_20190513_101117_f0d052c.tar.gz"

Do you need to package docker images also [y/N]? y
This might take a few minutes...

Packaged Malcolm docker images to "/home/user/tmp/malcolm_20190513_101117_f0d052c_images.tar.gz"


To install Malcolm:
  1. Run install.py
  2. Follow the prompts

To start, stop, restart, etc. Malcolm:
  Use the control scripts in the "scripts/" directory:
   - start         (start Malcolm)
   - stop          (stop Malcolm)
   - restart       (restart Malcolm)
   - logs          (monitor Malcolm logs)
   - wipe          (stop Malcolm and clear its database)
   - auth_setup    (change authentication-related settings)

A minute or so after starting Malcolm, the following services will be accessible:
  - Arkime: https://localhost/
  - Kibana: https://localhost/kibana/
  - PCAP upload (web): https://localhost/upload/
  - PCAP upload (sftp): sftp://[email protected]:8022/files/
  - Host and subnet name mapping editor: https://localhost/name-map-ui/
  - Account management: https://localhost:488/

The above example will result in the following artifacts for distribution as explained in the script's output:

$ ls -lh
total 2.0G
-rwxr-xr-x 1 user user  61k May 13 11:32 install.py
-rw-r--r-- 1 user user 2.0G May 13 11:37 malcolm_20190513_101117_f0d052c_images.tar.gz
-rw-r--r-- 1 user user  683 May 13 11:37 malcolm_20190513_101117_f0d052c.README.txt
-rw-r--r-- 1 user user 183k May 13 11:32 malcolm_20190513_101117_f0d052c.tar.gz

Installing from pre-packaged installation files

If you have obtained pre-packaged installation files to install Malcolm on a non-networked machine via an internal network share or on a USB key, you likely have the following files:

malcolm_YYYYMMDD_HHNNSS_xxxxxxx.README.txt - This readme file contains a minimal set up instructions for extracting the contents of the other tarballs and running the Malcolm appliance.
malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz - This tarball contains the configuration files and directory configuration used by an instance of Malcolm. It can be extracted via tar -xf malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz upon which a directory will be created (named similarly to the tarball) containing the directories and configuration files. Alternatively, install.py can accept this filename as an argument and handle its extraction and initial configuration for you.
malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz - This tarball contains the Docker images used by Malcolm. It can be imported manually via docker load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
install.py - This install script can load the Docker images and extract Malcolm configuration files from the aforementioned tarballs and do some initial configuration for you.

Run install.py malcolm_XXXXXXXX_XXXXXX_XXXXXXX.tar.gz and follow the prompts. If you do not already have Docker and Docker Compose installed, the install.py script will help you install them.

Preparing your system

Recommended system requirements

Malcolm runs on top of Docker which runs on recent releases of Linux, Apple macOS and Microsoft Windows 10.

To quote the Elasticsearch documentation, "If there is one resource that you will run out of first, it will likely be memory." The same is true for Malcolm: you will want at least 16 gigabytes of RAM to run Malcolm comfortably. For processing large volumes of traffic, I'd recommend at a bare minimum a dedicated server with 16 cores and 16 gigabytes of RAM. Malcolm can run on less, but more is better. You're going to want as much hard drive space as possible, of course, as the amount of PCAP data you're able to analyze and store will be limited by your hard drive.

Arkime's wiki has a couple of documents (here and here and here and a calculator here) which may be helpful, although not everything in those documents will apply to a Docker-based setup like Malcolm.

System configuration and tuning

If you already have Docker and Docker Compose installed, the install.py script can still help you tune system configuration and docker-compose.yml parameters for Malcolm. To run it in "configuration only" mode, bypassing the steps to install Docker and Docker Compose, run it like this:

./scripts/install.py --configure

Although install.py will attempt to automate many of the following configuration and tuning parameters, they are nonetheless listed in the following sections for reference:

`docker-compose.yml` parameters

Edit docker-compose.yml and search for the ES_JAVA_OPTS key. Edit the -Xms4g -Xmx4g values, replacing 4g with a number that is half of your total system memory, or just under 32 gigabytes, whichever is less. So, for example, if I had 64 gigabytes of memory I would edit those values to be -Xms31g -Xmx31g. This indicates how much memory can be allocated to the Elasticsearch heaps. For a pleasant experience, I would suggest not using a value under 10 gigabytes. Similar values can be modified for Logstash with LS_JAVA_OPTS, where using 3 or 4 gigabytes is recommended.

Various other environment variables inside of docker-compose.yml can be tweaked to control aspects of how Malcolm behaves, particularly with regards to processing PCAP files and Zeek logs. The environment variables of particular interest are located near the top of that file under Commonly tweaked configuration options, which include:

PUID and PGID - Docker runs all of its containers as the privileged root user by default. For better security, Malcolm immediately drops to non-privileged user accounts for executing internal processes wherever possible. The PUID (process user ID) and PGID (process group ID) environment variables allow Malcolm to map internal non-privileged user accounts to a corresponding user account on the host.
NGINX_BASIC_AUTH - if set to true, use TLS-encrypted HTTP basic authentication (default); if set to false, use Lightweight Directory Access Protocol (LDAP) authentication
NGINX_LOG_ACCESS_AND_ERRORS - if set to true, all access to Malcolm via its web interfaces will be logged to Elasticsearch (default false)
MANAGE_PCAP_FILES – if set to true, all PCAP files imported into Malcolm will be marked as available for deletion by Arkime if available storage space becomes too low (default false)
ZEEK_AUTO_ANALYZE_PCAP_FILES – if set to true, all PCAP files imported into Malcolm will automatically be analyzed by Zeek, and the resulting logs will also be imported (default false)
ZEEK_DISABLE_... - if set to any non-blank value, each of these variables can be used to disable a certain Zeek function when it analyzes PCAP files (for example, setting ZEEK_DISABLE_LOG_PASSWORDS to true to disable logging of cleartext passwords)
ZEEK_DISABLE_BEST_GUESS_ICS - see "Best Guess" Fingerprinting for ICS Protocols
MAXMIND_GEOIP_DB_LICENSE_KEY - Malcolm uses MaxMind's free GeoLite2 databases for GeoIP lookups. As of December 30, 2019, these databases are no longer available for download via a public URL. Instead, they must be downloaded using a MaxMind license key (available without charge from MaxMind). The license key can be specified here for GeoIP database downloads during build- and run-time.
ARKIME_ANALYZE_PCAP_THREADS – the number of threads available to Arkime for analyzing PCAP files (default 1)
ZEEK_AUTO_ANALYZE_PCAP_THREADS – the number of threads available to Malcolm for analyzing Zeek logs (default 1)
LOGSTASH_OUI_LOOKUP – if set to true, Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs (default true)
LOGSTASH_REVERSE_DNS – if set to true, Logstash will perform a reverse DNS lookup for all external source and destination IP address values when analyzing Zeek logs (default false)
ES_EXTERNAL_HOSTS – if specified (in the format '10.0.0.123:9200'), logs received by Logstash will be forwarded on to another external Elasticsearch instance in addition to the one maintained locally by Malcolm
ES_EXTERNAL_SSL – if set to true, Logstash will use HTTPS for the connection to external Elasticsearch instances specified in ES_EXTERNAL_HOSTS
ES_EXTERNAL_SSL_CERTIFICATE_VERIFICATION – if set to true, Logstash will require full SSL certificate validation; this may fail if using self-signed certificates (default false)
AUTO_TAG – if set to true, Malcolm will automatically create Arkime sessions and Zeek logs with tags based on the filename, as described in Tagging (default true)
BEATS_SSL – if set to true, Logstash will use require encrypted communications for any external Beats-based forwarders from which it will accept logs; if Malcolm is being used as a standalone tool then this can safely be set to false, but if external log feeds are to be accepted then setting it to true is recommended (default false)
ZEEK_EXTRACTOR_MODE – determines the file extraction behavior for file transfers detected by Zeek; see Automatic file extraction and scanning for more details
EXTRACTED_FILE_IGNORE_EXISTING – if set to true, files extant in ./zeek-logs/extract_files/ directory will be ignored on startup rather than scanned
EXTRACTED_FILE_PRESERVATION – determines behavior for preservation of Zeek-extracted files
VTOT_API2_KEY – used to specify a VirusTotal Public API v.20 key, which, if specified, will be used to submit hashes of Zeek-extracted files to VirusTotal
EXTRACTED_FILE_ENABLE_YARA – if set to true, Zeek-extracted files will be scanned with Yara
EXTRACTED_FILE_YARA_CUSTOM_ONLY – if set to true, Malcolm will bypass the default Yara ruleset and use only user-defined rules in ./yara/rules
EXTRACTED_FILE_ENABLE_CAPA – if set to true, Zeek-extracted files that are determined to be PE (portable executable) files will be scanned with Capa
EXTRACTED_FILE_CAPA_VERBOSE – if set to true, all Capa rule hits will be logged; otherwise (false) only MITRE ATT&CK technique classifications will be logged
EXTRACTED_FILE_ENABLE_CLAMAV – if set to true, Zeek-extracted files will be scanned with ClamAV
EXTRACTED_FILE_UPDATE_RULES – if set to true, file scanner engines (e.g., ClamAV, Capa, Yara) will periodically update their rule definitions
EXTRACTED_FILE_HTTP_SERVER_ENABLE – if set to true, the directory containing Zeek-extracted files will be served over HTTP at ./extracted-files/ (e.g., https://localhost/extracted-files/ if you are connecting locally)
EXTRACTED_FILE_HTTP_SERVER_ENCRYPT – if set to true, those Zeek-extracted files will be AES-256-CBC-encrypted in an openssl enc-compatible format (e.g., openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe)
EXTRACTED_FILE_HTTP_SERVER_KEY – specifies the AES-256-CBC decryption password for encrypted Zeek-extracted files; used in conjunction with EXTRACTED_FILE_HTTP_SERVER_ENCRYPT
PCAP_ENABLE_NETSNIFF – if set to true, Malcolm will capture network traffic on the local network interface(s) indicated in PCAP_IFACE using netsniff-ng
PCAP_ENABLE_TCPDUMP – if set to true, Malcolm will capture network traffic on the local network interface(s) indicated in PCAP_IFACE using tcpdump; there is no reason to enable both PCAP_ENABLE_NETSNIFF and PCAP_ENABLE_TCPDUMP
PCAP_IFACE – used to specify the network interface(s) for local packet capture if PCAP_ENABLE_NETSNIFF or PCAP_ENABLE_TCPDUMP are enabled; for multiple interfaces, separate the interface names with a comma (e.g., 'enp0s25' or 'enp10s0,enp11s0')
PCAP_ROTATE_MEGABYTES – used to specify how large a locally-captured PCAP file can become (in megabytes) before it closed for processing and a new PCAP file created
PCAP_ROTATE_MINUTES – used to specify an time interval (in minutes) after which a locally-captured PCAP file will be closed for processing and a new PCAP file created
PCAP_FILTER – specifies a tcpdump-style filter expression for local packet capture; leave blank to capture all traffic

Linux host system configuration

Installing Docker

Docker installation instructions vary slightly by distribution. Please follow the links below to docker.com to find the instructions specific to your distribution:

After installing Docker, because Malcolm should be run as a non-root user, add your user to the docker group with something like:

$ sudo usermod -aG docker yourusername

Following this, either reboot or log out then log back in.

Docker starts automatically on DEB-based distributions. On RPM-based distributions, you need to start it manually or enable it using the appropriate systemctl or service command(s).

You can test docker by running docker info, or (assuming you have internet access), docker run --rm hello-world.

Installing docker-compose

Please follow this link on docker.com for instructions on installing docker-compose.

Operating system configuration

The host system (ie., the one running Docker) will need to be configured for the best possible Elasticsearch performance. Here are a few suggestions for Linux hosts (these may vary from distribution to distribution):

Append the following lines to /etc/sysctl.conf:

# the maximum number of open file handles
fs.file-max=2097152

# increase maximums for inotify watches
fs.inotify.max_user_watches=131072
fs.inotify.max_queued_events=131072
fs.inotify.max_user_instances=512

# the maximum number of memory map areas a process may have
vm.max_map_count=262144

# decrease "swappiness" (swapping out runtime memory vs. dropping pages)
vm.swappiness=1

# the maximum number of incoming connections
net.core.somaxconn=65535

# the % of system memory fillable with "dirty" pages before flushing
vm.dirty_background_ratio=40

# maximum % of dirty system memory before committing everything
vm.dirty_ratio=80

Depending on your distribution, create either the file /etc/security/limits.d/limits.conf containing:

# the maximum number of open file handles
* soft nofile 65535
* hard nofile 65535
# do not limit the size of memory that can be locked
* soft memlock unlimited
* hard memlock unlimited

OR the file /etc/systemd/system.conf.d/limits.conf containing:

[Manager]
# the maximum number of open file handles
DefaultLimitNOFILE=65535:65535
# do not limit the size of memory that can be locked
DefaultLimitMEMLOCK=infinity

Change the readahead value for the disk where the Elasticsearch data will be stored. There are a few ways to do this. For example, you could add this line to /etc/rc.local (replacing /dev/sda with your disk block descriptor):

# change disk read-adhead value (# of blocks)
blockdev --setra 512 /dev/sda

Change the I/O scheduler to deadline or noop. Again, this can be done in a variety of ways. The simplest is to add elevator=deadline to the arguments in GRUB_CMDLINE_LINUX in /etc/default/grub, then running sudo update-grub2
If you are planning on using very large data sets, consider formatting the drive containing elasticsearch volume as XFS.

After making all of these changes, do a reboot for good measure!

macOS host system configuration

Automatic installation using `install.py`

The install.py script will attempt to guide you through the installation of Docker and Docker Compose if they are not present. If that works for you, you can skip ahead to Configure docker daemon option in this section.

Install Homebrew

The easiest way to install and maintain docker on Mac is using the Homebrew cask. Execute the following in a terminal.

$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
$ brew install cask
$ brew tap homebrew/cask-versions

Install docker-edge

$ brew cask install docker-edge

This will install the latest version of docker and docker-compose. It can be upgraded later using brew as well:

$ brew cask upgrade --no-quarantine docker-edge

You can now run docker from the Applications folder.

Configure docker daemon option

Some changes should be made for performance (this link gives a good succinct overview).

Resource allocation - For a good experience, you likely need at least a quad-core MacBook Pro with 16GB RAM and an SSD. I have run Malcolm on an older 2013 MacBook Pro with 8GB of RAM, but the more the better. Go in your system tray and select Docker → Preferences → Advanced. Set the resources available to docker to at least 4 CPUs and 8GB of RAM (>= 16GB is preferable).
Volume mount performance - You can speed up performance of volume mounts by removing unused paths from Docker → Preferences → File Sharing. For example, if you’re only going to be mounting volumes under your home directory, you could share /Users but remove other paths.

After making these changes, right click on the Docker 🐋 icon in the system tray and select Restart.

Windows host system configuration

Installing and configuring Docker Desktop for Windows

Installing and configuring Docker to run under Windows must be done manually, rather than through the install.py script as is done for Linux and macOS.

In order to be able to configure Docker volume mounts correctly, you should be running Windows 10, version 1803 or higher.
The control scripts in the scripts/ directory are written for Python 3. They also rely on a few other utilities such as OpenSSL and htpasswd. The easiest way to run these tools in Windows is using the Windows Subsystem for Linux (WSL) (however, they may also be installed and configured manually: Python 3; OpenSSL; htpasswd, download the httpd….zip file and extract htpasswd.exe from the Apache…\bin\ directory). To install WSL, run the following command in PowerShell as Administrator:
- Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux
Install the Linux distribution of your choice in WSL. These instructions have been tested using Debian, but will probably work with other distributions as well.
Run the following commands in PowerShell as Administrator to enable required Windows features:
- Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Hyper-V -All
- Enable-WindowsOptionalFeature -Online -FeatureName Containers –All
If you have not yet done so after enabling the Windows features, reboot.
Install Docker Desktop for Windows either by downloading the installer from the official Docker site or installing it through chocolatey.
Run Docker Desktop, click the Settings option in the Docker system tray menu and make the following adjustments:
- General
  - Ensure Start Docker Desktop when you log in is checked.
- Shared Drives
  - Mark the drive onto which Malcolm is installed as Shared (e.g., check Shared for drive C).
- Advanced
  - Increase CPUs to as many as you're comfortable with (at least 4 is best).
  - Increase Memory to as much as you're comfortable with (at least 16 is recommended, no fewer than 10).
  - Increase Disk image max size to however much space you want Malcolm to have available to it (ideally at least several hundred gigabytes), and change the Disk image location if needed to accommodate it.
Make sure Docker applies/restarts (or just reboot), then go back in and check the Advanced settings to make sure things stick.
To ensure Docker volume mounts work correctly when using WSL, WSL needs to be configured to mount at / instead of at /mnt. Inside your WSL Bash shell, run the following command to write /etc/wsl.conf to specify the WSL mount point:
- echo -e '[automount]\nroot = /\noptions = "metadata"' | sudo tee /etc/wsl.conf
Reboot.
Run docker info in PowerShell to make sure Docker is running.
Open a shell in your WSL distribution and run docker.exe info to make sure Docker is accessible from within WSL.
- Previous versions of WSL required the native Linux docker command-line client to interact with the Windows Desktop Docker server. Recent improvements to WSL allow the Windows executables docker-compose.exe and docker.exe to be run seamlessly in WSL. Malcolm's control scripts detect this scenario.

Finish Malcolm's configuration

Once Docker is installed, configured and running as described in the previous section, run ./scripts/install.py --configure (in WSL it will probably be something like sudo ./scripts/install.py --configure) to finish configuration of the local Malcolm installation.

The control scripts outlined in the Running Malcolm section may not be symlinked correctly under Windows. Rather than running ./scripts/start, ./scripts/stop, etc., you can run ./scripts/control.py --start, ./scripts/control.py --stop, etc. to the same effect.

Running Malcolm

Configure authentication

Malcolm requires authentication to access the user interface. Nginx can authenticate users with either local TLS-encrypted HTTP basic authentication or using a remote Lightweight Directory Access Protocol (LDAP) authentication server.

With the local basic authentication method, user accounts are managed by Malcolm and can be created, modified, and deleted using a user management web interface. This method is suitable in instances where accounts and credentials do not need to be synced across many Malcolm installations.

LDAP authentication are managed on a remote directory service, such as a Microsoft Active Directory Domain Services or OpenLDAP.

Malcolm's authentication method is defined in the x-auth-variables section near the top of the docker-compose.yml file with the NGINX_BASIC_AUTH environment variable: true for local TLS-encrypted HTTP basic authentication, false for LDAP authentication.

In either case, you must run ./scripts/auth_setup before starting Malcolm for the first time in order to:

define the local Malcolm administrator account username and password (although these credentials will only be used for basic authentication, not LDAP authentication)
specify whether or not to (re)generate the self-signed certificates used for HTTPS access
- key and certificate files are located in the nginx/certs/ directory
specify whether or not to (re)generate the self-signed certificates used by a remote log forwarder (see the BEATS_SSL environment variable above)
- certificate authority, certificate, and key files for Malcolm’s Logstash instance are located in the logstash/certs/ directory
- certificate authority, certificate, and key files to be copied to and used by the remote log forwarder are located in the filebeat/certs/ directory
specify whether or not to store the username/password for forwarding Logstash events to a secondary, external Elasticsearch instance (see the ES_EXTERNAL_HOSTS, ES_EXTERNAL_SSL, and ES_EXTERNAL_SSL_CERTIFICATE_VERIFICATION environment variables above)
- these parameters are stored securely in the Logstash keystore file logstash/certs/logstash.keystore
specify whether or not to store the username/password for email alert senders
- these parameters are stored securely in the Elasticsearch keystore file elasticsearch/elasticsearch.keystore

Local account management

auth_setup is used to define the username and password for the administrator account. Once Malcolm is running, the administrator account can be used to manage other user accounts via a Malcolm User Management page served over HTTPS on port 488 (e.g., https://localhost:488 if you are connecting locally).

Malcolm user accounts can be used to access the interfaces of all of its components, including Arkime. Arkime uses its own internal database of user accounts, so when a Malcolm user account logs in to Arkime for the first time Malcolm creates a corresponding Arkime user account automatically. This being the case, it is not recommended to use the Arkime Users settings page or change the password via the Password form under the Arkime Settings page, as those settings would not be consistently used across Malcolm.

Users may change their passwords via the Malcolm User Management page by clicking User Self Service. A forgotten password can also be reset via an emailed link, though this requires SMTP server settings to be specified in htadmin/config.ini in the Malcolm installation directory.

Lightweight Directory Access Protocol (LDAP) authentication

The nginx-auth-ldap module serves as the interface between Malcolm's Nginx web server and a remote LDAP server. When you run auth_setup for the first time, a sample LDAP configuration file is created at nginx/nginx_ldap.conf.

# This is a sample configuration for the ldap_server section of nginx.conf.
# Yours will vary depending on how your Active Directory/LDAP server is configured.
# See https://github.com/kvspb/nginx-auth-ldap#available-config-parameters for options.

ldap_server ad_server {
  url "ldap://ds.example.com:3268/DC=ds,DC=example,DC=com?sAMAccountName?sub?(objectClass=person)";

  binddn "bind_dn";
  binddn_passwd "bind_dn_password";

  group_attribute member;
  group_attribute_is_dn on;
  require group "CN=Malcolm,CN=Users,DC=ds,DC=example,DC=com";
  require valid_user;
  satisfy all;
}

auth_ldap_cache_enabled on;
auth_ldap_cache_expiration_time 10000;
auth_ldap_cache_size 1000;

This file is mounted into the nginx container when Malcolm is started to provide connection information for the LDAP server.

The contents of nginx_ldap.conf will vary depending on how the LDAP server is configured. Some of the avaiable parameters in that file include:

url - the ldap:// or ldaps:// connection URL for the remote LDAP server, which has the following syntax: ldap[s]://:/???
binddn and binddn_password - the account credentials used to query the LDAP directory
group_attribute - the group attribute name which contains the member object (e.g., member or memberUid)
group_attribute_is_dn - whether or not to search for the user's full distinguished name as the value in the group's member attribute
require and satisfy - require user, require group and require valid_user can be used in conjunction with satisfy any or satisfy all to limit the users that are allowed to access the Malcolm instance

Before starting Malcolm, edit nginx/nginx_ldap.conf according to the specifics of your LDAP server and directory tree structure. Using a LDAP search tool such as ldapsearch in Linux or dsquery in Windows may be of help as you formulate the configuration. Your changes should be made within the curly braces of the ldap_server ad_server { … } section. You can troubleshoot configuration file syntax errors and LDAP connection or credentials issues by running ./scripts/logs (or docker-compose logs nginx) and examining the output of the nginx container.

The Malcolm User Management page described above is not available when using LDAP authentication.

LDAP connection security

Authentication over LDAP can be done using one of three ways, two of which offer data confidentiality protection:

StartTLS - the standard extension to the LDAP protocol to establish an encrypted SSL/TLS connection within an already established LDAP connection
LDAPS - a commonly used (though unofficial and considered deprecated) method in which SSL negotiation takes place before any commands are sent from the client to the server
Unencrypted (clear text) (not recommended)

In addition to the NGINX_BASIC_AUTH environment variable being set to false in the x-auth-variables section near the top of the docker-compose.yml file, the NGINX_LDAP_TLS_STUNNEL and NGINX_LDAP_TLS_STUNNEL environment variables are used in conjunction with the values in nginx/nginx_ldap.conf to define the LDAP connection security level. Use the following combinations of values to achieve the connection security methods above, respectively:

StartTLS
- NGINX_LDAP_TLS_STUNNEL set to true in docker-compose.yml
- url should begin with ldap:// and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in nginx/nginx_ldap.conf
LDAPS
- NGINX_LDAP_TLS_STUNNEL set to false in docker-compose.yml
- url should begin with ldaps:// and its port should be either the default LDAPS port (636) or the default LDAPS Global Catalog port (3269) in nginx/nginx_ldap.conf
Unencrypted (clear text) (not recommended)
- NGINX_LDAP_TLS_STUNNEL set to false in docker-compose.yml
- url should begin with ldap:// and its port should be either the default LDAP port (389) or the default Global Catalog port (3268) in nginx/nginx_ldap.conf

For encrypted connections (whether using StartTLS or LDAPS), Malcolm will require and verify certificates when one or more trusted CA certificate files are placed in the nginx/ca-trust/ directory. Otherwise, any certificate presented by the domain server will be accepted.

Starting Malcolm

Docker compose is used to coordinate running the Docker containers. To start Malcolm, navigate to the directory containing docker-compose.yml and run:

$ ./scripts/start

This will create the containers' virtual network and instantiate them, then leave them running in the background. The Malcolm containers may take a several minutes to start up completely. To follow the debug output for an already-running Malcolm instance, run:

$ ./scripts/logs

You can also use docker stats to monitor the resource utilization of running containers.

Stopping and restarting Malcolm

You can run ./scripts/stop to stop the docker containers and remove their virtual network. Alternatively, ./scripts/restart will restart an instance of Malcolm. Because the data on disk is stored on the host in docker volumes, doing these operations will not result in loss of data.

Malcolm can be configured to be automatically restarted when the Docker system daemon restart (for example, on system reboot). This behavior depends on the value of the restart: setting for each service in the docker-compose.yml file. This value can be set by running ./scripts/install.py --configure and answering "yes" to "Restart Malcolm upon system or Docker daemon restart?."

Clearing Malcolm’s data

Run ./scripts/wipe to stop the Malcolm instance and wipe its Elasticsearch database (including index snapshots and management policies and alerting configuration).

Capture file and log archive upload

Malcolm serves a web browser-based upload form for uploading PCAP files and Zeek logs at https://localhost/upload/ if you are connecting locally.

Additionally, there is a writable files directory on an SFTP server served on port 8022 (e.g., sftp://[email protected]:8022/files/ if you are connecting locally).

The types of files supported are:

PCAP files (of mime type application/vnd.tcpdump.pcap or application/x-pcapng)
- PCAPNG files are partially supported: Zeek is able to process PCAPNG files, but not all of Arkime's packet examination features work correctly
Zeek logs in archive files (application/gzip, application/x-gzip, application/x-7z-compressed, application/x-bzip2, application/x-cpio, application/x-lzip, application/x-lzma, application/x-rar-compressed, application/x-tar, application/x-xz, or application/zip)
- where the Zeek logs are found in the internal directory structure in the archive file does not matter

Files uploaded via these methods are monitored and moved automatically to other directories for processing to begin, generally within one minute of completion of the upload.

Tagging

In addition to be processed for uploading, Malcolm events will be tagged according to the components of the filenames of the PCAP files or Zeek log archives files from which the events were parsed. For example, records created from a PCAP file named ACME_Scada_VLAN10.pcap would be tagged with ACME, Scada, and VLAN10. Tags are extracted from filenames by splitting on the characters "," (comma), "-" (dash), and "_" (underscore). These tags are viewable and searchable (via the tags field) in Arkime and Kibana. This behavior can be changed by modifying the AUTO_TAG environment variable in docker-compose.yml.

Tags may also be specified manually with the browser-based upload form.

Processing uploaded PCAPs with Zeek

The browser-based upload interface also provides the ability to specify tags for events extracted from the files uploaded. Additionally, an Analyze with Zeek checkbox may be used when uploading PCAP files to cause them to be analyzed by Zeek, similarly to the ZEEK_AUTO_ANALYZE_PCAP_FILES environment variable described above, only on a per-upload basis. Zeek can also automatically carve out files from file transfers; see Automatic file extraction and scanning for more details.

Live analysis

Capturing traffic on local network interfaces

Malcolm's pcap-capture container can capture traffic on one or more local network interfaces and periodically rotate these files for processing with Arkime and Zeek. The pcap-capture Docker container is started with additional privileges (IPC_LOCK, NET_ADMIN, NET_RAW, and SYS_ADMIN) in order for it to be able to open network interfaces in promiscuous mode for capture.

The environment variables prefixed with PCAP_ in the docker-compose.yml file determine local packet capture behavior. Local capture can also be configured by running ./scripts/install.py --configure and answering "yes" to "Should Malcolm capture network traffic to PCAP files?."

Note that currently Microsoft Windows and Apple macOS platforms run Docker inside of a virtualized environment. This would require additional configuration of virtual interfaces and port forwarding in Docker, the process for which is outside of the scope of this document.

Using a network sensor appliance

A remote network sensor appliance can be used to monitor network traffic, capture PCAP files, and forward Zeek logs, Arkime sessions, or other information to Malcolm. Hedgehog Linux is a Debian-based operating system built to

monitor network interfaces
capture packets to PCAP files
detect file transfers in network traffic and extract and scan those files for threats
generate and forward Zeek logs, Arkime sessions, and other information to Malcolm

Please see the Hedgehog Linux README for more information.

Manually forwarding Zeek logs from an external source

Malcolm’s Logstash instance can also be configured to accept Zeek logs from a remote forwarder by running ./scripts/install.py --configure and answering "yes" to "Expose Logstash port to external hosts?." Enabling encrypted transport of these logs files is discussed in Configure authentication and the description of the BEATS_SSL environment variable in the docker-compose.yml file.

Configuring Filebeat to forward Zeek logs to Malcolm might look something like this example filebeat.yml:

filebeat.inputs:
- type: log
  paths:
    - /var/zeek/*.log
  fields_under_root: true
  fields:
    type: "session"
  compression_level: 0
  exclude_lines: ['^\s*#']
  scan_frequency: 10s
  clean_inactive: 180m
  ignore_older: 120m
  close_inactive: 90m
  close_renamed: true
  close_removed: true
  close_eof: false
  clean_renamed: true
  clean_removed: true

output.logstash:
  hosts: ["192.0.2.123:5044"]
  ssl.enabled: true
  ssl.certificate_authorities: ["/foo/bar/ca.crt"]
  ssl.certificate: "/foo/bar/client.crt"
  ssl.key: "/foo/bar/client.key"
  ssl.supported_protocols: "TLSv1.2"
  ssl.verification_mode: "none"

Arkime

The Arkime interface will be accessible over HTTPS on port 443 at the docker hosts IP address (e.g., https://localhost if you are connecting locally).

Zeek log integration

A stock installation of Arkime extracts all of its network connection ("session") metadata ("SPI" or "Session Profile Information") from full packet capture artifacts (PCAP files). Zeek (formerly Bro) generates similar session metadata, linking network events to sessions via a connection UID. Malcolm aims to facilitate analysis of Zeek logs by mapping values from Zeek logs to the Arkime session database schema for equivalent fields, and by creating new "native" Arkime database fields for all the other Zeek log values for which there is not currently an equivalent in Arkime:

In this way, when full packet capture is an option, analysis of PCAP files can be enhanced by the additional information Zeek provides. When full packet capture is not an option, similar analysis can still be performed using the same interfaces and processes using the Zeek logs alone.

One value of particular mention is Zeek Log Type (zeek.logType in Elasticsearch). This value corresponds to the kind of Zeek .log file from which the record was created. In other words, a search could be restricted to records from conn.log by searching zeek.logType == conn, or restricted to records from weird.log by searching zeek.logType == weird. In this same way, to view only records from Zeek logs (excluding any from PCAP files), use the special Arkime EXISTS filter, as in zeek.logType == EXISTS!. On the other hand, to exclude Zeek logs and only view records from PCAP files, use zeek.logType != EXISTS!.

Click the icon of the owl 🦉 in the upper-left hand corner of to access the Arkime usage documentation (accessible at https://localhost/help if you are connecting locally), click the Fields label in the navigation pane, then search for zeek to see a list of the other Zeek log types and fields available to Malcolm.

The values of records created from Zeek logs can be expanded and viewed like any native Arkime session by clicking the plus ➕ icon to the left of the record in the Sessions view. However, note that when dealing with these Zeek records the full packet contents are not available, so buttons dealing with viewing and exporting PCAP information will not behave as they would for records from PCAP files. Other than that, Zeek records and their values are usable in Malcolm just like native PCAP session records.

Correlating Zeek logs and Arkime sessions

The Arkime interface displays both Zeek logs and Arkime sessions alongside each other. Using fields common to both data sources, one can craft queries to filter results matching desired criteria.

A few fields of particular mention that help limit returned results to those Zeek logs and Arkime session records generated from the same network connection are Community ID (communityId and zeek.community_id in Arkime and Zeek, respectively) and Zeek's connection UID (zeek.uid), which Malcolm maps to Arkime's rootId field.

Community ID is specification for standard flow hashing published by Corelight with the intent of making it easier to pivot from one dataset (e.g., Arkime sessions) to another (e.g., Zeek conn.log entries). In Malcolm both Arkime and Zeek populate this value, which makes it possible to filter for a specific network connection and see both data sources' results for that connection.

The rootId field is used by Arkime to link session records together when a particular session has too many packets to be represented by a single session. When normalizing Zeek logs to Arkime's schema, Malcolm piggybacks on rootId to store Zeek's connection UID to crossreference entries across Zeek log types. The connection UID is also stored in zeek.uid.

Filtering on community ID OR'ed with zeek UID (e.g., communityId == "1:r7tGG//fXP1P0+BXH3zXETCtEFI=" || rootId == "CQcoro2z6adgtGlk42") is an effective way to see both the Arkime sessions and Zeek logs generated by a particular network connection.

Help

Click the icon of the owl 🦉 in the upper-left hand corner of to access the Arkime usage documentation (accessible at https://localhost/help if you are connecting locally), which includes such topics as search syntax, the Sessions view, SPIView, SPIGraph, and the Connections graph.

Sessions

The Sessions view provides low-level details of the sessions being investigated, whether they be Arkime sessions created from PCAP files or Zeek logs mapped to the Arkime session database schema.

The Sessions view contains many controls for filtering the sessions displayed from all sessions down to sessions of interest:

search bar: Indicated by the magnifying glass 🔍 icon, the search bar allows defining filters on session/log metadata
time bounding controls: The 🕘 , Start, End, Bounding, and Interval fields, and the date histogram can be used to visually zoom and pan the time range being examined.
search button: The Search button re-runs the sessions query with the filters currently specified.
views button: Indicated by the eyeball 👁 icon, views allow overlaying additional previously-specified filters onto the current sessions filters. For convenience, Malcolm provides several Arkime preconfigured views including several on the zeek.logType field.

map: A global map can be expanded by clicking the globe 🌎 icon. This allows filtering sessions by IP-based geolocation when possible.

Some of these filter controls are also available on other Arkime pages (such as SPIView, SPIGraph, Connections, and Hunt).

The number of sessions displayed per page, as well as the page currently displayed, can be specified using the paging controls underneath the time bounding controls.

The sessions table is displayed below the filter controls. This table contains the sessions/logs matching the specified filters.

To the left of the column headers are two buttons. The Toggle visible columns button, indicated by a grid ⊞ icon, allows toggling which columns are displayed in the sessions table. The Save or load custom column configuration button, indicated by a columns ◫ icon, allows saving the current displayed columns or loading previously-saved configurations. This is useful for customizing which columns are displayed when investigating different types of traffic. Column headers can also be clicked to sort the results in the table, and column widths may be adjusted by dragging the separators between column headers.

Details for individual sessions/logs can be expanded by clicking the plus ➕ icon on the left of each row. Each row may contain multiple sections and controls, depending on whether the row represents a Arkime session or a Zeek log. Clicking the field names and values in the details sections allows additional filters to be specified or summary lists of unique values to be exported.

When viewing Arkime session details (ie., a session generated from a PCAP file), an additional packets section will be visible underneath the metadata sections. When the details of a session of this type are expanded, Arkime will read the packet(s) comprising the session for display here. Various controls can be used to adjust how the packet is displayed (enabling natural decoding and enabling Show Images & Files may produce visually pleasing results), and other options (including PCAP download, carving images and files, applying decoding filters, and examining payloads in CyberChef) are available.

See also Arkime's usage documentation for more information on the Sessions view.

PCAP Export

Clicking the down arrow ▼ icon to the far right of the search bar presents a list of actions including PCAP Export (see Arkime's sessions help for information on the other actions). When full PCAP sessions are displayed, the PCAP Export feature allows you to create a new PCAP file from the matching Arkime sessions, including controls for which sessions are included (open items, visible items, or all matching items) and whether or not to include linked segments. Click Export PCAP button to generate the PCAP, after which you'll be presented with a browser download dialog to save or open the file. Note that depending on the scope of the filters specified this might take a long time (or, possibly even time out).

See the issues section of this document for an error that can occur using this feature when Zeek log sessions are displayed.View

SPIView

Arkime's SPI (Session Profile Information) View provides a quick and easy-to-use interface for exploring session/log metrics. The SPIView page lists categories for general session metrics (e.g., protocol, source and destination IP addresses, sort and destination ports, etc.) as well as for all of various types of network understood by Arkime and Zeek. These categories can be expanded and the top n values displayed, along with each value's cardinality, for the fields of interest they contain.

Click the the plus ➕ icon to the right of a category to expand it. The values for specific fields are displayed by clicking the field description in the field list underneath the category name. The list of field names can be filtered by typing part of the field name in the Search for fields to display in this category text input. The Load All and Unload All buttons can be used to toggle display of all of the fields belonging to that category. Once displayed, a field's name or one of its values may be clicked to provide further actions for filtering or displaying that field or its values. Of particular interest may be the Open [fieldname] SPI Graph option when clicking on a field's name. This will open a new tab with the SPI Graph (see below) populated with the field's top values.

Note that because the SPIView page can potentially run many queries, SPIView limits the search domain to seven days (in other words, seven indices, as each index represents one day's worth of data). When using SPIView, you will have best results if you limit your search time frame to less than or equal to seven days. This limit can be adjusted by editing the spiDataMaxIndices setting in config.ini and rebuilding the malcolmnetsec/arkime docker container.

See also Arkime's usage documentation for more information on SPIView.

SPIGraph

Arkime's SPI (Session Profile Information) Graph visualizes the occurrence of some field's top n values over time, and (optionally) geographically. This is particularly useful for identifying trends in a particular type of communication over time: traffic using a particular protocol when seen sparsely at regular intervals on that protocol's date histogram in the SPIGraph may indicate a connection check, polling, or beaconing (for example, see the llmnr protocol in the screenshot below).

Controls can be found underneath the time bounding controls for selecting the field of interest, the number of elements to be displayed, the sort order, and a periodic refresh of the data.

See also Arkime's usage documentation for more information on SPIGraph.

Connections

The Connections page presents network communications via a force-directed graph, making it easy to visualize logical relationships between network hosts.

Controls are available for specifying the query size (where smaller values will execute more quickly but may only contain an incomplete representation of the top n sessions, and larger values may take longer to execute but will be more complete), which fields to use as the source and destination for node values, a minimum connections threshold, and the method for determining the "weight" of the link between two nodes. As is the case with most other visualizations in Arkime, the graph is interactive: clicking on a node or the link between two nodes can be used to modify query filters, and the nodes themselves may be repositioned by dragging and dropping them. A node's color indicates whether it communicated as a source/originator, a destination/responder, or both.

While the default source and destination fields are Src IP and Dst IP:Dst Port, the Connections view is able to use any combination of any of the fields populated by Arkime and Zeek. For example:

Src OUI and Dst OUI (hardware manufacturers)
Src IP and Protocols
Originating Network Segment and Responding Network Segment (see CIDR subnet to network segment name mapping)
Originating GeoIP City and Responding GeoIP City

or any other combination of these or other fields.

See also Arkime's usage documentation for more information on the Connections graph.

Hunt

Arkime's Hunt feature allows an analyst to search within the packets themselves (including payload data) rather than simply searching the session metadata. The search string may be specified using ASCII (with or without case sensitivity), hex codes, or regular expressions. Once a hunt job is complete, matching sessions can be viewed in the Sessions view.

Clicking the Create a packet search job on the Hunt page will allow you to specify the following parameters for a new hunt job:

a packet search job name
a maximum number of packets to examine per session
the search string and its format (ascii, ascii (case sensitive), hex, regex, or hex regex)
whether to search source packets, destination packets, or both
whether to search raw or reassembled packets

Click the ➕ Create button to begin the search. Arkime will scan the source PCAP files from which the sessions were created according to the search criteria. Note that whatever filters were specified when the hunt job is executed will apply to the hunt job as well; the number of sessions matching the current filters will be displayed above the hunt job parameters with text like "ⓘ Creating a new packet search job will search the packets of # sessions."

Once a hunt job is submitted, it will be assigned a unique hunt ID (a long unique string of characters like yuBHAGsBdljYmwGkbEMm) and its progress will be updated periodically in the Hunt Job Queue with the execution percent complete, the number of matches found so far, and the other parameters with which the job was submitted. More details for the hunt job can be viewed by expanding its row with the plus ➕ icon on the left.

Once the hunt job is complete (and a minute or so has passed, as the huntId must be added to the matching session records in the database), click the folder 📂 icon on the right side of the hunt job row to open a new Sessions tab with the search bar prepopulated to filter to sessions with packets matching the search criteria.

From this list of filtered sessions you can expand session details and explore packet payloads which matched the hunt search criteria.

The hunt feature is available only for sessions created from full packet capture data, not Zeek logs. This being the case, it is a good idea to click the eyeball 👁 icon and select the PCAP Files view to exclude Zeek logs from candidate sessions prior to using the hunt feature.

See also Arkime's usage documentation for more information on the hunt feature.

Statistics

Arkime provides several other reports which show information about the state of Arkime and the underlying Elasticsearch database.

The Files list displays a list of PCAP files processed by Arkime, the date and time of the earliest packet in each file, and the file size:

The ES Indices list (available under the Stats page) lists the Elasticsearch indices within which log data is contained:

The History view provides a historical list of queries issues to Arkime and the details of those queries:

See also Arkime's usage documentation for more information on the Files list, statistics, and history.

Settings

General settings

The Settings page can be used to tweak Arkime preferences, defined additional custom views and column configurations, tweak the color theme, and more.

See Arkime's usage documentation for more information on settings.

Kibana

While Arkime provides very nice visualizations, especially for network traffic, Kibana (an open source general-purpose data visualization tool for Elasticsearch) can be used to create custom visualizations (tables, charts, graphs, dashboards, etc.) using the same data.

The Kibana container can be accessed at https://localhost/kibana/ if you are connecting locally. Several preconfigured dashboards for Zeek logs are included in Malcolm's Kibana configuration.

The official Kibana User Guide has excellent tutorials for a variety of topics.

Kibana has several components for data searching and visualization:

Discover

The Discover view enables you to view events on a record-by-record basis (similar to a session record in Arkime or an individual line from a Zeek log). See the official Kibana User Guide for information on using the Discover view:

Screenshots

Visualizations and dashboards

Prebuilt visualizations and dashboards

Malcolm comes with dozens of prebuilt visualizations and dashboards for the network traffic represented by each of the Zeek log types. Click Dashboard to see a list of these dashboards. As is the case with all Kibana's visualizations, all of the charts, graphs, maps, and tables are interactive and can be clicked on to narrow or expand the scope of the data you are investigating. Similarly, click Visualize to explore the prebuilt visualizations used to build the dashboards.

Many of Malcolm's prebuilt visualizations for Zeek logs are heavily inspired by the excellent Kibana Dashboards that are part of Security Onion.

Screenshots

Building your own visualizations and dashboards

See the official Kibana User Guide for information on creating your own visualizations and dashboards:

Getting Started: Visualizing Your Data
Visualize
Dashboard
Timelion (more advanced time series data visualizer)

Screenshots

Search Queries in Arkime and Kibana

Kibana supports two query syntaxes: the legacy Lucene syntax and the new Kibana Query Language (KQL), both of which are somewhat different than Arkime's query syntax (see the help at https://localhost/help#search if you are connecting locally). The Arkime interface is for searching and visualizing both Arkime sessions and Zeek logs. The prebuilt dashboards in the Kibana interface are for searching and visualizing Zeek logs, but will not include Arkime sessions. Here are some common patterns used in building search query strings for Arkime and Kibana, respectively. See the links provided for further documentation.

	Arkime Search String	Kibana Search String (Lucene)	Kibana Search String (KQL)
Field exists	`zeek.logType == EXISTS!`	`_exists_:zeek.logType`	`zeek.logType:*`
Field does not exist	`zeek.logType != EXISTS!`	`NOT _exists_:zeek.logType`	`NOT zeek.logType:*`
Field matches a value	`port.dst == 22`	`dstPort:22`	`dstPort:22`
Field does not match a value	`port.dst != 22`	`NOT dstPort:22`	`NOT dstPort:22`
Field matches at least one of a list of values	`tags == [external_source, external_destination]`	`tags:(external_source OR external_destination)`	`tags:(external_source or external_destination)`
Field range (inclusive)	`http.statuscode >= 200 && http.statuscode <= 300`	`http.statuscode:[200 TO 300]`	`http.statuscode >= 200 and http.statuscode <= 300`
Field range (exclusive)	`http.statuscode > 200 && http.statuscode < 300`	`http.statuscode:{200 TO 300}`	`http.statuscode > 200 and http.statuscode < 300`
Field range (mixed exclusivity)	`http.statuscode >= 200 && http.statuscode < 300`	`http.statuscode:[200 TO 300}`	`http.statuscode >= 200 and http.statuscode < 300`
Match all search terms (AND)	`(tags == [external_source, external_destination]) && (http.statuscode == 401)`	`tags:(external_source OR external_destination) AND http.statuscode:401`	`tags:(external_source or external_destination) and http.statuscode:401`
Match any search terms (OR)	`(zeek_ftp.password == EXISTS!)		(zeek_http.password == EXISTS!)
Global string search (anywhere in the document)	all Arkime search expressions are field-based	`microsoft`	`microsoft`
Wildcards	`host.dns == "micro?oft"` (`?` for single character, `*` for any characters)	`dns.host:micro?oft` (`?` for single character, `*` for any characters)	`dns.host:microft` (`` for any characters)
Regex	`host.http == /.www\.f.k\.com.*/`	`zeek_http.host:/.www\.f.k\.com.*/`	Kibana Query Language does not currently support regex
IPv4 values	`ip == 0.0.0.0/0`	`srcIp:"0.0.0.0/0" OR dstIp:"0.0.0.0/0"`	`srcIp:"0.0.0.0/0" OR dstIp:"0.0.0.0/0"`
IPv6 values	`(ip.src == EXISTS!		ip.dst == EXISTS!) && (ip != 0.0.0.0/0)`
GeoIP information available	`country == EXISTS!`	`_exists_:zeek.destination_geo OR _exists_:zeek.source_geo`	`zeek.destination_geo:* or zeek.source_geo:*`
Zeek log type	`zeek.logType == notice`	`zeek.logType:notice`	`zeek.logType:notice`
IP CIDR Subnets	`ip.src == 172.16.0.0/12`	`srcIp:"172.16.0.0/12"`	`srcIp:"172.16.0.0/12"`
Search time frame	Use Arkime time bounding controls under the search bar	Use Kibana time range controls in the upper right-hand corner	Use Kibana time range controls in the upper right-hand corner

When building complex queries, it is strongly recommended that you enclose search terms and expressions in parentheses to control order of operations.

As Zeek logs are ingested, Malcolm parses and normalizes the logs' fields to match Arkime's underlying Elasticsearch schema. A complete list of these fields can be found in the Arkime help (accessible at https://localhost/help#fields if you are connecting locally).

Whenever possible, Zeek fields are mapped to existing corresponding Arkime fields: for example, the orig_h field in Zeek is mapped to Arkime's srcIp field. The original Zeek fields are also left intact. To complicate the issue, the Arkime interface uses its own aliases to reference those fields: the source IP field is referenced as ip.src (Arkime's alias) in Arkime and srcIp or zeek.orig_h in Kibana.

The table below shows the mapping of some of these fields.

Field Description	Arkime Field Alias(es)	Arkime-mapped Zeek Field(s)	Zeek Field(s)
Community ID Flow Hash		`communityId`	`zeek.community_id`
Destination IP	`ip.dst`	`dstIp`	`zeek.resp_h`
Destination MAC	`mac.dst`	`dstMac`	`zeek.resp_l2_addr`
Destination Port	`port.dst`	`dstPort`	`zeek.resp_p`
Duration	`session.length`	`length`	`zeek_conn.duration`
First Packet Time	`starttime`	`firstPacket`	`zeek.ts`, `@timestamp`
IP Protocol	`ip.protocol`	`ipProtocol`	`zeek.proto`
Last Packet Time	`stoptime`	`lastPacket`
MIME Type	`email.bodymagic`, `http.bodymagic`	`http.bodyMagic`	`zeek.filetype`, `zeek_files.mime_type`, `zeek_ftp.mime_type`, `zeek_http.orig_mime_types`, `zeek_http.resp_mime_types`, `zeek_irc.dcc_mime_type`
Protocol/Service	`protocols`	`protocol`	`zeek.proto`, `zeek.service`
Request Bytes	`databytes.src`, `bytes.src`	`srcBytes`, `srcDataBytes`	`zeek_conn.orig_bytes`, `zeek_conn.orig_ip_bytes`
Request Packets	`packets.src`	`srcPackets`	`zeek_conn.orig_pkts`
Response Bytes	`databytes.dst`, `bytes.dst`	`dstBytes`, `dstDataBytes`	`zeek_conn.resp_bytes`, `zeek_conn.resp_ip_bytes`
Response Packets	`packets.dst`	`dstPackets`	`zeek_con.resp_pkts`
Source IP	`ip.src`	`srcIp`	`zeek.orig_h`
Source MAC	`mac.src`	`srcMac`	`zeek.orig_l2_addr`
Source Port	`port.src`	`srcPort`	`zeek.orig_p`
Total Bytes	`databytes`, `bytes`	`totDataBytes`, `totBytes`
Total Packets	`packets`	`totPackets`
Username	`user`	`user`	`zeek.user`
Zeek Connection UID			`zeek.uid`
Zeek File UID			`zeek.fuid`
Zeek Log Type			`zeek.logType`

In addition to the fields listed above, Arkime provides several special field aliases for matching any field of a particular type. While these aliases do not exist in Kibana per se, they can be approximated as illustrated below.

Matches Any	Arkime Special Field Example	Kibana/Zeek Equivalent Example
IP Address	`ip == 192.168.0.1`	`srcIp:192.168.0.1 OR dstIp:192.168.0.1`
Port	`port == [80, 443, 8080, 8443]`	`srcPort:(80 OR 443 OR 8080 OR 8443) OR dstPort:(80 OR 443 OR 8080 OR 8443)`
Country (code)	`country == [RU,CN]`	`zeek.destination_geo.country_code2:(RU OR CN) OR zeek.source_geo.country_code2:(RU OR CN) OR dns.GEO:(RU OR CN)`
Country (name)		`zeek.destination_geo.country_name:(Russia OR China) OR zeek.source_geo.country_name:(Russia OR China)`
ASN	`asn == "Mozilla"`	`srcASN:Mozilla OR dstASN:Mozilla OR dns.ASN:Mozilla`
Host	`host == www.microsoft.com`	`zeek_http.host:www.microsoft.com (or zeek_dhcp.host_name, zeek_dns.host, zeek_ntlm.host, smb.host, etc.)`
Protocol (layers >= 4)	`protocols == tls`	`protocol:tls`
User	`user == EXISTS! && user != anonymous`	`_exists_:user AND (NOT user:anonymous)`

For details on how to filter both Zeek logs and Arkime session records for a particular connection, see Correlating Zeek logs and Arkime sessions.

Other Malcolm features

Automatic file extraction and scanning

Malcolm can leverage Zeek's knowledge of network protocols to automatically detect file transfers and extract those files from PCAPs as Zeek processes them. This behavior can be enabled globally by modifying the ZEEK_EXTRACTOR_MODE environment variable in docker-compose.yml, or on a per-upload basis for PCAP files uploaded via the browser-based upload form when Analyze with Zeek is selected.

To specify which files should be extracted, the following values are acceptable in ZEEK_EXTRACTOR_MODE:

none: no file extraction
interesting: extraction of files with mime types of common attack vectors
mapped: extraction of files with recognized mime types
known: extraction of files for which any mime type can be determined
all: extract all files

Extracted files can be examined through any of the following methods:

submitting file hashes to VirusTotal; to enable this method, specify the VTOT_API2_KEY environment variable in docker-compose.yml
scanning files with ClamAV; to enable this method, set the EXTRACTED_FILE_ENABLE_CLAMAV environment variable in docker-compose.yml to true
scanning files with Yara; to enable this method, set the EXTRACTED_FILE_ENABLE_YARA environment variable in docker-compose.yml to true
scanning PE (portable executable) files with Capa; to enable this method, set the EXTRACTED_FILE_ENABLE_CAPA environment variable in docker-compose.yml to true

Files which are flagged via any of these methods will be logged as Zeek signatures.log entries, and can be viewed in the Signatures dashboard in Kibana.

The EXTRACTED_FILE_PRESERVATION environment variable in docker-compose.yml determines the behavior for preservation of Zeek-extracted files:

quarantined: preserve only flagged files in ./zeek-logs/extract_files/quarantine
all: preserve flagged files in ./zeek-logs/extract_files/quarantine and all other extracted files in ./zeek-logs/extract_files/preserved
none: preserve no extracted files

The EXTRACTED_FILE_HTTP_SERVER_... environment variables in docker-compose.yml configure access to the Zeek-extracted files path through the means of a simple HTTPS directory server. Beware that Zeek-extracted files may contain malware. As such, the files may be optionally encrypted upon download.

Automatic host and subnet name assignment

IP/MAC address to hostname mapping via `host-map.txt`

The host-map.txt file in the Malcolm installation directory can be used to define names for network hosts based on IP and/or MAC addresses in Zeek logs. The default empty configuration looks like this:

# IP or MAC address to host name map:
#   address|host name|required tag
#
# where:
#   address: comma-separated list of IPv4, IPv6, or MAC addresses
#          e.g., 172.16.10.41, 02:42:45:dc:a2:96, 2001:0db8:85a3:0000:0000:8a2e:0370:7334
#
#   host name: host name to be assigned when event address(es) match
#
#   required tag (optional): only check match and apply host name if the event
#                            contains this tag
#

Each non-comment line (not beginning with a #), defines an address-to-name mapping for a network host. For example:

127.0.0.1,127.0.1.1,::1|localhost|
192.168.10.10|office-laptop.intranet.lan|
06:46:0b:a6:16:bf|serial-host.intranet.lan|testbed

Each line consists of three |-separated fields: address(es), hostname, and, optionally, a tag which, if specified, must belong to a log for the matching to occur.

As Zeek logs are processed into Malcolm's Elasticsearch instance, the log's source and destination IP and MAC address fields (zeek.orig_h, zeek.resp_h, zeek.orig_l2_addr, and zeek.resp_l2_addr, respectively) are compared against the lists of addresses in host-map.txt. When a match is found, a new field is added to the log: zeek.orig_hostname or zeek.resp_hostname, depending on whether the matching address belongs to the originating or responding host. If the third field (the "required tag" field) is specified, a log must also contain that value in its tags field in addition to matching the IP or MAC address specified in order for the corresponding _hostname field to be added.

zeek.orig_hostname and zeek.resp_hostname may each contain multiple values. For example, if both a host's source IP address and source MAC address were matched by two different lines, zeek.orig_hostname would contain the hostname values from both matching lines.

CIDR subnet to network segment name mapping via `cidr-map.txt`

The cidr-map.txt file in the Malcolm installation directory can be used to define names for network segments based on IP addresses in Zeek logs. The default empty configuration looks like this:

# CIDR to network segment format:
#   IP(s)|segment name|required tag
#
# where:
#   IP(s): comma-separated list of CIDR-formatted network IP addresses
#          e.g., 10.0.0.0/8, 169.254.0.0/16, 172.16.10.41
#
#   segment name: segment name to be assigned when event IP address(es) match
#
#   required tag (optional): only check match and apply segment name if the event
#                            contains this tag
#

Each non-comment line (not beginning with a #), defines an subnet-to-name mapping for a network host. For example:

192.168.50.0/24,192.168.40.0/24,10.0.0.0/8|corporate|
192.168.100.0/24|control|
192.168.200.0/24|dmz|
172.16.0.0/12|virtualized|testbed

Each line consists of three |-separated fields: CIDR-formatted subnet IP range(s), subnet name, and, optionally, a tag which, if specified, must belong to a log for the matching to occur.

As Zeek logs are processed into Malcolm's Elasticsearch instance, the log's source and destination IP address fields (zeek.orig_h and zeek.resp_h, respectively) are compared against the lists of addresses in cidr-map.txt. When a match is found, a new field is added to the log: zeek.orig_segment or zeek.resp_segment, depending on whether the matching address belongs to the originating or responding host. If the third field (the "required tag" field) is specified, a log must also contain that value in its tags field in addition to its IP address falling within the subnet specified in order for the corresponding _segment field to be added.

zeek.orig_segment and zeek.resp_segment may each contain multiple values. For example, if cidr-map.txt specifies multiple overlapping subnets on different lines, zeek.orig_segment would contain the hostname values from both matching lines if zeek.orig_h belonged to both subnets.

If both zeek.orig_segment and zeek.resp_segment are added to a log, and if they contain different values, the tag cross_segment will be added to the log's tags field for convenient identification of cross-segment traffic. This traffic could be easily visualized using Arkime's Connections graph, by setting the Src: value to Originating Network Segment and the Dst: value to Responding Network Segment:

Defining hostname and CIDR subnet names interface

As an alternative to manually editing cidr-map.txt and host-map.txt, a Host and Subnet Name Mapping editor is available at https://localhost/name-map-ui/ if you are connecting locally. Upon loading, the editor is populated from cidr-map.txt, host-map.txt and net-map.json.

This editor provides the following controls:

🔎 Search mappings - narrow the list of visible items using a search filter
Type, Address, Name and Tag (column headings) - sort the list of items by clicking a column header
📝 (per item) - modify the selected item
🚫 (per item) - remove the selected item
🖳 host / 🖧 segment, Address, Name, Tag (optional) and 💾 - save the item with these values (either adding a new item or updating the item being modified)
📥 Import - clear the list and replace it with the contents of an uploaded net-map.json file
📤 Export - format and download the list as a net-map.json file
💾 Save Mappings - format and store net-map.json in the Malcolm directory (replacing the existing net-map.json file)
🔁 Restart Logstash - restart log ingestion, parsing and enrichment

Applying mapping changes

When changes are made to either cidr-map.txt, host-map.txt or net-map.json, Malcolm's Logstash container must be restarted. The easiest way to do this is to restart malcolm via restart (see Stopping and restarting Malcolm) or by clicking the 🔁 Restart Logstash button in the name mapping interface interface.

Restarting Logstash may take several minutes, after which log ingestion will be resumed.

Elasticsearch index management

See Index State Management in the Open Distro for Elasticsearch documentation on Index State Management policies, managed indices, settings and APIs.

Elasticsearch index management only deals with disk space consumed by Elasticsearch indices: it does not have anything to do with PCAP file storage. The MANAGE_PCAP_FILES environment variable in the docker-compose.yml file can be used to allow Arkime to prune old PCAP files based on available disk space.

Alerting

See Alerting in the Open Distro for Elasticsearch documentation.

When using an email account to send alerts, you must authenticate each sender account before you can send an email. The auth_setup script can be used to securely store the email account credentials:

./scripts/auth_setup 

Store administrator username/password for local Malcolm access? (Y/n): n

(Re)generate self-signed certificates for HTTPS access (Y/n): n

(Re)generate self-signed certificates for a remote log forwarder (Y/n): n

Store username/password for forwarding Logstash events to a secondary, external Elasticsearch instance (y/N): n

Store username/password for email alert sender account (y/N): y

Open Distro alerting destination name: destination_alpha

Email account username: [email protected]
[email protected] password: 
[email protected] password (again): 
Email alert sender account variables stored: opendistro.alerting.destination.email.destination_alpha.password, opendistro.alerting.destination.email.destination_alpha.username

This action should only be performed while Malcolm is stopped: otherwise the credentials will not be stored correctly.

"Best Guess" Fingerprinting for ICS Protocols

There are many ICS (industrial control systems) protocols. While Malcolm's collection of protocol parsers includes a number of them, many, particularly those that are proprietary or less common, are unlikely to be supported with a full parser in the foreseeable future.

In an effort to help identify more ICS traffic, Malcolm can use "buest guess" method based on transport protocol (e.g., TCP or UDP) and port(s) to categorize potential traffic communicating over some ICS protocols without full parser support. This feature involves a mapping table and a Zeek script to look up the transport protocol and destination and/or source port to make a best guess at whether a connection belongs to one of those protocols. These potential ICS communications are categorized by vendor where possible.

Naturally, these lookups could produce false positives, so these connections are displayed in their own dashboard (the Best Guess dashboard found under the ICS section of Malcolm's Kibana dashboards' navigation pane). Values such as IP addresses, ports, or UID can be used to pivot to other dashboards to investigate further.

This feature is disabled by default, but it can be enabled by clearing (setting to '') the value of the ZEEK_DISABLE_BEST_GUESS_ICS environment variable in docker-compose.yml.

Using Beats to forward host logs to Malcolm

Because Malcolm uses components of the open source data analysis platform Elastic Stack, it can accept various host logs sent from Beats, Elastic Stack's lightweight data shippers. See ./scripts/beats for more information.

Malcolm installer ISO

Malcolm's Docker-based deployment model makes Malcolm able to run on a variety of platforms. However, in some circumstances (for example, as a long-running appliance as part of a security operations center, or inside of a virtual machine) it may be desirable to install Malcolm as a dedicated standalone installation.

Malcolm can be packaged into an installer ISO based on the current stable release of Debian. This customized Debian installation is preconfigured with the bare minimum software needed to run Malcolm.

Generating the ISO

Official downloads of the Malcolm installer ISO are not provided: however, it can be built easily on an internet-connected Linux host running current versions of VirtualBox and Vagrant (with the vagrant-reload plugin).

To perform a clean build the Malcolm installer ISO, navigate to your local Malcolm working copy and run:

$ ./malcolm-iso/build_via_vagrant.sh -f
…
Starting build machine...
Bringing machine 'default' up with 'virtualbox' provider...
…

Building the ISO may take 30 minutes or more depending on your system. As the build finishes, you will see the following message indicating success:

…
Finished, created "/malcolm-build/malcolm-iso/malcolm-3.2.1.iso"
…

By default, Malcolm's Docker images are not packaged with the installer ISO, assuming instead that you will pull the latest images with a docker-compose pull command as described in the Quick start section. If you wish to build an ISO with the latest Malcolm images included, follow the directions to create pre-packaged installation files, which include a tarball with a name like malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz. Then, pass that images tarball to the ISO build script with a -d, like this:

$ ./malcolm-iso/build_via_vagrant.sh -f -d malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz
…

A system installed from the resulting ISO will load the Malcolm Docker images upon first boot. This method is desirable when the ISO is to be installed in an "air gapped" environment or for distribution to non-networked machines.

Installation

The installer is designed to require as little user input as possible. For this reason, there are NO user prompts and confirmations about partitioning and reformatting hard disks for use by the operating system. The installer assumes that all non-removable storage media (eg., SSD, HDD, NVMe, etc.) are available for use and ⛔ 🆘 😭 💀 will partition and format them without warning 💀 😭 🆘 ⛔ .

The installer will ask for several pieces of information prior to installing the Malcolm base operating system:

Hostname
Domain name
Root password – (optional) a password for the privileged root account which is rarely needed
User name: the name for the non-privileged service account user account under which the Malcolm runs
User password – a password for the non-privileged sensor account
Encryption password (optional) – if the encrypted installation option was selected at boot time, the encryption password must be entered every time the system boots

At the end of the installation process, you will be prompted with a few self-explanatory yes/no questions:

Disable IPv6?
Automatically login to the GUI session?
Should the GUI session be locked due to inactivity?
Display the Standard Mandatory DoD Notice and Consent Banner? (only applies when installed on U.S. government information systems)

Following these prompts, the installer will reboot and the Malcolm base operating system will boot.

Setup

When the system boots for the first time, the Malcolm Docker images will load if the installer was built with pre-packaged installation files as described above. Wait for this operation to continue (the progress dialog will disappear when they have finished loading) before continuing the setup.

Open a terminal (click the red terminal 🗔 icon next to the Debian swirl logo 🍥 menu button in the menu bar). At this point, setup is similar to the steps described in the Quick start section. Navigate to the Malcolm directory (cd ~/Malcolm) and run auth_setup to configure authentication. If the ISO didn't have pre-packaged Malcolm images, or if you'd like to retrieve the latest updates, run docker-compose pull. Finalize your configuration by running scripts/install.py --configure and follow the prompts as illustrated in the installation example.

Once Malcolm is configured, you can start Malcolm via the command line or by clicking the circular yellow Malcolm icon in the menu bar.

Time synchronization

If you wish to set up time synchronization via NTP or htpdate, open a terminal and run sudo configure-interfaces.py. Select Continue, then choose Time Sync. Here you can configure the operating system to keep its time synchronized with either an NTP server (using the NTP protocol), another Malcolm instance, or another HTTP/HTTPS server. On the next dialog, choose the time synchronization method you wish to configure.

If htpdate is selected, you will be prompted to enter the IP address or hostname and port of an HTTP/HTTPS server (for a Malcolm instance, port 9200 may be used) and the time synchronization check frequency in minutes. A test connection will be made to determine if the time can be retrieved from the server.

If ntpdate is selected, you will be prompted to enter the IP address or hostname of the NTP server.

Upon configuring time synchronization, a "Time synchronization configured successfully!" message will be displayed.

Hardening

The Malcolm aggregator base operating system targets the following guidelines for establishing a secure configuration posture:

DISA STIG (Security Technical Implementation Guides) ported from DISA RHEL 7 STIG v1r1 to a Debian 9 base platform
CIS Debian Linux 9 Benchmark with additional recommendations by the hardenedlinux/harbian-audit project

STIG compliance exceptions

Currently there are 158 compliance checks that can be verified automatically and 23 compliance checks that must be verified manually.

The Malcolm aggregator base operating system claims the following exceptions to STIG compliance:

#	ID	Title	Justification
1	SV-86535r1	When passwords are changed a minimum of eight of the total number of characters must be changed.	Account/password policy exception: As an aggregator running Malcolm is intended to be used as an appliance rather than a general user-facing software platform, some exceptions to password enforcement policies are claimed.
2	SV-86537r1	When passwords are changed a minimum of four character classes must be changed.	Account/password policy exception
3	SV-86549r1	Passwords for new users must be restricted to a 24 hours/1 day minimum lifetime.	Account/password policy exception
4	SV-86551r1	Passwords must be restricted to a 24 hours/1 day minimum lifetime.	Account/password policy exception
5	SV-86553r1	Passwords for new users must be restricted to a 60-day maximum lifetime.	Account/password policy exception
6	SV-86555r1	Existing passwords must be restricted to a 60-day maximum lifetime.	Account/password policy exception
7	SV-86557r1	Passwords must be prohibited from reuse for a minimum of five generations.	Account/password policy exception
8	SV-86565r1	The operating system must disable account identifiers (individuals, groups, roles, and devices) if the password expires.	Account/password policy exception
9	SV-86567r2	Accounts subject to three unsuccessful logon attempts within 15 minutes must be locked for the maximum configurable period.	Account/password policy exception
10	SV-86569r1	If three unsuccessful root logon attempts within 15 minutes occur the associated account must be locked.	Account/password policy exception
11	SV-86603r1	The … operating system must prevent the installation of software, patches, service packs, device drivers, or operating system components of local packages without verification they have been digitally signed using a certificate that is issued by a Certificate Authority (CA) that is recognized and approved by the organization.	As the base distribution is not using embedded signatures, `debsig-verify` would reject all packages (see comment in `/etc/dpkg/dpkg.cfg`). Enabling it after installation would disallow any future updates.
12	SV-86607r1	USB mass storage must be disabled.	The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system.
13	SV-86609r1	File system automounter must be disabled unless required.	The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system.
14	SV-86705r1	The operating system must shut down upon audit processing failure, unless availability is an overriding concern. If availability is a concern, the system must alert the designated staff (System Administrator [SA] and Information System Security Officer [ISSO] at a minimum) in the event of an audit processing failure.	As maximizing availability is a system requirement, audit processing failures will be logged on the device rather than halting the system.
15	SV-86713r1	The operating system must immediately notify the System Administrator (SA) and Information System Security Officer ISSO (at a minimum) when allocated audit record storage volume reaches 75% of the repository maximum audit record storage capacity.	same as above
16	SV-86715r1	The operating system must immediately notify the System Administrator (SA) and Information System Security Officer (ISSO) (at a minimum) when the threshold for the repository maximum audit record storage capacity is reached.	same as above
17	SV-86597r1	A file integrity tool must verify the baseline operating system configuration at least weekly.	This functionality is not configured by default, but it could be configured post-install using Auditbeat or `aide`
18	SV-86697r2	The file integrity tool must use FIPS 140-2 approved cryptographic hashes for validating file contents and directories.	same as above
19	SV-86707r1	The operating system must off-load audit records onto a different system or media from the system being audited.	same as above
20	SV-86709r1	The operating system must encrypt the transfer of audit records off-loaded onto a different system or media from the system being audited.	same as above
21	SV-86833r1	The system must send rsyslog output to a log aggregation server.	same as above
22	SV-87815r2	The audit system must take appropriate action when there is an error sending audit records to a remote system.	same as above
23	SV-86693r2	The file integrity tool must be configured to verify Access Control Lists (ACLs).	As this is not a multi-user system, the ACL check would be irrelevant.
24	SV-86837r1	The system must use and update a DoD-approved virus scan program.	As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary.
25	SV-86839r1	The system must update the virus scan program every seven days or more frequently.	As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary.
26	SV-86847r2	All network connections associated with a communication session must be terminated at the end of the session or after 10 minutes of inactivity from the user at a command prompt, except to fulfill documented and validated mission requirements.	Malcolm be controlled from the command line in a manual capture scenario, so timing out a session based on command prompt inactivity would be inadvisable.
27	SV-86893r2	The operating system must, for networked systems, synchronize clocks with a server that is synchronized to one of the redundant United States Naval Observatory (USNO) time servers, a time server designated for the appropriate DoD network (NIPRNet/SIPRNet), and/or the Global Positioning System (GPS).	While time synchronization is supported on the Malcolm aggregator base operating system, an exception is claimed for this rule as the device may be configured to sync to servers other than the ones listed in the STIG.
28	SV-86905r1	For systems using DNS resolution, at least two name servers must be configured.	STIG recommendations for DNS servers are not enforced on the Malcolm aggregator base operating system to allow for use in a variety of network scenarios.
29	SV-86919r1	Network interfaces must not be in promiscuous mode.	One purpose of the Malcolm aggregator base operating system is to sniff and capture network traffic.
30	SV-86931r2	An X Windows display manager must not be installed unless approved.	A locked-down X Windows session is required for the sensor's kiosk display.
31	SV-86519r3	The operating system must set the idle delay setting for all connection types.	As this is a network traffic aggregation and analysis appliance rather than an end-user device, timing out displays or connections would not be desirable.
32	SV-86523r1	The operating system must initiate a session lock for the screensaver after a period of inactivity for graphical user interfaces.	This option is configurable during install time. Some installations of the Malcolm aggregator base operating system may be on appliance hardware not equipped with a keyboard by default, in which case it may not be desirable to lock the session.
33	SV-86525r1	The operating system must initiate a session lock for graphical user interfaces when the screensaver is activated.	This option is configurable during install time. Some installations of the Malcolm aggregator base operating system may be on appliance hardware not equipped with a keyboard by default, in which case it may not be desirable to lock the session.
34	SV-86589r1	The operating system must uniquely identify and must authenticate organizational users (or processes acting on behalf of organizational users) using multifactor authentication.	As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable.
35	SV-86921r2	The system must be configured to prevent unrestricted mail relaying.	Does not apply as the Malcolm aggregator base operating system not does run a mail service.
36	SV-86929r1	If the Trivial File Transfer Protocol (TFTP) server is required, the TFTP daemon must be configured to operate in secure mode.	Does not apply as the Malcolm aggregator base operating system does not run a TFTP server.
37	SV-86935r3	The Network File System (NFS) must be configured to use RPCSEC_GSS.	Does not apply as the Malcolm aggregator base operating system does not run an NFS server.
38	SV-87041r2	The operating system must have the required packages for multifactor authentication installed.	As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable.
39	SV-87051r2	The operating system must implement multifactor authentication for access to privileged accounts via pluggable authentication modules (PAM).	As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable.
40	SV-87059r2	The operating system must implement smart card logons for multifactor authentication for access to privileged accounts.	As this is a network traffic capture appliance rather than an end-user device or a multiuser network host, this requirement is not applicable.
41	SV-87829r1	Wireless network adapters must be disabled.	As an appliance intended to capture network traffic in a variety of network environments, wireless adapters may be needed to capture and/or report wireless traffic.
42	SV-86699r1	The system must not allow removable media to be used as the boot loader unless approved.	the Malcolm aggregator base operating system supports a live boot mode that can be booted from removable media.

Please review the notes for these additional rules. While not claiming an exception, they may be implemented or checked in a different way than outlined by the RHEL STIG as the Malcolm aggregator base operating system is not built on RHEL or for other reasons.

#	ID	Title	Note
1	SV-86585r1	Systems with a Basic Input/Output System (BIOS) must require authentication upon booting into single-user and maintenance modes.	Although the compliance check script does not detect it, booting into recovery mode does in fact require the root password.
2	SV-86587r1	Systems using Unified Extensible Firmware Interface (UEFI) must require authentication upon booting into single-user and maintenance modes.	Although the compliance check script does not detect it, booting into recovery mode does in fact require the root password.
3	SV-86651r1	All files and directories contained in local interactive user home directories must have mode 0750 or less permissive.	Depending on when the compliance check script is run, some ephemeral files may exist in the service account's home directory which will cause this check to fail. For practical purposes the Malcolm aggregator base operating system's configuration does, however, comply.
4	SV-86623r3	Vendor packaged system security patches and updates must be installed and up to date.	When the the Malcolm aggregator base operating system sensor appliance software is built, all of the latest applicable security patches and updates are included in it. How future updates are to be handled is still in design.
6	SV-86691r2	The operating system must implement NIST FIPS-validated cryptography for the following: to provision digital signatures, to generate cryptographic hashes, and to protect data requiring data-at-rest protections in accordance with applicable federal laws, Executive Orders, directives, policies, regulations, and standards.	the Malcolm aggregator base operating system does use FIPS-compatible libraries for cryptographic functions. However, the kernel parameter being checked by the compliance check script is incompatible with some of the systems initialization scripts.

In addition, DISA STIG rules SV-86663r1, SV-86695r2, SV-86759r3, SV-86761r3, SV-86763r3, SV-86765r3, SV-86595r1, and SV-86615r2 relate to the SELinux kernel which is not used in the Malcolm aggregator base operating system, and are thus skipped.

CIS benchmark compliance exceptions

Currently there are 271 checks to determine compliance with the CIS Debian Linux 9 Benchmark.

The Malcolm aggregator base operating system claims exceptions from the recommendations in this benchmark in the following categories:

1.1 Install Updates, Patches and Additional Security Software - When the the Malcolm aggregator appliance software is built, all of the latest applicable security patches and updates are included in it. How future updates are to be handled is still in design.

1.3 Enable verify the signature of local packages - As the base distribution is not using embedded signatures, debsig-verify would reject all packages (see comment in /etc/dpkg/dpkg.cfg). Enabling it after installation would disallow any future updates.

2.14 Add nodev option to /run/shm Partition, 2.15 Add nosuid Option to /run/shm Partition, 2.16 Add noexec Option to /run/shm Partition - The Malcolm aggregator base operating system does not mount /run/shm as a separate partition, so these recommendations do not apply.

2.18 Disable Mounting of cramfs Filesystems, 2.19 Disable Mounting of freevxfs Filesystems, 2.20 Disable Mounting of jffs2 Filesystems, 2.21 Disable Mounting of hfs Filesystems, 2.22 Disable Mounting of hfsplus Filesystems, 2.23 Disable Mounting of squashfs Filesystems, 2.24 Disable Mounting of udf Filesystems - The Malcolm aggregator base operating system is not compiling a custom Linux kernel, so these filesystems are inherently supported as they are part Debian Linux's default kernel.

4.6 Disable USB Devices - The ability to ingest data (such as PCAP files) from a mounted USB mass storage device is a requirement of the system.

6.1 Ensure the X Window system is not installed, 6.2 Ensure Avahi Server is not enabled, 6.3 Ensure print server is not enabled - An X Windows session is provided for displaying dashboards. The library packages libavahi-common-data, libavahi-common3, and libcups2 are dependencies of some of the X components used by the Malcolm aggregator base operating system, but the avahi and cups services themselves are disabled.

6.17 Ensure virus scan Server is enabled, 6.18 Ensure virus scan Server update is enabled - As this is a network traffic analysis appliance rather than an end-user device, regular user files will not be created. A virus scan program would impact device performance and would be unnecessary.

7.2.4 Log Suspicious Packets, 7.2.7 Enable RFC-recommended Source Route Validation, 7.4.1 Install TCP Wrappers - As Malcolm may operate as a network traffic capture appliance sniffing packets on a network interface configured in promiscuous mode, these recommendations do not apply.

8.4.1 Install aide package and 8.4.2 Implement Periodic Execution of File Integrity - This functionality is not configured by default, but it could be configured post-install using Auditbeat or aide.

8.1.1.2 Disable System on Audit Log Full, 8.1.1.3 Keep All Auditing Information, 8.1.1.5 Ensure set remote_server for audit service, 8.1.1.6 Ensure enable_krb5 set to yes for remote audit service, 8.1.1.7 Ensure set action for audit storage volume is fulled, 8.1.1.9 Set space left for auditd service, a few other audit-related items under section 8.1, 8.2.5 Configure rsyslog to Send Logs to a Remote Log Host - As maximizing availability is a system requirement, audit processing failures will be logged on the device rather than halting the system. auditd is set up to syslog when its local storage capacity is reached.

Password-related recommendations under 9.2 and 10.1 - The library package libpam-pwquality is used in favor of libpam-cracklib which is what the compliance scripts are looking for. Also, as an appliance running Malcolm is intended to be used as an appliance rather than a general user-facing software platform, some exceptions to password enforcement policies are claimed.

9.3.13 Limit Access via SSH - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root and an aggregator service account are used. SSH access for root is disabled. SSH login with a password is also disallowed: only key-based authentication is accepted. The service account accepts no keys by default. As such, the AllowUsers, AllowGroups, DenyUsers, and DenyGroups values in sshd_config do not apply.

9.5 Restrict Access to the su Command - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root and an aggregator service account are used.

10.1.10 Set maxlogins for all accounts and 10.5 Set Timeout on ttys - The Malcolm aggregator base operating system does not create multiple regular user accounts: only root and an aggregator service account are used.

12.10 Find SUID System Executables, 12.11 Find SGID System Executables - The few files found by these scripts are valid exceptions required by the Malcolm aggregator base operating system's core requirements.

Please review the notes for these additional guidelines. While not claiming an exception, the Malcolm aggregator base operating system may implement them in a manner different than is described by the CIS Debian Linux 9 Benchmark or the hardenedlinux/harbian-audit audit scripts.

4.1 Restrict Core Dumps - The Malcolm aggregator base operating system disables core dumps using a configuration file for ulimit named /etc/security/limits.d/limits.conf. The audit script checking for this does not check the limits.d subdirectory, which is why this is incorrectly flagged as noncompliant.

5.4 Ensure ctrl-alt-del is disabled - The Malcolm aggregator base operating system disables the ctrl+alt+delete key sequence by executing systemctl disable ctrl-alt-del.target during installation and the command systemctl mask ctrl-alt-del.target at boot time.

6.19 Configure Network Time Protocol (NTP) - While time synchronization is supported on the Malcolm aggregator base operating system, an exception is claimed for this rule as the network sensor device may be configured to sync to servers in a different way than specified in the benchmark.

7.4.4 Create /etc/hosts.deny, 7.7.1 Ensure Firewall is active, 7.7.4.1 Ensure default deny firewall policy, 7.7.4.3 Ensure default deny firewall policy, 7.7.4.4 Ensure outbound and established connections are configured - The Malcolm aggregator base operating system is configured with an appropriately locked-down software firewall (managed by "Uncomplicated Firewall" ufw). However, the methods outlined in the CIS benchmark recommendations do not account for this configuration.

8.7 Verifies integrity all packages - The script which verifies package integrity only "fails" because of missing (status ??5?????? displayed by the utility) language ("locale") files, which are removed as part of the Malcolm aggregator base operating system's trimming-down process. All non-locale-related system files pass intergrity checks.

Known issues

PCAP file export error when Zeek logs are in Arkime search results

Arkime has a nice feature that allows you to export PCAP files matching the filters currently populating the search field. However, Arkime viewer will raise an exception if records created from Zeek logs are found among the search results to be exported. For this reason, if you are using the export PCAP feature it is recommended that you apply the PCAP Files view to filter your search results prior to doing the export.

Manual Kibana index pattern refresh

Because some fields are created in Elasticsearch dynamically when Zeek logs are ingested by Logstash, they may not have been present when Kibana configures its index pattern field mapping during initialization. As such, those fields will not show up in Kibana visualizations until Kibana’s copy of the field list is refreshed. Malcolm periodically refreshes this list, but if fields are missing from your visualizations you may wish to do it manually.

After Malcolm ingests your data (or, more specifically, after it has ingested a new log type it has not seen before) you may manually refresh Kibana’s field list by clicking Management → Index Patterns, then selecting the sessions2-* index pattern and clicking the reload 🗘 button near the upper-right of the window.

Installation example using Ubuntu 20.04 LTS

Here's a step-by-step example of getting Malcolm from GitHub, configuring your system and your Malcolm instance, and running it on a system running Ubuntu Linux. Your mileage may vary depending on your individual system configuration, but this should be a good starting point.

The commands in this example should be executed as a non-root user.

You can use git to clone Malcolm into a local working copy, or you can download and extract the artifacts from the latest release.

To install Malcolm from the latest Malcolm release, browse to the Malcolm releases page on GitHub and download at a minimum install.py and the malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz file, then navigate to your downloads directory:

[email protected]:~$ cd Downloads/
[email protected]:~/Downloads$ ls
malcolm_common.py install.py  malcolm_20190611_095410_ce2d8de.tar.gz

If you are obtaining Malcolm using git instead, run the following command to clone Malcolm into a local working copy:

[email protected]:~$ git clone https://github.com/cisagov/Malcolm
Cloning into 'Malcolm'...
remote: Enumerating objects: 443, done.
remote: Counting objects: 100% (443/443), done.
remote: Compressing objects: 100% (310/310), done.
remote: Total 443 (delta 81), reused 441 (delta 79), pack-reused 0
Receiving objects: 100% (443/443), 6.87 MiB | 18.86 MiB/s, done.
Resolving deltas: 100% (81/81), done.

[email protected]:~$ cd Malcolm/

Next, run the install.py script to configure your system. Replace user in this example with your local account username, and follow the prompts. Most questions have an acceptable default you can accept by pressing the Enter key. Depending on whether you are installing Malcolm from the release tarball or inside of a git working copy, the questions below will be slightly different, but for the most part are the same.

[email protected]:~/Downloads$ sudo ./install.py
Installing required packages: ['apache2-utils', 'make', 'openssl']

"docker info" failed, attempt to install Docker? (Y/n): y

Attempt to install Docker using official repositories? (Y/n): y
Installing required packages: ['apt-transport-https', 'ca-certificates', 'curl', 'gnupg-agent', 'software-properties-common']
Installing docker packages: ['docker-ce', 'docker-ce-cli', 'containerd.io']
Installation of docker packages apparently succeeded

Add a non-root user to the "docker" group? (y/n): y

Enter user account: user

Add another non-root user to the "docker" group? (y/n): n

"docker-compose version" failed, attempt to install docker-compose? (Y/n): y

Install docker-compose directly from docker github? (Y/n): y
Download and installation of docker-compose apparently succeeded


fs.file-max increases allowed maximum for file handles
fs.file-max= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y

fs.inotify.max_user_watches increases allowed maximum for monitored files
fs.inotify.max_user_watches= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y

fs.inotify.max_queued_events increases queue size for monitored files
fs.inotify.max_queued_events= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y

fs.inotify.max_user_instances increases allowed maximum monitor file watchers
fs.inotify.max_user_instances= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


vm.max_map_count increases allowed maximum for memory segments
vm.max_map_count= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


net.core.somaxconn increases allowed maximum for socket connections
net.core.somaxconn= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


vm.swappiness adjusts the preference of the system to swap vs. drop runtime memory pages
vm.swappiness= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


vm.dirty_background_ratio defines the percentage of system memory fillable with "dirty" pages before flushing
vm.dirty_background_ratio= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


vm.dirty_ratio defines the maximum percentage of dirty system memory before committing everything
vm.dirty_ratio= appears to be missing from /etc/sysctl.conf, append it? (Y/n): y


/etc/security/limits.d/limits.conf increases the allowed maximums for file handles and memlocked segments
/etc/security/limits.d/limits.conf does not exist, create it? (Y/n): y

At this point, if you are installing from the a release tarball you will be asked if you would like to extract the contents of the tarball and to specify the installation directory:

Extract Malcolm runtime files from /home/user/Downloads/malcolm_20190611_095410_ce2d8de.tar.gz (Y/n): y

Enter installation path for Malcolm [/home/user/Downloads/malcolm]: /home/user/Malcolm
Malcolm runtime files extracted to /home/user/Malcolm

Alternatively, if you are configuring Malcolm from within a git working copy, install.py will now exit. Run install.py again like you did at the beginning of the example, only remove the sudo and add --configure to run install.py in "configuration only" mode.

[email protected]:~/Malcolm$ ./scripts/install.py --configure

Now that any necessary system configuration changes have been made, the local Malcolm instance will be configured:

Malcolm processes will run as UID 1000 and GID 1000. Is this OK? (Y/n): 

Setting 10g for Elasticsearch and 3g for Logstash. Is this OK? (Y/n): y

Restart Malcolm upon system or Docker daemon restart? (y/N): y

Select Malcolm restart behavior ('no', 'on-failure', 'always', 'unless-stopped'): unless-stopped

Authenticate against Lightweight Directory Access Protocol (LDAP) server? (y/N): n

Configure snapshot repository for Elasticsearch index state management? (y/N): n

Store snapshots locally in /home/user/Malcolm/elasticsearch-backup? (Y/n): y

Automatically analyze all PCAP files with Zeek? (y/N): y

Perform reverse DNS lookup locally for source and destination IP addresses in Zeek logs? (y/N): n

Perform hardware vendor OUI lookups for MAC addresses? (Y/n): y

Expose Logstash port to external hosts? (y/N): n

Forward Logstash logs to external Elasticstack instance? (y/N): n

Enable file extraction with Zeek? (y/N): y

Select file extraction behavior ('none', 'known', 'mapped', 'all', 'interesting'): interesting

Select file preservation behavior ('quarantined', 'all', 'none'): quarantined

Scan extracted files with ClamAV? (y/N): y

Scan extracted files with Yara? (y/N): y

Scan extracted PE files with Capa? (y/N): y

Lookup extracted file hashes with VirusTotal? (y/N): n

Download updated scanner signatures periodically? (Y/n): y

Should Malcolm capture network traffic to PCAP files? (y/N): y

Specify capture interface(s) (comma-separated): eth0

Capture packets using netsniff-ng? (Y/n): y

Capture packets using tcpdump? (y/N): n

Malcolm has been installed to /home/user/Malcolm. See README.md for more information.
Scripts for starting and stopping Malcolm and changing authentication-related settings can be found
in /home/user/Malcolm/scripts.

At this point you should reboot your computer so that the new system settings can be applied. After rebooting, log back in and return to the directory to which Malcolm was installed (or to which the git working copy was cloned).

Now we need to set up authentication and generate some unique self-signed SSL certificates. You can replace analyst in this example with whatever username you wish to use to log in to the Malcolm web interface.

[email protected]:~/Malcolm$ ./scripts/auth_setup
Store administrator username/password for local Malcolm access? (Y/n): 

Administrator username: analyst
analyst password: 
analyst password (again): 

(Re)generate self-signed certificates for HTTPS access (Y/n): 

(Re)generate self-signed certificates for a remote log forwarder (Y/n): 

Store username/password for forwarding Logstash events to a secondary, external Elasticsearch instance (y/N): 

Store username/password for email alert sender account (y/N):

For now, rather than build Malcolm from scratch, we'll pull images from Docker Hub:

[email protected]:~/Malcolm$ docker-compose pull
Pulling elasticsearch ... done
Pulling file-monitor  ... done
Pulling filebeat      ... done
Pulling freq          ... done
Pulling htadmin       ... done
Pulling kibana        ... done
Pulling logstash      ... done
Pulling arkime        ... done
Pulling name-map-ui   ... done
Pulling nginx-proxy   ... done
Pulling pcap-capture  ... done
Pulling pcap-monitor  ... done
Pulling upload        ... done
Pulling zeek          ... done

[email protected]:~/Malcolm$ docker images
REPOSITORY                                          TAG                 IMAGE ID            CREATED             SIZE
malcolmnetsec/arkime                                3.2.1               xxxxxxxxxxxx        39 hours ago        683MB
malcolmnetsec/elasticsearch-od                      3.2.1               xxxxxxxxxxxx        40 hours ago        690MB
malcolmnetsec/file-monitor                          3.2.1               xxxxxxxxxxxx        39 hours ago        470MB
malcolmnetsec/file-upload                           3.2.1               xxxxxxxxxxxx        39 hours ago        199MB
malcolmnetsec/filebeat-oss                          3.2.1               xxxxxxxxxxxx        39 hours ago        555MB
malcolmnetsec/freq                                  3.2.1               xxxxxxxxxxxx        39 hours ago        390MB
malcolmnetsec/htadmin                               3.2.1               xxxxxxxxxxxx        39 hours ago        180MB
malcolmnetsec/kibana-helper                         3.2.1               xxxxxxxxxxxx        40 hours ago        141MB
malcolmnetsec/kibana-od                             3.2.1               xxxxxxxxxxxx        40 hours ago        1.16GB
malcolmnetsec/logstash-oss                          3.2.1               xxxxxxxxxxxx        39 hours ago        1.41GB
malcolmnetsec/name-map-ui                           3.2.1               xxxxxxxxxxxx        39 hours ago        137MB
malcolmnetsec/nginx-proxy                           3.2.1               xxxxxxxxxxxx        39 hours ago        120MB
malcolmnetsec/pcap-capture                          3.2.1               xxxxxxxxxxxx        39 hours ago        111MB
malcolmnetsec/pcap-monitor                          3.2.1               xxxxxxxxxxxx        39 hours ago        157MB
malcolmnetsec/zeek                                  3.2.1               xxxxxxxxxxxx        39 hours ago        887MB

Finally, we can start Malcolm. When Malcolm starts it will stream informational and debug messages to the console. If you wish, you can safely close the console or use Ctrl+C to stop these messages; Malcolm will continue running in the background.

[email protected]:~/Malcolm$ ./scripts/start
Creating network "malcolm_default" with the default driver
Creating malcolm_elasticsearch_1 ... done
Creating malcolm_file-monitor_1  ... done
Creating malcolm_filebeat_1      ... done
Creating malcolm_freq_1          ... done
Creating malcolm_htadmin_1       ... done
Creating malcolm_kibana_1        ... done
Creating malcolm_logstash_1      ... done
Creating malcolm_name-map-ui_1   ... done
Creating malcolm_arkime_1        ... done
Creating malcolm_nginx-proxy_1   ... done
Creating malcolm_pcap-capture_1  ... done
Creating malcolm_pcap-monitor_1  ... done
Creating malcolm_upload_1        ... done
Creating malcolm_zeek_1          ... done

In a few minutes, Malcolm services will be accessible via the following URLs:
------------------------------------------------------------------------------
  - Arkime: https://localhost/
  - Kibana: https://localhost/kibana/
  - PCAP Upload (web): https://localhost/upload/
  - PCAP Upload (sftp): sftp://[email protected]:8022/files/
  - Host and subnet name mapping editor: https://localhost/name-map-ui/
  - Account management: https://localhost:488/
…
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
…
Attaching to malcolm_elasticsearch_1, malcolm_file-monitor_1, malcolm_filebeat_1, malcolm_freq_1, malcolm_htadmin_1, malcolm_kibana_1, malcolm_logstash_1, malcolm_name-map-ui_1, malcolm_arkime_1, malcolm_nginx-proxy_1, malcolm_pcap-capture_1, malcolm_pcap-monitor_1, malcolm_upload_1, malcolm_zeek_1
…

It will take several minutes for all of Malcolm's components to start up. Logstash will take the longest, probably 3 to 5 minutes. You'll know Logstash is fully ready when you see Logstash spit out a bunch of starting up messages, ending with this:

[]} logstash_1 | [2019-06-11T15:45:42,599][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600} … ">

…
logstash_1  | [2019-06-11T15:45:42,009][INFO ][logstash.agent    ] Pipelines running {:count=>4, :running_pipelines=>[:"malcolm-output", :"malcolm-input", :"malcolm-zeek", :"malcolm-enrichment"], :non_running_pipelines=>[]}
logstash_1  | [2019-06-11T15:45:42,599][INFO ][logstash.agent    ] Successfully started Logstash API endpoint {:port=>9600}
…

You can now open a web browser and navigate to one of the Malcolm user interfaces.

Upgrading Malcolm

At this time there is not an "official" upgrade procedure to get from one version of Malcolm to the next, as it may vary from platform to platform. However, the process is fairly simple can be done by following these steps:

Update the underlying system

You may wish to get the official updates for the underlying system's software packages before you proceed. Consult the documentation of your operating system for how to do this.

If you are upgrading an Malcolm instance installed from Malcolm installation ISO, follow scenario 2 below. Due to the Malcolm base operating system's hardened configuration, when updating the underlying system, temporarily set the umask value to Debian default (umask 0022 in the root shell in which updates are being performed) instead of the more restrictive Malcolm default. This will allow updates to be applied with the right permissions.

Scenario 1: Malcolm is a GitHub clone

If you checked out a working copy of the Malcolm repository from GitHub with a git clone command, here are the basic steps to performing an upgrade:

stop Malcolm
- ./scripts/stop
stash changes to docker-compose.yml and other files
- git stash save "pre-upgrade Malcolm configuration changes"
pull changes from GitHub repository
- git pull --rebase
pull new Docker images (this will take a while)
- docker-compose pull
apply saved configuration change stashed earlier
- git stash pop
if you see Merge conflict messages, resolve the conflicts with your favorite text editor
you may wish to re-run install.py --configure as described in System configuration and tuning in case there are any new docker-compose.yml parameters for Malcolm that need to be set up
start Malcolm
- ./scripts/start
you may be prompted to configure authentication if there are new authentication-related files that need to be generated
- you probably do not need to re-generate self-signed certificates

Scenario 2: Malcolm was installed from a packaged tarball

If you installed Malcolm from pre-packaged installation files, here are the basic steps to perform an upgrade:

stop Malcolm
- ./scripts/stop
uncompress the new pre-packaged installation files (using malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz as an example, the file and/or directory names will be different depending on the release)
- tar xf malcolm_YYYYMMDD_HHNNSS_xxxxxxx.tar.gz
backup current Malcolm scripts, configuration files and certificates
- mkdir -p ./upgrade_backup_$(date +%Y-%m-%d)
- cp -r filebeat/ htadmin/ logstash/ nginx/ auth.env cidr-map.txt docker-compose.yml host-map.txt net-map.json ./scripts ./README.md ./upgrade_backup_$(date +%Y-%m-%d)/
replace scripts and local documentation in your existing installation with the new ones
- rm -rf ./scripts ./README.md
- cp -r ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/scripts ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/README.md ./
replace (overwrite) docker-compose.yml file with new version
- cp ./malcolm_YYYYMMDD_HHNNSS_xxxxxxx/docker-compose.yml ./docker-compose.yml
re-run ./scripts/install.py --configure as described in System configuration and tuning
using a file comparison tool (e.g., diff, meld, Beyond Compare, etc.), compare docker-compose.yml and the docker-compare.yml file you backed up in step 3, and manually migrate over any customizations you wish to preserve from that file (e.g., PCAP_FILTER, MAXMIND_GEOIP_DB_LICENSE_KEY, MANAGE_PCAP_FILES; anything else you may have edited by hand in docker-compose.yml that's not prompted for in install.py --configure)
pull the new docker images (this will take a while)
- docker-compose pull to pull them from Docker Hub or docker-compose load -i malcolm_YYYYMMDD_HHNNSS_xxxxxxx_images.tar.gz if you have an offline tarball of the Malcolm docker images
start Malcolm
- ./scripts/start
you may be prompted to configure authentication if there are new authentication-related files that need to be generated
- you probably do not need to re-generate self-signed certificates

Post-upgrade

Monitoring Malcolm

If you are technically-minded, you may wish to follow the debug output provided by ./scripts/start (or ./scripts/logs if you need to re-open the log stream after you've closed it), although there is a lot there and it may be hard to distinguish whether or not something is okay.

Running docker-compose ps -a should give you a good idea if all of Malcolm's Docker containers started up and, in some cases, may be able to indicate if the containers are "healthy" or not.

After upgrading following one of the previous outlines, give Malcolm several minutes to get started. Once things are up and running, open one of Malcolm's web interfaces to verify that things are working.

Loading new Kibana dashboards and visualizations

Once the upgraded instance Malcolm has started up, you'll probably want to import the new dashboards and visualizations for Kibana. You can signal Malcolm to load the new visualizations by opening Kibana, clicking Management → Index Patterns, then selecting the sessions2-* index pattern and clicking the delete 🗑 button near the upper-right of the window. Confirm the Delete index pattern? prompt by clicking Delete. Close the Kibana browser window. After a few minutes the missing index pattern will be detected and Kibana will be signalled to load its new dashboards and visualizations.

Copyright

Malcolm is Copyright 2021 Battelle Energy Alliance, LLC, and is developed and released through the cooperation of the Cybersecurity and Infrastructure Security Agency of the U.S. Department of Homeland Security.

See License.txt for the terms of its release.

Contact information of author(s):

Seth Grover

File upload not Working (RHEL7, Malcolm 1.6.x)

Seems the file upload feature isn't working correctly. I've tried using both https:///upload and https://:8443 interfaces. Still can't see pcap files in moloch.

When I up load a 40mb file into https:/upload/ I receive an nginx 413 error, When try the :8443 interface, I can see the file makes it to the bind mount, but doesn't seem like it gets distributed past there.

opened by joel858 14
Installation doesn't work anymore (old title:auth_setup docker not found)

Hi, when i execute the auth_setup script it says docker not found execute install.py even when docker is installed. I tested commenting out the lines 830-838 of auth_setup and then it worked. First time using Github so i hope this is the right place for it.

Edit: the script is running but not working, seems like Docker-Compose is not found i tried removing it and installing in different ways. Installed it a week or two ago and it worked. Tried different Machines.

opened by Thomislav 10

Many permission issues when run as uid/gid other than 1000/1000

Note: Though I fully intend to run malcolm as a reduced privilege user, I am presently just trying to get it working before getting it working right/securely.

I am guessing that running as root is not a tested/recommended configuration as there are many breakages. If this is the case it likely all that is needed is documentation and then this can be closed "won't fix." As such I'm not going to try further with this configuration but thought I'd add an issue for anyone else who runs into this. If it is a supported configuration than there are likely a lot of fixes needed so this issue should probably be broken up into multiple issues.

Note: I am using su and not sudo so there could be potential differences, especially related to environmental variables, from doing the same from sudo or directly logging in as root

I have run the following on a minimal debian buster as root.

git clone
./scripts/install.py
./scripts/install.py --configure
auth_setup
reboot
docker-compose pull
./scripts/start

I did not configure any unprivileged users for docker nor in the install.py interface.

This results in many errors logged to the console and repeated failing starts of services. The basic issue is a combination on the permissions of configuration files, data and data directories being created with only root having permissions and in container services being run as the user originally sued from (schallee in this case). Many of these permissions are likely from the git clone but could also be caused by the Some of these I fixed before the details below

The user running the in container services is visible in this snippet from pstree -u where user privilege transitions has the user name in parentheses after the command (eg: nginx(schallee)):

 |-containerd-+-containerd-shim-+-supervisor.sh---supervisord
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord(schallee)-+-name-map-save-w-+-inotifywait
        |            |                 |                       |                 `-name-map-save-w
        |            |                 |                       |-nginx---nginx
        |            |                 |                       `-php-fpm7
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord-+-nginx---4*[nginx(schallee)]
        |            |                 |             `-php-fpm7.3---2*[php-fpm7.3(schallee)]
        |            |                 `-9*[{containerd-shim}]
        |            |-2*[containerd-shim-+-supervisord]
        |            |                    `-9*[{containerd-shim}]]
        |            |-containerd-shim-+-supervisord-+-initmoloch.sh---elastic_search_---curl---{curl}
        |            |                 |             |-python3(schallee)
        |            |                 |             |-python3(schallee)---6*[{python3}]
        |            |                 |             |-viewer_service.(schallee)---sleep
        |            |                 |             `-wise_service.sh(schallee)---sleep
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-cron_env_deb.sh---cron---cron---sh---su---sh(schallee)---elastic_search_---curl---{curl}
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord---python3(schallee)---6*[{python3}]
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord---watch-pcap-uplo-+-inotifywait
        |            |                 |                               `-watch-pcap-uplo
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-bash(schallee)---elastic_search_---sleep
        |            |                 `-10*[{containerd-shim}]
        |            |-containerd-shim-+-dumb-init---supervisord-+-cron_env_centos---crond---crond---sh---su---kibana-create-m(schallee)---elastic_search_---curl
        |            |                 |                         |-kibana.sh(schallee)---elastic_search_---sleep
        |            |                 |                         `-node(schallee)---5*[{node}]
        |            |                 `-10*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord---java(schallee)---57*[{java}]
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord-+-nginx---4*[nginx(www-data)]
        |            |                 |             |-php-fpm7.3---2*[php-fpm7.3(www-data)]
        |            |                 |             `-sshd
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord-+-bash---filebeat-watch--+-filebeat-watch-
        |            |                 |             |                        `-inotifywait
        |            |                 |             |-cron_env_centos---crond
        |            |                 |             `-filebeat(schallee)---10*[{filebeat}]
        |            |                 `-9*[{containerd-shim}]
        |            |-containerd-shim-+-supervisord-+-nginx---nginx(systemd-timesync)
        |            |                 |             `-2*[tail]
        |            |                 `-10*[{containerd-shim}]
        |            |-containerd-shim-+-docker-compose---docker-compose---16*[{docker-compose}]
        |            |                 `-10*[{containerd-shim}]
        |            `-30*[{containerd}]

The above functionality was only achieved after chmod a+r of:

filebeat/filebeat.yml
cidr-map.txt
host-map.txt
logstash/config/logstash.yml

Snippets from the log below suggest far more permissions would need to be fixed for this configuration to work:

file-monitor_1   | supervisor: couldn't chdir to /data/zeek/extract_files: EACCES
filebeat_1       | 2020-06-25 15:20:00,386 CRIT Server 'unix_http_server' running without any HTTP authentication checking
filebeat_1       | 2020-06-25T15:20:01.485Z	WARN	beater/filebeat.go:152	Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
filebeat_1       | 2020-06-25T15:20:01.489Z	WARN	beater/filebeat.go:368	Filebeat is unable to load the Ingest Node pipelines for the configured modules because the Elasticsearch output is not configured/enabled. If you have already loaded the Ingest Node pipelines or are using Logstash pipelines, you can ignore this warning.
elasticsearch_1  | "stacktrace": ["org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Unable to access 'path.data' (/usr/share/elasticsearch/data)",
elasticsearch_1  | "Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data",
pcap-capture_1   | 2020-06-25 15:19:57,366 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
nginx-proxy_1    | cp: can't stat '/etc/nginx/ca-trust/*': No such file or directory
pcap-monitor_1   | PermissionError: [Errno 13] Permission denied: '/pcap/processed'

opened by schallee 8

No Data in Kibana or Moloch after upload

I followed the installation guide for Ubuntu 18.04 LTS, I used the git method for grabbing the install files. Everything installed and populated exactly as stated in the guide. When I attempt to upload a pcap through https://localhost:8443 the pcap is accepted. I have validated this by checking the pcap/upload directory and I see it move over to the pcap/processed directory. The mime type for my pcap shows "application/vnd.tcpdump.pcap". I also ran "docker-compose ps" and everything is "up" with elastalert, kibana, elasticsearch, and logstash showing "(healthy)". I have tried running the wipe.sh script and doing reboots with no change. The pcap does not show in the history or sessions tab of moloch even when specifying "all" as a time range. I'm not really sure what to do at this point.

opened by chrswtsn 7
Kibana Elastalert Plugin won't save Rules

When creating new elastalert rule, when I save I get internal server error 500. Tailing the logs for malcolm_kibana_1 doesn't provide much more information.

The kibana container is running in an "unhealthy state", however I'm not sure if the health check is accurate. The http redirect from the nginx proxy might not work, however, when curl -u "username":"password" -kf https://"hostname":5601/api/status it returns the status.

opened by joel858 7

mount destination [/opt/zeek/share/zeek/site/intel] not absolute: unknown

🐛 Summary

Start script fails

To reproduce

Just follow the standard first start from readme and cloning from github (all was okay with v5.0.0) run start script, ubuntu 16.04 docker 20.10.7

$./scripts/start
Malcolm failed to start

malcolm_file-monitor_1 is up-to-date
malcolm_htadmin_1 is up-to-date
malcolm_pcap-capture_1 is up-to-date
malcolm_freq_1 is up-to-date
malcolm_api_1 is up-to-date
malcolm_opensearch_1 is up-to-date
malcolm_name-map-ui_1 is up-to-date
malcolm_logstash_1 is up-to-date
malcolm_dashboards-helper_1 is up-to-date
malcolm_arkime_1 is up-to-date
malcolm_pcap-monitor_1 is up-to-date
malcolm_filebeat_1 is up-to-date
Creating malcolm_zeek_1 ...
malcolm_dashboards_1 is up-to-date
malcolm_upload_1 is up-to-date
Creating malcolm_zeek_1 ... error

ERROR: for malcolm_zeek_1  Cannot start service zeek: OCI runtime create failed: invalid mount {Destination:[/opt/zeek/share/zeek/site/intel] Type:bind Source:/var/lib/docker/volumes/2a847d65bbc5c7706c73d2e95bf270b82d9314dba6647fe7b9848d3c5c0eaf8d/_data Options:[rbind]}: mount destination [/opt/zeek/share/zeek/site/intel] not absolute: unknown

ERROR: for zeek  Cannot start service zeek: OCI runtime create failed: invalid mount {Destination:[/opt/zeek/share/zeek/site/intel] Type:bind Source:/var/lib/docker/volumes/2a847d65bbc5c7706c73d2e95bf270b82d9314dba6647fe7b9848d3c5c0eaf8d/_data Options:[rbind]}: mount destination [/opt/zeek/share/zeek/site/intel] not absolute: unknown
Encountered errors while bringing up the project.

opened by FrancYescO 6

Zeek Intel Framework

Greetings, so I've been up and down in the docs for Zeek and understand decently well how to create Intel in Zeek, however, that is largely aligned with a normal deployment of Zeek. I have loaded the required policies and pointed zeek (in local.zeek) to my local bro formatted file with an indicator. I tested everything in the TryZeek page with the same pcap, local.zeek changes, as well as the same bro intel file. However, when I replicate I cannot get Malcolm (under the Kibana-Zeek Intel or Notice dashboards) to pick up on my Intel hit. I have noticed some differences in the deployment with Malcolm so I figured it would be best to ask the developer directly. Thanks for your support and contribution!!

opened by cyamal1b4 6
Asset inventory capabilities

Congratulations for the project, it's really useful and easy to setup in just minutes using the scripts and docker compose.

I've just deployed the solution for testing it so I'm actually a newbie and I have to spend more time to discover all the features but I have a question that will be decisive to continue using it or not by the moment: Does it have asset inventory capabilities to list all the devices on the network?

I set to true the property LOGSTASH_OUI_LOOKUP (Logstash will map MAC addresses to vendors for all source and destination MAC addresses when analyzing Zeek logs).

Is there any dashboard or any place where we can obtain a list of the network devices?

Best regards.

opened by robefernandez 6

MacOS: scripts fail due to use of linux features

Affected Versions:

Master & v1.8.1 release

Details:

I notice that some of the scripts in ./scripts don't seem compatible with the default MacOS tools. For instance:

./scripts/auth_setup.sh appears to terminate silently here. If I add set +e before the line, or comment out the line, it works. I assume this is due to different handling of errors on Mac, as it looks like the script wants to ignore any error from failing to source the file?
./scripts/start.sh starts the containers and then dumps out a grep usage error:

Starting malcolm_htadmin_1       ... done
Starting malcolm_pcap-capture_1  ... done
Starting malcolm_elasticsearch_1 ... done
Starting malcolm_file-monitor_1  ... done
Recreating malcolm_moloch_1      ... done
Starting malcolm_logstash_1      ... done
Starting malcolm_pcap-monitor_1  ... done
Starting malcolm_curator_1       ... done
Starting malcolm_kibana_1        ... done
Starting malcolm_elastalert_1    ... done
Starting malcolm_zeek_1          ... done
Starting malcolm_filebeat_1      ... done
Recreating malcolm_upload_1      ... done
Recreating malcolm_nginx-proxy_1 ... done

In a few minutes, Malcolm services will be accessible via the following URLs:
------------------------------------------------------------------------------                                                                                                                                                                                                                                                                                                - Moloch: https://localhost/
  - Kibana: https://localhost/kibana/
  - PCAP Upload (web): https://localhost/upload/
  - PCAP Upload (sftp): sftp://[email protected]:8022/files/                                                                                                                                                                                                                                                                                                                 - Account management: https://localhost:488/

         Name                        Command                       State                                                          Ports                                                                                                                                                                                                                                     --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
malcolm_curator_1         /usr/local/bin/cron_env_deb.sh   Up                                                                                                                                                                                                                                                                                                               malcolm_elastalert_1      /usr/local/bin/elastalert- ...   Up (health: starting)   3030/tcp, 3333/tcp
malcolm_elasticsearch_1   /usr/local/bin/docker-entr ...   Up (health: starting)   9200/tcp, 9300/tcp                                                                                                                                                                                                                                                                       malcolm_file-monitor_1    /usr/local/bin/supervisord ...   Up                      3310/tcp
malcolm_filebeat_1        /usr/local/bin/docker-entr ...   Up
malcolm_htadmin_1         /usr/bin/supervisord -c /s ...   Up                      80/tcp                                                                                                                                                                                                                                                                                   malcolm_kibana_1          /usr/local/bin/dumb-init - ...   Up (health: starting)   28991/tcp, 5601/tcp
malcolm_logstash_1        /usr/local/bin/logstash-st ...   Up (health: starting)   5000/tcp, 5044/tcp, 9600/tcp                                                                                                                                                                                                                                                             malcolm_moloch_1          /usr/bin/supervisord -c /e ...   Up                      8000/tcp, 8005/tcp, 8081/tcp
malcolm_nginx-proxy_1     /usr/local/bin/docker_entr ...   Up                      0.0.0.0:28991->28991/tcp, 0.0.0.0:3030->3030/tcp, 0.0.0.0:443->443/tcp, 0.0.0.0:488->488/tcp,                                                                                                                                                                                                                                                                               0.0.0.0:5601->5601/tcp, 80/tcp, 0.0.0.0:8443->8443/tcp, 0.0.0.0:9200->9200/tcp,
                                                                                   0.0.0.0:9600->9600/tcp                                                                                                                                                                                                                                                                   malcolm_pcap-capture_1    /usr/local/bin/supervisor.sh     Up
malcolm_pcap-monitor_1    /usr/bin/supervisord -c /e ...   Up                      30441/tcp                                                                                                                                                                                                                                                                                malcolm_upload_1          /docker-entrypoint.sh /usr ...   Up                      127.0.0.1:8022->22/tcp, 80/tcp
malcolm_zeek_1            /usr/bin/supervisord -c /e ...   Up

usage: grep [-abcDEFGHhIiJLlmnOoqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
        [-e pattern] [-f file] [--binary-files=value] [--color=when]
        [--context[=num]] [--directories=action] [--label] [--line-buffered]
        [--null] [pattern] [file ...]

Looking at the script, it calls /scripts/logs.sh which passes a -P flag to grep, which does not appear in the man page for grep on Mac, but does appear in discussions of grep on linux distros.

I tried following the MacOS set up guide but that seems to be largely about configuring Docker (which I already have), I also installed coreutils and grep via homebrew under the assumption that these would be more like the linux/GNU versions. So I have 1 questiona nd 1 suggestion:

Question: What do I need to get these script running out of the box on Mac? Suggestion: Is it worth writing these scripts in a more cross platform language like python rather than bash?

Great job on Malcolm by the way - the only reason I'm raising this issue is because I think the project looks awesome and want to use it 👍

opened by Osipion 6

Filebeat certificate issues with .iso installer

With the Malcolm installer from https://malcolm.fyi/docs/download.html, filebeat is having certificate issues and failing to process Zeek logs.

Steps to reproduce the behavior:

Install Malcolm to a virtual disk using the version 6.4.3 .iso from the link above
Start Malcolm using scripts/start
Upload packet capture file via the web UI and elect to process with Zeek

Expected behavior

The packet capture file is uploaded without issue and Zeek runs against the file without issue, as evident by the contents of zeek-logs/. However, filebeat fails to start due to certificate issues and cannot (to my understanding) ingest Zeek logs, send them to Logstash, etc. The error stanza below repeats indefinitely.

Any helpful log output or screenshots

Paste the results here:


malcolm-filebeat-1            | 2022-12-28T16:50:48.077Z Setup Beat: filebeat; Version: 8.5.2
malcolm-filebeat-1            | 2022-12-28T16:50:48.088Z Failed reading certificate file /certs/client.crt: read /certs/client.crt: is a directory
malcolm-filebeat-1            | 2022-12-28T16:50:48.088Z Failed reading CA certificate: read /certs/ca.crt: is a directory
malcolm-filebeat-1            | 2022-12-28T16:50:48.088Z filebeat stopped.
malcolm-filebeat-1            | 2022-12-28T16:50:48.088Z Exiting: error initializing publisher: 2 errors: read /certs/client.crt: is a directory /certs/client.crt; read /certs/ca.crt: is a directory reading /certs/ca.crt
malcolm-filebeat-1            | Exiting: error initializing publisher: 2 errors: read /certs/client.crt: is a directory /certs/client.crt; read /certs/ca.crt: is a directory reading /certs/ca.crt
malcolm-filebeat-1            | 2022-12-28 16:50:48,101 INFO exited: filebeat (exit status 1; not expected)

opened by Dleau 5

Issue with auth_setup, says I need docker installed but it already is
🐛 Summary

What's wrong? I am trying to install Malcolm on my Ubuntu machine. I got through the install.py and install.py --configure. However, when I got to running the auth_setup script I get "Exception: auth_setup requires docker, please run install.py". I ran install.py but nothing really happens. Everything else in the install was going smoothly. I also tried "sudo ./auth_setup", but that just presented another exception that says I shouldn't run that script as root.

To reproduce

Steps to reproduce the behavior:

Run the install script and then run the install script again, then run the auth_setup script.

Expected behavior

I did not expect to get an exception.

Any helpful log output or screenshots

Paste the results here: n/a

Add any screenshots of the problem here.
opened by Jallnutt1 5

v6.4.3(Dec 6, 2022)
Malcolm v6.4.3 is a minor release containing enhancements, component version updates and bug fixes.

https://github.com/cisagov/Malcolm/compare/v6.4.2...v6.4.3

Enhancements

Import the NetBox Device Type Library on NetBox first run to populate manufacturers, device types, models and modules

idaholab/Malcolm#127 have install.py --configure ask about other storage locations for PCAP, Zeek logs and OpenSearch indices

idaholab/Malcolm#128 have install.py --configure prompt for Arkime to manage uploaded PCAP files or not

Component version updates

Alpine Linux to v3.17 for some Docker containers' base images

Filebeat to v8.5.2

NetBox to v3.3.9

Zeek to v5.0.4

opensearch-py to v2.0.1

Fluent Bit to v2.0.6

Fixes

Fix some bad links in the documentation and other minor documentation improvements

Fix idaholab/Malcolm#126, suricata logs show up in Arkime as "notip" for the protocol

Fix idaholab/Malcolm#129, filtering by rootId in Arkime returns no results

Fix Docker health checks for NetBox and supporting containers

Fix "read-only" version of nginx.conf

Tweaks to install.py memory recommendations

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/.
Source code(tar.gz)
Source code(zip)
install.py(118.68 KB)
malcolm_20221206_144043_082bae76.README.txt(896 bytes)
malcolm_20221206_144043_082bae76.tar.gz(59.25 KB)
malcolm_common.py(28.95 KB)
v6.4.2(Nov 17, 2022)
Malcolm v6.4.2 is a minor release containing a few component version updates (some addressing component vulnerabilities) and other improvements.

https://github.com/cisagov/Malcolm/compare/v6.4.1...v6.4.2

Component version updates

Zeek to v5.0.3 (this release fixes several security vulnerabilities in Zeek itself)

OpenSearch and OpenSearch Dashboards to v2.4.0

Logstash to v8.4.0

FileBeat to v8.5.1

NetBox to v3.3.8

Bug Fixes

Fix unhandled exceptions in API when certain API calls are made before data is indexed

Improvements

Added Zeek plugin to detect vulnerability to and exploitation attempts of CVE-2022-3602

Minor documentation fixes

Minor improvements to Docker container debug logging

Implemented caching of entropy calculations for DNS requests and TLS hostnames

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/.
Source code(tar.gz)
Source code(zip)
install.py(103.31 KB)
malcolm_20221117_101139_5565a323.README.txt(896 bytes)
malcolm_20221117_101139_5565a323.tar.gz(56.24 KB)
malcolm_common.py(26.16 KB)
v6.4.1(Nov 3, 2022)
Malcolm v6.4.1 is a minor release containing a few bug fixes, component version updates and other improvements.

https://github.com/cisagov/Malcolm/compare/v6.4.0...v6.4.1

Bug fixes

Zeek log files that have been renamed and are in the process of moving not caught correctly by Logstash (idaholab/Malcolm#121)

Hedgehog Arkime viewer node should use TLS (idaholab/Malcolm#122)

Recent changes to Elastic Common Schema needed adjustment (map number data type to long)

Component version updates

Arkime v4.0.2

NetBox v3.3.7

Improvements

On Hedgehog Linux, allow configuration of Arkime capture to use PCAP compression if desired

Changes to GitHub Docker image and ISO workflows, updating deprecated actions and features

Create corresponding net-map.json/Host and Subnet Name Mapping items in NetBox on when applicable

Remove unnecessary linux-headers- package from Zeek Docker image

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/.
Source code(tar.gz)
Source code(zip)
install.py(103.26 KB)
malcolm_20221103_113516_136589e1.README.txt(896 bytes)
malcolm_20221103_113516_136589e1.tar.gz(56.22 KB)
malcolm_common.py(26.16 KB)
v6.4.0(Oct 19, 2022)
Malcolm v6.4.0 features refactored documentation, the initial integration of NetBox (a network infrastructure resource modeling tool), several component version updates and other improvements and bug fixes.

Note that some changes involved in this release require some modifications to files used by docker-compose. Please run ./scripts/auth_setup and ./scripts/install.py --configure to ensure the appropriate new environment variables are set.

https://github.com/cisagov/Malcolm/compare/v6.3.0...v6.4.0

New features

initial NetBox integration (development ongoing, see idaholab/Malcolm#17)

Improvements

Documentation reformat/refactor

Use tini for Docker image init

Added support for s7comm_upload_download.log

Surface more options in install.py --configure, as well as minor tweaks

Update documentation report for results of ISO hardening

Component version updates

Arkime v4.0.1

Allow (optional) PCAP compression on Hedgehog

OpenSearch and OpenSearch Dashboards v2.3.0

Fluent Bit v1.9.9

Zeek v5.0.2

Bug fixes

verify capa signature hits are still being parsed/inserted correctly (idaholab/Malcolm#120)

Handle long integers in parsing bacnet_discovery and bacnet_property

Better enrichment of network.direction based on source and destination IP addresses

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/.
Source code(tar.gz)
Source code(zip)
install.py(103.26 KB)
malcolm_20221019_095250_4daf6105.README.txt(896 bytes)
malcolm_20221019_095250_4daf6105.tar.gz(56.30 KB)
malcolm_common.py(26.16 KB)
v6.3.0(Sep 7, 2022)
Malcolm v6.3.0 is a feature release with a number of new features, bug fixes and improvements. Of particular note is Malcolm's ability to now use another OpenSearch instance or cluster in lieu of its own local instance.

Note that the changes involved in idaholab/Malcolm#10 require modifications to files used by docker-compose. Please run ./scripts/auth_setup and ./scripts/install.py --configure to ensure the appropriate new environment variables are set.

https://github.com/cisagov/Malcolm/compare/v6.2.0...v6.3.0

New Features

Support remote OpenSearch instance/cluster as alternative to local containerized instance (idaholab/Malcolm#10)

Documentation and convenience scripts for configuring ingestion of third-party logs, and basic parsing/normalizing of Fluent-Bit's Windows event logs

S7comm Plus support and replaced Amazon S7comm parser with icsnpp-s7comm (idaholab/Malcolm#99)

Version Bumps

OpenSearch and OpenSearch Dashboards to v2.2.1

Zeek to v5.0.1

Spicy to v1.5.1

spicy-plugin to v1.3.17

YARA to v4.2.3

Capa to v4.0.1

Improvements

Major improvements to OPC UA Binary parser and supporting dashboards

Ensure that all containers are provided the same information about trusted CA certificates

changed list of "sensitive countries" to match U.S. Department of Energy Sensitive Country List

Increased maximum fields from 3,000 to 5,000

Standardized configuration and authentication for primary and secondary remote OpenSearch instances, and make sure that index templates are created on secondary remote OpenSearch instances

Expand and fix normalization of network.direction in lieu of using tags

Various tweaks and improvements to the install.py script for enabling/disabling some features

Bugs Fixed

fields could be missing in Arkime due to a large number of concurrent requests (idaholab/Malcolm#115)

mapper_parsing_exception, TCP flag parsing problem (cisagov/Malcolm#214)

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(100.81 KB)
malcolm_20220907_161840_a749dafe.README.html(11.70 MB)
malcolm_20220907_161840_a749dafe.README.txt(813 bytes)
malcolm_20220907_161840_a749dafe.tar.gz(127.00 KB)
malcolm_common.py(25.71 KB)
v6.2.0(Aug 3, 2022)
Malcolm v6.2.0 is a feature release with a number of bug fixes and improvements. Of particular note is a major reworking of how a standalone instance of Malcolm (i.e., when not receiving traffic from a network sensor) analyzes "live" traffic. See the README for more information.

Note that the changes around idaholab/Malcolm#109 and idaholab/Malcolm#110 require changes to the files used by docker-compose. Please run ./scripts/auth_setup and ./scripts/install.py --configure to ensure the appropriate new environment variables are set.

https://github.com/cisagov/Malcolm/compare/v6.1.0...v6.2.0

Improvements

idaholab/Malcolm#109: break Zeek/Suricata into two containers: one for "live" capture and one for uploaded PCAPs

give option to disable capture interface hardware offloading and adjust ring buffer sizes for standalone Malcolm capture

Zeek and Suricata images are now configured to not drop privileges at init (in order to be able to set permissions for network capture), but supervisord will drop privileges for all of its child processes before they execute to maintain security posture

include headers needed to build Zeek af_packet plugin in Zeek docker container

updated README to describe methods for capturing local traffic with standalone Malcolm

same images will be used for zeek and zeek-live containers, as well as for suricata and suricata-live containers, respectively

use the same scripts zeekdeploy.sh to configure and run Zeek on both Hedgehog and in the Malcolm zeek docker images

prevent "live" and "non-live" Zeek containers from both trying to update intel indicators at the same time

Speed up build time by getting official Debian suricata packages from backports rather than building from source

Added Suricata rule update cron jobs

Added documentation (in the form of comments) to all docker-compose file variables

Bugs

Fix idaholab/Malcolm#107: expand action/result meaning in DNP3 (and other?) dashboards

Clean up some Nul values that could appear in Zeek logs

improve mapping of BACnet actions

Fix idaholab/Malcolm#108: export PCAP not working from Arkime sessions without "Arkime Sessions"

Fix idaholab/Malcolm#110: SFTP upload broken due to dollar sign(s) in openssl-encrypted password

prompt in install.py --configure whether or not to expose this port to external hosts

Fix an issue that could prevent some zeek logs from being parsed correctly if they contained non-ASCII charactters

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(94.29 KB)
malcolm_20220803_140640_28ee931c.README.html(11.81 MB)
malcolm_20220803_140640_28ee931c.README.txt(813 bytes)
malcolm_20220803_140640_28ee931c.tar.gz(122.48 KB)
malcolm_common.py(22.15 KB)
v6.1.0(Jul 13, 2022)
Malcolm v6.1.0 is a feature release with a number of updates and improvements.

https://github.com/cisagov/Malcolm/compare/v6.0.1...v6.1.0

Bugs fixed

Zeek logs get reingested after container restart - (idaholab/Malcolm#101)

Added IPsec fields that were not being parsed

Fixed some dashboards that should have been using ECS field names

Split the STUN attribute type field on comma during stun.log parsing

Improvements

Malcolm's OpenSearch index template is now composed upon initialization with elements from the latest Elastic Common Schema release.

Replaced most instances of beats on Hedgehog Linux (with the exception of the Apache-licensed 7.10.2 filebeat which is compatible with OpenSearch) with Fluent Bit (see idaholab/Malcolm#102) for resource utilization monitoring, etc. and recreated dashboards referencing these metrics

Replaced Auditbeat file integrity checking module with AIDE for Hedgehog Linux

Added an optionally exposed (disabled by default) a TCP input endpoint to Malcolm to allow easier ingestion of other third-party logs not natively supported by Malcolm

Improvements to APIs for listing fields and indices

Removed old environment variable-configured Index State Management code as the new OpenSearch v2.1.0 release has nice UIs for both index state management and snapshot management

Version bumps of note

Supercronic to v0.2.1

OpenSearch and OpenSearch Dashboards to v2.1.0 (incorporating changes from v2.0.0, v2.0.1 and v2.1.0)

Zeek to v5.0.0 with built-in Spicy and Spicy Zeek plugin

YARA to v4.2.2

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(90.22 KB)
malcolm_20220713_074058_8ad8b2ed.README.html(11.81 MB)
malcolm_20220713_074058_8ad8b2ed.README.txt(813 bytes)
malcolm_20220713_074058_8ad8b2ed.tar.gz(118.97 KB)
malcolm_common.py(22.15 KB)
v6.0.1(May 25, 2022)
Malcolm v6.0.1 is a minor release updating some of Malcolm's core components and adding a couple of Zeek plugins for detecting recent CVEs.

https://github.com/cisagov/Malcolm/compare/v6.0.0...v6.0.1

Added Zeek plugins

Corelight's DCE/RPC remote code execution vulnerability (CVE-2022-26809) plugin

Corelight's VMware Workspace ONE Access and Identity Manager RCE vulnerability (CVE-2022-22954) plugin

Bugs fixed

Fixed an issue where user-supplied trusted CA certificates might not be added to the OpenSearch container's trust store

Version bumps

OpenSearch to v1.3.2

OpenSearch Dashboards to v1.3.2

Alpine base Docker image to v3.16

Docker Compose to v2.5.0 (as installed by install.py and in the Malcolm ISO installer)

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(91.45 KB)
malcolm_20220525_111909_c3e323b.README.html(11.81 MB)
malcolm_20220525_111909_c3e323b.README.txt(813 bytes)
malcolm_20220525_111909_c3e323b.tar.gz(118.85 KB)
malcolm_common.py(22.15 KB)
v6.0.0(May 13, 2022)
Malcolm v6.0.0 is a major release which incorporates Suricata as a data source for network traffic analysis in Malcolm alongside Zeek and Arkime. A team at BYU (@piercema, @aglad-eng, @Jarscott1, @n8hacks) recently completed their work on Suricata integration for their capstone project. This release includes their changes as well as some additional work by Malcolm's developer in integrating Suricata in other ways not covered in the scope of their project. This release also includes other bug fixes and improvements.

https://github.com/cisagov/Malcolm/compare/v5.2.11...v6.0.0

As the Malcolm project uses semantic versioning when choosing version numbers. This release required some pretty extensive remapping of Zeek fields in order for Zeek and Suricata to target the same naming conventions for common fields. This backwards-compatibility breaking change is the reason for bumping the major version number from 5 to 6. It is not recommended to attempt an upgrade from a previous release; a fresh install is strongly encouraged.

Features

Incorporate Suricata as a data source for network traffic analysis in both Malcolm and Hedgehog Linux

Added support for the GENISYS protocol

Improvements

Minor tweaks to the GitHub workflows for building the Malcolm installer ISO

Better fingerprinting of events during Logstash parsing in order to create a unique but reproducible hash for events in the case that duplicate data is indexed into Malcolm

All data sources (Arkime, Zeek and Suricata) now specify the data source (stored as event.provider, arkime, zeek and suricata, respectively) and the log type (stored as event.dataset, e.g., session, conn, alert, etc.) in order to facilitate filtering among various types of network metadata

The Malcolm REST API was improved to support POST operations for all of the calls which can accept a filter argument to allow for easier representation of filters as JSON objects

Reworked several dashboards, including the Overview, Security Overview, Zeek Notices and Signatures dashboards

Leave packages in place on the ISO-installed Malcolm and Hedgehog Linux environments in order to support mounting SMB shares from the Thunar GUI

Bug fixes

Fix idaholab/Malcolm#94: docker-compose | "function" has no attribute "get" (ubuntu 20.04 install)

Fix idaholab/Malcolm#96: DNP3 dashboard has invalid saved search syntax

Fix idaholab/Malcolm#97: virustotal file scanning broken (AttributeError: 'Namespace' object has no attribute 'vtotReqLimit')

Fix idaholab/Malcolm#98: BSAP RDB data parsed incorrectly

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(91.26 KB)
malcolm_20220513_150329_af2c2790.README.html(11.81 MB)
malcolm_20220513_150329_af2c2790.README.txt(813 bytes)
malcolm_20220513_150329_af2c2790.tar.gz(119.10 KB)
malcolm_common.py(22.15 KB)
v5.2.11(Apr 27, 2022)
Malcolm v5.2.11 is a minor release with a few user experience improvements and component version updates (some of which resolve potential security issues).

https://github.com/cisagov/Malcolm/compare/v5.2.10...v5.2.11

Addressing security vulnerabilities

bump Zeek to v4.2.1 addressing a potential Zeek buffer overflow vulnerability

Deserilization of Untrusted YML data - cisagov/Malcolm#207

Version bumps

Spicy to v1.4.1

OpenSearch to v1.3.1

Yara to v4.2.1

Improvements

Resolve performance degredation when we went to OpenSearch 1.3 by using the G1GC garbage collector - https://github.com/idaholab/Malcolm/issues/91

improve workflow for configuring Malcolm to run behind another reverse proxy (Caddy, Traefik, etc.) - https://github.com/idaholab/Malcolm/issues/92

assign and display both event.provider and event.dataset in Arkime - https://github.com/idaholab/Malcolm/issues/89

only show the controls for PCAP download from session details if there is actually a PCAP backing the session document #90 - https://github.com/idaholab/Malcolm/issues/90

increase timeouts related to filebeat (see https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html) to be a little more forgiving for log files that take a long time to process - mmguero-dev/[email protected]

strip build status badges from deployed copy of README.md

The install.py script will make use of the pythondialog module for user interaction (on Linux) if it is available

added link to Dashboards in the footer of Arkime's interface

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(90.42 KB)
malcolm_20220427_135737_b6f51e0.README.html(10.59 MB)
malcolm_20220427_135737_b6f51e0.README.txt(813 bytes)
malcolm_20220427_135737_b6f51e0.tar.gz(117.59 KB)
malcolm_common.py(22.15 KB)
v5.2.10(Apr 4, 2022)
Malcolm v5.2.10 is a minor release updating some of Malcolm's core components.

https://github.com/cisagov/Malcolm/compare/v5.2.9...v5.2.10

Version bumps

Arkime to v3.4.2

OpenSearch to v1.3.0

OpenSearch dashboards to v1.3.0

Bug fixes

cisagov/Malcolm#205

ensure timestamp fields are explicitly defined as date type in index template

Improvements

restore zeek.cip_io.io_data field so that it may be reviewed in Dashboards Discover view and Arkime

added malcolmmonitor convenience bash function into Malcolm ISO-installed environment

pointed several zeek plugins' installation source back upstream now that my PRs have been accepted

Cleanup

removed references related to internally-developed INL tool MALASS which is no longer under development and was never released publicly

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(80.36 KB)
malcolm_20220404_081746_07826c4.README.html(10.65 MB)
malcolm_20220404_081746_07826c4.README.txt(813 bytes)
malcolm_20220404_081746_07826c4.tar.gz(114.93 KB)
malcolm_common.py(13.43 KB)
v5.2.9(Mar 18, 2022)
Malcolm v5.2.9 is a release to fix a regression introduced in v5.2.9 (see idaholab/Malcolm#84), affecting the Malcolm REST API and generation of intelligence files for Zeek. If you don't use those features, you may choose to skip this release. My apologies for putting this out so close to the last release.

https://github.com/cisagov/Malcolm/compare/v5.2.8...v5.2.9

Bug fixes

Fix idaholab/Malcolm#84 ("upstream incompatibility between python regex library 2022.3.15 and dateparser breaks API")

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(80.36 KB)
malcolm_20220215_154832_f88ebd6.README.html(10.65 MB)
malcolm_20220215_154832_f88ebd6.README.txt(813 bytes)
malcolm_20220215_154832_f88ebd6.tar.gz(114.39 KB)
malcolm_common.py(13.43 KB)
v5.2.8(Mar 17, 2022)
Malcolm v5.2.8 is a release to patch a major security vulnerability in OpenSSL.

https://github.com/cisagov/Malcolm/compare/v5.2.7...v5.2.8

Version bumps

Arkime to v3.4.1

Spicy to v1.4.0

Update all docker images' system packages to get latest security updates, including updating OpenSSL to fix CVE-2022-0778

CVE-2022-0778 can already be detected in network traffic by Malcolm by 0xxon/cve-2020-0601

Minor improvements

Include gvfs-backends package in ISO-installed environments to allow mounting SMB shares in the Thunar GUI

Bug fixes

Fix an issue with "read-only mode" combined with "no SSL mode" (very unlikely to have affected anybody)

Tweak Logstash pipeline size to make it a little more conservative to avoid Logstash restarts due to running out of heap resources

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(80.36 KB)
malcolm_20220317_072344_fd8adb9.README.html(10.65 MB)
malcolm_20220317_072344_fd8adb9.README.txt(813 bytes)
malcolm_20220317_072344_fd8adb9.tar.gz(114.93 KB)
malcolm_common.py(13.43 KB)
v5.2.7(Mar 14, 2022)
Malcolm v5.2.7 is a patch release with improvements and bug fixes.

https://github.com/cisagov/Malcolm/compare/v5.2.6...v5.2.7

Bugs fixed

fixed instances where spicy_ will sometimes be prepended to network.protocol fields (e.g., spicy_wireguard is now fixed to just be wireguard)

Improvements

base GitHub workflow files' docker build step on moby/buildkit:master

added API webhook that can be used as an Alerting destionation for alerts to be indexed back into the OpenSearch database as session records

added example Alerting monitor and destination using API webhook

added ability to run Malcolm's nginx-proxy container in non-HTTPs mode (not recommended unless running behind a third-party reverse proxy like Traefik or Caddy, in which case it is very useful)

removed performance-analyzer plugin from OpenSearch container to free up resources

improvements to documentation for Anomaly Detection and Alerting

added example scripts and Vagrantfile for easily configuring and running Malcolm in a read-only or demo mode on Amazon Linux 2 (useful for AWS)

Version bumps

Arkime to v3.4.0

Yara to v4.2.0

Capa to v3.2.0

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(80.38 KB)
malcolm_20220310_122522_fab7680.README.html(10.65 MB)
malcolm_20220310_122522_fab7680.README.txt(813 bytes)
malcolm_20220310_122522_fab7680.tar.gz(114.39 KB)
malcolm_common.py(13.43 KB)
v5.2.6(Feb 24, 2022)
Malcolm v5.2.6 is a patch release with improvements and bug fixes.

https://github.com/cisagov/Malcolm/compare/v5.2.5...v5.2.6

Bugs fixed

Fixed Logstash failing to start idaholab/Malcolm#78

Added tuning options to address Logstash out of memory errors idaholab/Malcolm#79

Incorporated latest bugfixes in BACnet parser

Fixed issue with mapping some field types being incorrect for BSAP and OSPF logs

Improvements

Added http-more-files-names plugin to populate files.log filenames entries for HTTP requests

Normalized bsap_ip_header.type_name to event.action

Removed unnecessary Logstash field conversions for types already defined in the template

Improved logs and status convenience scripts to allow filtering to a particular service

Improved convenience script for working with GitHub workflows during Malcolm development

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(78.56 KB)
malcolm_20220224_093312_2452968.README.html(10.63 MB)
malcolm_20220224_093312_2452968.README.txt(813 bytes)
malcolm_20220224_093312_2452968.tar.gz(112.32 KB)
malcolm_common.py(13.43 KB)
v5.2.5(Feb 15, 2022)
Malcolm v5.2.5 is a patch release with improvements and bug fixes.

https://github.com/cisagov/Malcolm/compare/v5.2.4...v5.2.5

Threat Intelligence

idaholab/Malcolm#77 - automatically generate Zeek intelligence indicators from MISP

perform autogeneration of Zeek intel files from TAXII/MISP feeds multithreaded

allow filtering indicators from TAXII/MISP by date (e.g., "only include those created/modified in the last n days", etc.)

added intelligence hits as a new severity ranked category

highlight intel sources more clearly in dashboard

Hedgehog Linux (sensor appliance)

added sensormonitor convenience function to monitor services, disk space and logs

Bug fixes

Remove CIP fields no longer supplied by the ICSNPP EtherNet/IP parser and update dashboard accordingly

idaholab/Malcolm#76 - directory creation race condition starting up zeek on sensor which may cause zeekctl to fail

cisagov/Malcolm#189 - mount destination [/opt/zeek/share/zeek/site/intel] not absolute: unknown

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220215_155031_17751f1.README.html(10.63 MB)
malcolm_20220215_155031_17751f1.README.txt(813 bytes)
malcolm_20220215_155031_17751f1.tar.gz(111.25 KB)
malcolm_common.py(13.43 KB)
v5.2.4(Feb 7, 2022)
Malcolm v5.2.4 is a patch release with improvements and bug fixes.

https://github.com/cisagov/Malcolm/compare/v5.2.3...v5.2.4

New features

idaholab/Malcolm#74 (automatically generate Zeek intelligence indicators from STIX/TAXII)

Improvements

group MAC addresses and OUI (vendors) into related.mac and related.oui for easier searching across all fields

improvements to default anomaly detectors

Bug fixes

Fix idaholab/Malcolm#75 (OpenSearch Dashboards loads slowly without network connectivity)

Fix idaholab/Malcolm#76 (directory creation race condition starting up zeek on sensor which may cause zeekctl to fail)

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220207_095131_db122ba.README.html(10.63 MB)
malcolm_20220207_095131_db122ba.README.txt(813 bytes)
malcolm_20220207_095131_db122ba.tar.gz(108.77 KB)
malcolm_common.py(13.43 KB)
v5.2.3(Jan 31, 2022)
Malcolm v5.2.3 is a patch release with component version bumps, bug fixes and improvements.

https://github.com/cisagov/Malcolm/compare/v5.2.2...v5.2.3

Version bumps

Arkime v3.3.1

Zeek v4.2.0

Improvements

Added script and better documentation for putting Malcolm in "read-only" mode

Improved Files dashboard

Bug fixes

Fixed an issue where Logstash wasn't parsing the ftime from files.log correctly (a field added by the Spicy ZIP analyzer)

Fixed idaholab/Malcolm#73 (path for tcpdump changed) for Hedgehog Linux

Fixed idaholab/Malcolm#72 (better file directory/name parsing and normalization in Logstash)

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220131_103441_ba503df.README.html(10.63 MB)
malcolm_20220131_103441_ba503df.README.txt(813 bytes)
malcolm_20220131_103441_ba503df.tar.gz(107.54 KB)
malcolm_common.py(13.43 KB)
v5.2.2(Jan 25, 2022)
Malcolm v5.2.2 is a patch release with some improvements to the API and a fix for using Zeek intelligence files on Hedgehog Linux.

https://github.com/cisagov/Malcolm/compare/v5.2.1...v5.2.2

Added more capabilities to the API

added /document/ API

added filter ability to /agg/ and /document/ API

added more documentation and examples

For Zeek intel. files, changed location from /opt/zeek/share/zeek/site/intel to /opt/sensor/sensor_ctl/zeek/intel so that they aren't lost on reboot

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220121_102525_d138f2f.README.html(10.40 MB)
malcolm_20220121_102525_d138f2f.README.txt(813 bytes)
malcolm_20220121_102525_d138f2f.tar.gz(102.40 KB)
malcolm_common.py(13.43 KB)
v5.2.1(Jan 21, 2022)

Malcolm v5.2.1 is patch release identical to v5.2.0 with the addition of a fix (arkime/[email protected]) for a regression bug introduced in Arkime v3.3.0 which prevented the Arkime viewer from correctly loading some large or XORed packets.

In addition, a minor change was made to the startup scripts for Hedgehog Linux's Zeek configuration to allow Zeek intelligence files to be automatically loaded the same way they are in Malcolm's Zeek container.

https://github.com/cisagov/Malcolm/compare/v5.2.0...v5.2.1

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220121_102525_d138f2f.README.html(10.40 MB)
malcolm_20220121_102525_d138f2f.README.txt(813 bytes)
malcolm_20220121_102525_d138f2f.tar.gz(102.40 KB)
malcolm_common.py(13.43 KB)
v5.2.0(Jan 21, 2022)
Malcolm v5.2.0 is a feature release with a several new features and improvements, version bumps and bug fixes.

EDIT: As of this morning (1/21/2022) I'm tracking a regression in Arkime v3.3.0 with viewing the packet payload of some large sessions. It's likely a patch release will be put out later today to address this. Apologies.

https://github.com/cisagov/Malcolm/compare/v5.1.0...v5.2.0

New features

Zeek Intelligence Framework (see idaholab/Malcolm#20)

To quote Zeek's Intelligence Framework documentation, "The goals of Zeek’s Intelligence Framework are to consume intelligence data, make it available for matching, and provide infrastructure to improve performance and memory utilization. Data in the Intelligence Framework is an atomic piece of intelligence such as an IP address or an e-mail address. This atomic data will be packed with metadata such as a freeform source field, a freeform descriptive field, and a URL which might lead to more information about the specific item." Zeek intelligence indicator types include IP addresses, URLs, file names, hashes, email addresses, and more.

Malcolm doesn't come bundled with intelligence files from any particular feed, but they can be easily included into your local instance. On startup, Malcolm's malcolmnetsec/zeek docker container enumerates the subdirectories under ./zeek/intel (which is bind mounted into the container's runtime) and configures Zeek so that those intelligence files will be automatically included in its local policy. Subdirectories under ./zeek/intel which contain their own __load__.zeek file will be @load-ed as-is, while subdirectories containing "loose" intelligence files will be loaded automatically with a redef Intel::read_files directive.

New OPCUA Binary protocol parser for Zeek and corresponding dashboard.

Improvements

set ecs.provider to arkime for logs from Arkime's capture to make categorizing logs by source easier

API

allow bucketing multiple fields from /agg/ API

added /fields/ API to list fields added documentation

ECS normalization to related.hosts field for all applicable protocols

updated documentation, screenshots and slides

spreadsheet mapping STIX v1.2 fields to Zeek fields and Malcolm normalized fields

updated MITRE ATT&CK mappings for Capa hits

added a pseudo-read-only NGINX configuration

Version bumps

Arkime to v3.3.0

OpenSearch to v1.2.4

Capa to v3.1.0

cve-2021-44228 Log4Shell detector plugin for Zeek to v0.5.3 (see corelight/cve-2021-44228#46)

Bug Fixes

fix idaholab/Malcolm#71 (type mismatch for network.vlan.id between Malcolm and Arkime definitions) which prevented vlan traffic from indexing correctly from Arkime's capture with Malcolm's field template

fix for ethernet/IP traffic which could lead to Zeek runaway memory allocation until crash: "Fixed bug with Request Paths containing Port Segments" (cisagov/[email protected])

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220120_210106_d3e70f8.README.html(10.40 MB)
malcolm_20220120_210106_d3e70f8.README.txt(813 bytes)
malcolm_20220120_210106_d3e70f8.tar.gz(102.42 KB)
malcolm_common.py(13.43 KB)
v5.1.0(Jan 5, 2022)
Malcolm v5.1.0 is a feature release laying the groundwork for a new REST API for querying Malcolm. It also contains a few component version bumps.

https://github.com/cisagov/Malcolm/compare/v5.0.4...v5.1.0

New features

put framework in for Malcolm REST API (idaholab/Malcolm#70) - not feature complete yet, but minimally usable

Version bumps

OpenSearch to v1.2.3

LogStash Docker image to v7.16.2 with OpenSearch output plugin v1.2.0

Latest releases of Zeek packages

Misc.

Reformatted all Python code with Black with the options --line-length 120 --skip-string-normalization

Updated some deprecated logstash filter parameters in translate filter

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(77.54 KB)
malcolm_20220105_092906_3957e25.README.html(10.35 MB)
malcolm_20220105_092906_3957e25.README.txt(813 bytes)
malcolm_20220105_092906_3957e25.tar.gz(99.40 KB)
malcolm_common.py(13.43 KB)
v5.0.4(Dec 20, 2021)
Malcolm v5.0.4 is a patch release with improvements to Zeek detection of CVE-2021-44228 ("Log4Shell" Log4J vulnerability).

https://github.com/cisagov/Malcolm/compare/v5.0.3...v5.0.4

build with latest corelight/cve-2021-44228 release

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.47 KB)
malcolm_20211220_080355_d8824fd.README.html(10.30 MB)
malcolm_20211220_080355_d8824fd.README.txt(813 bytes)
malcolm_20211220_080355_d8824fd.tar.gz(99.97 KB)
malcolm_common.py(12.57 KB)
v5.0.3(Dec 16, 2021)
Malcolm v5.0.3 is a patch release with a few minor bug fixes and improvements to Zeek detection of CVE-2021-44228 ("Log4Shell" Log4J vulnerability).

https://github.com/cisagov/Malcolm/compare/v5.0.2...v5.0.3

build with latest zeek/spicy-ldap release (dpd-based detection rather than just port-based)

build with latest corelight/cve-2021-44228 release

fix idaholab/Malcolm#69 (zeek resists shutdown on sensor during halt/reboot)

bump OpenSearch to v1.2.2 which has log4j 2.16

added convenience script for working with GitHub workflow-built images

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.47 KB)
malcolm_20211216_131705_97c18e3.README.html(10.30 MB)
malcolm_20211216_131705_97c18e3.README.txt(813 bytes)
malcolm_20211216_131705_97c18e3.tar.gz(99.69 KB)
malcolm_common.py(12.57 KB)
v5.0.2(Dec 15, 2021)
Malcolm v5.0.2 is a patch release adding HTTP header-based Zeek detection of CVE-2021-44228 ("Log4Shell" Log4J vulnerability).

https://github.com/cisagov/Malcolm/compare/v5.0.1...v5.0.2

Added Corelight's Zeek detection script for CVE-2021-44228 ("Log4Shell" Log4J vulnerability)

move zeek.http.tags field up to top-level tags

Version bumps

Arkime to v3.2.1

Alpine (for dashboards-helper, name-map-ui and nginx-proxy Docker containers) to v3.15.0

NGINX (for nginx-proxy Docker container) to v1.20.2

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.47 KB)
malcolm_20211215_135958_3f6f71c.README.html(10.30 MB)
malcolm_20211215_135958_3f6f71c.README.txt(813 bytes)
malcolm_20211215_135958_3f6f71c.tar.gz(97.10 KB)
malcolm_common.py(12.57 KB)
v5.0.1(Dec 14, 2021)
Malcolm v5.0.1 is a patch release with minor bug- and security-related fixes.

https://github.com/cisagov/Malcolm/compare/v5.0.0...v5.0.1

Security vulnerabilities addressed:

mitigations for CVE-2021-44228 (log4shell) idaholab/Malcolm#68

Bugs addressed:

Very large pcaps don't get proccesed idaholab/Malcolm#44

pcap files with colon (:) in the name don't process correctly idaholab/Malcolm#2

turning off AUTO_TAG feature disables tagging altogether idaholab/Malcolm#12

recent debinterfaces release broke configure-interfaces.py idaholab/Malcolm#48

opensearch indexes in yellow state idaholab/Malcolm#67

arkime capture gives mlockall_init() warning on startup idaholab/Malcolm#66

Other

bumped Arkime from v3.1.1 to v3.2.0

bumped OpenSearch to v1.2.1

switched from elasticsearch to opensearch python client libraries

write contributor's guide for source code contributions/modifications idaholab/Malcolm#25

handle new fields in ethernet/IP logs (cisagov/[email protected])

use more recognizable dashboards logo for OpenSearch dashboards launcher in Malcolm ISO

include patches used to build Arkime Dockerfile when building Arkime for hedgehog as well

build Zeek spicy analyzers from their various repos rather than the zeek/spicy-analyzer meta-repo

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.47 KB)
malcolm_20211214_081008_b59e237.README.html(10.30 MB)
malcolm_20211214_081008_b59e237.README.txt(813 bytes)
malcolm_20211214_081008_b59e237.tar.gz(97.09 KB)
malcolm_common.py(12.57 KB)
v5.0.0(Dec 7, 2021)
Malcolm v5.0.0 is a major release which addresses idaholab/Malcolm#54, transition from ElasticSearch to OpenSearch

https://github.com/cisagov/Malcolm/compare/v4.0.1...v5.0.0

Malcolm has switched to the OpenSearch project as the basis of its search and analytics capabilities, mainly for two reasons:

Elastic.co's decision to no longer release Elasticsearch and Kibana under an open source license

Capabilities available under OpenSearch (and previously under Open Distro for Elasticsearch) that are only available with paid "premium" Elastic.co subscriptions (machine learning anomaly detection, alerting, reporting, etc.)

As the Malcolm project uses semantic versioning when choosing version numbers, this backwards-compatibility breaking change is the reason for bumping the major version number from 4 to 5. It is not recommended to attempt an upgrade from a previous release; a fresh install is required.

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.

Historical context for the events and reasoning behind this change:

Elastic announces license change

Amazon NOT OK

Doubling Down on Open

Doubling Down on Open, Part II

Elastic License v2

FAQ on 2021 License Change

Does this mean that Elasticsearch and Kibana are no longer Open Source? Yes. Neither the Elastic License nor SSPL have been approved by the OSI, so to prevent confusion, we no longer refer to Elasticsearch or Kibana as open source.

old "open source" tier ("Apache 2.0: Now and always" 🙄) goes away

The SSPL is not an open source license

OpenSearch fork:

Stepping up for a truly open source Elasticsearch

Introducing OpenSearch

Truly Doubling Down on Open Source @ logz.io

FAQ

Third-party blogs, etc.

Elasticsearch does not belong to Elastic

Elasticearch and Kibana are now business risks

Is Elasticsearch no longer open source software?

The Implications of Elasticsearch and Kibana License Change

Let's talk about the Elastic license change

Elastic is going closed-source. Where does that leave MSSPs?

Source code(tar.gz)
Source code(zip)
install.py(65.47 KB)
malcolm_20211207_114821_17e34d0.README.html(10.30 MB)
malcolm_20211207_114821_17e34d0.README.txt(813 bytes)
malcolm_20211207_114821_17e34d0.tar.gz(99.54 KB)
malcolm_common.py(12.57 KB)
v4.0.1(Dec 1, 2021)
Malcolm v4.0.1 is a point release with the following updates:

https://github.com/cisagov/Malcolm/compare/v4.0.0...v4.0.1

Incorporate support for OSPF package analyzer and add relevant visualizations

Fix for building Zeek Spicy analyzer plugins as they are being split out into separate repositories rather than just the Zeek spicy-analyzers repo

This may be the final release of Malcolm prior to the completion of the transition from Elasticsearch to OpenSearch.

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.52 KB)
malcolm_20211201_131509_d667f10.README.html(9.43 MB)
malcolm_20211201_131509_d667f10.README.txt(794 bytes)
malcolm_20211201_131509_d667f10.tar.gz(97.07 KB)
malcolm_common.py(12.57 KB)
v4.0.0(Nov 18, 2021)
Malcolm v4.0.0 consists of a major restructuring of the underlying data schema used to represent Zeek logs (and, going forward, logs from other data sources) in the Elasticsearch data store. As the Malcolm project uses semantic versioning when choosing version numbers, this backwards-compatibility breaking change is the reason for bumping the major version number from 3 to 4 despite no significant new functionality being introduced.

The details of the drivers behind this change can be found at idaholab/Malcolm#64 and idaholab/Malcolm#16. This change, though somewhat painful, will make it easier to integrate more data sources into Malcolm in the future and potentially makes Malcolm's network session data more compatible with other tools that use the Elastic Common Schema.

https://github.com/cisagov/Malcolm/compare/v3.4.0...v4.0.0

BREAKING CHANGES:

as many field names have changed, custom saved dashboards and/or bookmarks to Kibana or Arkime visualizations may need to be adjusted accordingly

old network session data (stored in the sessions2-* indices in Elasticsearch) will not be visible (as the indices are now named arkime-sessions3-*)

A fresh install of Malcolm is recommended with this release. Upgrading from previous versions of Malcolm to v4.0.0+ is not suggested.

Changes:

added GitHub workflow files which contain instructions for GitHub to build the docker images and sensor and Malcolm installer ISOs.

moved many fields that were named zeek-specific to generic ECS-specified (or at least "ECS-inspired") field names, updating related parsing code and dashboard definitions

changed Zeek-specific field naming schema (e.g., zeek_foo.bar becomes zeek.foo.bar)

added Corelight's Microsoft Excel privilege escalation detection (CVE-2021-42292) plugin

integrated updates to the LDAP parser which improve the detail given from observed LDAP searches

improved and genericized the code for mapping MAC addresses to vendor OUIs, replacing the use of logstash-filter-ieee_oui

updated some Dockerfiles to use Debian 11 "bullseye" instead of Debian 10 "buster"

Source code(tar.gz)
Source code(zip)
install.py(65.52 KB)
malcolm_20211118_123707_174600e.README.html(9.43 MB)
malcolm_20211118_123707_174600e.README.txt(794 bytes)
malcolm_20211118_123707_174600e.tar.gz(98.03 KB)
malcolm_common.py(12.57 KB)
v3.4.0(Oct 28, 2021)
Malcolm v3.4.0 is a feature release focused on bringing its major underlying components up-to-date with the latest released versions, increasing stability, improving performance and adding new features.

https://github.com/cisagov/Malcolm/compare/v3.3.1...v3.4.0

Component version updates

Arkime v3.3.1 (from v2.7.1)

Zeek v4.1.1 (from v4.0.4)

Zeek v4.1 Feature Release

Spicy v1.3.0 (from v1.2.1)

Yara v4.1.3 (from v4.1.2)

Capa v3.0.3 (from v3.0.2)

Debian v10 to v11 (for ISO images)

Added GitHub actions for building the Malcolm Docker images on GitHub and pushing them to GHCR

Moved common Logstash Ruby code to file-based scripting

Use standard stunnel package in NGINX proxy container rather than building from source

Switched from CLANG to GCC build toolchain for Zeek and Spicy plugins

Replaced LXDE desktop environment with XFCE (for ISO images)

Renamed various fields to align with Arkime's gradual adoption of the Elastic Common Schema

Added parser support and dashboard for the STUN (Session Traversal Utilities for NAT) protocol

Further improved capabilities for tagging ICS traffic

Logs from known ICS protocols how have ics added to the tags field

Logs identified by "ICS best guess" lookups now have ics_best_guess added to the tags field

"ICS best guess" lookups have been augmented with a MAC address lookup table of ICS hardware vendors

ICS-related overview dashboards have been updated accordingly

Malcolm and Hedgehog Linux may be obtained by pulling or building the Docker images and/or building the ISO installer images as described in the documentation. Unofficial ISO installer images for Malcolm and Hedgehog Linux are not hosted on GitHub, but may be downloaded from https://malcolm.fyi/download/.
Source code(tar.gz)
Source code(zip)
install.py(65.48 KB)
malcolm_20211028_102341_2fe758a.README.html(9.43 MB)
malcolm_20211028_102341_2fe758a.README.txt(794 bytes)
malcolm_20211028_102341_2fe758a.tar.gz(97.30 KB)
malcolm_common.py(12.57 KB)

Malcolm is a powerful, easily deployable network traffic analysis tool suite for full packet capture artifacts (PCAP files) and Zeek logs.

Related tags

Overview

Malcolm

Table of Contents

Quick start

Getting Malcolm

Source code

Building Malcolm from scratch

Initial configuration

Pull Malcolm's Docker images

Import from pre-packaged tarballs

Starting and stopping Malcolm

User interface

Overview

Components

Supported Protocols

Development

Building from source

Pre-Packaged installation files

Creating pre-packaged installation files

Installing from pre-packaged installation files

Preparing your system

Recommended system requirements

System configuration and tuning

docker-compose.yml parameters

Linux host system configuration

Installing Docker

Installing docker-compose

Operating system configuration

macOS host system configuration

Automatic installation using install.py

Install Homebrew

Install docker-edge

Configure docker daemon option

Windows host system configuration

Installing and configuring Docker Desktop for Windows

Finish Malcolm's configuration

Running Malcolm

Configure authentication

Local account management

Lightweight Directory Access Protocol (LDAP) authentication

LDAP connection security

Starting Malcolm

Stopping and restarting Malcolm

Clearing Malcolm’s data

Capture file and log archive upload

Tagging

Processing uploaded PCAPs with Zeek

Live analysis

Capturing traffic on local network interfaces

Using a network sensor appliance

Manually forwarding Zeek logs from an external source

Arkime

Zeek log integration

Correlating Zeek logs and Arkime sessions

Help

Sessions

PCAP Export

SPIView

SPIGraph

Connections

Hunt

Statistics

Settings

General settings

Kibana

Discover

Screenshots

Visualizations and dashboards

Prebuilt visualizations and dashboards

Screenshots

Building your own visualizations and dashboards

Screenshots

Search Queries in Arkime and Kibana

Other Malcolm features

Automatic file extraction and scanning

Automatic host and subnet name assignment

IP/MAC address to hostname mapping via host-map.txt

CIDR subnet to network segment name mapping via cidr-map.txt

`docker-compose.yml` parameters

Automatic installation using `install.py`

IP/MAC address to hostname mapping via `host-map.txt`

CIDR subnet to network segment name mapping via `cidr-map.txt`