Yet Another Workflow Parser for SecurityHub

Related tags

Data Analysisyawps
Overview

YAWPS

Yet Another Workflow Parser for SecurityHub

"Screaming pepper" by Rum Bucolic Ape is licensed with CC BY-ND 2.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-nd/2.0/

Purpose

Currently SecurityHub has a ChatBot integration thats a bit lacking. All of securityhub goes to chatbot, which means a singular flooding channel of alerts.

With cloud-custodians recent support for securityhub and organizations we have a good way to send all alerts for an entire org to slack. But that means every account goes to a single channel.

This repo is part of a multi-series talk/demo on how to intelligently route account messages to differing Slack channels.

In the scenario where a team owns an account it would be nice to let cloud-custodian generate meaningful securityhub notifications that go to specific team channels.

For this talk we will simply tag AWS accounts with 2 tags account_name (a human readable name) and slack_channel (a slack channel to direct those security hub notifications to).

A blog post and KubeCon talk will be coming soon with more information

Prerequisites

The only real pre-requisite here is a working multi-account SecurityHub

Configuration

Environment Variable Description
SLACK_FALLBACK_CHANNEL Channel to fallback to if the slack_channel tag is not provided on the account
SLACK_TOKEN the path in SSM to the slack token`
SLACK_TOKEN_SSM_PATH if a SLACK_TOKEN is not found, this is where to grab it from the EC2 Param store
LOGGING_LEVEL the logging level to use. Default is INFO
ENABLE_FORK_COPY_SEVERITY Enable the ability to fork some messages to another channel by severity. Value can be True or False. Default is False
FORK_COPY_SEVERITY_VALUE If ENABLE_FORK_COPY_SEVERITY is True, what severity level to fork by. Should be an integer between 0 and 100. Default is 90
ENABLE_FORK_ONLY_SEVERITY Enable the ability to fork some messages to only another channel by severity. Value can be True or False. Default is False
FORK_ONLY_SEVERITY_VALUE If ENABLE_FORK_ONLY_SEVERITY is True, what severity level to fork by. Should be an integer between 0 and 100. Default is 100

Forking

There are a few use cases for forking.

In general (all defaults) YAWPS will only send to the channel found in the tag or the SLACK_FALLBACK_CHANNEL (because it's required).

This is great until you have rules that you want a second team (lets say security) to also see and follow up with.

Using ENABLE_FORK_COPY_SEVERITY and FORK_COPY_SEVERITY_VALUE lets you also send to that second slack channel. Lets say you set FORK_COPY_SEVERITY_VALUE to 90. This means that anything rated 90 will send to both.

Another use-case exists: not sending team specific alerts. Lets say that an alert is not actionable by the configured team, but is purely for security visibility (like failed IAM logins etc). You can use ENABLE_FORK_ONLY_SEVERITY set to, say 100, in this scenario so that custom rules can set severity to 100 and send it only to security and bypass the primary team. This is good for noise filtration and helping to keep things actionable by a singular source.

Deploy

ServerLess

TODO

Terraform

  1. Download this repository (or a released artifact)
  2. Run make zip to produce a fully deployable s3 artifact
  3. Deploy something similar to this terraform

Testing

$ poetry install
$ poetry run tox
Owner
myoung34
Cloud security engineer, tinkerer, tomato farmer
myoung34
SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

SNV Pipeline SNV calling pipeline developed explicitly to process individual or trio vcf files obtained from Illumina based pipeline (grch37/grch38).

East Genomics 1 Nov 02, 2021
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.

1 Feb 11, 2022
WithPipe is a simple utility for functional piping in Python.

A utility for functional piping in Python that allows you to access any function in any scope as a partial.

Michael Milton 1 Oct 26, 2021
Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Elicited Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations. Credit to Brett Hoove

Ryan McGeehan 3 Nov 04, 2022
Catalogue data - A Python Scripts to prepare catalogue data

catalogue_data Scripts to prepare catalogue data. Setup Clone this repo. Install

BigScience Workshop 3 Mar 03, 2022
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.

HoloViz 2.9k Jan 06, 2023
Pizza Orders Data Pipeline Usecase Solved by SQL, Sqoop, HDFS, Hive, Airflow.

PizzaOrders_DataPipeline There is a Tony who is owning a New Pizza shop. He knew that pizza alone was not going to help him get seed funding to expand

Melwin Varghese P 4 Jun 05, 2022
Udacity-api-reporting-pipeline - Udacity api reporting pipeline

udacity-api-reporting-pipeline In this exercise, you'll use portions of each of

Fabio Barbazza 1 Feb 15, 2022
A model checker for verifying properties in epistemic models

Epistemic Model Checker This is a model checker for verifying properties in epistemic models. The goal of the model checker is to check for Pluralisti

Thomas Träff 2 Dec 22, 2021
A columnar data container that can be compressed.

Unmaintained Package Notice Unfortunately, and due to lack of resources, the Blosc Development Team is unable to maintain this package anymore. During

944 Dec 09, 2022
Transform-Invariant Non-Negative Matrix Factorization

Transform-Invariant Non-Negative Matrix Factorization A comprehensive Python package for Non-Negative Matrix Factorization (NMF) with a focus on learn

EMD Group 6 Jul 01, 2022
Projeto para realizar o RPA Challenge . Utilizando Python e as bibliotecas Selenium e Pandas.

RPA Challenge in Python Projeto para realizar o RPA Challenge (www.rpachallenge.com), utilizando Python. O objetivo deste desafio é criar um fluxo de

Henrique A. Lourenço 1 Apr 12, 2022
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather

Tuplex 791 Jan 04, 2023
A simple and efficient tool to parallelize Pandas operations on all available CPUs

Pandaral·lel Without parallelization With parallelization Installation $ pip install pandarallel [--upgrade] [--user] Requirements On Windows, Pandara

Manu NALEPA 2.8k Dec 31, 2022
Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well

Orchest 3.6k Jan 09, 2023
t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology.

tree-SNE t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in s

Isaac Robinson 61 Nov 21, 2022
A data analysis using python and pandas to showcase trends in school performance.

A data analysis using python and pandas to showcase trends in school performance. A data analysis to showcase trends in school performance using Panda

Jimmy Faccioli 0 Sep 07, 2021
Candlestick Pattern Recognition with Python and TA-Lib

Candlestick-Pattern-Recognition-with-Python-and-TA-Lib Goal Look at the S&P500 to try and get a better understanding of these candlestick patterns and

Ganesh Jainarain 11 Oct 07, 2022
A highly efficient and modular implementation of Gaussian Processes in PyTorch

GPyTorch GPyTorch is a Gaussian process library implemented using PyTorch. GPyTorch is designed for creating scalable, flexible, and modular Gaussian

3k Jan 02, 2023
Pyspark Spotify ETL

This is my first Data Engineering project, it extracts data from the user's recently played tracks using Spotify's API, transforms data and then loads it into Postgresql using SQLAlchemy engine. Data

16 Jun 09, 2022