This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Overview

AWS Data Engineering Pipeline

This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda):

drawing

Below are the steps of how to build this pipeline in AWS:

1️⃣ Create a new iCloud9 environment dedicated to this project.

🤔 Need a refresher? Please check this repo.

⚠️ Make sure to use name as your unique id for your items in the fang table.

2️⃣ Create a fang table in DynamoDB and SQS queue.

You can check how to do it here.

3️⃣ Build producer Lambda Function

  1. In iCloud9, initialize a serverless application with SAM template:

    sam init 

Inputs: 1, 2, 4, "producer"

  1. Set virtual environment and source it:

    # I called my virtual environment "comprehendProducer"
    python3 -m venv ~/.comprehendProducer
    source ~/.comprehendProducer/bin/activate
  2. Add the code for your application to app.py

  3. Add relevant packages used in your app to requirements.txt file

  4. Install requirements

     cd hello_world/
     pip install -r requirements.txt 
     cd .. 
  5. Create a repository (producer) in Elastic Container Registry (ECR) and copy its URI

  6. Build and deploy your serverless application:

    sam build 
    sam deploy --guided

    When prompted to input URI, paste the URI for the producer repository that you've just created.

  7. Create IAM Role granting Administrator Access to the Producer Lambda function.

    🤔 Not sure how to create IAM Role? Check out this video (17 min ).

  8. Add the execution role that you created to the Producer Lambda function.

    In case you forgot how to do it:

    In AWS console: Lambda ➡️ click on producer function ➡️ configuration ➡️ permissions ➡️ Edit ➡️ Select the role under Existing role.

  9. You are all set with the producer function! Now deactivate virtual environment:

    deactivate 
    cd .. 
    

4️⃣ Create an S3 bucket and note its name

5️⃣ Build consumer Lambda Function

Repeat steps in 3️⃣ .

⚠️ In #3 when you add the code for a consumer app to app.py, make sure to replace bucket="fangsentiment" with the name of your S3 bucket.

6️⃣ Add triggers to Lambda Functions

🤔 Not sure how to do it? Check out this video (start times are noted below):

Producer Lambda Function: CloudWatchEvent(30 min)

Consumer Lambda Function: SQS (42 min)

7️⃣ If all goes well, you will see sentiment results in your S3 bucket:

s3

💡 Tip: If you've already deployed your Lambda function but need to edit your application, you can make the necessary edits to your app and build and deploy the app again:

sam build && sam deploy 

💡 Tip: If you don't have space left on disk, you may want to remove a few docker containers that you don't use.

#list containers 
docker image ls 
# remove a container 
docker image rm <containerId>
Async boto3 with Autogenerated Data Classes

awspydk Async boto3 with Autogenerated JIT Data Classes Motivation This library is forked from an internal project that works with a lot of backend AW

1 Dec 05, 2021
python3.5+ hubspot client based on hapipy, but modified to use the newer endpoints and non-legacy python

A python wrapper around HubSpot's APIs, for python 3.5+. Built initially around hapipy, but heavily modified. Check out the documentation here! (thank

Jacobi Petrucciani 140 Dec 21, 2022
Rapid Sms Bomber For Indian Number.

Bombzilla Rapid Sms Bomber For Indian Number. Installation git clone https://github.com/sarv99/Bombzilla cd Bombzilla chmod +x setup.sh ./setup.sh Af

Saurav Jangid 1 Jan 12, 2022
Bill is a bot capable to Chat with you, search everything on web to you, and send message to yours contacts for you.

Bill Bot The inteligent Bot Bill is a intelligent bot, it can chat, search and send messages to you. Chat with You Send messages on WhatsApp for you S

João Assalim 3 Sep 12, 2021
WakeNote is a tool that hides notifications from you until you confirm you want to read them, with technology to help prevent the reading of depressing messages first thing in the morning.

By: Seanpm2001, Et; Al. Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans |

Sean P. Myrick V19.1.7.2 3 Oct 21, 2022
A Discord bot written in Python that can be used to control event management on a server.

Event Management Discord Bot A Discord bot written in Python that can be used to control event management on a Discord server. Made originally for GDS

Suvaditya Mukherjee 2 Dec 07, 2021
Python wrapper for the GitLab API

Python GitLab python-gitlab is a Python package providing access to the GitLab server API. It supports the v4 API of GitLab, and provides a CLI tool (

1.9k Dec 31, 2022
Discord Rpc With Python And 2 Buttons

Discord-RPC-With-Python- Discord Rpc With Python And 2 Buttons Packages pypresence time Required Programs Python Latest Version Random IDE Discord :P

Kaz 4 Dec 12, 2021
Custom bot I've made to host events on my personal Discord server.

discord_events Custom bot I've made to host events on my personal Discord server. You can try the bot out in my personal server here: https://discord.

AlexFlipnote 5 Mar 16, 2022
Simple PoC script that allows you to exploit telegram's "send with timer" feature by saving any media sent with this functionality.

Simple PoC script that allows you to exploit telegram's "send with timer" feature by saving any media sent with this functionality.

Matteo 52 Nov 29, 2022
Discord-Wrapper - Discord Websocket Wrapper in python

This does not currently work and is in development Discord Websocket Wrapper in

3 Oct 25, 2022
Public API client for GETTR, a "non-bias [sic] social network," designed for data archival and analysis.

GoGettr GoGettr is an API client for GETTR, a "non-bias [sic] social network." (We will not reward their domain with a hyperlink.) GoGettr is built an

Stanford Internet Observatory 72 Dec 14, 2022
可基于【腾讯云函数】/【GitHub Actions】/【Docker】的每日签到脚本(支持多账号使用)签到列表: |爱奇艺|全民K歌|腾讯视频|有道云笔记|网易云音乐|一加手机社区官方论坛|百度贴吧|Bilibili|V2EX|咔叽网单|什么值得买|AcFun|天翼云盘|WPS|吾爱破解|芒果TV|联通营业厅|Fa米家|小米运动|百度搜索资源平台|每日天气预报|每日一句|哔咔漫画|和彩云|智友邦|微博|CSDN|王者营地|

每日签到集合 基于【腾讯云函数】/【GitHub Actions】/【Docker】的每日签到脚本 支持多账号使用 特别声明: 本仓库发布的脚本及其中涉及的任何解锁和解密分析脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。

87 Nov 12, 2022
Simple library for logging to Loggly

#Hoover A python wrapper used to hit the Loggly. API For more information on Hoover see http://wiki.loggly.com/hooverguide ##Install With this git rep

Hoover Loggly 34 May 19, 2021
A battle-tested Django 2.1 project template with configurations for AWS, Heroku, App Engine, and Docker.

For information on how to use this project template, check out the wiki. {{ project_name }} Table of Contents Requirements Local Setup Local Developme

Lionheart Software 64 Jun 15, 2022
Force-Subscribe-Bot - A Telegram Bot to force users to join a specific channel before sending messages in a group

Introduction A Telegram Bot to force users to join a specific channel before sen

LG Bot Updates 0 Jan 16, 2022
Paginator for Dis-Snek Python Discord API wrapper

snek-paginator Paginator for Dis-Snek Python Discord API wrapper Installation: pip install -U snek-paginator Basic Example: from dis_snek.client impo

1 Nov 04, 2021
🛰️ Scripts démontrant l'utilisation de l'imagerie RADARSAT-1 à partir d'un seau AWS | 🛰️ Scripts demonstrating the use of RADARSAT-1 imagery from an AWS bucket

🛰️ Scripts démontrant l'utilisation de l'imagerie RADARSAT-1 à partir d'un seau AWS | 🛰️ Scripts demonstrating the use of RADARSAT-1 imagery from an AWS bucket

Agence spatiale canadienne - Canadian Space Agency 4 May 18, 2022
Python 3 tools for interacting with Notion API

NotionDB Python 3 tools for interacting with Notion API: API client Relational database wrapper Installation pip install notiondb API client from noti

Viet Hoang 14 Nov 24, 2022
yobot插件,Steam雷达,可自动播报玩家的Steam游戏状态和DOTA2图文战报

Steam_watcher 这是 prcbot/yobot 的自定义插件,可自动播报玩家的Steam游戏状态和DOTA2图文战报 都有些什么功能? 本插件可以在用户绑定后自动推送Steam游戏状态的更新和 Dota2 图文战报,以及提供一些手动查询功能 指令列表 atbot 表示需要@BOT ats

羽波 21 Jun 21, 2022