This is a repository for the Duke University Cloud Computing course project on Serveless Data Engineering Pipeline. For this project, I recreated the below pipeline.

Overview

AWS Data Engineering Pipeline

This is a repository for the Duke University Cloud Computing course project on Serverless Data Engineering Pipeline. For this project, I recreated the below pipeline in iCloud9 (reference: https://github.com/noahgift/awslambda):

drawing

Below are the steps of how to build this pipeline in AWS:

1️⃣ Create a new iCloud9 environment dedicated to this project.

🤔 Need a refresher? Please check this repo.

⚠️ Make sure to use name as your unique id for your items in the fang table.

2️⃣ Create a fang table in DynamoDB and SQS queue.

You can check how to do it here.

3️⃣ Build producer Lambda Function

  1. In iCloud9, initialize a serverless application with SAM template:

    sam init 

Inputs: 1, 2, 4, "producer"

  1. Set virtual environment and source it:

    # I called my virtual environment "comprehendProducer"
    python3 -m venv ~/.comprehendProducer
    source ~/.comprehendProducer/bin/activate
  2. Add the code for your application to app.py

  3. Add relevant packages used in your app to requirements.txt file

  4. Install requirements

     cd hello_world/
     pip install -r requirements.txt 
     cd .. 
  5. Create a repository (producer) in Elastic Container Registry (ECR) and copy its URI

  6. Build and deploy your serverless application:

    sam build 
    sam deploy --guided

    When prompted to input URI, paste the URI for the producer repository that you've just created.

  7. Create IAM Role granting Administrator Access to the Producer Lambda function.

    🤔 Not sure how to create IAM Role? Check out this video (17 min ).

  8. Add the execution role that you created to the Producer Lambda function.

    In case you forgot how to do it:

    In AWS console: Lambda ➡️ click on producer function ➡️ configuration ➡️ permissions ➡️ Edit ➡️ Select the role under Existing role.

  9. You are all set with the producer function! Now deactivate virtual environment:

    deactivate 
    cd .. 
    

4️⃣ Create an S3 bucket and note its name

5️⃣ Build consumer Lambda Function

Repeat steps in 3️⃣ .

⚠️ In #3 when you add the code for a consumer app to app.py, make sure to replace bucket="fangsentiment" with the name of your S3 bucket.

6️⃣ Add triggers to Lambda Functions

🤔 Not sure how to do it? Check out this video (start times are noted below):

Producer Lambda Function: CloudWatchEvent(30 min)

Consumer Lambda Function: SQS (42 min)

7️⃣ If all goes well, you will see sentiment results in your S3 bucket:

s3

💡 Tip: If you've already deployed your Lambda function but need to edit your application, you can make the necessary edits to your app and build and deploy the app again:

sam build && sam deploy 

💡 Tip: If you don't have space left on disk, you may want to remove a few docker containers that you don't use.

#list containers 
docker image ls 
# remove a container 
docker image rm <containerId>
Um bot para contar quantas vezes o meu amigo troca de pfp/nick/tag essas coisas ae pq aquele mlk n para quieto

EkiBot Um bot que tem apenas as suas funções de audit log com as PFP's (avatares) dos usuários Pode ser usado para um usuário em específico, ou até me

Samuel 3 Aug 11, 2021
This is a straightforward python implementation to specifically grab basic infos about IPO companies in China from Sina Stock website.

SinaStockBasicInfoCollect This is a straightforward python implementation to specifically grab basic infos about IPO companies in China from Sina Stoc

CrosSea 1 Dec 09, 2021
Python library for Spurwing API to schedule appointments, manage calendars and custom integrations.

Spurwing API Python Library Lightweight Python library for Spurwing's API. Spurwing's API makes it easy to add robust scheduling and booking to your a

Spurwing 1 Jul 14, 2021
🔮 Uncover some followers of a private instagram account

Private Instagram Chaining 🔮 Uncover part of followers of an instagram private account I have this private instagram account julianakhao. I need to g

аэт 69 Dec 17, 2022
An attendance bot that joins google meet automatically according to schedule and marks present in the google meet.

Google-meet-self-attendance-bot An attendance bot which joins google meet automatically according to schedule and marks present in the google meet. I

Sarvesh Wadi 12 Sep 20, 2022
An automated bot for twitter using Tweepy!

Tweeby An automated bot for twitter using Tweepy! About This bot will look for tweets that contain certain hashtags, if found. It'll send them a messa

Ori 1 Dec 06, 2021
A listener for RF >= 4.0 that prints a Stack Trace to console to faster find the code section where the failure appears.

robotframework-stacktrace A listener for RF = 4.0 that prints a Stack Trace to console to faster find the code section where the failure appears. Ins

marketsquare 16 Nov 24, 2022
A Telegram Bot for searching any channel messages from Inline by @AbirHasan2005

Message-Search-Bot A Telegram Bot for searching any channel messages from Inline by @AbirHasan2005. I made this for @AHListBot. You can use this for s

Abir Hasan 44 Dec 27, 2022
Deepl - DeepL Free API For Python

DeepL DeepL Free API Notice Since I don't want to make my AuthKey public, if you

Vincent Young 4 Apr 11, 2022
Bot made by BLACKSTORM[BM] Contact Us - t.me/BLACKSTORM18

ᴡʜᴀᴛ ɪs ᴊᴀʀᴠɪs sᴇᴄᴜʀɪᴛʏ ʙᴏᴛ ᴊᴀʀᴠɪs ʙᴏᴛ ɪs ᴛᴇʟᴇɢʀᴀᴍ ɢʀᴏᴜᴘ ᴍᴀɴᴀɢᴇʀ ʙᴏᴛ ᴡɪᴛʜ ᴍᴀɴʏ ғᴇᴀᴛᴜʀᴇs. ᴛʜɪs ʙᴏᴛ ʜᴇʟᴘs ʏᴏᴜ ᴛᴏ ᴍᴀɴᴀɢᴇ ʏᴏᴜʀ ɢʀᴏᴜᴘs ᴇᴀsɪʟʏ. ᴏʀɪɢɪɴᴀʟʟʏ ᴀ

1 Dec 11, 2021
A tool written in Python used to instalock agents in VALORANT using the local API.

Valorant Instalock Tool v2.1.0 by Mr. SOSA A tool written in Python used to instalock agents in VALORANT using the local API. This is NOT a hotkey pro

Mr. SOSA 3 Nov 18, 2021
SOLSEA-NFT-EXPLORE - Using Streamlit to build a simple UI on top of the Solana API

SOLSEA NFT Explorer Using Streamlit to build a simple UI on top of the Solana AP

Devin Capriola 3 Mar 19, 2022
Migration Manager (MM) is a very small utility that can list source servers in a target account and apply mass launch template modifications.

Migration Manager Migration Manager (MM) is a very small utility that can list source servers in a target account and apply mass launch template modif

Cody 2 Nov 04, 2021
A visualization of people a user follows on Twitter

Twitter-Map This software allows the user to create maps of Twitter accounts. Installation git clone Oliver Greenwood 12 Jul 20, 2022

stories-matiasucker created by GitHub Classroom

Stories do Instagram Este projeto tem como objetivo desenvolver uma pequena aplicação que simule os efeitos e funcionalidades ao estilo Instagram. A a

1 Dec 20, 2021
Unofficial Python wrapper for official Hacker News API

haxor Unofficial Python wrapper for official Hacker News API. Installation pip install haxor Usage Import and initialization: from hackernews import H

147 Sep 18, 2022
股票量化

StockQuant Gary-Hertel 请勿提交issue!可以加入交流群与其他朋友一起自学交流,加微信mzjimmy 一、配置文件的设置 启动框架需要先导入必要的模块,并且载入一次配置文件! 配置文件是一个json格式的文件config.json,在docs文件夹中有模板

218 Dec 25, 2022
A simple telegram Bot, Upload Media File| video To telegram using the direct download link. (youtube, Mediafire, google drive, mega drive, etc)

URL-Uploader (Bot) A Bot Upload file|video To Telegram using given Links. Features: 👉 Only Auth Users (AUTH_USERS) Can Use The Bot 👉 Upload YTDL Sup

Hash Minner 18 Dec 17, 2022
Stinky ID - A stable pluggable Telegram userbot + Voice & Video Call music bot, based on Telethon

Ultroid - UserBot A stable pluggable Telegram userbot + Voice & Video Call music

Riyan.rz 1 Jan 03, 2022
Leakvertise is a Python open-source project which aims to bypass these fucking annoying captchas and ads from linkvertise, easily

Leakvertise Leakvertise is a Python open-source project which aims to bypass these fucking annoying captchas and ads from linkvertise, easily. You can

Quatrecentquatre 9 Oct 06, 2022