This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

Overview

airflow-test

This is a practice on Airflow, which is

First of all, build virtual env using virtualbox

This is optional but highly recommended

  • It is very important to make sure that development env is independent and has no conflict with other env setting

Next, make python virtual env on proper directory

python3 -m venv sandbox
source ~/sandbox/bin/activate

Let start to install Airflow and initial setting

Install Airflow

pip3 install apache-airflow==2.1.0 \ ## airflow version depends on when you install and constraint 
  --constraint 'git address' ## constraint is ???. It is optional. refer to https://gist.github.com/marclamberti

Airflow Inital Setting

airflow db init ## airflow metadata initializing
airflow users create -u admin -p admin -f jaeyoung -l jang -r Admin -e [email protected] ## create admin user

airflow webserver ## launch web server. after that we can access airflow web server through localhost:8080
airflow scheduler ## activate airfow scheduler to run DAGs

Construct data pipeline with Airflow

One example of data pipeline (one DAGs) - let follow below steps and make data pipeline

  • creating table -> is-api-available -> extending -> processing-use ->
  • what is it? when new user comes in, sense that and process and store user info to destination (?)
  1. create table
  • make python script user_processing.py in dags folder
  • let create table using sqlite operator in user_processing.py
from airflow.models import DAG
from airflow.providers.sqlite.operators.sqlite import SqliteOperator

from datetime import datetime

default_args = {
    'start_date': datetime(2020, 1, 1)
}

with DAG('user_processing', schedule_interval='@daily', default_args=default_args, catchup=False) as dag:
    # Define tasks/operators

    creating_talbe = SqliteOperator(
        task_id='creating_table',
        sqlite_conn_id='db_sqlite',
        sql='''
            CREATE TABLE users (
                firstname TEXT NOT NULL,
                lastname TEXT NOT NULL,
                country TEXT NOT NULL,
                username TEXT NOT NULL,
                password TEXT NOT NULL,
                email TEXT NOT NULL PRIMARY KEY
            );
            '''
    )
  • set sqlite connection on Airflow Web UI
pip install apache-airflow-providers-sqlite ## install airflow provider of sqlite. refer airflow doc. may already be installed from intial installing Airflow..
# next, make a new sqltie connection named 'db_sqlite' in Airflow
  • after that, do test airflow tasks!

airflow tasks test user_processing creating_table 2020-01-01

  • one more things to do just for check again
sqlite3 airflow.db ## connect to sqlite3 in airflow.db
sqlite>.tables; ## see the tables in airflow.dbb
sqlite>select * form users; ## test query
Owner
Jaeyoung
Jaeyoung
A simple python script where the user inputs the current ingredients they have in their kitchen into ingredients.txt

A simple python script where the user inputs the current ingredients they have in their kitchen into ingredients.txt and then runs the main.py script, and it will output what recipes can be created b

Jordan Leich 3 Nov 02, 2022
A simple but flexible plugin system for Python.

PluginBase PluginBase is a module for Python that enables the development of flexible plugin systems in Python. Step 1: from pluginbase import PluginB

Armin Ronacher 1k Dec 16, 2022
🎅🏻 Helping santa understand ✨ python ✨

☃️ Advent of code 2021 ☃️ Helping santa understand ✨ python ✨

Fluffy 2 Dec 25, 2021
Learning with Peter Norvig's lis.py interpreter

Learning with lis.py This repository contains variations of Peter Norvig's lis.py interpreter for a subset of Scheme, described in (How to Write a (Li

Fluent Python 170 Dec 15, 2022
UUID_ApiGenerator - This an API that will return a key-value pair of randomly generated UUID

This an API that will return a key-value pair of randomly generated UUID. Key will be a timestamp and value will be UUID. While the

1 Jan 28, 2022
Information about a signed UEFI Shell that can be used when Secure Boot is enabled.

SignedUEFIShell During our research of the BootHole vulnerability last year, we tried to find as many signed bootloaders as we could. We searched all

Mickey 61 Jan 03, 2023
A program to generate random numbers b/w 0 to 10 using time

random-num-using-time A program to generate random numbers b/w 0 to 10 using time it uses python's in-built module datetime and an equation which retu

Atul Kushwaha 1 Oct 01, 2022
Minimalistic Gridworld Environment (MiniGrid)

Minimalistic Gridworld Environment (MiniGrid) There are other gridworld Gym environments out there, but this one is designed to be particularly simple

Maxime Chevalier-Boisvert 1.7k Jan 03, 2023
[Cython] Vs [Python] Which one is Faster ?

[Cython] Vs [Python] ? Attractive Contrast :) Mission : Which one is Faster ? Comparing of Execution runtime for [Selection_sort] with Time Complexity

baqer marani 1 Dec 05, 2021
Learning a Little about Containerlab

Learning a Little about Containerlab Hello all. This is the respository based on this blog post. Getting Started Feel free to use this example. You wi

10 Oct 16, 2022
uMap lets you create maps with OpenStreetMap layers in a minute and embed them in your site.

uMap project About uMap lets you create maps with OpenStreetMap layers in a minute and embed them in your site. Because we think that the more OSM wil

771 Dec 29, 2022
Now you'll never be late for your Webinars or Meetings on the GoToWebinar Platform

GoToWebinar Launcher : Now you'll never be late for your Webinars or Meetings on the GoToWebinar Platform About Are you popular for always being late

Jay Thorat 6 Jun 07, 2022
A faster copy of nell's comet nuker

Astro a faster copy of nell's comet nuker also nell uses external libraries like it's cocaine man never learned to use ansi color codes (ily nell) (On

horrid 8 Aug 15, 2022
Experimental proxy for dumping the unencrypted packet data from Brawl Stars (WIP)

Brawl Stars Proxy Experimental proxy for version 39.99 of Brawl Stars. It allows you to capture the packets being sent between the Brawl Stars client

4 Oct 29, 2021
run-js Goal: The Easiest Way to Run JavaScript in Python

run-js Goal: The Easiest Way to Run JavaScript in Python features Stateless Async JS Functions No Intermediary Files Functional Programming CommonJS a

Daniel J. Dufour 9 Aug 16, 2022
Fetch PRs from GitHub and analyze which ones are unmergeable

Set up token Generate a personal access token on GitHub. Add repo permissions. export GH_TOKEN="abcdefg" Pull PR data make Usually, GitHub doesn't h

Stefan van der Walt 1 Nov 05, 2021
WorldsCollide - Final Fantasy VI Randomizer

FFVI Worlds Collide Worlds Collide is an open worlds randomizer for Final Fantas

8 Jun 13, 2022
Projeto para ajudar no aprendizado da linguagem Pyhon

Economize Este projeto tem o intuito de criar desáfios para a codificação em Python, fazendo com que haja um maior entendimento da linguagem em seu to

Lucas Cunha Rodrigues 1 Dec 16, 2021
Meera 2 May 12, 2022