Cloud-based recommendation system

Overview

Cloud-based recommendation system

This project is based on cloud services to create data lake, ETL process, train and deploy learning model to implement a recommendation system.

Purpose

One Web app can return if the consumer will buy the product or not when providing user ID and corresponding product SKU.

Services

This project will use services:

AWS: lambda function, Step functions, Glue (job,notebook,crawler), Athena, SNS, S3, Sagemaker, IAM, Dynamodb, API Gateway.

Confluent cloud (kafka) for streaming data.

Project description

  1. Create a bucket on S3 as the storage location of the data lake, store the raw data in the bucket (raw data zone), and then return the data after ETL to the same bucket (curated zone).

  2. Preview the data, determine the data is useful and meaningful for our project. Use AWS Glue crawler to grab corresponding data catalog (in created database and generated table info). Use Athena to do SQL query. This like Apache Hive, it does not change raw data, but do operations above the raw data.

  3. Create and store stream data. Create a kafka topic on Clonfluent cloud and set schema registry for the corresponding stream data, schema sets as confluent_cloud_kafka-->confluent_kafka_topic_schema.json. Set the kafka producer as confluent_cloud_kafka-->confluent_kafka_producer_lambda.py to push stream data to corresponding kafka topic in different partitions (because this project does not have exact source giving real stream data, we produce stream data manually). Set the consumer (confluent connector with AWS lambda) as confluent_cloud_kafka-->confluent_kafka_consumer_lambda.py to poll the stream data in kafka topic and store them in Dynamodb table.

  4. ETL process. Use lambda function to do data transformation operations based on SQL, corresponding scripts in file lambda_functions(ETL). Create Glue job to integrate new dataset and store in curated zone in data lake, scripts is in glue_job-->glue_job_ETL.py. Use step fuctions to orchestrate ETL workflow based on above lambda functions, ASL script is in step_function(workflow)-->step_functions_for_curated.json.

    This part is based on spark, and it is similar with the project in repo: https://github.com/Yi-Ding111/spark-ETL-based-databricks-aws.

  5. Train learning model (XGBoost). Use sagemaker notebook instance to do some kinds more operations like: EDA and feature engineering, use XGBoost framework to train the data, adjust parameters and try different attributes combinations to find the best one. Scripts is in sagemaker-->xgboost_deploy_sagemaker.ipynb.

  6. Deploy learning model. Get deploy endpoint after machine learning. Create lambda function to invoke the sagemaker endpoint to use the trained model, scripts is in sagemaker-->endpoint_interact_lambda.py. Let the lambda function integrate with API gatway (proxy integration) as the backend. Deploy the API gatewat and use the invoked URL for web applications to do interactions.

  7. Store the application output. Use SNS to publish the output to lambda and update the information into Dynamodb table, scripts is in sagemaker-->prediction_store_dynamodb.py


Acknowledgement

This project is completed with the guidance from Leo Lee (JR academy)


Author: YI DING, Leo Lee

Created at: Dec 2021

Contact: [email protected]

Owner
Yi Ding
Yi Ding
Codes for AAAI'21 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'

DHCN Codes for AAAI 2021 paper 'Self-Supervised Hypergraph Convolutional Networks for Session-based Recommendation'. Please note that the default link

Xin Xia 124 Dec 14, 2022
reXmeX is recommender system evaluation metric library.

A general purpose recommender metrics library for fair evaluation.

AstraZeneca 258 Dec 22, 2022
Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems.

Persine, the Persona Engine Persine is an automated tool to study and reverse-engineer algorithmic recommendation systems. It has a simple interface a

Jonathan Soma 87 Nov 29, 2022
Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'.

COTREC Codes for CIKM'21 paper 'Self-Supervised Graph Co-Training for Session-based Recommendation'. Requirements: Python 3.7, Pytorch 1.6.0 Best Hype

Xin Xia 43 Jan 04, 2023
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Annoy Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given quer

Spotify 10.6k Jan 01, 2023
Elliot is a comprehensive recommendation framework that analyzes the recommendation problem from the researcher's perspective.

Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation

Information Systems Lab @ Polytechnic University of Bari 215 Nov 29, 2022
A movie recommender which recommends the movies belonging to the genre that user has liked the most.

Content-Based-Movie-Recommender-System This model relies on the similarity of the items being recommended. (I have used Pandas and Numpy. However othe

Srinivasan K 0 Mar 31, 2022
Code for ICML2019 Paper "Compositional Invariance Constraints for Graph Embeddings"

Dependencies NOTE: This code has been updated, if you were using this repo earlier and experienced issues that was due to an outaded codebase. Please

Avishek (Joey) Bose 43 Nov 25, 2022
Bert4rec for news Recommendation

News-Recommendation-system-using-Bert4Rec-model Bert4rec for news Recommendation

saran pandian 2 Feb 04, 2022
Mutual Fund Recommender System. Tailor for fund transactions.

Explainable Mutual Fund Recommendation Data Please see 'DATA_DESCRIPTION.md' for mode detail. Recommender System Methods Baseline Collabarative Fiilte

JHJu 2 May 19, 2022
Bundle Graph Convolutional Network

Bundle Graph Convolutional Network This is our Pytorch implementation for the paper: Jianxin Chang, Chen Gao, Xiangnan He, Depeng Jin and Yong Li. Bun

55 Dec 25, 2022
Code for KHGT model, AAAI2021

KHGT Code for KHGT accepted by AAAI2021 Please unzip the data files in Datasets/ first. To run KHGT on Yelp data, use python labcode_yelp.py For Movi

32 Nov 29, 2022
ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms

ToR[e]cSys is a PyTorch Framework to implement recommendation system algorithms, including but not limited to click-through-rate (CTR) prediction, learning-to-ranking (LTR), and Matrix/Tensor Embeddi

LI, Wai Yin 90 Oct 08, 2022
Recommender systems are the systems that are designed to recommend things to the user based on many different factors

Recommender systems are the systems that are designed to recommend things to the user based on many different factors. The recommender system deals with a large volume of information present by filte

Happy N. Monday 3 Feb 15, 2022
A library of metrics for evaluating recommender systems

recmetrics A python library of evalulation metrics and diagnostic tools for recommender systems. **This library is activly maintained. My goal is to c

Claire Longo 458 Jan 06, 2023
Respiratory Health Recommendation System

Respiratory-Health-Recommendation-System Respiratory Health Recommendation System based on Air Quality Index Forecasts This project aims to provide pr

Abhishek Gawabde 1 Jan 29, 2022
A Python scikit for building and analyzing recommender systems

Overview Surprise is a Python scikit for building and analyzing recommender systems that deal with explicit rating data. Surprise was designed with th

Nicolas Hug 5.7k Jan 01, 2023
This is our implementation of GHCF: Graph Heterogeneous Collaborative Filtering (AAAI 2021)

GHCF This is our implementation of the paper: Chong Chen, Weizhi Ma, Min Zhang, Zhaowei Wang, Xiuqiang He, Chenyang Wang, Yiqun Liu and Shaoping Ma. 2

Chong Chen 53 Dec 05, 2022
Movies/TV Recommender

recommender Movies/TV Recommender. Recommends Movies, TV Shows, Actors, Directors, Writers. Setup Create file API_KEY and paste your TMDB API key in i

Aviem Zur 3 Apr 22, 2022
Group-Buying Recommendation for Social E-Commerce

Group-Buying Recommendation for Social E-Commerce This is the official implementation of the paper Group-Buying Recommendation for Social E-Commerce (

Jun Zhang 37 Nov 28, 2022