Credit Fraud detection: Context: It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Dataset Location : This dataset could be found at https://www.kaggle.com/mlg-ulb/creditcardfraud This dataset (creditcard.csv) was provided by KAGGLE The dataset contains transactions made by credit cards in September 2013 by European cardholders. It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. This dataset is already preprocessed. I began with splitting the dataset into train and test sets with a split of 0.75:0.25, Did a brief analysis and checked that the dataset contains 99.8% of the values are labeled as not fraud and only 0.2% are labeled as fraud. I bootstrapped the data by upsampling the training dataset because if we had only a few positives relative to negatives, the training model will spend most of its time on negative examples and not learn enough from positive ones. Therefore I bootstrapped the data to make it balanced. Then I applied Random Forest with the number of trees = 20 and determined which were the most important features for our model. I followed with Logistic Regression Then finally I followed by a Gaussian Naive Bayes I tested all three models for accuracy, precision, recall and f1 score. The Random Forest model has better accuaracy and precision than the Logistic Regression and Gaussian Naive Bayes models, but Logistic regression has the best recall, yet Random Forest has the best f1 score which is the harmonic average between precision and recall.
Credit fraud detection in Python using a Jupyter Notebook
Overview
VIL-100: A New Dataset and A Baseline Model for Video Instance Lane Detection (ICCV 2021)
Preparation Please see dataset/README.md to get more details about our datasets-VIL100 Please see INSTALL.md to install environment and evaluation too
Context-Sensitive Misspelling Correction of Clinical Text via Conditional Independence, CHIL 2022
cim-misspelling Pytorch implementation of Context-Sensitive Spelling Correction of Clinical Text via Conditional Independence, CHIL 2022. This model (
A testcase generation tool for Persistent Memory Programs.
PMFuzz PMFuzz is a testcase generation tool to generate high-value tests cases for PM testing tools (XFDetector, PMDebugger, PMTest and Pmemcheck) If
The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)
EIGNN: Efficient Infinite-Depth Graph Neural Networks The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 20
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)
A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G
Barbershop: GAN-based Image Compositing using Segmentation Masks (SIGGRAPH Asia 2021)
Barbershop: GAN-based Image Compositing using Segmentation Masks Barbershop: GAN-based Image Compositing using Segmentation Masks Peihao Zhu, Rameen A
Python package for covariance matrices manipulation and Biosignal classification with application in Brain Computer interface
pyRiemann pyRiemann is a python package for covariance matrices manipulation and classification through Riemannian geometry. The primary target is cla
Classic Papers for Beginners and Impact Scope for Authors.
There have been billions of academic papers around the world. However, maybe only 0.0...01% among them are valuable or are worth reading. Since our limited life has never been forever, TopPaper provi
Experiments for distributed optimization algorithms
Network-Distributed Algorithm Experiments -- This repository contains a set of optimization algorithms and objective functions, and all code needed to
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.
The Ultimate PyTorch Source-Build Template Translations: 한국어 TL;DR PyTorch built from source can be x4 faster than a naïve PyTorch install. This repos
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation
CyTran: Cycle-Consistent Transformers for Non-Contrast to Contrast CT Translation We propose a novel approach to translate unpaired contrast computed
An implementation for the ICCV 2021 paper Deep Permutation Equivariant Structure from Motion.
Deep Permutation Equivariant Structure from Motion Paper | Poster This repository contains an implementation for the ICCV 2021 paper Deep Permutation
PyArmadillo: an alternative approach to linear algebra in Python
PyArmadillo is a linear algebra library for the Python language, with an emphasis on ease of use.
Neural Message Passing for Computer Vision
Neural Message Passing for Quantum Chemistry Implementation of different models of Neural Networks on graphs as explained in the article proposed by G
graph-theoretic framework for robust pairwise data association
CLIPPER: A Graph-Theoretic Framework for Robust Data Association Data association is a fundamental problem in robotics and autonomy. CLIPPER provides
Source code for the GPT-2 story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"
Storium GPT-2 Models This is the official repository for the GPT-2 models described in the EMNLP 2020 paper [STORIUM: A Dataset and Evaluation Platfor
Official repository of the paper Privacy-friendly Synthetic Data for the Development of Face Morphing Attack Detectors
SMDD-Synthetic-Face-Morphing-Attack-Detection-Development-dataset Official repository of the paper Privacy-friendly Synthetic Data for the Development
Gradient representations in ReLU networks as similarity functions
Gradient representations in ReLU networks as similarity functions by Dániel Rácz and Bálint Daróczy. This repo contains the python code related to our
RuDOLPH: One Hyper-Modal Transformer can be creative as DALL-E and smart as CLIP
[Paper] [Хабр] [Model Card] [Colab] [Kaggle] RuDOLPH 🦌 🎄 ☃️ One Hyper-Modal Tr
Official PyTorch implementation of "Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks" (AAAI 2022)
Preemptive Image Robustification for Protecting Users against Man-in-the-Middle Adversarial Attacks This is the code for reproducing the results of th