Python KNN model: Predicting a probability of getting a work visa. Tableau: Non-immigrant visas over the years.

Overview

The value of international students to the United States. Probability of getting a non-immigrant visa.

Project timeline: Jan 2021 - April 2021

Project team:

  • Zinaida Dvoskina (myself)
  • Kirill Ilin
  • Johnathan Conley
  • Cindy Ye Fung

Analyzed publicly available data on the U.S. non-immigrant visa acquisition. To conduct research, used publicly available data from the USCIS (the number of visas issued per country, category, the political party in office, and year) and from the US Department of Labor Office of Foreign Labor Certification (employment-based immigration applications: applicant’s received dates, decision dates, the most recent date a case determination decision was issued, etc.).

Created a Tableau timelapse, showing the world map, where visa numbers can be filtered by region, country, and compared between years. Other visualizations showed no strong trend to justify that the political party in office affects the likelihood of a foreigner obtaining a visa.

Visa Time Lapse Visa by Year and Party Visa Cat Working

Created a KNN model for classification with the following variables as predictors: Received month, Agent representing employer, Annual wage rate, Annual prevailing wage, PW wage level, H-1B dependent status, Support H1B status. Datasets are populated with approved results of visa applications - almost 97%. That resulted in highly biased prediction models towards positive outcomes, which means the model wasn’t very trustworthy, even though it performed very well predicting positive outcomes for visa approval.

To solve the problem, randomly eliminated data points and aligned the number of positive and negative outcomes for a more correct prediction. Due to computing power, had to limit the number of predictors to 3: Full Time Position, PW, and New Employer, and the model was only run for 2020.

A new KNN model run on undersampled data showed results not biased towards a positive outcome. Chosen predictors had an impact on visa decisions, however, only in approximately 60% of cases. Further increase in the number of predictors could improve the model.

An interesting finding was that software engineers are at the top job title to obtain a working visa; however, they have the most denials.


In this repository you can find our code, Tableau workbooks, project report and a presentation with our major findings. The data file is too big to upload here.

Owner
Zinaida Dvoskina
Marketing Data Analyst. Master of Science in Business Analytics.
Zinaida Dvoskina
This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin

SPARQLing Database Queries from Intermediate Question Decompositions This repo is the implementation of the following paper: SPARQLing Database Querie

Yandex Research 20 Dec 19, 2022
Code repo for EMNLP21 paper "Zero-Shot Information Extraction as a Unified Text-to-Triple Translation"

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation Source code repo for paper Zero-Shot Information Extraction as a Unified Text

cgraywang 88 Dec 31, 2022
A repository for the paper "Improved Adversarial Systems for 3D Object Generation and Reconstruction".

Improved Adversarial Systems for 3D Object Generation and Reconstruction: This is a repository for the paper "Improved Adversarial Systems for 3D Obje

Edward Smith 188 Dec 25, 2022
Reinforcement Learning for Portfolio Management

qtrader Reinforcement Learning for Portfolio Management Why Reinforcement Learning? Learns the optimal action, rather than models the market. Adaptive

Angelos Filos 406 Jan 01, 2023
SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

SpeechNAS Better Trade off between Latency and Accuracy for Large Scale Speaker Verification

Wentao Zhu 24 May 20, 2022
Unofficial JAX implementations of Deep Learning models

JAX Models Table of Contents About The Project Getting Started Prerequisites Installation Usage Contributing License Contact About The Project The JAX

107 Jan 05, 2023
A toy compiler that can convert Python scripts to pickle bytecode πŸ₯’

Pickora 🐰 A small compiler that can convert Python scripts to pickle bytecode. Requirements Python 3.8+ No third-party modules are required. Usage us

κŒ—α–˜κ’’κ€€κ“„κ’’κ€€κˆ€κŸ 68 Jan 04, 2023
El-Gamal on Elliptic Curve (Python)

El-Gamal-on-EC El-Gamal on Elliptic Curve (Python) References: https://docsdrive.com/pdfs/ansinet/itj/2005/299-306.pdf https://arxiv.org/ftp/arxiv/pap

3 May 04, 2022
ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet)

ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-tree Complex Wavelet Representation and Contradict Channel Loss (HDCWNet) (

Wei-Ting Chen 49 Dec 27, 2022
This is the code for the paper "Contrastive Clustering" (AAAI 2021)

Contrastive Clustering (CC) This is the code for the paper "Contrastive Clustering" (AAAI 2021) Dependency python=3.7 pytorch=1.6.0 torchvision=0.8

Yunfan Li 210 Dec 30, 2022
Code release for "Making a Bird AI Expert Work for You and Me".

Making-a-Bird-AI-Expert-Work-for-You-and-Me Code release for "Making a Bird AI Expert Work for You and Me". arxiv (Coming soon...) Changelog 2021/12/6

PRIS-CV: Computer Vision Group 11 Dec 11, 2022
A dataset for online Arabic calligraphy

Calliar Calliar is a dataset for Arabic calligraphy. The dataset consists of 2500 json files that contain strokes manually annotated for Arabic callig

ARBML 114 Dec 28, 2022
Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

Preference-Planning-Deep-IRL Introduction Check my portfolio post Dependencies Gym stable-baselines3 PyTorch Usage Take Demonstration python3 record.

Tianyu Li 9 Oct 26, 2022
Speeding-Up Back-Propagation in DNN: Approximate Outer Product with Memory

Approximate Outer Product Gradient Descent with Memory Code for the numerical experiment of the paper Speeding-Up Back-Propagation in DNN: Approximate

2 Mar 02, 2022
Shape-Adaptive Selection and Measurement for Oriented Object Detection

Source Code of AAAI22-2171 Introduction The source code includes training and inference procedures for the proposed method of the paper submitted to t

houliping 24 Nov 29, 2022
Author Disambiguation using Knowledge Graph Embeddings with Literals

Author Name Disambiguation with Knowledge Graph Embeddings using Literals This is the repository for the master thesis project on Knowledge Graph Embe

12 Oct 19, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
Hierarchical probabilistic 3D U-Net, with attention mechanisms (β€”π˜ˆπ˜΅π˜΅π˜¦π˜―π˜΅π˜ͺ𝘰𝘯 𝘜-π˜•π˜¦π˜΅, π˜šπ˜Œπ˜™π˜¦π˜΄π˜•π˜¦π˜΅) and a nested decoder structure with deep supervision (β€”π˜œπ˜•π˜¦π˜΅++).

Hierarchical probabilistic 3D U-Net, with attention mechanisms (β€”π˜ˆπ˜΅π˜΅π˜¦π˜―π˜΅π˜ͺ𝘰𝘯 𝘜-π˜•π˜¦π˜΅, π˜šπ˜Œπ˜™π˜¦π˜΄π˜•π˜¦π˜΅) and a nested decoder structure with deep supervision (β€”π˜œπ˜•π˜¦π˜΅++). Built in TensorFlow 2.5. Configured for vox

Diagnostic Image Analysis Group 32 Dec 08, 2022
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
A python comtrade load library accelerated by go

Comtrade-GRPC Code for python used is mainly from dparrini/python-comtrade. Just patch the code in BinaryDatReader.parse for parsing a little more eff

Bo 1 Dec 27, 2021