Python KNN model: Predicting a probability of getting a work visa. Tableau: Non-immigrant visas over the years.

Last update: Nov 21, 2021

Overview

The value of international students to the United States. Probability of getting a non-immigrant visa.

Project timeline: Jan 2021 - April 2021

Project team:

Zinaida Dvoskina (myself)
Kirill Ilin
Johnathan Conley
Cindy Ye Fung

Analyzed publicly available data on the U.S. non-immigrant visa acquisition. To conduct research, used publicly available data from the USCIS (the number of visas issued per country, category, the political party in office, and year) and from the US Department of Labor Office of Foreign Labor Certification (employment-based immigration applications: applicant’s received dates, decision dates, the most recent date a case determination decision was issued, etc.).

Created a Tableau timelapse, showing the world map, where visa numbers can be filtered by region, country, and compared between years. Other visualizations showed no strong trend to justify that the political party in office affects the likelihood of a foreigner obtaining a visa.

Created a KNN model for classification with the following variables as predictors: Received month, Agent representing employer, Annual wage rate, Annual prevailing wage, PW wage level, H-1B dependent status, Support H1B status. Datasets are populated with approved results of visa applications - almost 97%. That resulted in highly biased prediction models towards positive outcomes, which means the model wasn’t very trustworthy, even though it performed very well predicting positive outcomes for visa approval.

To solve the problem, randomly eliminated data points and aligned the number of positive and negative outcomes for a more correct prediction. Due to computing power, had to limit the number of predictors to 3: Full Time Position, PW, and New Employer, and the model was only run for 2020.

A new KNN model run on undersampled data showed results not biased towards a positive outcome. Chosen predictors had an impact on visa decisions, however, only in approximately 60% of cases. Further increase in the number of predictors could improve the model.

An interesting finding was that software engineers are at the top job title to obtain a working visa; however, they have the most denials.

In this repository you can find our code, Tableau workbooks, project report and a presentation with our major findings. The data file is too big to upload here.

Python KNN model: Predicting a probability of getting a work visa. Tableau: Non-immigrant visas over the years.

Related tags

Overview

Owner

Zinaida Dvoskina

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

This repository contains the code for the paper ``Identifiable VAEs via Sparse Decoding''.

Data Engineering ZoomCamp

Remote sensing change detection using PaddlePaddle

Official codebase for "B-Pref: Benchmarking Preference-BasedReinforcement Learning" contains scripts to reproduce experiments.

Deep Inertial Prediction (DIPr)

Vector Quantized Diffusion Model for Text-to-Image Synthesis

CTF Challenge for CSAW Finals 2021

The official repository for our paper "The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers". We significantly improve the systematic generalization of transformer models on a variety of datasets using simple tricks and careful considerations.

DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Eff video representation - Efficient video representation through neural fields

Rank1 Conversation Emotion Detection Task

Preprocessed Datasets for our Multimodal NER paper

NP DRAW paper released code

Tool for live presentations using manim

Mmdet benchmark with python

Builds a LoRa radio frequency fingerprint identification (RFFI) system based on deep learning techiniques

Tensorflow Implementation of SMU: SMOOTH ACTIVATION FUNCTION FOR DEEP NETWORKS USING SMOOTHING MAXIMUM TECHNIQUE

GEA - Code for Guided Evolution for Neural Architecture Search