This entire project encompasses both Data Analysis and Machine Learning. It was carefully structured and compiled for easy understanding.
To run this notebook you can either install.
- Download anaconda from anaconda site this have almost all dependencies pre-installed. Feel free to use any environment of choice
The Home Mortgage Disclosure Act (HMDA) requires many financial institutions to maintain, report, and publicly disclose information about mortgages. These public data are important because:
-
- they help show whether lenders are serving the housing needs of their communities.
- help authourities to determine and fish out all predatory act of lending.
- they give public officials information that helps them make decisions and policies.
- They shed light on lending patterns that could be discriminatory. Eg. a reported increase in mortgage borrowing by blacks and Hispanics as of 1993.
On my Kaggle site My Homepage.
Show how to perform advanced Analytics and Machine Learning in Python using a full complement of PyData utilities. This is aimed for those looking to get into the field Data Science or those who are already in the field and looking to solve a real world project with python.
- Importing Data with Pandas
- Cleaning Data
- Exploring Data through Visualizations with Matplotlib
- Doing predictive Analysis with various Machine Learning Algorithms
- Supervised Machine learning Techniques: + RandomForestClassifier + StratifiedKfold ( 5 folds) + ETC
- K-folds cross validation to valuate results locally
- Output the results from the IPython Notebook to Kaggle
- Was able to derive excerpt insights to give pro recommendation to borrowers
- Was able to predict applicant loan approval with 74% accuracy