This is an open solution to the Home Credit Default Risk challenge 🏡

Last update: Dec 27, 2022

Overview

Home Credit Default Risk: Open Solution

This is an open solution to the Home Credit Default Risk challenge 🏡 .

More competitions 🎇

Check collection of public projects 🎁 , where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
Encourage more Kagglers to start working on this competition.
Deliver open source solution with no strings attached. Code is available on our GitHub repository 💻 . This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code 😃
We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: Home Credit Default Risk 📈 and screens below.

Train and validation results on folds 📊	LightGBM learning curves 📊

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script 🐍 .

Note

As of 1.07.2019 we officially discontinued neptune-cli client project making neptune-client the only supported way to communicate with Neptune. That means you should run experiments via python ... command or update loggers to neptune-client. For more information about the new client go to neptune-client read-the-docs page.

How to start?

Learn about our solutions

Check Kaggle forum and participate in the discussions.
Check our Wiki pages 🏡 , where we document our work. See solutions below:

link to code	name	CV	LB	link to description
solution 1	chestnut 🌰	?	0.742	LightGBM and basic features
solution 2	seedling 🌱	?	0.747	Sklearn and XGBoost algorithms and groupby features
solution 3	blossom 🌼	0.7840	0.790	LightGBM on selected features
solution 4	tulip 🌷	0.7905	0.801	LightGBM with smarter features
solution 5	sunflower 🌻	0.7950	0.804	LightGBM clean dynamic features
solution 6	four leaf clover 🍀	0.7975	0.806	priv. LB 0.79804, Stacking by feature diversity and model diversity

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation (fast track)

Clone repository and install requirements (use Python3.5)

pip3 install -r requirements.txt

Register to the neptune.ml (if you wish to use it)
Run experiment based on LightGBM:

🔱

neptune account login
neptune run --config configs/neptune.yaml main.py train_evaluate_predict_cv --pipeline_name lightGBM

🐍

python main.py -- train_evaluate_predict_cv --pipeline_name lightGBM

Installation (step by step)

Step by step installation 🖥️

Hyperparameter Tuning

Various options of hyperparameter tuning are available

Random Search

configs/neptune.yaml

  hyperparameter_search__method: random
  hyperparameter_search__runs: 100

src/pipeline_config.py

    'tuner': {'light_gbm': {'max_depth': ([2, 4, 6], "list"),
                            'num_leaves': ([2, 100], "choice"),
                            'min_child_samples': ([5, 10, 15 25, 50], "list"),
                            'subsample': ([0.95, 1.0], "uniform"),
                            'colsample_bytree': ([0.3, 1.0], "uniform"),
                            'min_gain_to_split': ([0.0, 1.0], "uniform"),
                            'reg_lambda': ([1e-8, 1000.0], "log-uniform"),
                            },
              }

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

Check competition project on GitHub to see what we are working on right now.
Express your interest in paticular task by writing comment in this task, or by creating new one with your fresh idea.
We will get back to you quickly in order to start working together.
Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

Kaggle discussion is our primary way of communication.
Read project's Wiki, where we publish descriptions about the code, pipelines and supporting tools such as neptune.ml.
Submit an issue directly in this repo.

Comments

ModuleNotFoundError: No module named 'deepsense'
There are two things that will make the processing of your issue faster:

Make sure that you are using the latest version of the code,

In case of bug issue, it would be nice to provide more technical details such like execution command, error message or script that reproduces your bug.

Thanks!

Kamil & Jakub,

core contributors to the minerva.ml
opened by poteman 9

use lightGBM_stacking pipeline raise error

There are two things that will make the processing of your issue faster:

Make sure that you are using the latest version of the code,
In case of bug issue, it would be nice to provide more technical details such like execution command, error message or script that reproduces your bug.

Thanks!

Kamil & Jakub,

core contributors to the minerva.ml

while I run the script python -W ignore main.py -- train_evaluate_predict_cv --pipeline_name lightGBM_stacking

it raise error like following:

2018-08-10 21:03:04 steppy >>> done: initializing experiment directories
2018-08-10 21:03:04 steppy >>> Step light_gbm_fold_0 initialized
2018-08-10 21-03-04 home-credit >>> Start pipeline fit and transform on train
2018-08-10 21:03:04 steppy >>> cleaning cache...
2018-08-10 21:03:04 steppy >>> cleaning cache done
Traceback (most recent call last):
  File "main.py", line 82, in <module>
    main()
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "main.py", line 78, in train_evaluate_predict_cv
    pipeline_manager.train_evaluate_predict_cv(pipeline_name, model_level, dev_mode, submit_predictions)
  File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 37, in train_evaluate_predict_cv
    train_evaluate_predict_cv(pipeline_name, model_level, dev_mode, submit_predictions)
  File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 173, in train_evaluate_predict_cv
    train_evaluate_predict_cv_first_level(pipeline_name, dev_mode, submit_predictions)
  File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 285, in train_evaluate_predict_cv_first_level
    model_level='first')
  File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 428, in _fold_fit_evaluate_predict_loop
    fold_id, pipeline_name, model_level)
  File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 517, in _fold_fit_evaluate_loop
    pipeline.fit_transform(train_data)
  File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/steppy/base.py", line 310, in fit_transform
    step_inputs[input_data_part] = data[input_data_part]
KeyError: 'input'

opened by ghost 8

How to export the feature correlation?

Dear,

Could you tell me how to export the feature correlation? I saw some features in your wiki with some correlation scores. I would like to know how can you know the score? By the way, what is the scoring math metric you used? Thanks

opened by OsloAI 4
PermissionError: [WinError 5] Access is denied

Hi. Neptune is a very handy tool and I'm getting to use it. However I encountered some errors.

It seems Neptune successfully read in the data, there are messages telling me the dataset has been initialized. But when the system wants to do gb training, such permission error happens.

deepsense.neptune.client_library.threads.channel_values_thread WARNING channel_values_thread.py:389 - _validate() X-coordinate 1278101.8469238281 is not greater than the previous one 1278101.8469238281. Dropping point (x=1278101.8469238281, y=PermissionError: [WinError 5] Access is denied

Besides, I'm wondering if such runtime warning is normal.

deepsense.neptune.client_library.threads.channel_values_thread WARNING channel_values_thread.py:389 - _validate() X-coordinate 1278101.8469238281 is not greater than the previous one 1278101.8469238281. Dropping point (x=1278101.8469238281, y= new_handle = steal_handle(parent_pid, pipe_handle) ) for channel stderr. X-coordinates must be strictly increasing for each channel.

This warning came out every second. Should it be like this?

Could you please tell me why there is such error?

opened by MRrollingJerry 4
Notebook Updated ?

Hello,

Thank you for sharing your project, it is interesting ! I have a question regarding the different notebook where there are some preprocessing (cleaning) and creation of new features (hand craft and aggregating). I think the notebook are not updated if we compare to the code. For instance, if we look at the application data, we see 5 cleaning in the code, but in the notebook only two are available. It is forecast to update them ? It is more easy to understand all the data engineering with a notebook than a complete code.

opened by Shiro-LK 2
Add some logger info while reading data

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

Each pull request should be provided with minimal description about its contents.

Thanks!

Jakub & Kamil,

core contributors to the minerva.ml

opened by pranayaryal 2

KeyError: 'NAME_EDUCATION_TYPE_CODE_GENDER_AMT_CREDIT_min' while running the code

Solution Version: solution 5 | sunflower 🌻 Command used to run the code: neptune run --config configs/neptune.yaml main.py train_evaluate_predict_cv --pipeline_name lightGBM

Error Message: KeyError: 'NAME_EDUCATION_TYPE_CODE_GENDER_AMT_CREDIT_min' The pops up in the feature_extraction.py file, under the GroupbyAggregateDiffs class and _add_diff_features method. While iterating through self.groupby_aggregations, using this line of code for groupby_cols, specs in self.groupby_aggregations:, the contents of this - self.groupby_aggregations are:

[(['NAME_EDUCATION_TYPE', 'CODE_GENDER'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['NAME_FAMILY_STATUS', 'NAME_EDUCATION_TYPE'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['NAME_FAMILY_STATUS', 'CODE_GENDER'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['CODE_GENDER', 'ORGANIZATION_TYPE'], [('AMT_ANNUITY', 'mean'), ('AMT_INCOME_TOTAL', 'mean'), ('DAYS_REGISTRATION', 'mean'), ('EXT_SOURCE_1', 'mean')]), (['CODE_GENDER', 'REG_CITY_NOT_WORK_CITY'], [('AMT_ANNUITY', 'mean'), ('CNT_CHILDREN', 'mean'), ('DAYS_ID_PUBLISH', 'mean')]), (['CODE_GENDER', 'NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE', 'REG_CITY_NOT_WORK_CITY'], [('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean')]), (['NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE'], [('AMT_CREDIT', 'mean'), ('AMT_REQ_CREDIT_BUREAU_YEAR', 'mean'), ('APARTMENTS_AVG', 'mean'), ('BASEMENTAREA_AVG', 'mean'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_3', 'mean'), ('NONLIVINGAREA_AVG', 'mean'), ('OWN_CAR_AGE', 'mean'), ('YEARS_BUILD_AVG', 'mean')]), (['NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE', 'REG_CITY_NOT_WORK_CITY'], [('ELEVATORS_AVG', 'mean'), ('EXT_SOURCE_1', 'mean')]), (['OCCUPATION_TYPE'], [('AMT_ANNUITY', 'mean'), ('CNT_CHILDREN', 'mean'), ('CNT_FAM_MEMBERS', 'mean'), ('DAYS_BIRTH', 'mean'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_REGISTRATION', 'mean'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_3', 'mean')])]

I am assuming that this is where it is combining 'NAME_EDUCATION_TYPE', 'CODE_GENDER', 'AMT_CREDIT', 'min' and the key is missing. I tried to isolate the error message, but the code base is quite large. I figured a community based approach to resolve the bug might be ideal.

opened by KartikKannapur 2

ValueError: No transformer cached credit_card_balance_cleaning_fold_1

When I run "python main.py -- train_evaluate_predict_cv --pipeline_name XGBoost", I got this error in the title. However, I can run the lightGBM pipeline successfully.

opened by liyinxiao 1
Ming zhang

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

Each pull request should be provided with minimal description about its contents.

Thanks!

Jakub & Kamil,

core contributors to the minerva.ml

opened by Bowen-Guo 1
Update models.py - Neural Network framework

Added a general framework for a Keras-based neural network (#136)

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

Each pull request should be provided with minimal description about its contents.

Thanks!

Jakub & Kamil,

core contributors to the minerva.ml

opened by yotamco100 1
CV improved LB not

Hi guys,

i have extracted some more features from bureau file, it improved CV from .7950 to 0.7974 ( std 0.0024) but LB drop from .804 to .802. has anyone experienced this before ? i dont use any TARGET related features, and dont think it is overfitting.

opened by davutpolat 1
Best configurations and models used

Hello, I would like to use your pipeline described as 'solution 6,' but I don't get which models were used for the 1st level as oof-predictions.

Also, I would like to know which configurations were used for the 1st level and the 2nd level respectively. In the config file, there are two configuration files (neptune_stacking.yaml and neptune.yaml) and I'm confused which one was used for each level.

Could you let me know the models and the configurations used for the 1st and the 2nd layer respectively?

Thank you!!

opened by kssteven418 0

Releases(solution-6)

solution-6(Aug 30, 2018)

Source code(tar.gz)
Source code(zip)
solution-5(Jul 18, 2018)

Description: https://github.com/neptune-ml/open-solution-home-credit/wiki/LightGBM-clean-dynamic-features
Source code(tar.gz)
Source code(zip)
solution-4(Jul 10, 2018)

Description: https://github.com/neptune-ml/open-solution-home-credit/wiki/LightGBM-with-smarter-features
Source code(tar.gz)
Source code(zip)
solution-3(Jul 3, 2018)

Description: https://github.com/neptune-ml/open-solution-home-credit/wiki/LightGBM-on-selected-features
Source code(tar.gz)
Source code(zip)
solution-2(Jun 19, 2018)

Description: https://github.com/neptune-ml/open-solution-home-credit/wiki/Sklearn-and-XGBoost-algorithms-and-groupby-features
Source code(tar.gz)
Source code(zip)
solution-1(Jul 18, 2018)

Description: https://github.com/neptune-ml/open-solution-home-credit/wiki/LightGBM-and-basic-features
Source code(tar.gz)
Source code(zip)

Owner

minerva.ml

GitHub Repository https://www.kaggle.com/c/home-credit-default-risk

(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

Wasserstein Distances for Stereo Disparity Estimation Accepted in NeurIPS 2020 as Spotlight. [Project Page] Wasserstein Distances for Stereo Disparity

92 Dec 12, 2022

This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021.

Open Rule Induction This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021. Abstract Rule

16 Nov 14, 2022

DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.

DWIPrep: A Robust Preprocessing Pipeline for dMRI Data DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transp

1 Jan 09, 2023

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

numbalsoda numbalsoda is a python wrapper to the LSODA method in ODEPACK, which is for solving ordinary differential equation initial value problems.

52 Jan 09, 2023

A Python library for unevenly-spaced time series analysis

traces A Python library for unevenly-spaced time series analysis. Why? Taking measurements at irregular intervals is common, but most tools are primar

516 Dec 29, 2022

Lightweight Python library for adding real-time object tracking to any detector.

Norfair is a customizable lightweight Python library for real-time 2D object tracking. Using Norfair, you can add tracking capabilities to any detecto

1.7k Jan 05, 2023

Stacked Generative Adversarial Networks

Stacked Generative Adversarial Networks This repository contains code for the paper "Stacked Generative Adversarial Networks", CVPR 2017. Part of the

241 May 07, 2022

thundernet ncnn

MMDetection_Lite 基于mmdetection 实现一些轻量级检测模型，安装方式和mmdeteciton相同 voc0712 voc 0712训练 voc2007测试 coco预训练 thundernet_voc_shufflenetv2_1.5 input shape mAP 320

39 Dec 05, 2022

UIUCTF 2021 Public Challenge Repository

UIUCTF-2021-Public UIUCTF 2021 Public Challenge Repository Notes: every challenge folder contains a challenge.yml file in the format for ctfcli, CTFd'

15 Nov 03, 2022

Face Recognize System on camera AI OAK1

FRS on OAK1 Face Recognize System on camera OAK1 This project contains our work that deploy on camera OAK1 Features Anti-Spoofing Face detection Face

6 Aug 08, 2022

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

209 Jan 01, 2023

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

3 Oct 07, 2022

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL, with support of notifications and distributed workload of that work

3 Mar 11, 2022

A project to make Amazon Echo respond to sign language using your webcam

Making Alexa respond to Sign Language using Tensorflow.js Try the live demo Read the Blog Post on Tensorflow's Blog Coming Soon Watch the video This p

444 Jan 03, 2023

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Build Type Linux MacOS Windows Build Status OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facia

25.7k Jan 09, 2023

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

cppn-gan-vae tensorflow Train Compositional Pattern Producing Network as a Generative Model, using Generative Adversarial Networks and Variational Aut

343 Dec 29, 2022

This is an open solution to the Home Credit Default Risk challenge 🏡

Related tags

Overview

Home Credit Default Risk: Open Solution

More competitions 🎇

Our goals

Disclaimer

Note

How to start?

Learn about our solutions

Start experimenting with ready-to-use code

Installation (fast track)

Installation (step by step)

Hyperparameter Tuning

Get involved

User support

Comments

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Pull Request template to Home Credit Default Risk Open Solution

Code contributions

Releases(solution-6)

solution-6(Aug 30, 2018)

solution-5(Jul 18, 2018)

solution-4(Jul 10, 2018)

solution-3(Jul 3, 2018)

solution-2(Jun 19, 2018)

solution-1(Jul 18, 2018)

Owner

minerva.ml

(NeurIPS 2020) Wasserstein Distances for Stereo Disparity Estimation

This repository is the official implementation of Open Rule Induction. This paper has been accepted to NeurIPS 2021.

DWIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data.

Python wrapper of LSODA (solving ODEs) which can be called from within numba functions.

A Python library for unevenly-spaced time series analysis

Lightweight Python library for adding real-time object tracking to any detector.

Stacked Generative Adversarial Networks

thundernet ncnn

UIUCTF 2021 Public Challenge Repository

Face Recognize System on camera AI OAK1

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Fast and customizable reconnaissance workflow tool based on simple YAML based DSL.

A project to make Amazon Echo respond to sign language using your webcam

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

Automated Attendance Project Using Face Recognition

PyTorch implementation of "Debiased Visual Question Answering from Feature and Sample Perspectives" (NeurIPS 2021)

Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks