This is an open solution to the Home Credit Default Risk challenge ๐Ÿก

Overview

Home Credit Default Risk: Open Solution

Join the chat at https://gitter.im/minerva-ml/open-solution-home-credit license

This is an open solution to the Home Credit Default Risk challenge ๐Ÿก .

More competitions ๐ŸŽ‡

Check collection of public projects ๐ŸŽ , where you can find multiple Kaggle competitions with code, experiments and outputs.

Our goals

We are building entirely open solution to this competition. Specifically:

  1. Learning from the process - updates about new ideas, code and experiments is the best way to learn data science. Our activity is especially useful for people who wants to enter the competition, but lack appropriate experience.
  2. Encourage more Kagglers to start working on this competition.
  3. Deliver open source solution with no strings attached. Code is available on our GitHub repository ๐Ÿ’ป . This solution should establish solid benchmark, as well as provide good base for your custom ideas and experiments. We care about clean code ๐Ÿ˜ƒ
  4. We are opening our experiments as well: everybody can have live preview on our experiments, parameters, code, etc. Check: Home Credit Default Risk ๐Ÿ“ˆ and screens below.
Train and validation results on folds ๐Ÿ“Š LightGBM learning curves ๐Ÿ“Š
train-validation-results-on-folds LightGBM-learning-curves

Disclaimer

In this open source solution you will find references to the neptune.ml. It is free platform for community Users, which we use daily to keep track of our experiments. Please note that using neptune.ml is not necessary to proceed with this solution. You may run it as plain Python script ๐Ÿ .

Note

As of 1.07.2019 we officially discontinued neptune-cli client project making neptune-client the only supported way to communicate with Neptune. That means you should run experiments via python ... command or update loggers to neptune-client. For more information about the new client go to neptune-client read-the-docs page.

How to start?

Learn about our solutions

  1. Check Kaggle forum and participate in the discussions.
  2. Check our Wiki pages ๐Ÿก , where we document our work. See solutions below:
link to code name CV LB link to description
solution 1 chestnut ๐ŸŒฐ ? 0.742 LightGBM and basic features
solution 2 seedling ๐ŸŒฑ ? 0.747 Sklearn and XGBoost algorithms and groupby features
solution 3 blossom ๐ŸŒผ 0.7840 0.790 LightGBM on selected features
solution 4 tulip ๐ŸŒท 0.7905 0.801 LightGBM with smarter features
solution 5 sunflower ๐ŸŒป 0.7950 0.804 LightGBM clean dynamic features
solution 6 four leaf clover ๐Ÿ€ 0.7975 0.806 priv. LB 0.79804, Stacking by feature diversity and model diversity

Start experimenting with ready-to-use code

You can jump start your participation in the competition by using our starter pack. Installation instruction below will guide you through the setup.

Installation (fast track)

  1. Clone repository and install requirements (use Python3.5)
pip3 install -r requirements.txt
  1. Register to the neptune.ml (if you wish to use it)
  2. Run experiment based on LightGBM:

๐Ÿ”ฑ

neptune account login
neptune run --config configs/neptune.yaml main.py train_evaluate_predict_cv --pipeline_name lightGBM

๐Ÿ

python main.py -- train_evaluate_predict_cv --pipeline_name lightGBM

Installation (step by step)

Step by step installation ๐Ÿ–ฅ๏ธ

Hyperparameter Tuning

Various options of hyperparameter tuning are available

  1. Random Search

    configs/neptune.yaml

      hyperparameter_search__method: random
      hyperparameter_search__runs: 100

    src/pipeline_config.py

        'tuner': {'light_gbm': {'max_depth': ([2, 4, 6], "list"),
                                'num_leaves': ([2, 100], "choice"),
                                'min_child_samples': ([5, 10, 15 25, 50], "list"),
                                'subsample': ([0.95, 1.0], "uniform"),
                                'colsample_bytree': ([0.3, 1.0], "uniform"),
                                'min_gain_to_split': ([0.0, 1.0], "uniform"),
                                'reg_lambda': ([1e-8, 1000.0], "log-uniform"),
                                },
                  }

Get involved

You are welcome to contribute your code and ideas to this open solution. To get started:

  1. Check competition project on GitHub to see what we are working on right now.
  2. Express your interest in paticular task by writing comment in this task, or by creating new one with your fresh idea.
  3. We will get back to you quickly in order to start working together.
  4. Check CONTRIBUTING for some more information.

User support

There are several ways to seek help:

  1. Kaggle discussion is our primary way of communication.
  2. Read project's Wiki, where we publish descriptions about the code, pipelines and supporting tools such as neptune.ml.
  3. Submit an issue directly in this repo.
Comments
  • ModuleNotFoundError: No module named 'deepsense'

    ModuleNotFoundError: No module named 'deepsense'

    There are two things that will make the processing of your issue faster:

    1. Make sure that you are using the latest version of the code,
    2. In case of bug issue, it would be nice to provide more technical details such like execution command, error message or script that reproduces your bug.

    Thanks!

    Kamil & Jakub,

    core contributors to the minerva.ml

    opened by poteman 9
  • use lightGBM_stacking pipeline raise error

    use lightGBM_stacking pipeline raise error

    There are two things that will make the processing of your issue faster:

    1. Make sure that you are using the latest version of the code,
    2. In case of bug issue, it would be nice to provide more technical details such like execution command, error message or script that reproduces your bug.

    Thanks!

    Kamil & Jakub,

    core contributors to the minerva.ml

    while I run the script python -W ignore main.py -- train_evaluate_predict_cv --pipeline_name lightGBM_stacking

    it raise error like following:

    2018-08-10 21:03:04 steppy >>> done: initializing experiment directories
    2018-08-10 21:03:04 steppy >>> Step light_gbm_fold_0 initialized
    2018-08-10 21-03-04 home-credit >>> Start pipeline fit and transform on train
    2018-08-10 21:03:04 steppy >>> cleaning cache...
    2018-08-10 21:03:04 steppy >>> cleaning cache done
    Traceback (most recent call last):
      File "main.py", line 82, in <module>
        main()
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 722, in __call__
        return self.main(*args, **kwargs)
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 697, in main
        rv = self.invoke(ctx)
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 895, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/click/core.py", line 535, in invoke
        return callback(*args, **kwargs)
      File "main.py", line 78, in train_evaluate_predict_cv
        pipeline_manager.train_evaluate_predict_cv(pipeline_name, model_level, dev_mode, submit_predictions)
      File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 37, in train_evaluate_predict_cv
        train_evaluate_predict_cv(pipeline_name, model_level, dev_mode, submit_predictions)
      File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 173, in train_evaluate_predict_cv
        train_evaluate_predict_cv_first_level(pipeline_name, dev_mode, submit_predictions)
      File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 285, in train_evaluate_predict_cv_first_level
        model_level='first')
      File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 428, in _fold_fit_evaluate_predict_loop
        fold_id, pipeline_name, model_level)
      File "/data1/huangzp/kaggle/home_risk_lightgbm/src/pipeline_manager.py", line 517, in _fold_fit_evaluate_loop
        pipeline.fit_transform(train_data)
      File "/opt/anaconda2/envs/python3/lib/python3.6/site-packages/steppy/base.py", line 310, in fit_transform
        step_inputs[input_data_part] = data[input_data_part]
    KeyError: 'input'
    
    opened by ghost 8
  • How to export the feature correlation?

    How to export the feature correlation?

    Dear,

    Could you tell me how to export the feature correlation? I saw some features in your wiki with some correlation scores. I would like to know how can you know the score? By the way, what is the scoring math metric you used? Thanks

    opened by OsloAI 4
  • PermissionError: [WinError 5] Access is denied

    PermissionError: [WinError 5] Access is denied

    Hi. Neptune is a very handy tool and I'm getting to use it. However I encountered some errors.

    It seems Neptune successfully read in the data, there are messages telling me the dataset has been initialized. But when the system wants to do gb training, such permission error happens.

    deepsense.neptune.client_library.threads.channel_values_thread WARNING channel_values_thread.py:389 - _validate() X-coordinate 1278101.8469238281 is not greater than the previous one 1278101.8469238281. Dropping point (x=1278101.8469238281, y=PermissionError: [WinError 5] Access is denied

    Besides, I'm wondering if such runtime warning is normal.

    deepsense.neptune.client_library.threads.channel_values_thread WARNING channel_values_thread.py:389 - _validate() X-coordinate 1278101.8469238281 is not greater than the previous one 1278101.8469238281. Dropping point (x=1278101.8469238281, y= new_handle = steal_handle(parent_pid, pipe_handle) ) for channel stderr. X-coordinates must be strictly increasing for each channel.

    This warning came out every second. Should it be like this?

    Could you please tell me why there is such error?

    opened by MRrollingJerry 4
  • Notebook Updated ?

    Notebook Updated ?

    Hello,

    Thank you for sharing your project, it is interesting ! I have a question regarding the different notebook where there are some preprocessing (cleaning) and creation of new features (hand craft and aggregating). I think the notebook are not updated if we compare to the code. For instance, if we look at the application data, we see 5 cleaning in the code, but in the notebook only two are available. It is forecast to update them ? It is more easy to understand all the data engineering with a notebook than a complete code.

    opened by Shiro-LK 2
  • Add some logger info while reading data

    Add some logger info while reading data

    Pull Request template to Home Credit Default Risk Open Solution

    Code contributions

    Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

    Each pull request should be provided with minimal description about its contents.

    Thanks!

    Jakub & Kamil,

    core contributors to the minerva.ml

    opened by pranayaryal 2
  • KeyError: 'NAME_EDUCATION_TYPE_CODE_GENDER_AMT_CREDIT_min' while running the code

    KeyError: 'NAME_EDUCATION_TYPE_CODE_GENDER_AMT_CREDIT_min' while running the code

    Solution Version: solution 5 | sunflowerย ๐ŸŒป Command used to run the code: neptune run --config configs/neptune.yaml main.py train_evaluate_predict_cv --pipeline_name lightGBM

    Error Message: KeyError: 'NAME_EDUCATION_TYPE_CODE_GENDER_AMT_CREDIT_min' The pops up in the feature_extraction.py file, under the GroupbyAggregateDiffs class and _add_diff_features method. While iterating through self.groupby_aggregations, using this line of code for groupby_cols, specs in self.groupby_aggregations:, the contents of this - self.groupby_aggregations are:

    [(['NAME_EDUCATION_TYPE', 'CODE_GENDER'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['NAME_FAMILY_STATUS', 'NAME_EDUCATION_TYPE'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['NAME_FAMILY_STATUS', 'CODE_GENDER'], [('AMT_CREDIT', 'min'), ('AMT_CREDIT', 'mean'), ('AMT_CREDIT', 'max'), ('AMT_CREDIT', 'sum'), ('AMT_CREDIT', 'var'), ('AMT_ANNUITY', 'min'), ('AMT_ANNUITY', 'mean'), ('AMT_ANNUITY', 'max'), ('AMT_ANNUITY', 'sum'), ('AMT_ANNUITY', 'var'), ('AMT_INCOME_TOTAL', 'min'), ('AMT_INCOME_TOTAL', 'mean'), ('AMT_INCOME_TOTAL', 'max'), ('AMT_INCOME_TOTAL', 'sum'), ('AMT_INCOME_TOTAL', 'var'), ('AMT_GOODS_PRICE', 'min'), ('AMT_GOODS_PRICE', 'mean'), ('AMT_GOODS_PRICE', 'max'), ('AMT_GOODS_PRICE', 'sum'), ('AMT_GOODS_PRICE', 'var'), ('EXT_SOURCE_1', 'min'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_1', 'max'), ('EXT_SOURCE_1', 'sum'), ('EXT_SOURCE_1', 'var'), ('EXT_SOURCE_2', 'min'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_2', 'max'), ('EXT_SOURCE_2', 'sum'), ('EXT_SOURCE_2', 'var'), ('EXT_SOURCE_3', 'min'), ('EXT_SOURCE_3', 'mean'), ('EXT_SOURCE_3', 'max'), ('EXT_SOURCE_3', 'sum'), ('EXT_SOURCE_3', 'var'), ('OWN_CAR_AGE', 'min'), ('OWN_CAR_AGE', 'mean'), ('OWN_CAR_AGE', 'max'), ('OWN_CAR_AGE', 'sum'), ('OWN_CAR_AGE', 'var'), ('REGION_POPULATION_RELATIVE', 'min'), ('REGION_POPULATION_RELATIVE', 'mean'), ('REGION_POPULATION_RELATIVE', 'max'), ('REGION_POPULATION_RELATIVE', 'sum'), ('REGION_POPULATION_RELATIVE', 'var'), ('DAYS_REGISTRATION', 'min'), ('DAYS_REGISTRATION', 'mean'), ('DAYS_REGISTRATION', 'max'), ('DAYS_REGISTRATION', 'sum'), ('DAYS_REGISTRATION', 'var'), ('CNT_CHILDREN', 'min'), ('CNT_CHILDREN', 'mean'), ('CNT_CHILDREN', 'max'), ('CNT_CHILDREN', 'sum'), ('CNT_CHILDREN', 'var'), ('CNT_FAM_MEMBERS', 'min'), ('CNT_FAM_MEMBERS', 'mean'), ('CNT_FAM_MEMBERS', 'max'), ('CNT_FAM_MEMBERS', 'sum'), ('CNT_FAM_MEMBERS', 'var'), ('DAYS_ID_PUBLISH', 'min'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_ID_PUBLISH', 'max'), ('DAYS_ID_PUBLISH', 'sum'), ('DAYS_ID_PUBLISH', 'var'), ('DAYS_BIRTH', 'min'), ('DAYS_BIRTH', 'mean'), ('DAYS_BIRTH', 'max'), ('DAYS_BIRTH', 'sum'), ('DAYS_BIRTH', 'var'), ('DAYS_EMPLOYED', 'min'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_EMPLOYED', 'max'), ('DAYS_EMPLOYED', 'sum'), ('DAYS_EMPLOYED', 'var')]), (['CODE_GENDER', 'ORGANIZATION_TYPE'], [('AMT_ANNUITY', 'mean'), ('AMT_INCOME_TOTAL', 'mean'), ('DAYS_REGISTRATION', 'mean'), ('EXT_SOURCE_1', 'mean')]), (['CODE_GENDER', 'REG_CITY_NOT_WORK_CITY'], [('AMT_ANNUITY', 'mean'), ('CNT_CHILDREN', 'mean'), ('DAYS_ID_PUBLISH', 'mean')]), (['CODE_GENDER', 'NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE', 'REG_CITY_NOT_WORK_CITY'], [('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean')]), (['NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE'], [('AMT_CREDIT', 'mean'), ('AMT_REQ_CREDIT_BUREAU_YEAR', 'mean'), ('APARTMENTS_AVG', 'mean'), ('BASEMENTAREA_AVG', 'mean'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_3', 'mean'), ('NONLIVINGAREA_AVG', 'mean'), ('OWN_CAR_AGE', 'mean'), ('YEARS_BUILD_AVG', 'mean')]), (['NAME_EDUCATION_TYPE', 'OCCUPATION_TYPE', 'REG_CITY_NOT_WORK_CITY'], [('ELEVATORS_AVG', 'mean'), ('EXT_SOURCE_1', 'mean')]), (['OCCUPATION_TYPE'], [('AMT_ANNUITY', 'mean'), ('CNT_CHILDREN', 'mean'), ('CNT_FAM_MEMBERS', 'mean'), ('DAYS_BIRTH', 'mean'), ('DAYS_EMPLOYED', 'mean'), ('DAYS_ID_PUBLISH', 'mean'), ('DAYS_REGISTRATION', 'mean'), ('EXT_SOURCE_1', 'mean'), ('EXT_SOURCE_2', 'mean'), ('EXT_SOURCE_3', 'mean')])]
    

    I am assuming that this is where it is combining 'NAME_EDUCATION_TYPE', 'CODE_GENDER', 'AMT_CREDIT', 'min' and the key is missing. I tried to isolate the error message, but the code base is quite large. I figured a community based approach to resolve the bug might be ideal.

    opened by KartikKannapur 2
  • ValueError: No transformer cached credit_card_balance_cleaning_fold_1

    ValueError: No transformer cached credit_card_balance_cleaning_fold_1

    When I run "python main.py -- train_evaluate_predict_cv --pipeline_name XGBoost", I got this error in the title. However, I can run the lightGBM pipeline successfully.

    opened by liyinxiao 1
  • Ming zhang

    Ming zhang

    Pull Request template to Home Credit Default Risk Open Solution

    Code contributions

    Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

    Each pull request should be provided with minimal description about its contents.

    Thanks!

    Jakub & Kamil,

    core contributors to the minerva.ml

    opened by Bowen-Guo 1
  • Update models.py - Neural Network framework

    Update models.py - Neural Network framework

    Added a general framework for a Keras-based neural network (#136)

    Pull Request template to Home Credit Default Risk Open Solution

    Code contributions

    Major - and most appreciated - contribution is pull request with feature or bug fix. Each pull request initiates discussion about your code contribution.

    Each pull request should be provided with minimal description about its contents.

    Thanks!

    Jakub & Kamil,

    core contributors to the minerva.ml

    opened by yotamco100 1
  • CV improved LB not

    CV improved LB not

    Hi guys,

    i have extracted some more features from bureau file, it improved CV from .7950 to 0.7974 ( std 0.0024) but LB drop from .804 to .802. has anyone experienced this before ? i dont use any TARGET related features, and dont think it is overfitting.

    opened by davutpolat 1
  • Best configurations and models used

    Best configurations and models used

    Hello, I would like to use your pipeline described as 'solution 6,' but I don't get which models were used for the 1st level as oof-predictions.

    Also, I would like to know which configurations were used for the 1st level and the 2nd level respectively. In the config file, there are two configuration files (neptune_stacking.yaml and neptune.yaml) and I'm confused which one was used for each level.

    Could you let me know the models and the configurations used for the 1st and the 2nd layer respectively?

    Thank you!!

    opened by kssteven418 0
Releases(solution-6)
ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

ESGD-M - A stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch

Katherine Crowson 53 Dec 29, 2022
A implemetation of the LRCN in mxnet

A implemetation of the LRCN in mxnet ##Abstract LRCN is a combination of CNN and RNN ##Installation Download UCF101 dataset ./avi2jpg.sh to split the

44 Aug 25, 2022
This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10ร— Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

117 Dec 24, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 01, 2023
Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | ็ฎ€ไฝ“ไธญๆ–‡ Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

Alibaba 123 Dec 12, 2022
Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022) Paper | Demo Requirements Python = 3.6 , Pytorch

FuxiVirtualHuman 84 Jan 03, 2023
A new GCN model for Point Cloud Analyse

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for VA-GCN in pytorch. Classification (ModelNet10/40) Data Preparation D

12 Feb 02, 2022
Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

Swapping Autoencoder for Deep Image Manipulation Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang UC

449 Dec 27, 2022
MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ra

EnliteAI GmbH 222 Dec 24, 2022
Code for paper "Multi-level Disentanglement Graph Neural Network"

Multi-level Disentanglement Graph Neural Network (MD-GNN) This is a PyTorch implementation of the MD-GNN, and the code includes the following modules:

Lirong Wu 6 Dec 29, 2022
Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN

Segmentation and Identification of Vertebrae in CT Scans using CNN, k-means Clustering and k-NN If you use this code for your research, please cite ou

41 Dec 08, 2022
Code for Recurrent Mask Refinement for Few-Shot Medical Image Segmentation (ICCV 2021).

Recurrent Mask Refinement for Few-Shot Medical Image Segmentation Steps Install any missing packages using pip or conda Preprocess each dataset using

XIE LAB @ UCI 39 Dec 08, 2022
What can linearized neural networks actually say about generalization?

What can linearized neural networks actually say about generalization? This is the source code to reproduce the experiments of the NeurIPS 2021 paper

gortizji 11 Dec 09, 2022
QT Py Media Knob using rotary encoder & neopixel ring

QTPy-Knob QT Py USB Media Knob using rotary encoder & neopixel ring The QTPy-Knob features: Media knob for volume up/down/mute with "qtpy-knob.py" Cir

Tod E. Kurt 56 Dec 30, 2022
Using OpenAI's CLIP to upscale and enhance images

CLIP Upscaler and Enhancer Using OpenAI's CLIP to upscale and enhance images Based on nshepperd's JAX CLIP Guided Diffusion v2.4 Sample Results Viewpo

Tripp Lyons 5 Jun 14, 2022
The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

PRIMER The official code for PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization. PRIMER is a pre-trained model for mu

AI2 111 Dec 18, 2022
A full pipeline AutoML tool for tabular data

HyperGBM Doc | ไธญๆ–‡ We Are Hiring๏ผ Dear folks,we are offering challenging opportunities located in Beijing for both professionals and students who are k

DataCanvas 240 Jan 03, 2023
No-reference Image Quality Assessment(NIQA) Algorithms (BRISQUE, NIQE, PIQE, RankIQA, MetaIQA)

No-Reference Image Quality Assessment Algorithms No-reference Image Quality Assessment(NIQA) is a task of evaluating an image without a reference imag

Dae-Young Song 26 Jan 04, 2023
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre

Liming Jiang 460 Jan 04, 2023
The official implementation of VAENAR-TTS, a VAE based non-autoregressive TTS model.

VAENAR-TTS This repo contains code accompanying the paper "VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis". Sa

THUHCSI 138 Oct 28, 2022