Garbage classification using structure data.

Overview

垃圾分类模型使用说明

1.包含以下数据文件

文件 描述
data/MaterialMapping.csv 物体以及其归类的信息
data/TestRecords 光谱原始测试数据 CSV 文件
data/TestRecordDesc.zip CSV 文件描述文件
data/Boundaries.csv 物体轮廓信息

2.包含以下模型文件

文件夹 描述
output/Category/ 包含预测大类别的分类模型
output/Material/ 包含预测大类别(4类)的分类模型
output/Backgroud/ 包含预测小类别(50类)的分类模型

3.环境配置

  进入garbage路径,在anaconda命令行运行pip install -r requirements.txt

4.数据预处理

  在anaconda命令行运行python data_preprocess.py,即可在data文件夹中生成AllEmbracingDataset.csv。若将来更新数据,按照和原来相同的格式和路径保存在data文件夹中,即可用data_preprocess.py生成更新后的数据集

  • 运行数据预处理Python脚本,将上述数据的信息集合到一个数据文件中
python code/data_preprocess.py -data_dir D:/datasets/garbage \
                        -test \
                        -groupbyObjID

运行脚本生成的数据文件 datasets/AllEmbracingDataset.csv 数据集

5.模型训练Python脚本

python code/train_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess

其他 Python脚本说明:

  • feature_engineering.py 特征工程代码
  • ref.py 数据处理和模型推理所需的配置文件
  • utils.py 数据处理所需的一些函数
  • gbdt_feature.py 用gbdt模型生成特征

6.模型推理Python脚本

python code/predict_gbdt_lr.py -data_dir D:/datasets/garbage/ \
                    -use_groupbyID True \
                    -output_dir output/ \
                    -skip_data_preprocess \
                    -save_dir output/ 

  注1:只要同一个ObjID的多条数据的预测结果有一个不是背景零,最终预测结果就不是背景零。

  注2:预测出的Material只会是在训练数据中出现过的唯一标记号。这次数据中不同的唯一标记号共有148个,具体可参见output/log/log.txt中的LabelEncoder.classes

  • 预测结果文件(predictions.csv)说明:对每个物体(即每个ObjID,通常对应多条测试记录)给出多个预测结果汇总后的预测结果。
# 域名 意义
1 ObjID 被测物体唯一标记。同一物体会对应多条测试记录
2 Category 物体分类,从训练数据中获取
3 Material 物体对应的唯一标识号,从训练数据中获取
4 pred_Category 模型所预测出的物体分类
5 pred_Material 模型所预测出的物体唯一标识号
6 pred_background 模型预测的背景和物体 (背景标记为 0,物体标记为 1)
7 pred_Category_final 模型所预测出的物体分类
8 pred_Material_final 模型所预测出的物体材料分类

7. 模型精度

  对于Category、Material和Background三种场景的预测,我们均使用GBDT+LR模型。尝试过SVM、XGBoost、LightGBM和GBDT+LR模型,对比之下,GBDT+LR模型表现最好。   在测试集上的Accuracy如下:

场景 Accuracy
Category 0.7583130575831306
Material 0.6042173560421735
Background 0.996044825313118
Owner
wenqi
Learning is all you need!
wenqi
Image-Scaling Attacks and Defenses

Image-Scaling Attacks & Defenses This repository belongs to our publication: Erwin Quiring, David Klein, Daniel Arp, Martin Johns and Konrad Rieck. Ad

Erwin Quiring 163 Nov 21, 2022
Reference code for the paper CAMS: Color-Aware Multi-Style Transfer.

CAMS: Color-Aware Multi-Style Transfer Mahmoud Afifi1, Abdullah Abuolaim*1, Mostafa Hussien*2, Marcus A. Brubaker1, Michael S. Brown1 1York University

Mahmoud Afifi 36 Dec 04, 2022
Semantic graph parser based on Categorial grammars

Lambekseq "Everyone who failed Greek or Latin hates it." This package is for proving theorems in Categorial grammars (CG) and constructing semantic gr

10 Aug 19, 2022
PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

EML Tübingen 70 Dec 27, 2022
1st-in-MICCAI2020-CPM - Combined Radiology and Pathology Classification

Combined Radiology and Pathology Classification MICCAI 2020 Combined Radiology a

22 Dec 08, 2022
A library to inspect itermediate layers of PyTorch models.

A library to inspect itermediate layers of PyTorch models. Why? It's often the case that we want to inspect intermediate layers of a model without mod

archinet.ai 380 Dec 28, 2022
Neural network for digit classification powered by cuda

cuda_nn_mnist Neural network library for digit classification powered by cuda Resources The library was built to work with MNIST dataset. python-mnist

Nikita Ardashev 1 Dec 20, 2021
Xintao 1.4k Dec 25, 2022
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

OpenMMLab 688 Jan 06, 2023
Agile SVG maker for python

Agile SVG Maker Need to draw hundreds of frames for a GIF? Need to change the style of all pictures in a PPT? Need to draw similar images with differe

SemiWaker 4 Sep 25, 2022
The project of phase's key role in complex and real NN

Phase-in-NN This is the code for our project at Princeton (co-authors: Yuqi Nie, Hui Yuan). The paper title is: "Neural Network is heterogeneous: Phas

YuqiNie-lab 1 Nov 04, 2021
Veri Setinizi Yolov5 Formatına Dönüştürün

Veri Setinizi Yolov5 Formatına Dönüştürün! Bu Repo da Neler Var? Xml Formatındaki Veri Setini .Txt Formatına Çevirme Xml Formatındaki Dosyaları Silme

Kadir Nar 4 Aug 22, 2022
HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep.

HODEmu HODEmu, is both an executable and a python library that is based on Ragagnin 2021 in prep. and emulates satellite abundance as a function of co

Antonio Ragagnin 1 Oct 13, 2021
Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction

GraviCap Official code repository for ICCV 2021 paper: Gravity-Aware Monocular 3D Human Object Reconstruction. Gravity-Aware Monocular 3D Human-Object

Rishabh Dabral 15 Dec 09, 2022
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Scalable Incomplete Network Embedding ⠀⠀ A PyTorch implementation of Scalable Incomplete Network Embedding (ICDM 2018). Abstract Attributed network em

Benedek Rozemberczki 69 Sep 22, 2022
This git repo contains the implementation of my ML project on Heart Disease Prediction

Introduction This git repo contains the implementation of my ML project on Heart Disease Prediction. This is a real-world machine learning model/proje

Aryan Dutta 1 Feb 02, 2022
MlTr: Multi-label Classification with Transformer

MlTr: Multi-label Classification with Transformer This is official implement of "MlTr: Multi-label Classification with Transformer". Abstract The task

程星 38 Nov 08, 2022
Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Philipp Erler 329 Jan 06, 2023
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

HCQ_Tweet_Dataset: FREE to Download. Keywords: HCQ, hydroxychloroquine, tweet, twitter, COVID-19 This dataset is associated with the paper "Understand

2 Mar 16, 2022