SimCSE在中文任务上的简单实验

Related tags

MiscellaneousSimCSE
Overview

SimCSE 中文测试

SimCSE在常见中文数据集上的测试,包含ATECBQLCQMCPAWSXSTS-B共5个任务。

介绍

文件

- utils.py  工具函数
- eval.py  评测主文件

评测

命令格式:

python eval.py [model_type] [pooling] [task_name] [dropout_rate]

使用例子:

python eval.py BERT cls ATEC 0.3

其中四个参数必须传入,含义分别如下:

- model_type: 模型,必须是['BERT', 'RoBERTa', 'WoBERT', 'RoFormer', 'BERT-large', 'RoBERTa-large', 'SimBERT', 'SimBERT-tiny', 'SimBERT-small']之一;
- pooling: 池化方式,必须是['first-last-avg', 'last-avg', 'cls', 'pooler']之一;
- task_name: 评测数据集,必须是['ATEC', 'BQ', 'LCQMC', 'PAWSX', 'STS-B']之一;
- dropout_rate: 浮点数,dropout的比例,如果为0则不dropout;

环境

测试环境:tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.5,如果在其他环境组合下报错,请根据错误信息自行调整代码。

下载

Google官方的两个BERT模型:

关于语义相似度数据集,可以从数据集对应的链接自行下载,也可以从作者提供的百度云链接下载。

其中senteval_cn目录是评测数据集汇总,senteval_cn.zip是senteval目录的打包,两者下其一就好。

相关

交流

QQ交流群:808623966,微信群请加机器人微信号spaces_ac_cn

Owner
苏剑林(Jianlin Su)
科学爱好者
苏剑林(Jianlin Su)
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

liegroups Python implementation of SO2, SE2, SO3, and SE3 matrix Lie groups using numpy or PyTorch. [Documentation] Installation To install, cd into t

STARS Laboratory 259 Dec 11, 2022
Repo created for the purpose of adding any kind of programs and projects

Programs and Project Repository A repository for adding programs and projects of any kind starting from beginners level to expert ones Contributing to

Unicorn Dev Community 3 Nov 02, 2022
📜Generate poetry with gcc diagnostics

gado (gcc awesome diagnostics orchestrator) is a wrapper of gcc that outputs its errors and warnings in a more poetic format.

Dikson Santos 19 Jun 25, 2022
This is the code of Python enthusiasts collection and written.

I am Python's enthusiast, like to collect Python's programs and code.

cnzb 35 Apr 18, 2022
Algo próximo do ARP

ArpPY Algo parecido com o ARP-Scan. Dependencias O script necessita no mínimo ter o Python versão 3.x instalado e ter o sockets instalado. Executando

Feh's 3 Jan 18, 2022
In this project we will implement AirBnB clone using console

AirBnB Clone In this project we will implement AirBnB clone using console. Usage The shell should work like this

Nandweza Allan 1 Feb 07, 2022
We'll be using HTML, CSS and JavaScript for the frontend

We'll be using HTML, CSS and JavaScript for the frontend. Nothing to install in specific. Open your text-editor and start coding a beautiful front-end.

Mugada sai tilak 1 Dec 15, 2021
A compiler for ARM, X86, MSP430, xtensa and more implemented in pure Python

Introduction The PPCI (Pure Python Compiler Infrastructure) project is a compiler written entirely in the Python programming language. It contains fro

Windel Bouwman 277 Dec 26, 2022
KUIZ is a web application quiz where you can create/take a quiz for learning and sharing knowledge from various subjects, questions and answers.

KUIZ KUIZ is a web application quiz where you can create/take a quiz for learning and sharing knowledge from various subjects, questions and answers.

Thanatibordee Sihaboonthong 3 Sep 12, 2022
Irrigation Component V4 providing support for a custom card

Irrigation Component V4 This release sees the delivery of a custom card https://github.com/petergridge/irrigation_card to render the program options s

12 Oct 28, 2022
Taking the fight to the establishment.

Throwdown Taking the fight to the establishment. Wat? I wanted a simple markdown interpreter in python and/or javascript to output html for my website

Trevor van Hoof 1 Feb 01, 2022
HungryBall to prosta gra, w której gracz wciela się w piłkę.

README POLSKI Opis gry HungryBall to prosta gra, w której gracz wciela się w piłkę. Sterowanie odbywa się za pomocą przycisków w, a, s i d lub opcjona

Karol 1 Nov 24, 2021
Parametric Bottle in CADQuery

Parametric Bottle using CADQuery The proposed code makes it possible to generate different types and sizes of 3D bottles in order to train Pixel2mesh

Ayoub EL HOUDRI 1 May 22, 2022
Impf Bot.py 🐍⚡ automation for the German

Impf Bot.py 🐍⚡ automation for the German "ImpfterminService - 116117"

251 Dec 13, 2022
Age of Empires II recorded game parsing and summarization in Python 3.

mgz Age of Empires II recorded game parsing and summarization in Python 3. Supported Versions Age of Kings (.mgl) The Conquerors (.mgx) Userpatch 1.4

148 Dec 11, 2022
p5 is a Python package based on the core ideas of Processing.

p5 p5 is a Python library that provides high level drawing functionality to help you quickly create simulations and interactive art using Python. It c

p5py 645 Jan 04, 2023
Скрипт позволяет заводить задачи в Панель мониторинга YouTrack на основе парсинга сайта safe-surf.ru

Скрипт позволяет заводить задачи в Панель мониторинга YouTrack на основе парсинга сайта safe-surf.ru

Bad_karma 3 Feb 12, 2022
A blazing fast mass certificate generator script for the community ⚡

A simple mass certificate generator script for the community ⚡ Source Code · Docs · Raw Script Docs All you need Certificate Design a simple template

Tushar Nankani 24 Jan 03, 2023
my own python useful functions

PythonToolKit Motivation This Repo should help save time for data scientists' daily work regarding the Time Series regression task by providing functi

Kai 2 Oct 01, 2022
A small script I made that takes any standard Decklist of magic the gathering cards and pulls all card images from scryfall at once!

A small script I made that takes any standard Decklist of magic the gathering cards and pulls all card images from scryfall at once!

15 Aug 26, 2022