SimCSE在中文任务上的简单实验

Related tags

MiscellaneousSimCSE
Overview

SimCSE 中文测试

SimCSE在常见中文数据集上的测试,包含ATECBQLCQMCPAWSXSTS-B共5个任务。

介绍

文件

- utils.py  工具函数
- eval.py  评测主文件

评测

命令格式:

python eval.py [model_type] [pooling] [task_name] [dropout_rate]

使用例子:

python eval.py BERT cls ATEC 0.3

其中四个参数必须传入,含义分别如下:

- model_type: 模型,必须是['BERT', 'RoBERTa', 'WoBERT', 'RoFormer', 'BERT-large', 'RoBERTa-large', 'SimBERT', 'SimBERT-tiny', 'SimBERT-small']之一;
- pooling: 池化方式,必须是['first-last-avg', 'last-avg', 'cls', 'pooler']之一;
- task_name: 评测数据集,必须是['ATEC', 'BQ', 'LCQMC', 'PAWSX', 'STS-B']之一;
- dropout_rate: 浮点数,dropout的比例,如果为0则不dropout;

环境

测试环境:tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.5,如果在其他环境组合下报错,请根据错误信息自行调整代码。

下载

Google官方的两个BERT模型:

关于语义相似度数据集,可以从数据集对应的链接自行下载,也可以从作者提供的百度云链接下载。

其中senteval_cn目录是评测数据集汇总,senteval_cn.zip是senteval目录的打包,两者下其一就好。

相关

交流

QQ交流群:808623966,微信群请加机器人微信号spaces_ac_cn

Owner
苏剑林(Jianlin Su)
科学爱好者
苏剑林(Jianlin Su)
Hy - A dialect of Lisp that's embedded in Python

Hy Lisp and Python should love each other. Let's make it happen. Hy is a Lisp dialect that's embedded in Python. Since Hy transforms its Lisp code int

Hy Society 4.4k Jan 02, 2023
flake8 plugin which forbids match statements (PEP 634)

flake8-match flake8 plugin which forbids match statements (PEP 634)

Anthony Sottile 25 Nov 01, 2022
Pequenos programas variados que estou praticando e implementando, leia o Read.me!

my-small-programs Pequenos programas variados que estou praticando e implementando! Arquivo: automacao Automacao de processos de rotina com código Pyt

Léia Rafaela 43 Nov 22, 2022
p5 is a Python package based on the core ideas of Processing.

p5 p5 is a Python library that provides high level drawing functionality to help you quickly create simulations and interactive art using Python. It c

p5py 645 Jan 04, 2023
Small pip update helpers.

pipdate pipdate is a collection of small pip update helpers. The command pipdate # or python3.9 -m pipdate updates all your pip-installed packages. (O

Nico Schlömer 69 Dec 18, 2022
Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python

Synchrosqueezing is a powerful reassignment method that focuses time-frequency representations, and allows extraction of instantaneous amplitudes and frequencies

John Muradeli 382 Jan 06, 2023
Homed - Light-weight, easily configurable, dockerized homepage

homed GitHub Repo Docker Hub homed is a light-weight customizable portal primari

Matt Walters 12 Dec 15, 2022
An alternative app for core Armoury Crate functions.

NoROG DISCLAIMER: Use at your own risk. This is alpha-quality software. It has not been extensively tested, though I personally run it daily on my lap

12 Nov 29, 2022
Cobalt Strike Sleep Python Bridge

This project is 'bridge' between the sleep and python language. It allows the control of a Cobalt Strike teamserver through python without the need for for the standard GUI client. NOTE: This project

Cobalt Strike 140 Jan 04, 2023
Python - Aprendendo Python na ByLearn

PYTHON Identação Escopo Pai Escopo filho Escopo neto Variaveis

Italo Rafael 3 May 31, 2022
Estimate the Market Size for Electic and Plug-In Hybrid Vehicles In Africa

Estimate the Market Size for Electic and Plug-In Hybrid Vehicles In Africa The goal of this repository is to use open data repositories to answer the

Leonce Nshuti 0 Feb 21, 2022
Taxonomy addition for complete trees

TACT: Taxonomic Addition for Complete Trees TACT is a Python app for stochastic polytomy resolution. It uses birth-death-sampling estimators across an

Jonathan Chang 3 Jun 07, 2022
A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

Marshall Conover 2 Jan 12, 2022
This is a python package to get wards, districts,cities and provinces in Zimbabwe

Zim-Places Features This is a python package that allows you to search for cities, provinces, and districts in Zimbabwe.Zimbabwe is split into eight p

RONALD KANYEPI 2 Mar 01, 2022
Send notifications created in Frappe or ERPNext as push notication via Firebase Cloud Message(FCM)

FCM Notification for ERPNext Send notifications created in Frappe or ERPNext as push notication via Firebase Cloud Message(FCM) Steps to use the app:

Tridz 9 Nov 14, 2022
A webapp that timestamps key moments in a football clip

A look into what we're building Demo.mp4 Prerequisites Python 3 Node v16+ Steps to run Create a virtual environment. Activate the virtual environment.

Pranav 1 Dec 10, 2021
A one place destination to check whatever is trending on the top social and news websites at present.

UpTrend A one place destination to check whatever is trending on the top social and news websites at present. Explore the docs » View Demo · Report Bu

Google Developer Student Clubs - JGEC 10 Oct 03, 2021
Collections of python projects

nppy, mostly contains projects written in Python. Some projects are very simple while some are a bit lenghty and difficult(for beginners) Requirements

ghanteyyy 75 Dec 20, 2022
A stupid obfuscation thing

StupidObfuscation A stupid obfuscation thing How it works The obfuscator takes a string, splits into pieces of one, then, using the table from letter.

Echo 2 May 03, 2022
Given an array of integers, calculate the ratios of its elements that are positive, negative, and zero.

Given an array of integers, calculate the ratios of its elements that are positive, negative, and zero. Print the decimal value of each fraction on a new line with places after the decimal.

Shruti Dhave 2 Nov 29, 2021