CoSENT 比Sentence-BERT更有效的句向量方案

Last update: Dec 12, 2022

Related tags

Text Data & NLP CoSENT

Overview

CoSENT

比Sentence-BERT更有效的句向量方案

介绍

博客：https://kexue.fm/archives/8847
数据：https://github.com/bojone/BERT-whitening/tree/main/chn

效果

train训练、test测试：

	ATEC	BQ	LCQMC	PAWSX	STS-B	Avg
BERT+CoSENT	49.74	72.38	78.69	60.00	80.14	68.19
Sentence-BERT	46.36	70.36	78.72	46.86	66.41	61.74
RoBERTa+CoSENT	50.81	71.45	79.31	61.56	81.13	68.85
Sentence-RoBERTa	48.29	69.99	79.22	44.10	72.42	62.80

NLI训练、test测试：

	ATEC	BQ	LCQMC	PAWSX	STS-B	Avg
BERT+CoSENT	28.93	41.84	66.07	20.49	73.91	46.25
Sentence-BERT	28.19	42.73	64.98	15.38	**74.88	45.23
RoBERTa+CoSENT	31.84	46.65	68.43	20.89	74.37	48.43
Sentence-RoBERTa	31.87	45.60	67.89	15.64	73.93	46.99

环境

需要bert4keras >= 0.10.8。个人实验环境是tensorflow 1.15 + keras 2.3.1 + bert4keras 0.10.8。

交流

QQ交流群：808623966，微信群请加机器人微信号spaces_ac_cn

Owner

苏剑林(Jianlin Su)

科学爱好者

GitHub Repository

To be a next-generation DL-based phenotype prediction from genome mutations.

Sequence -----------+-- 3D_structure -- 3D_module --+ +-- ? | |

18 Jan 11, 2022

Basic yet complete Machine Learning pipeline for NLP tasks

Basic yet complete Machine Learning pipeline for NLP tasks This repository accompanies the article on building basic yet complete ML pipelines for sol

20 Aug 22, 2022

A design of MIDI language for music generation task, specifically for Natural Language Processing (NLP) models.

MIDI Language Introduction Reference Paper: Pop Music Transformer: Beat-based Modeling and Generation of Expressive Pop Piano Compositions: code This

3 May 25, 2022

ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab

AliceMind AliceMind: ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab This repository provides pre-trained encode

1.4k Jan 04, 2023

MicBot - MicBot uses Google Translate to speak everyone's chat messages

MicBot MicBot uses Google Translate to speak everyone's chat messages. It can al

2 Mar 09, 2022

A paper list for aspect based sentiment analysis.

Aspect-Based-Sentiment-Analysis A paper list for aspect based sentiment analysis. Survey [IEEE-TAC-20]: Issues and Challenges of Aspect-based Sentimen

419 Dec 20, 2022

This repository will contain the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

1.1k Dec 27, 2022

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering

NeuTex: Neural Texture Mapping for Volumetric Neural Rendering Paper: https://arxiv.org/abs/2103.00762 Running Run on the provided DTU scene cd run ba

68 Jan 06, 2023

Fastseq 基于ONNXRUNTIME的文本生成加速框架

9 Nov 09, 2021

Words-per-minute - A terminal app written in python utilizing the curses module that tests the user's ability to type

words-per-minute A terminal app written in python utilizing the curses module th

1 Jan 14, 2022

Write Python in Urdu - اردو میں کوڈ لکھیں

UrduPython Write simple Python in Urdu. How to Use Write Urdu code in سامپل۔پے The mappings are as following: "۔": ".", "،":

26 Nov 27, 2022

Mycroft Core, the Mycroft Artificial Intelligence platform.

Mycroft Mycroft is a hackable open source voice assistant. Table of Contents Getting Started Running Mycroft Using Mycroft Home Device and Account Man

6.1k Jan 09, 2023

The Classical Language Toolkit

Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt

754 Jan 09, 2023

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

32 Nov 13, 2022

Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project: http://casl-project.ai/

Texar-PyTorch is a toolkit aiming to support a broad set of machine learning, especially natural language processing and text generation tasks. Texar

726 Dec 30, 2022

Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API

gpt3-instruct-sandbox Interactive Jupyter Notebook Environment for using the GPT-3 Instruct API Description This project updates an existing GPT-3 san

312 Jan 03, 2023

VampiresVsWerewolves - Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition

VampiresVsWerewolves Our Implementation of a MiniMax algorithm with alpha beta pruning in the context of an in-class competition. Our Algorithm finish

1 Jan 21, 2022

A full spaCy pipeline and models for scientific/biomedical documents.

This repository contains custom pipes and models related to using spaCy for scientific documents. In particular, there is a custom tokenizer that adds

1.3k Jan 03, 2023

Submit issues and feature requests for our API here.

AIx GPT API Submit issues and feature requests for our API here. See https://apps.aixsolutionsgroup.com for more info. Python Quick Start pip install

7 Mar 27, 2022

Transformers4Rec is a flexible and efficient library for sequential and session-based recommendation, available for both PyTorch and Tensorflow.

730 Jan 09, 2023