Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

Related tags

Computer VisionCSCBLI
Overview

CSCBLI

Code for our ACL Findings 2021 paper,
"Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction".

Requirements

python >= 3.6
numpy >= 1.9.0
pytorch >= 1.0

Supervised

How to train

CUDA_VISIBLE_DEVICES=0 python train.py --src_lang $lg --tgt_lang en\
        --static_src_emb_path $ssemb --static_tgt_emb_path $stemb\
        --context_src_emb_path $csemb --context_tgt_emb_path $ctemb\
        --train_data_path $data_path --save_path $save_path
--static_src_emb_path   aligned source static embedding path 
--static_tgt_emb_path   aligned target static embedding path
--context_src_emb_path  source context embedding path
--context_tgt_emb_path  target context embedding path

How to Test

CUDA_VISIBLE_DEVICES=0 python test_on_all_word.py --src_lang $lg\
        --tgt_lang en --model_path $model_path\
        --dict_path $dict_path\
        --vecmap_context_src_emb_path $vcpath\
        --vecmap_context_tgt_emb_path $vspath\
        --vecmap
--vecmap_context_src_emb_path aligned source context embedding path
--vecmap_context_tgt_emb_path aligned target context embedding path
--vecmap use interpolation method, else unified method

Unsupervised

How to train

lg=ar
CUDA_VISIBLE_DEVICES=0 python train.py --src_lang en --tgt_lang $lg\
  --static_src_emb_path $ssemb --static_tgt_emb_path $stemb\
  --context_src_emb_path $csemb --context_tgt_emb_path $ctemb\
   --save_path $save_path 
--static_src_emb_path   aligned source static embedding path 
--static_tgt_emb_path   aligned target static embedding path
--context_src_emb_path  source context embedding path
--context_tgt_emb_path  target context embedding path

How to Test

src=ar
tgt=en
model_path=../checkpoints/$src-$tgt-add_orign_nw.pkl_last
CUDA_VISIBLE_DEVICES=0 python test.py  --model_path $model_path \
        --dict_path ../$src-$tgt.5000-6500.txt  --mode v2 \
        --src_lang $src --tgt_lang $tgt  \
        --reload_src_ctx   $path1 \
        --reload_tgt_ctx   $path2 --lambda_w1 0.11
--mode type    use v1 for unified method and v2 for interpolated 
--lambda_w1    the weight for interpolation
--reload_src_ctx   aligned source context embedding
--reload_tgt_ctx   aligned targte context embedding
Owner
Jinpeng Zhang
Jinpeng Zhang
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
Code for the ACL2021 paper "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction"

CSCBLI Code for our ACL Findings 2021 paper, "Combining Static Word Embedding and Contextual Representations for Bilingual Lexicon Induction". Require

Jinpeng Zhang 12 Oct 08, 2022
nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex.

faceprocessor nofacedb/faceprocessor is a face recognition engine for NoFaceDB program complex. Tech faceprocessor uses a number of open source projec

NoFaceDB 3 Sep 06, 2021
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
Document manipulation detection with python

image manipulation detection task: -- tianchi function image segmentation salie

JiaKui Hu 3 Aug 22, 2022
Creating of virtual elements of the graphical interface using opencv and mediapipe.

Virtual GUI Creating of virtual elements of the graphical interface using opencv and mediapipe. Element GUI Output Description Button By default the b

Aleksei 4 Jun 16, 2022
A post-processing tool for scanned sheets of paper.

unpaper Originally written by Jens Gulden — see AUTHORS for more information. Licensed under GNU GPL v2 — see COPYING for more information. Overview u

27 Dec 07, 2022
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Marek Mauder 127 Dec 03, 2022
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
A simple Digits Recogniser made in Python

⭐ Python Digit Recogniser A simple digit Recogniser made in Python Demo Run Locally Clone the project git clone https://github.com/yashraj-n/python-

Yashraj narke 4 Nov 29, 2021
Text modding tools for FF7R (Final Fantasy VII Remake)

FF7R_text_mod_tools Subtitle modding tools for FF7R (Final Fantasy VII Remake) There are 3 tools I made. make_dualsub_mod.exe: Merges (or swaps) subti

10 Dec 19, 2022
基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现 本项目基于paddlepaddle框架复现PSENet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 AIStudio链接 参考项目: whai362-PSENet 环境配置 本项目

QuanHao Guo 4 Apr 24, 2022
✌️Using this you can control your PC/Laptop volume by Hand Gestures created with Python.

Hand Gesture Volume Controller ✋ Hand recognition 👆 Finger recognition 🔊 you can decrease and increase volume Demo Code Firstly I have created a Mod

Abbas Ataei 19 Nov 17, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
Python library to extract tabular data from images and scanned PDFs

Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d

Org. Account 165 Dec 31, 2022
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 04, 2023
Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that slide and lock together.

Fusion-360-Add-In-PuzzleSpline Fusion 360 Add-in that creates a pair of toothed curves that can be used to split a body and create two pieces that sli

Michiel van Wessem 1 Nov 15, 2021
RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection

RRD: Rotation-Sensitive Regression for Oriented Scene Text Detection For more details, please refer to our paper. Citing Please cite the related works

Minghui Liao 102 Jun 29, 2022
Super Mario Game With Python

Super_Mario Hello all this is a simple python program which tries to use our body as a controller for the super mario game Here I have used media pipe

Adarsh Badagala 219 Nov 25, 2022