A dataset for online Arabic calligraphy

Overview

Calliar

Calliar is a dataset for Arabic calligraphy. The dataset consists of 2500 json files that contain strokes manually annotated for Arabic calligraphy. This repository contains the dataset for the following paper :

Calliar: An Online Handwritten Dataset for Arabic Calligraphy
Zaid Alyafeai, Maged S. Al-shaibani, Mustafa Ghaleb, Yousif Ahmed Al-Wajih
https://arxiv.org/abs/2106.10745

Abstract: Calligraphy is an essential part of the Arabic heritage and culture. It has been used in the past for the decoration of houses and mosques. Usually, such calligraphy is designed manually by experts with aesthetic insights. In the past few years, there has been a considerable effort to digitize such type of art by either taking a photo of decorated buildings or drawing them using digital devices. The latter is considered an online form where the drawing is tracked by recording the apparatus movement, an electronic pen for instance, on a screen. In the literature, there are many offline datasets collected with a diversity of Arabic styles for calligraphy. However, there is no available online dataset for Arabic calligraphy. In this paper, we illustrate our approach for the collection and annotation of an online dataset for Arabic calligraphy called Calliar that consists of 2,500 sentences. Calliar is annotated for stroke, character, word and sentence level prediction.

Stats

Dataset # of Samples # of Words # of Chars # of Strokes
Train 2,000 6,065 24,722 36,561
Valid 250 738 2,946 4,410
Test 250 753 3,052 4,601

Dataset Formats

Mainly, we have two basic formats.

.json

Each .json file contains a list of strokes. Each list is a dictionary of the stroke character and the list of points. Each composite character like ت is mapped into a list of primitive strokes i.e ..ٮ . Refer to the paper and to chars.py for more details on the mapping.

.npz

The compressed format of the dataset dataset.npz is only 8.6 MB and uses the Ramer-Douglas-Peucker Algorithm to decrease the number of points per stroke. The python library rdp was used for such task. The .npz format follows the same approach as QuickDraw.

Visualization

The vis.py file contains a list of python methods for easily visualizing the dataset. Here are two examples for drawing a sample json file and creating an animation.

import glob
import matplotlib.pyplot as plt 
import json 
from IPython.core.display import display, HTML, Video
from vis import *

## show an image of the strokes 
drawing = json.load(open(json_path))
print(get_annotation(json_path))
data = convert_3d(drawing)
draw_strokes(data, stroke_width = 2, crop = True)

## create an animation. 
create_animation(json_path)
Video("tmp/video.mp4")

Samples

sample_calliar_image_3

Animation

video_twitter.mp4
video_twitter_1.mp4
video_twitter_2.mp4
video_twitter_3.mp4

Citation

@misc{alyafeai2021calliar,
      title={Calliar: An Online Handwritten Dataset for Arabic Calligraphy}, 
      author={Zaid Alyafeai and Maged S. Al-shaibani and Mustafa Ghaleb and Yousif Ahmed Al-Wajih},
      year={2021},
      eprint={2106.10745},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
Implements the training, testing and editing tools for "Pluralistic Image Completion"

Pluralistic Image Completion ArXiv | Project Page | Online Demo | Video(demo) This repository implements the training, testing and editing tools for "

Chuanxia Zheng 615 Dec 08, 2022
an Evolutionary Algorithm assisted GAN

EvoGAN an Evolutionary Algorithm assisted GAN ckpts

3 Oct 09, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
Code for: https://berkeleyautomation.github.io/bags/

DeformableRavens Code for the paper Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks. Here is the

Daniel Seita 121 Dec 30, 2022
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
Official code of CVPR 2021's PLOP: Learning without Forgetting for Continual Semantic Segmentation

PLOP: Learning without Forgetting for Continual Semantic Segmentation This repository contains all of our code. It is a modified version of Cermelli e

Arthur Douillard 116 Dec 14, 2022
Dynamic Environments with Deformable Objects (DEDO)

DEDO - Dynamic Environments with Deformable Objects DEDO is a lightweight and customizable suite of environments with deformable objects. It is aimed

Rika 32 Dec 22, 2022
Codes for our IJCAI21 paper: Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization

DDAMS This is the pytorch code for our IJCAI 2021 paper Dialogue Discourse-Aware Graph Model and Data Augmentation for Meeting Summarization [Arxiv Pr

xcfeng 55 Dec 27, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 06, 2023
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
Download & Install mods for your favorit game with a few simple clicks

Husko's SteamWorkshop Downloader 🔴 IMPORTANT ❗ 🔴 The Tool is currently being rewritten so updates will be slow and only on the dev branch until it i

Husko 67 Nov 25, 2022
ICCV2021 - Mining Contextual Information Beyond Image for Semantic Segmentation

Introduction The official repository for "Mining Contextual Information Beyond Image for Semantic Segmentation". Our full code has been merged into ss

55 Nov 09, 2022
git《USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation》(2020) GitHub: [fig2]

USD-Seg This project is an implement of paper USD-Seg:Learning Universal Shape Dictionary for Realtime Instance Segmentation, based on FCOS detector f

Ruolin Ye 80 Nov 28, 2022
Car Parking Tracker Using OpenCv

Car Parking Vacancy Tracker Using OpenCv I used basic image processing methods i

Adwait Kelkar 30 Dec 03, 2022
Code for the paper 'A High Performance CRF Model for Clothes Parsing'.

Clothes Parsing Overview This code provides an implementation of the research paper: A High Performance CRF Model for Clothes Parsing Edgar Simo-S

Edgar Simo-Serra 119 Nov 21, 2022
Seg-Torch for Image Segmentation with Torch

Seg-Torch for Image Segmentation with Torch This work was sparked by my personal research on simple segmentation methods based on deep learning. It is

Eren Gölge 37 Dec 12, 2022
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
An inofficial PyTorch implementation of PREDATOR based on KPConv.

PREDATOR: Registration of 3D Point Clouds with Low Overlap An inofficial PyTorch implementation of PREDATOR based on KPConv. The code has been tested

ZhuLifa 14 Aug 03, 2022
Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors, CVPR 2021

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors Human POSEitioning System (H

Aymen Mir 66 Dec 21, 2022
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch. Detectron Detectron is Facebook AI Research's software sy

Facebook Research 25.5k Jan 07, 2023