An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix

Overview

An experimental Fang Song style Chinese font generated with skeleton-tracing and pix2pix, with glyphs based on cwTeXFangSong. The font is optimised for vertical typesetting. Below is a sample:

The font contains roughly 13,000 glyphs, mostly for traditional Chinese.

I created the font for one of my own projects. The font is admittedly not perfect, but nevertheless have many ineteresting features; therefore I am sharing the font file and programs used to generate it.

Download

Download the font directly at dist/tkFangSong.ttf.

The name of the font is 剔骨仿宋 (thek-kwot-fang-song, So named because the algorithm that created it resembles "deboning"). It is licensed under SIL open font license. If you wish to credit the author, you might use my name 黃令東/黄令东, or the romanization "Lingdong Huang".

Features

The font builds on the elegant shapes from cwTeXFangSong to add more hand-made look and feel reminiscent of the aesthetics of old woodblock printed books.

The font has a wider proportion compared to the original cwTeXFangSong, and is further widened towards the bottom, to accentuate the finishing strokes. The "center of mass" is also moved downwards:

Above right is a visualization of the base function used to warp the skeleton.

The height of a glyph is additionally tweaked based on its vertical complexity, computed with Sobel operator and taking the max of each pixel row.

Many fonts are optimized for horizontal typesetting, and as such, when arranged vertically, the center of mass shifts left and right, giving a jagged look. This font attempts to solve the problem by computing centroids (via image moments) and aligning them.

The font has rich textures. Some of them are artifacts produced by pix2pix network; others are fine-tuned noises delibrately added.

It is to be noted that, as an automated process, it doesn't always produce optimal results; some characters might end up looking ugly, or use the wrong caligraphic movement for certain strokes; For some caligraphers, some strokes might appear too "weak" to their tastes.

Process

The medial axis (skeleton) is computed for each raster rendering of the glyphs in the original font. (The resultant hershey font can be found at ./dist/CWFS64J.HF.TXT)

Pairs of images: the original rendering vs the skeletons are sent to pix2pix for training. pix2pix learns the correspondance and becomes capable of turning skeletons to glyphs.

New skeletons are generated by warping the originals according to my (questionable) taste.

All the new skeletons are fed into the trained network to obtain the new glyphs. The new glyphs are warped in structure, but the weight and shape of the strokes still look legit.

Some post-processing is applied, and potrace is used to re-vectorize the glyphs. Finally, fontforge is used to create a TTF file.

Building the font

Note that to use the font, you can simply download it here. This sections is for reproducing the results from scratch.

The scripts used to build the font are included in the workflow/ folder. Note that making the font is a quite involved process (especially the part of training the neural net). You might also need to modify the scripts to fit your system/folder configuration, but here are some rough steps:

  • Get OpenCV, tensorflow<=1.13.1, numpy and friends
  • First install swig version of skeleton-tracing for python, then obtain cwTeXFangSong from here.
  • Run skel.py > CWFS64.HF.TXT, then join.py > CWFS64J.HF.TXT
  • Modify pairs.py to read CWFS64J.HF.TXT and output to a folder you will create.
  • Download pix2pix-tensorflow and train on the images created in the previous step. (Good luck getting a prehistoric tensorflow project running, you'll need it. Once you do it's pretty straightforward, follow their README)
  • I trained 40K steps on 1.3K images, afterwhich the quality didn't seem to improve much, your mileage might vary.
  • Run warp.py > CWFS64W3.HF.TXT. Modify pairs.py to read from it, and create an output folder like before. Run pairs.py.
  • Evaluate the new image pairs with pix2pix. Edit any of the images that have too much defect with Photoshop or software of your choice. Put the edited ones in a separated folder, say retouched/.
  • Modify refine.py to read from the images and retouched images folders, create an output folder fine/ for it, and run the script.
  • Download potrace, make it runnable from commandline, and run trace_all.py.
  • Run forgefont.py to create a TTF from SVGs generated in the previous step.
  • Done! You can also preview the glyphs with preview.py > index.html, or preview the skeleton with preview_hf.py > index.html.

A PDF containing all the glyphs can be found here. If you find this font not bad, you might also enjoy qiji-font, a more authentic reproduction of a historical typeface.

Owner
Lingdong Huang
Lingdong Huang
An online markdown resume template project, based on pywebio

An online markdown resume template project, based on pywebio

极简XksA 5 Nov 10, 2022
TextStatistics - Get a text file wich contains English text

TextStatistics This program get a text file wich contains English text. The program analyses the text, and print some information. For this program I

2 Nov 15, 2021
Word and phrase lists in CSV

Word Lists Word and phrase lists in CSV, collected from different sources. Oxford Word Lists: oxford-5k.csv - Oxford 3000 and 5000 oxford-opal.csv - O

Anton Zhiyanov 14 Oct 14, 2022
Format Covid values to ASCII-Table (Only for Germany and Austria)

Covid-19-Formatter (Only for Germany and Austria) Dieses Script speichert die gemeldeten Daten des RKIs / BMSGPK und formatiert diese zu einer Asci Ta

56 Jan 22, 2022
LazyText is inspired b the idea of lazypredict, a library which helps build a lot of basic models without much code.

LazyText is inspired b the idea of lazypredict, a library which helps build a lot of basic models without much code. LazyText is for text what lazypredict is for numeric data.

Jay Vala 13 Nov 04, 2022
Extract knowledge from raw text

Extract knowledge from raw text This repository is a nearly copy-paste of "From Text to Knowledge: The Information Extraction Pipeline" with some cosm

Raphael Sourty 10 Dec 03, 2022
The project is investigating methods to extract human-marked data from document forms such as surveys and tests.

The project is investigating methods to extract human-marked data from document forms such as surveys and tests. They can read questions, multiple-choice exam papers, and grade.

Harry 5 Mar 27, 2022
A slugifier that works in unicode

Unicode Slugify Unicode Slugify is a slugifier that generates unicode slugs. It was originally used in the Firefox Add-ons web site to generate slugs

Mozilla 315 Nov 21, 2022
Python tool to make adding to your armory spreadsheet armory less of a pain.

Python tool to make adding to your armory spreadsheet armory slightly less of a pain by creating a CSV to simply copy and paste.

1 Oct 20, 2021
You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

You can encode and decode base85, ascii85, base64, base32, and base16 with this tool.

8 Dec 20, 2022
py-trans is a Free Python library for translate text into different languages.

Free Python library to translate text into different languages.

I'm Not A Bot #Left_TG 13 Aug 27, 2022
Tools to extract questionaire of finalexam.eu and provide interactive questionaire with summary

AskMe This script is completely terminal based. No user interface is added. You can get the command line options by using the --help argument. Make su

David Loewe 1 Nov 09, 2021
🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

🐸 Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! 🧙‍♀️

Brandon 5.6k Jan 03, 2023
Split large XML files into smaller ones for easy upload

Split large XML files into smaller ones for easy upload. Works for WordPress Posts Import and other XML files.

Joseph Adediji 1 Jan 30, 2022
"Complexity" of Flags of the countries of the world

"Complexity" of Flags of the countries of the world Flags (png) from: https://flagcdn.com/w2560.zip https://flagpedia.net/download/images run: chmod +

Alexander Lelchuk 1 Feb 10, 2022
A python tool one can extract the "hash" from a WINDOWS HELLO PIN

WINHELLO2hashcat About With this tool one can extract the "hash" from a WINDOWS HELLO PIN. This hash can be cracked with Hashcat, more precisely with

33 Dec 05, 2022
Python Q&A for Network Engineers

Q & A I am often asked questions about how to solve this or that problem, and I decided to post these questions and solutions here, in case it is also

Natasha Samoylenko 30 Nov 15, 2022
🍋 A Python package to process food

Pyfood is a simple Python package to process food, in different languages. Pyfood's ambition is to be the go-to library to deal with food, recipes, on

Local Seasonal 8 Apr 04, 2022
text-to-speach bot - You really do NOT have time for read a newsletter? Now you can listen to it

NewsletterReader You really do NOT have time for read a newsletter? Now you can listen to it The Newsletter of Filipe Deschamps is a great place to re

ItanuRomero 8 Sep 18, 2021
知乎评论区词云分析

zhihu-comment-wordcloud 知乎评论区词云分析 起源于:如何看待知乎问题“男生真的很不能接受彩礼吗?”的一个回答下评论数超8万条,创单个回答下评论数新记录? 项目代码说明 2.download_comment.py 下载全量评论 2.word_cloud_by_dt 生成词云 2

李国宝 10 Sep 26, 2022