This is version 1.4 of Hspell, the free Hebrew spellchecker and morphology engine. You can get Hspell from: http://hspell.ivrix.org.il/ Hspell was written by Nadav Har'El and Dan Kenigsberg: nyh @ math.technion.ac.il danken @ cs.technion.ac.il Hspell is free software, released under the GNU Affero General Public License (AGPL) version 3. Note that not only the programs in the distribution, but also the dictionary files and the generated word lists, are licensed under the AGPL. There is no warranty of any kind for the contents of this distribution. See the LICENSE file for more information and the exact license terms. The rest of this README file explains Hspell's spelling standard (niqqud-less), a bit about the technology behind Hspell, how to use the "hspell" program (but see the manual page for more current information), and lists a few future directions. See the separate INSTALL file for instructions on how to install Hspell. About Hspell's spelling standard -------------------------------- Hspell was designed to be 100% and strictly compliant with the official niqqud-less spelling rules ("Ha-ktiv Khasar Ha-niqqud", colloquially known as "Ktiv Male", or "plene spelling" in English), published by the Academy of the Hebrew Language. This is both an advantage and a disadvantage, depending on your viewpoint. It's an advantage because it encourages a *correct* and consistent spelling style throughout your writing. It is a disadvantage, because a few of the Academia's official spelling decisions are relatively unknown to the general public. Users of Hspell (and all Hebrew writers, for that matter) are encouraged to read the Academia's official niqqud-less spelling rules (which are printed at the end of most modern Hebrew dictionaries), and to refer to Hebrew dictionaries which use the niqqud-less spelling (such as Millon Ha-hove or Rav Milim). We also provide in docs/niqqudless.odt a document (in Hebrew) which describes in detail Hspell's spelling standard, and why certain words are spelled the way they are. The technology behind Hspell ---------------------------- The "hspell" program itself is mostly a simple (but efficient) program that checks input words against a long list of valid words. The real "brains" behind it are the word lists (lexicon) provided by the Hspell project. In order for it to be completely free of other people's copyright restrictions, the Hspell project is a clean-room implementation, not based on other companies' word lists, on other companies' spell checkers, or on copying of printed dictionaries. The word list is also not based on automatic scanning of available Hebrew documents (such as online newspapers), because there is no way to guarantee that such a list will be correct, complete, or consistent with regard to spelling rules. Instead, our idea was to write programs which know how to correctly inflect Hebrew nouns and conjugate Hebrew verbs. The inputs to these programs are lists of noun stems and of verb roots, plus hints needed for the correct inflection when these cannot be figured out automatically. These input files are obviously an important part of the Hspell project. The "word list generators" (written in Perl, and are also part of the Hspell project) then create the complete word-list for use by the spellchecking program, hspell. The generated lists are useful for much more than spellchecking, by the way - see more on that below ("the future"). Although we wrote all of Hspell's code ourselves, we are truly indebted to the old-style "open source" pioneers - people who wrote books about the knowledge they developed, instead of hiding it in proprietary software. For the correct noun inflections, Dr. Shaul Barkali's "The Complete Noun Book" has been a great help. Prof. Uzzi Ornan's booklet "Verb Conjugation in Flow Charts" has been instrumental in the implementation of verb conjugation, and Barkali's "The Complete Verb Book" was used too. During our work we have extensively used a number of Hebrew dictionaries, including Even Shoshan, Millon Ha-hove and Rav-Milim, to ensure the correctness of certain words. Various Hebrew newspapers and books, both printed and online, were used for inspiration and for finding words we still do not recognize. We wish to thank Cilla Tuviana and Dr. Zvi Har'El for their assistance with some grammatical questions. Using hspell ------------ After unpacking the distribution and running "configure", "make" and "make install" (see the INSTALL file for more information), the hspell executable is installed (by default) in /usr/local/bin, and the dictionary files are in /usr/local/share/hspell. The "hspell" program can be used on any sort of text file containing Hebrew and potentially non-Hebrew characters which it ignores. For example, it works well on Hebrew text files, TeX/LaTeX files, and HTML. Running hspell filename Will check the spelling in filename and will output the list of incorrect words (just like the old-fashioned UNIX "spell" program did). If run without a file parameter, hspell reads from its standard input. In the current release, hspell expects ISO-8859-8-encoded files. If files using a different encoding (e.g., UTF8) are to be checked, they must be converted first to ISO-8859-8 (e.g., see iconv(1), recode(1)). If the "-c" option is given, hspell will suggest corrections for misspelled words, whenever it can find such corrections. The correction mechanism in this release is especially good at finding corrections for incorrect niqqud-less spellings, with missing or extra 'immot-qri'a. The "-l" (verbose) option will explain for each correct word why it was recognized, if Hspell was built with the "linginfo" optional feature enabled (a morphological analysis is shown, i.e., fully describe all possible ways to read the given word as an inflected word with optional prefixes). Because hspell's output (naturally) is "logical-order", it is normally useful to pipe it to bidiv or rev before viewing. For example hspell -c filename | bidiv | less Another convenient alternative is to run hspell on a BiDi-enabled terminal. Instead of using the hspell program described above, users can also use Hspell's lexicon through one of the popular multi-lingual spell-checkers, aspell and hunspell. See the INSTALL file for more information on building these dictionaries. How *you* can help ------------------ By now, Hspell is fairly mature, and its lexicon of over 24,000 base words is fairly comprehensive, similar in breadth to some printed dictionaries. Careful attention has also been given to its accuracy, and its conformance with the spelling rules of the Academy of the Hebrew Language. Nevertheless, Hspell does not, and probably never will, cover all of modern Hebrew language. Also, undoubtedly, it may contain some errors as well. If you find such omissions or errors, please let us know. Before reporting such omissions or errors, please try to verify that the word you are proposing is indeed correctly spelled: Please refer to modern dictionaries. Please also look at doc/niqqudless.odt - the word you are proposing might actually be a known mispelling which we discuss in that document.
Hspell, the free Hebrew spellchecker and morphology engine.
Overview
知乎评论区词云分析
zhihu-comment-wordcloud 知乎评论区词云分析 起源于:如何看待知乎问题“男生真的很不能接受彩礼吗?”的一个回答下评论数超8万条,创单个回答下评论数新记录? 项目代码说明 2.download_comment.py 下载全量评论 2.word_cloud_by_dt 生成词云 2
WorldCloud Orçamento de Estado 2022
World Cloud Orçamento de Estado 2022 What it does This script creates a worldcloud, masked on a image, from a txt file How to run it? Install all libr
A Python package to facilitate research on building and evaluating automated scoring models.
Rater Scoring Modeling Tool Introduction Automated scoring of written and spoken test responses is a growing field in educational natural language pro
A collection of pre-commit hooks for handling text files.
texthooks A collection of pre-commit hooks for handling text files. In particular, hooks for handling unicode characters which may be undesirable in a
Auto translate Localizable.strings for multiple languages in Xcode
auto_localize Auto translate Localizable.strings for multiple languages in Xcode Usage put your origin Localizable.strings file in folder pip3 install
Production First and Production Ready End-to-End Keyword Spotting Toolkit
WeKws Production First and Production Ready End-to-End Keyword Spotting Toolkit. The goal of this toolkit it to... Small footprint keyword spotting (K
Python tool to make adding to your armory spreadsheet armory less of a pain.
Python tool to make adding to your armory spreadsheet armory slightly less of a pain by creating a CSV to simply copy and paste.
Split large XML files into smaller ones for easy upload
Split large XML files into smaller ones for easy upload. Works for WordPress Posts Import and other XML files.
Translate .sbv subtitle files
deepl4subtitle Deeplを使って字幕ファイル(.sbv)を翻訳します。タイムスタンプも含めて出力しますが、翻訳時はタイムスタンプは文の一部とは切り離されるので、.sbvファイルをそのまま翻訳機に突っ込むよりも高精度な翻訳ができるはずです。 つかいかた 入力する.sbvファイルの前処理
This project is a small tool for processing url-containing texts delivered by HUAWEI Share on Windows.
hwshare_helper This project is a small tool for handling url-containing texts delivered by HUAWEI Share on Windows. config Before use, please install
Meeting, rendezvous, confluence (Finnish kohtaaminen) mark up, down, and up again.
kohtaaminen Meeting, rendezvous, confluence (Finnish kohtaaminen) mark up, down, and up again. Given a zip file containing a tree of html and media fi
A neat little program to read the text from the "All Ten Fingers" program, and write them back.
ATFTyper A neat little program to read the text from the "All Ten Fingers" program, and write them back. How does it work? This program uses the Pillo
A pipeline for making highlighted text stand-alone.
title emoji colorFrom colorTo sdk app_file pinned decontextualizer 📤 green gray streamlit main.py false Decontextualizer As a second step in improvin
A simple text editor for linux
wolf-editor A simple text editor for linux Installing using Deb Package Download newest package from releases CD into folder where the downloaded acka
The app gets your sutitle.srt and proccess it to extract sentences
DubbingAssistants This app gets your sutitle.srt and proccess it to extract sentences, and also find Start time and End time of them. Step 1: install
Word-Generator - Generates meaningful words from dictionary with given no. of letters and words.
Meaningful Word Generator Generates meaningful words from dictionary with given no. of letters and words. This might be useful for generating short li
This repository contains scripts to control a RGB text fan attached to a Raspberry Pi.
RGB Text Fan Controller This repository contains scripts to control a RGB text fan attached to a Raspberry Pi. Setup The Raspberry Pi and RGB text fan
Little python script + dictionary to help solve Wordle puzzles
Wordle Solver Little python script + dictionary to help solve Wordle puzzles Usage Usage: ./wordlesolver.py [letters in word] [letters not in word] [p
A Python app which can convert normal text to Handwritten text.
Text to HandWritten Text ✍️ Converter Watch Tutorial for this project Usage:- Clone my repository. Open CMD in working directory. Run following comman
Wordle strategy: Find frequency of letters appearing in 5-letter words in the English language
Find frequency of letters appearing in 5-letter words in the English language In