मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

Overview

For English, scroll down

मराठी शब्द

मराठी भाषा वाचवण्यासाठी मी हा ओपन सोर्स प्रोजेक्ट सुरू केला आहे.

माझ्या मते, आपली भाषा हळूहळू आणि कोणाचाही लक्षात न येता एका मृत भाषेच्या दिशेने वाटचाल करत आहे. या उपक्रमात सगळ्यांचे स्वागत आहे, ज्यांना कोणाला हा एक गंभीर विषय वाटतो व त्यात काही सुधारणा करण्याची गरज आहे असे वाटते.

अगदी सोप्या रीतीने सांगायचं झाला तर खालील उदाहरण पहा -

१. मराठी वाक्यांमधील इंग्रजी शब्दांचा जास्त आणि अनावश्यक वापर.

  • अयोग्य - "फार bore झालंय. चला एखादा picture बघूया."
  • योग्य - "फार कंटाळा आलाय. चला एखादा चित्रपट बघूया. "

२. देवनगरीऐवजी लॅटिन अक्षरे वापरुन मराठी टायपिंग / लिहिणे

  • अयोग्य - "me tujhya sobat marathi bolat ahe."
  • योग्य - "मी तुझ्या सोबत मराठीत बोलत आहे."

अधिक माहितीसाठी खालील इंग्रजी मजकूर वाचा. आपण सॉफ्टवेअर अभियंते जरी नसाल तरीही आपण योगदान करू शकता.

योगदान करण्यासाठी

१. "Github" वर आपले खाते बनवा

२. "Discussions" पृष्ठावरील आपल्या कल्पना, टिप्पण्या इ. वर चर्चा करा.

Marathi shabd

About

This project is being developed as a part of an effort to help save the Marathi language from its gradual and unnoticeable decline into a dying language.

Goal


(This is the goal of the overall idea and not just this project.)

Revive the usage of Marathi language in its original/unadulterated form in day-to-day life in both spoken and written medium.

How to do it?


  1. Make people realise that these problems exist
  2. Motivate them to work towards fixing it
  3. Provide them with resources (this project basically is a part of this step)
  4. Ask them to do actually implement this in their daily life

This will be done with a combination of videos, blogs and software tools such as this. (Contributions in all these are welcome.)

Overview of this project

The idea is to have a static website (ad free, bloat free and fast) where people, looking to improve their Marathi vocabulary, can search for an English word/phrase and quickly find its Marathi equivalent, and also usage example wherever possible.

Words can also be categorised into various topics (tags) so that words used in same context can be found together to improve the vocabulary those particular topics. More features can be added in the future, if necessary.

So basically it will be an ad-free and fast English-to-Marathi thesaurus for day-to-day words with some additonal features.

Development and contribution

It is currently in its very initial stage where I am conceptualising it and looking for contributors (developers as well as people well versed in the Marathi language).

Some places to do contributions

  • Database update - adding English words with Marathi equivalents
  • Static website creation - Basically parsing the database and creating an output markdown file with all the content. This file will be used on the github.io static website page.
    • note - I would particularly like help in this area as it is new to me as well.
  • Adding/correcting content in Marathi language to this project's documentation (readme, website pages etc.)

(This is the current plan and can be improvised.)

Please suggest your ideas, comments etc. in the "Discussions" page.

I also have in mind quite a few other ideas related to creating resources in Marathi language, which I plan to start once I have this project's website first ready at some usable level.

What is the need to do this?

As I see it, there 2 main problems which are explained below -

  1. Excessive use of English words in Marathi sentences.

Simply stated this is using a lot of English words in our sentences where we could easily use Marathi words. Example -

  • Not OK - "फार bore झालंय. चला एखादा picture बघूया."
  • OK - "फार कंटाळा आलाय. चला एखादा चित्रपट बघूया. "

The direct consequence of this is that we are loosing our grip on the Marathi vocabulary. And this problem is ever growing like a snowball, which needs external force and motivation to fix it. This problem exists in both the spoken as well as the written form. Also while this is particularly serious in the urban population, it may also expand to rural areas as the reach of English schools and the internet widens.

This project currently is for working on the above problem only.

  1. Typing/writing Marathi using the Latin alphabet instead of Devanagari.

This is basically typing Marathi like this

  • Not OK - "me tujhya sobat marathit bolat ahe."
  • OK - "मी तुझ्या सोबत मराठीत बोलत आहे."

This problem is something that I feel should not exist in today's date, as we now have good keyboards for typing in Marathi using Devanagari on all platforms be it mobile or computers. However it continues to exist, as people find it easier to type using Latin alphabet on the qwerty keyboard.

Owner
मुक्त स्त्रोत
मुक्त स्त्रोत
Abhijith Neil Abraham 2 Nov 05, 2021
NLP command-line assistant powered by OpenAI

NLP command-line assistant powered by OpenAI

Axel 16 Dec 09, 2022
BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

OpenBMB 377 Jan 02, 2023
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries

Moment-DETR QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries Jie Lei, Tamara L. Berg, Mohit Bansal For dataset de

Jie Lei 雷杰 133 Dec 22, 2022
NLP, before and after spaCy

textacy: NLP, before and after spaCy textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the hig

Chartbeat Labs Projects 2k Jan 04, 2023
A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

286 Jan 02, 2023
vits chinese, tts chinese, tts mandarin

vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统

AmorTX 12 Dec 14, 2022
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context This repository contains the code in both PyTorch and TensorFlow for our paper

Zhilin Yang 3.3k Dec 28, 2022
Final Project for the Intel AI Readiness Boot Camp NLP (Jan)

NLP Boot Camp (Jan) Synopsis Full Name: Prameya Mohanty Name of your School: Delhi Public School, Rourkela Class: VIII Title of the Project: iTransect

TheCodingHub 1 Feb 01, 2022
wxPython app for converting encodings, modifying and fixing SRT files

Subtitle Converter Program za obradu srt i txt fajlova. Requirements: Python version 3.8 wxPython version 4.1.0 or newer Libraries: srt, PyDispatcher

4 Nov 25, 2022
PUA Programming Language written in Python.

pua-lang PUA Programming Language written in Python. Installation git clone https://github.com/zhaoyang97/pua-lang.git cd pua-lang pip install . Try

zy 4 Feb 19, 2022
nlpcommon is a python Open Source Toolkit for text classification.

nlpcommon nlpcommon, Python Text Tool. Guide Feature Install Usage Dataset Contact Cite Reference Feature nlpcommon is a python Open Source

xuming 3 May 29, 2022
Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

Kashgari Overview | Performance | Installation | Documentation | Contributing 🎉 🎉 🎉 We released the 2.0.0 version with TF2 Support. 🎉 🎉 🎉 If you

Eliyar Eziz 2.3k Dec 29, 2022
Coreference resolution for English, French, German and Polish, optimised for limited training data and easily extensible for further languages

Coreferee Author: Richard Paul Hudson, Explosion AI 1. Introduction 1.1 The basic idea 1.2 Getting started 1.2.1 English 1.2.2 French 1.2.3 German 1.2

Explosion 70 Dec 12, 2022
"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

transformers-arithmetic This repository contains the code to reproduce the experiments from the paper: Nogueira, Jiang, Lin "Investigating the Limitat

Castorini 33 Nov 16, 2022
PyTorch implementation of NATSpeech: A Non-Autoregressive Text-to-Speech Framework

A Non-Autoregressive Text-to-Speech (NAR-TTS) framework, including official PyTorch implementation of PortaSpeech (NeurIPS 2021) and DiffSpeech (AAAI 2022)

760 Jan 03, 2023
Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET

Training COMET using seq2seq setting Use AutoModelForSeq2SeqLM in Huggingface Transformers to train COMET. The codes are modified from run_summarizati

tqfang 9 Dec 17, 2022
Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning English | 中文 ❗ Now we provide inferencing code and pre-training models

164 Jan 02, 2023
TTS is a library for advanced Text-to-Speech generation.

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pretra

Mozilla 6.5k Jan 08, 2023
Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet.

Sonnet finder Finds snippets in iambic pentameter in English-language text and tries to combine them to a rhyming sonnet. Usage This is a Python scrip

Marcel Bollmann 11 Sep 25, 2022