WildHack 2021 solution by Nuclear Foxes team (public version).

Overview

WildHack 2021

Nuclear Foxes Team

This repo contains our project for the Wildberries Hackathon 2021.

Task 2: Searching tags

Implement an algorithm of recommendation hints, which reduces the search for the desired products in the system.

Prerequirements

Project structure

  • embeddings_prep.py is used for lemmatizing text words and mapping to their embedding representation. We use Natasha library.
  • popularity.py is used for mapping words to their popularity.
  • bot.py is used for launching bot.
  • metrics.py is our small file to separate the cosine similarity metric we use from the rest of the project.

Launch order

  1. pip install requirements.txt in order to install all the required libraries.
  2. Launch popularity.py to create popularity.pkl containing information about possible tags' relative 'popularity'.
  3. Launch embeddings_prep.py to create embeddings.pkl containing information about possible tags' embeddings.
  4. Launch bot.py and input your bot token to launch Telegram bot with your token which would allow you to get tags for you queries based on results of previous steps.

However, this repo also contains our pretrained version of the solution (the required for bot.py .pkl files) so it can be used out-of-box by installing required dependances, setuping Navec and simply running bot.py. Still, feel free to go into both embeddings_prep.py and popularity.py to try to tune some parameters.

Owner
Sergey Zakharov
Sergey Zakharov
Jarvis Python BOT acts like Google-assistance

Jarvis-Python-BOT Jarvis Python BOT acts like Google-assistance Setup Add Mail ID (Gmail) in the file at line no 82.

Ishan Jogalekar 1 Jan 08, 2022
Webcash is an experimental e-cash (electronic cash)

Webcash Webcash is an experimental new electronic cash ("e-cash") that enables decentralized and instant payments to anyone, anywhere in the world. Us

Mark Friedenbach 0 Feb 26, 2022
Implent of Oracle Base line and Lea-3 Baseline

Oracle-Baseline Implent of Oracle Base line and Lea-3 Baseline Oracle Oracle : This model is used to obtain an oracle with a greedy algorithm similar

Andrew Zeng 2 Nov 12, 2021
Calculate the efficient frontier

关于 代码主要参考Fábio Neves的文章,你可以在他的文章中找到一些细节性的解释

Wyman Lin 104 Nov 11, 2022
A prototype COG-based tile server for sparse Mars datasets

Mars tiler Mars Tiler is a prototype web application that serves tiles from cloud-optimized GeoTIFFs, with an emphasis on supporting planetary dataset

Daven Quinn 3 Mar 23, 2022
MeerKAT radio telescope simulation package. Built to simulate multibeam antenna data.

MeerKATgen MeerKAT radio telescope simulation package. Designed with performance in mind and utilizes Just in time compile (JIT) and XLA backed vectro

Peter Ma 6 Jan 23, 2022
dragmap-meth: Fast and accurate aligner for bisulfite sequencing reads using dragmap

dragmap_meth (dragmap_meth.py) Alignment of BS-Seq reads using dragmap. Intro This works for single-end reads and for paired-end reads from the direct

Shaojun Xie 3 Jul 14, 2022
Schemdule is a tiny tool using script as schema to schedule one day and remind you to do something during a day.

Schemdule is a tiny tool using script as schema to schedule one day and remind you to do something during a day. Platform Python Install Use pip: pip

StardustDL 4 Sep 13, 2021
A Python Based Utility for Processing GST-Return JSON Files to Multiple Formats

GSTR 1/2A Utility by Shan.tk Open Source GSTR 1/GSTR 2A JSON to Excel utility based on Python. Useful for Auditors in Verifying GSTR 1 Return Invoices

Sudharshan TK 1 Oct 08, 2022
Python script to preprocess images of all Pokémon to finetune ruDALL-E

ai-generated-pokemon-rudalle Python script to preprocess images of all Pokémon (the "official artwork" of each Pokémon via PokéAPI) into a format such

Max Woolf 132 Dec 11, 2022
LOC-FLOW is an “hands-free” earthquake location workflow to process continuous seismic records

LOC-FLOW is an “hands-free” earthquake location workflow to process continuous seismic records: from raw waveforms to well located earthquakes with magnitude calculations. The package assembles sever

Miao Zhang 71 Jan 09, 2023
This repository contains Python Projects for Beginners as well as for Intermediate Developers built by Contributors.

Python Projects {Open Source} Introduction The repository was built with a tree-like structure in mind, it contains collections of Python Projects. Mo

Gaurav Pandey 115 Apr 30, 2022
Pyhexdmp - Python hex dump module

Pyhexdmp - Python hex dump module

25 Oct 23, 2022
SEH-Helper - Binary Ninja plugin for exploring Structured Exception Handlers

SEH Helper Author: EliseZeroTwo A Binary Ninja helper for exploring structured e

Elise 74 Dec 26, 2022
Regular Expressions - Use regular expressions to detect date format

A list of all the resources used https://regex101.com/ - To test regex https://w

Ravika Nagpal 1 Jan 04, 2022
Source code for Learn Programming: Python

This repository contains the source code of the game engine behind Learn Programming: Python. The two key files are game.py (the main source of the ga

Niema Moshiri 25 Apr 24, 2022
ChieriBot,词云API版,用于统计群友说过的怪话

wordCloud_API 词云API版,用于统计群友说过的怪话,基于wordCloud 消息储存在mysql数据库中.数据表结构见table.sql 为啥要做成API:这玩意太吃性能了,如果和Bot放在同一个服务器,可能会影响到bot的正常运行 你服务器性能够用的话就当我在放屁 依赖包 pip i

chinosk 7 Mar 20, 2022
Decoupled Smoothing in Probabilistic Soft Logic

Decoupled Smoothing in Probabilistic Soft Logic Experiments for "Decoupled Smoothing in Probabilistic Soft Logic". Probabilistic Soft Logic Probabilis

Kushal Shingote 1 Feb 08, 2022
The LiberaPay archive module for the SeanPM life archive project.

By: Top README.md Read this article in a different language Sorted by: A-Z Sorting options unavailable ( af Afrikaans Afrikaans | sq Shqiptare Albania

Sean P. Myrick V19.1.7.2 1 Aug 26, 2022
An interactive course to git

OperatorEquals' Sandbox Git Course! Preface This Git course is an ongoing project containing use cases that I've met (and still meet) while working in

John Torakis 62 Sep 19, 2022