原神抽卡记录数据集-Genshin Impact gacha data

Last update: Dec 27, 2022

Related tags

Text Data & NLP genshin-impact

Overview

提要

持续收集原神抽卡记录中

可以使用抽卡记录导出工具导出抽卡记录的json，将json文件发送至[email protected]，我会在清除个人信息后将文件提交到此处。以下两种导出工具任选其一即可。

一种抽卡记录导出工具 from sunfkny 使用方法演示视频

另一种electron版的抽卡记录导出工具 from lvlvl

目前数据集中有195917条抽卡记录

数据使用说明

你可以以个人身份自由的使用本项目数据用于抽卡机制研究，你可以自由的修改和发布我的分析代码（虽然我这代码还不如重新写一次）

但是一定不要将抽卡数据集发布整合到别的平台上，若如此，以后有人去使用多个来源的抽卡数据可能会遇到严重的数据重复问题。请让想要获得抽卡数据朋友来GitHub下载，或注明数据来自本项目。

在使用本数据集得出任何结论时，请自问过程是否严谨，结论是否可信。不应当发布显然不正确的抽卡模型或是不正确且会造成不良影响的模型，如造成不良影响，数据集整理者和提供数据的玩家不负任何责任。

通过一段时间的研究，我基本整理出了原神抽卡的所有机制：

原神抽卡全机制总结

分析抽卡机制的一些工具

数据格式说明

dataset_02文件夹中文件从0001开始顺序编号

每个文件夹内包含一个账号的抽卡记录

gacha100.csv 记录初行者推荐祈愿抽卡数据

gacha200.csv 记录常驻祈愿抽卡数据

gacha301.csv 记录角色活动祈愿数据

gacha302.csv 记录武器活动祈愿数据

csv文件内数据记录格式如下

抽卡时间	名称	类别	星级
YYYY-MM-DD HH:MM:SS	物品全名	角色/武器	3/4/5

分析工具说明

DataAnalysis.py用于分析csv抽卡文件，这段代码还在重写中，会非常的难用，仅供参考，运行后会输出参考统计量并画出分布图，分布图中理论值是我根据实际数据、部分游戏文件推理建立的概率增长模型。

DistributionMatrix.py用于在四星五星耦合的情况下分析设计模型的抽卡概率和分布，是计算抽卡模型的综合概率与期望的大杀器

原神抽卡记录数据集-Genshin Impact gacha data

Related tags

Overview

提要

数据使用说明

数据格式说明

推荐数据处理方式

分析工具说明

Owner

Spooky Skelly For Python

Question and answer retrieval in Turkish with BERT

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Library for fast text representation and classification.

Easy-to-use CPM for Chinese text generation

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

A repository to run gpt-j-6b on low vram machines (4.2 gb minimum vram for 2000 token context, 3.5 gb for 1000 token context). Model loading takes 12gb free ram.

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format

[Preprint] Escaping the Big Data Paradigm with Compact Transformers, 2021

Nmt - TensorFlow Neural Machine Translation Tutorial

AI Assistant for Building Reliable, High-performing and Fair Multilingual NLP Systems

A BERT-based reverse dictionary of Korean proverbs

숭실대학교 컴퓨터학부 전공종합설계프로젝트

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

The FinQA dataset from paper: FinQA: A Dataset of Numerical Reasoning over Financial Data

A sentence aligner for comparable corpora

Repository of the Code to Chatbots, developed in Python

An attempt to map the areas with active conflict in Ukraine using open source twitter data.