Automatically download and crop key information from the arxiv daily paper. (cpu version)

Related tags

DownloaderFocusAX
Overview

FocusAX

按关键词筛选arxiv每日最新paper或从arxiv搜索。

  • 自动下载、获取摘要、自动截取文中表格和图片。

安装必要的环境

  • 安装 paddle
# GPU安装
python3 -m pip install paddlepaddle-gpu==2.1.1 -i https://mirror.baidu.com/pypi/simple

# CPU安装
 python3 -m pip install paddlepaddle==2.1.1 -i https://mirror.baidu.com/pypi/simple
  • 安装 Layout-Parser
=2.2"">
pip3 install -U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl
pip install "paddleocr>=2.2"
  • 按照其他必要的包
pip3 install -r requirements.txt
  • 下载模型权重
  • PubLayNet 下载解压后放置在paperparse目录下。目录结构如下
FocusAX
    - paperparse
        - ppyolov2_r50vd_dcn_365e_publaynet
            - inference.pdiparams
            - inference.pdiparams.info
            - inference.pdmodel
        - ...
    - downloader
        - ...
    - utils
        - ...
    - configs.py
    - focus_daily.py
    - focus_search.py
    - README.py
    - ...

使用教程

  • configs.py :程序参数配置文件
# =============== 网络代理 ================
# proxy = None # 不使用代理
proxy = {"http": "socks5://127.0.0.1:8080", "https": "socks5://127.0.0.1:8080"}
# =============== 保存文件根目录 ================
root_path = "./arxiv"
# =============== DNN模型推理配置信息 ================
threshold = 0.5
enable_mkldnn = True
enforce_cpu = True
thread_num = 4
  • focus_daily.py :按关键字过滤arxiv daily上的文章(仅当日)
if __name__ == '__main__':
    key_words = ['GAN'] # 要包含的关键词
    subject_words = ['ML', 'CV', 'AI']  # 要包含的类别
    start_parse(key_words, subject_words, needPDF=True, needZip=False)
  • focus_search.py :按关键字在arxiv检索
start_parse('Keyword')
  • root_path 目录中将创建新的文件夹保存结果

效果图

每个文件夹中的abs.md文件保留的是当前pdf的介绍,使用Typora等markdown编辑器打开。 image

image

ps:论文排版不规范会导致截图混乱。

其他

Owner
HeoLis
Interesting in generate methods.
HeoLis
A standalone pytube wrapper for downloading individual videos from YouTube.

pytube-runner This is a Python CLI script for downloading individual videos from YouTube. The pytube project is the core of this runner, so naturally

Shiva 2 Jun 21, 2022
This repository contains code for a youtube-dl GUI written in PyQt.

youtube-dl-GUI This repository contains code for a youtube-dl GUI written in PyQt. It is based on youtube-dl which is a Video downloading script maint

M.Yasoob Ullah Khalid ☺ 191 Jan 02, 2023
A CLI that searches and download Youtube videos in mp3 format.

A CLI that searches and download Youtube videos in mp3 format.

Finhawk 4 Jul 25, 2022
Automatically download and crop key information from the arxiv daily paper. (cpu version)

Automatically download and crop key information from the arxiv daily paper. (cpu version)

HeoLis 4 Jul 30, 2022
YouPlay is a python based tool for downloading YouTube videos through its URL

YouPlay is a python based tool for downloading YouTube videos through its URL. It is capable to download videos from YouTube playlists too and can extract the audio file only from the video. It can r

Nitin Choudhury 10 Sep 15, 2022
Python based YouTube video Downloader GUI Application.

Youtube video Downloader Python based Youtube video Downloader GUI Application. Installation Python Dependencies Import pytube pip install pytube Im

Naem Azam 1 Jan 03, 2022
A股tick下载,自动判断交易日历,获取全市场level1数据

TickDown A股tick下载,自动判断交易日历,获取全市场level1数据 依赖项 func_timeout requests some_tool(仓库里) akshare 使用 定时任务在上午 09:07开始运行 参数调节 max_num 单批次提交的股票数,当前为800,可以自行尝试多个数

Demon Finch 7 Jul 06, 2022
Newsemble is an API that provides easy access to the current news for programmatic analysis

Newsemble is an API that provides easy access to the current news for programmatic analysis. It has been built using Python, BeautifulSoup and MongoDB.

Rishabh 43 Dec 16, 2022
Tool To download Amazon 4k SDR HDR 1080, CDM IS Not Included

WV-AMZN-4K-RIPPER Tool To download Amazon 4k SDR HDR 1080, CDM IS Not Included For CDM You can Mail :- Denis Trunov 179 Dec 17, 2022

𝐴 𝑡𝑒𝑙𝑒𝑔𝑟𝑎𝑚 𝑏𝑜𝑡 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑 𝑣𝑖𝑑𝑒𝑜 𝑎𝑛𝑑 𝑎𝑢𝑑𝑖𝑜 𝑓𝑟𝑜𝑚 𝑦𝑜𝑢𝑡𝑢𝑏𝑒 𝑎𝑛𝑑 𝑣𝑖𝑑𝑒𝑜 𝑤𝑒𝑏𝑠𝑖𝑡𝑒𝑠 𝑞𝑢𝑖𝑐𝑘𝑙𝑦

𝐴 𝑡𝑒𝑙𝑒𝑔𝑟𝑎𝑚 𝑏𝑜𝑡 𝑡ℎ𝑎𝑡 𝑐𝑎𝑛 𝑑𝑜𝑤𝑛𝑙𝑜𝑎𝑑 𝑣𝑖𝑑𝑒𝑜 𝑎𝑛𝑑 𝑎𝑢𝑑𝑖𝑜 𝑓𝑟𝑜𝑚 𝑦𝑜𝑢𝑡𝑢𝑏𝑒 𝑎𝑛𝑑 𝑣𝑖𝑑𝑒𝑜 𝑤𝑒𝑏𝑠𝑖𝑡𝑒𝑠 𝑞𝑢𝑖𝑐𝑘𝑙𝑦

SOCIAL MECHANIC 2 Aug 04, 2022
VK sticker downloader with python

VK Sticker Downloader This repository is used to automate download file from VK Sticker How to use Execute the file ./downloader.py Writedown full url

Hartawan Bahari M. 1 Dec 29, 2021
Simple package for Sublime Text 4; download URL's for local viewing and editing

URLDownloader This is a simple example package that allows you to easily download the contents of any web URL to edit locally. Given a URL, the packag

Terence Martin 3 Mar 05, 2022
A tool to download program information from Bugcrowd, for use by researchers to compare programs they are eligible to participate in

Description bcstats is a tool which allows Bugcrowd researchers to download information about all accessible programs (public and private) into a sing

19 Oct 13, 2022
Download YouTube videos/music and images in MP4, JPG with this tool.

ABOUT THE TOOL Download YouTube videos, music and images in MP4, JPG with this tool, with an easy to understand interface. This tool works with both,

TrollSkull 5 Jan 02, 2023
A python script that discovers hidden YouTube API clients. Just a research project.

YouTube-Internal-Clients A script that discovers hidden internal clients of the YouTube (Innertube) API using bruteforce methods. The script tries cli

David 97 Jan 02, 2023
Download minecraft head or skin, allows TLauncher accounts

Minecraft-skin-downloader Download minecraft head or skin, allows TLauncher accounts by BoBkiNN_ Contact: https://vk.com/bobkinnvk Requirements: Modul

3 Apr 03, 2022
A cross-platform python based utility to download courses from udemy for personal offline use.

udemy-dl A cross-platform python based utility to download courses from udemy for personal offline use. Warning Udemy has started to encrypt many of t

Nasir Khan 4.6k Dec 31, 2022
Making the process of downloading youtube videos faster and more convinient.

Easy-YT Making the process of downloading youtube videos faster and more convinient. What can it do? This python script can be used to download youtub

Meynam 39 Nov 15, 2021
A youtube-dl fork with additional features and fixes

yt-dlp is a youtube-dl fork based on the now inactive youtube-dlc. The main focus of this project is adding new features and patches while also keepin

yt-dlp 37.1k Jan 03, 2023
SABnzbd - The automated Usenet download tool

SABnzbd is an Open Source Binary Newsreader written in Python.

SABnzbd 1.8k Dec 30, 2022