Find papers by keywords and venues. Then download it automatically

Last update: Dec 15, 2022

Related tags

Web Crawling paper-finder

Overview

paper finder

Find papers by keywords and venues. Then download it automatically.

How to use this?

Search

CLI

python search.py -k "knowledge tracing,knowledge trace" -v "KDD,IJCAI" -o data/kt_result.csv

min_year : paper >= min_year
max_year : paper<=max_year
-k : keywords, different keywords split use ,
-v : venue, split using ,. If default, will use the default venues.
o : output file path

Python api

from search import search
keyword_list=['knowledge tracing','knowledge trace']
venue_list=['KDD','IJCAI']
search(keyword_list=keyword_list,venue_list=venue_list,min_year=2016,max_year=2021,output='data/kt_result.csv')

Your can find venues' name in there.

Download

CLI

python download.py -i data/kt_result.csv  -o pdfs

i : the csv path from search
o : the dir to save pdfs, we will create sub folder for each venue. Such as pdfs/AIED

Python api

from utils.download import download_from_df
import pandas as pd

csv_path = "data/kt_result.csv"
df = pd.read_csv(csv_path)
df = download_from_df(df,save_dir='pdfs')
df.to_csv(csv_path.replace('.csv','_download_result.csv'),index=False)

Todo

Search papers.
Download papers

Author Warning

This code is only used for academic communication. The author has no liability for copyright. DO NOT ENGAGE IN ANY ILLEGAL ACTIVITIES. Please download and read the genuine articles from the publisher.

This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

7 Sep 30, 2022

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings and results from live.skidor.com Usage: Put the python file in a dedic

0 Jan 7, 2022

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

0 Jan 7, 2022

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

39 Dec 28, 2022

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Movies-Scraper You are probably tired of navigating through a movie website to get the right movie you'd want to watch during the weekend. There may e

1 Jan 31, 2022

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

71 Oct 4, 2022

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

9 Nov 16, 2022

Command line program to download documents from web portals.

command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re

16 Dec 26, 2022

download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

1 Dec 2, 2022

Releases(v0.1)

v0.1(Dec 6, 2022)

增加引用查询、增加代码链接查询
Source code(tar.gz)
Source code(zip)
v0.0.4(Jun 15, 2022)
add examples

fix search api bug

Source code(tar.gz)
Source code(zip)
v0.0.3(Mar 2, 2022)

Source code(tar.gz)
Source code(zip)

Find papers by keywords and venues. Then download it automatically

Related tags

Overview

paper finder

How to use this?

Search

CLI

Python api

Download

CLI

Python api

Todo

Author Warning

You might also like...

This tool crawls a list of websites and download all PDF and office documents

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Script used to download data for stocks.

Download images from forum threads

Command line program to download documents from web portals.

download NCERT books using scrapy

Releases(v0.1)

v0.1(Dec 6, 2022)

v0.0.4(Jun 15, 2022)

v0.0.3(Mar 2, 2022)

Owner

Jiahao Chen (TabChen)

A simple python web scraper.

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

This program scrapes information and images for movies and TV shows.

Extract embedded metadata from HTML markup

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

Twitter Claimer / Swapper / Turbo - Proxyless - Multithreading

Web scrapper para cotizar articulos

Async Python 3.6+ web scraping micro-framework based on asyncio

Create crawler get some new products with maximum discount in banimode website

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

simple http & https proxy scraper and checker

Script used to download data for stocks.

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

🐞 Douban Movie / Douban Book Scarpy

Here I provide the source code for doing web scraping using the python library, it is Selenium.

🕷 Phone Crawler with multi-thread functionality

download NCERT books using scrapy

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

A web service for scanning media hosted by a Matrix media repository

Discord webhook spammer with proxy support and proxy scraper