Find papers by keywords and venues. Then download it automatically

Overview

paper finder

Find papers by keywords and venues. Then download it automatically.

How to use this?

Search

CLI

python search.py -k "knowledge tracing,knowledge trace" -v "KDD,IJCAI" -o data/kt_result.csv
  • min_year : paper >= min_year
  • max_year : paper<=max_year
  • -k : keywords, different keywords split use ,
  • -v : venue, split using ,. If default, will use the default venues.
  • o : output file path

Python api

from search import search
keyword_list=['knowledge tracing','knowledge trace']
venue_list=['KDD','IJCAI']
search(keyword_list=keyword_list,venue_list=venue_list,min_year=2016,max_year=2021,output='data/kt_result.csv')

Your can find venues' name in there.

Download

CLI

python download.py -i data/kt_result.csv  -o pdfs
  • i : the csv path from search
  • o : the dir to save pdfs, we will create sub folder for each venue. Such as pdfs/AIED

Python api

from utils.download import download_from_df
import pandas as pd

csv_path = "data/kt_result.csv"
df = pd.read_csv(csv_path)
df = download_from_df(df,save_dir='pdfs')
df.to_csv(csv_path.replace('.csv','_download_result.csv'),index=False)

Todo

  • Search papers.
  • Download papers

Author Warning

This code is only used for academic communication. The author has no liability for copyright. DO NOT ENGAGE IN ANY ILLEGAL ACTIVITIES. Please download and read the genuine articles from the publisher.

You might also like...
This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings and results from live.skidor.com Usage: Put the python file in a dedic

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

A Telegram crawler to search groups and channels automatically and collect any type of data from them.
A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Movies-Scraper You are probably tired of navigating through a movie website to get the right movie you'd want to watch during the weekend. There may e

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

Command line program to download documents from web portals.

command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re

download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

Releases(v0.1)
Owner
Jiahao Chen (TabChen)
Fighting!
Jiahao Chen (TabChen)
Scraping and visualising India's real-time COVID-19 data from the MOHFW dataset.

COVID19-WEB-SCRAPER Open Source Tech Lab - Project [SEMESTER IV] OSTL Assignments OSTL Assignments - 1 OSTL Assignments - 2 Project COVID19 India Data

AMEY THAKUR 8 Apr 28, 2022
WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

WebScraping Web scraping Pyton program that scrapes Job website for python devel

Michelle 2 Jul 22, 2022
Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)

trafilatura: Web scraping tool for text discovery and retrieval Description Trafilatura is a Python package and command-line tool which seamlessly dow

Adrien Barbaresi 704 Jan 06, 2023
Web scraped S&P 500 Data from Wikipedia using Pandas and performed Exploratory Data Analysis on the data.

Web scraped S&P 500 Data from Wikipedia using Pandas and performed Exploratory Data Analysis on the data. Then used Yahoo Finance to get the related stock data and displayed them in the form of chart

Samrat Mitra 3 Sep 09, 2022
Scraping Thailand COVID-19 data from the DDC's tableau dashboard

Scraping COVID-19 data from DDC Dashboard Scraping Thailand COVID-19 data from the DDC's tableau dashboard. Data is updated at 07:30 and 08:00 daily.

Noppakorn Jiravaranun 5 Jan 04, 2022
A package designed to scrape data from Yahoo Finance.

yahoostock A package designed to scrape data from Yahoo Finance. Installation The most simple installation method is through PIP. pip install yahoosto

Rohan Singh 2 May 28, 2022
Snowflake database loading utility with Scrapy integration

Snowflake Stage Exporter Snowflake database loading utility with Scrapy integration. Meant for streaming ingestion of JSON serializable objects into S

Oleg T. 0 Dec 06, 2021
A python script to extract answers to any question on Quora (Quora+ included)

quora-plus-bypass A python script to extract answers to any question on Quora (Quora+ included) Requirements Python 3.x

Nitin Narayanan 10 Aug 18, 2022
Scrapes Every Email Address of Every Society in Every University

society-email-scrape Site Live at https://kcsoc.github.io/society-email-scrape/ How to automatically generate new data Go to unis.yml Add your uni Cre

Krishna Consciousness Society 18 Dec 14, 2022
fork huanghyw/jd_seckill

Jd_Seckill 特别声明: 本仓库发布的jd_seckill项目中涉及的任何脚本,仅用于测试和学习研究,禁止用于商业用途,不能保证其合法性,准确性,完整性和有效性,请根据情况自行判断。 本项目内所有资源文件,禁止任何公众号、自媒体进行任何形式的转载、发布。

512 Jan 03, 2023
Instagram profile scrapper with python

IG Profile Scrapper Instagram profile Scrapper Just type the username, and boo! :D Instalation clone this repo to your computer git clone https://gith

its Galih 6 Nov 07, 2022
Pseudo API for Google Trends

pytrends Introduction Unofficial API for Google Trends Allows simple interface for automating downloading of reports from Google Trends. Only good unt

General Mills 2.6k Dec 28, 2022
Ebay Webscraper for Getting Average Product Price

Ebay-Webscraper-for-Getting-Average-Product-Price The code in this repo is used to determine the average price of an item on Ebay given a valid search

17 Jan 05, 2023
Jobinja.ir jobs scraper.

Jobinja.ir Dataset Introduction This project is a simple web scraper that scraps pages of jobinja.ir concurrently and writes and update (if file gets

Iman Kermani 3 Apr 15, 2022
Transistor, a Python web scraping framework for intelligent use cases.

Web data collection and storage for intelligent use cases. transistor About The web is full of data. Transistor is a web scraping framework for collec

BOM Quote Manufacturing 212 Nov 05, 2022
Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing

Scrapegoat is a python library that can be used to scrape the websites from internet based on the relevance of the given topic irrespective of language using Natural Language Processing. It can be ma

10 Jul 06, 2022
A powerful annex BUBT, BUBT Soft, and BUBT website scraping script.

Annex Bubt Scraping Script I think this is the first public repository that provides free annex-BUBT, BUBT-Soft, and BUBT website scraping API script

Md Imam Hossain 4 Dec 03, 2022
Luis M. Capdevielle 1 Jan 14, 2022
Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)

mcc-mnc.com-webscraper Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX) A Python script for web scraping mcc-mnc.com Link: mcc

Anton Ivarsson 1 Nov 07, 2021
Get-web-images - A python code that get images from any site

image retrieval This is a python code to retrieve an image from the internet, a

CODE 1 Dec 30, 2021