Console application for downloading images from Reddit in Python

Overview

RedditImageScraper

Console application for downloading images from Reddit in Python

Screenshot

Introduction

This short Python script was created for the mass-downloading of images from Reddit. It will be used later for creating data-sets for several Machine Learning projects.

In order to use the script, you will have to have a Reddit account sign-up to create a developer account. You will be assigned a client_id and client_secret which you have to enter in config.ini before you run the script.

Usage

The -r parameter provides a list of sub-reddits to search.
The -st parameter can specify the maximum number of images to download from each sub-reddit. Defaults to 1000.
The -t parameter can specify the total number of images to download across all sub-reddits before we abort. Defaults to 10000.
The -f parameter can specify the folder into which we download the images. Defaults to download.

For example, to download at-most 20 pictures from the dogpictures, dogswithjobs, GuiltyDogs, and dogs sub-reddits, aborting when we have 50 files in total, and saving the files to a folder titled dog, we would use the following:

python RedditImageScraper.py -r dogpictures dogswithjobs GuiltyDogs dog -st 20 -t 50 -f dogs

This should produce something like the following:

Downloading from dogpictures
3tdfpayjp1k51.jpg
cuz4a5np90k51.jpg
1e1g882z40k51.jpg
xX3OQgP.jpg
fuatv49mizj51.jpg
tk9khtspb1k51.jpg
pgbxxakm63k51.jpg
oeUI8Iy.jpg
vqklutghc1k51.jpg
1ctn7f0390k51.jpg
r7995a1dd3k51.jpg
qbjj06vaa0k51.jpg
q6nl05omyzj51.jpg
bl8bi5tsu2k51.jpg
gxc78spvxxj51.jpg
w0pdsr1hsyj51.jpg
9h19nq1k5vj51.jpg
y67tpittfyj51.jpg
Downloaded 18 from dogpictures

Downloading from dogswithjobs
5xxpn6xs7xj51.png
3mrgwnlum1k51.jpg
rs2uecgnb1k51.jpg
y077mg1974k51.jpg
kci6u8pc02k51.jpg
iho9wex0qrj51.jpg
109eyp6kjyj51.jpg
i86x3o6dutj51.jpg
Downloaded 8 from dogswithjobs

Downloading from GuiltyDogs
8z89s7a89dj51.jpg
c9rf2r516li51.jpg
pbdqr853rsh51.jpg
e9xihfbqdeh51.jpg
53gamygu9ch51.jpg
d3tq02dbbyg51.jpg
ifsmwutou2h51.jpg
Downloaded 7 from GuiltyDogs

Downloading from dog
1kloilrhc1k51.jpg
bwe1go65h1k51.jpg
8118vyqeg1k51.jpg
bajprhddg0k51.jpg
rlc7n4m6q0k51.jpg
z9p8llkuyyj51.jpg
dhdi10myx2k51.jpg
6zflnt9hozj51.jpg
niptrbxzf2k51.jpg
jxi3vrd901k51.jpg
u8eykob35yj51.jpg
5hwj8cce6zj51.png
9nr2t4f0vzj51.jpg
ozs8tuu7mzj51.jpg
0h1fwfqhh3k51.jpg
Downloaded 15 from dog

Downloaded 48 in total
Owner
James
Busily reinventing the wheel
James
A Python package that scrapes Google News article data while remaining undetected by Google.

A Python package that scrapes Google News article data while remaining undetected by Google. Our scraper can scrape page data up until the last page and never trigger a CAPTCHA (download stats: https

Geminid Systems, Inc 6 Aug 10, 2022
Dictionary - Application focused on word search through web scraping

Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.

Juan Manuel 2 May 09, 2022
爱奇艺会员,腾讯视频,哔哩哔哩,百度,各类签到

My-Actions 个人收集并适配Github Actions的各类签到大杂烩 不要fork了 ⭐️ star就行 使用方式 新建仓库并同步代码 点击Settings - Secrets - 点击绿色按钮 (如无绿色按钮说明已激活。直接到下一步。) 新增 new secret 并设置 Secr

280 Dec 30, 2022
A Spider for BiliBili comments with a simple API server.

BiliComment A spider for BiliBili comment. Spider Usage Put config.json into config directory, and then python . ./config/config.json. A example confi

Hao 3 Jul 05, 2021
Scrapes proxies and saves them to a text file

Proxy Scraper Scrapes proxies from https://proxyscrape.com and saves them to a file. Also has a customizable theme system Made by nell and Lamp

nell 2 Dec 22, 2021
Command line program to download documents from web portals.

command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re

16 Dec 26, 2022
Meme-videos - Scrapes memes and turn them into a video compilations

Meme Videos Scrapes memes from reddit using praw and request and then converts t

Partho 12 Oct 28, 2022
PS5 bot to find a console in france for chrismas 🎄🎅🏻 NOT FOR SCALPERS

Une PS5 pour Noël Python + Chrome --headless = une PS5 pour noël MacOS Installer chrome Tweaker le .yaml pour la listes sites a scrap et les criteres

Olivier Giniaux 3 Feb 13, 2022
Scrapy, a fast high-level web crawling & scraping framework for Python.

Scrapy Overview Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pag

Scrapy project 45.5k Jan 07, 2023
A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.

Universal Online Judge Spider Introduction This is a spider for Universal Online Judge (UOJ) system (https://uoj.ac/). It also works for all other Onl

TriNitroTofu 1 Dec 07, 2021
京东秒杀商品抢购Python脚本

Jd_Seckill 非常感谢原作者 https://github.com/zhou-xiaojun/jd_mask 提供的代码 也非常感谢 https://github.com/wlwwu/jd_maotai 进行的优化 主要功能 登陆京东商城(www.jd.com) cookies登录 (需要自

Andy Zou 1.5k Jan 03, 2023
A Powerful Spider(Web Crawler) System in Python.

pyspider A Powerful Spider(Web Crawler) System in Python. Write script in Python Powerful WebUI with script editor, task monitor, project manager and

Roy Binux 15.7k Jan 04, 2023
Bulk download tool for the MyMedia platform

MyMedia Bulk Content Downloader This is a bulk download tool for the MyMedia platform. USE ONLY WHERE ALLOWED BY THE COPYRIGHT OWNER. NOT AFFILIATED W

Ege Feyzioglu 3 Oct 14, 2022
👁️ Tool for Data Extraction and Web Requests.

httpmapper 👁️ Project • Technologies • Installation • How it works • License Project 🚧 For educational purposes. This is a project that I developed,

15 Dec 05, 2021
A module for CME that spiders hashes across the domain with a given hash.

hash_spider A module for CME that spiders hashes across the domain with a given hash. Installation Simply copy hash_spider.py to your CME module folde

37 Sep 08, 2022
腾讯课堂,模拟登陆,获取课程信息,视频下载,视频解密。

腾讯课堂脚本 要学一些东西,但腾讯课堂不支持自定义变速,播放时有水印,且有些老师的课一遍不够看,于是这个脚本诞生了。 时间比较紧张,只会不定时修复重大bug。多线程下载之类的功能更新短期内不会有,如果你想一起完善这个脚本,欢迎pr 2020.5.22测试可用 使用方法 很简单,三部完成 下载代码,

163 Dec 30, 2022
Automatically scrapes all menu items from the Taco Bell website

Automatically scrapes all menu items from the Taco Bell website. Returns as PANDAS dataframe.

Sasha 2 Jan 15, 2022
Crawler in Python 3.7, 3.8. 3.9. Pypy3

Description Python Crawler written Python 3. (Supports major Python releases Python3.6, Python3.7 and Python 3.8) Installation and Use Setup VirtualEn

Vinit Kumar 2 Mar 12, 2022
一个m3u8视频流下载脚本

一个Python的m3u8流视频下载脚本 介绍 m3u8流视频日益常见,目前好用的下载器也有很多,我把之前自己写的一个小脚本分享出来,供广大网友使用。写此程序的目的在于给视频下载爱好者提供一个下载样例,可直接调用,勿再重复造轮子。 使用方法 在python中直接运行程序或进行外部调用 import

Nchu 0 Oct 10, 2021
Displays market info for the LUNI token on the Terra Blockchain

LuniBot for Discord Displays market info for the LUNI/LUNA token on the Terra Blockchain (Webscrape method currently scraping CoinMarketCap). Will evo

0 Jan 22, 2022