Python script to download all images/webms of a 4chan thread

Overview

4chan-downloader

Python script to download all images/webms of a 4chan thread

Download Script

The main script is called inb4404.py and can be called like this: python inb4404.py [thread/filename]

usage: inb4404.py [-h] [-c] [-d] [-l] [-n] [-r] thread

positional arguments:
  thread              url of the thread (or filename; one url per line)

optional arguments:
  -h, --help          show this help message and exit
  -c, --with-counter  show a counter next the the image that has been
                      downloaded
  -d, --date          show date as well
  -l, --less          show less information (surpresses checking messages)
  -n, --use-names     use thread names instead of the thread ids
                      (...4chan.org/board/thread/thread-id/thread-name)
  -r, --reload        reload the queue file every 5 minutes
  -t, --title         save original filenames

You can parse a file instead of a thread url. In this file you can put as many links as you want, you just have to make sure that there's one url per line. A line is considered to be a url if the first 4 letters of the line start with 'http'.

If you use the --use-names argument, the thread name is used to name the respective thread directory instead of the thread id.

Thread Watcher

This is a work-in-progress script but basic functionality is already given. If you call the script like

python thread-watcher.py -b vg -q mhg -f queue.txt -n "Monster Hunter"

then it looks for all threads that include mhg inside the vg board, stores the thread url into queue.txt and adds /Monster-Hunter at the end of the url so that you can use the --use-names argument from the actual download script.

Legacy

The current scripts are written in python3, in case you still use python2 you can use an old version of the script inside the legacy directory.

Comments
  • "Something went wrong"

    Regardless of thread, board, script location, drive letter, attribute, or host machine the script returns "Something went wrong" and times out after several tries. image Above picture showing different threads and boards and different attribute flags, all three failing almost immediately. image Interestingly however it works fine on an arch install. I've tried it on two machines running Windows 10, reinstalled python 3 as well Not entirely sure why it's dropping almost immediately on Windows. Assume this is a Windows problem, but nothing changed since I used it yesterday. will close if I find a solution to using it on windows. sorry there isn't much information to go off of.

    opened by cherioux 16
  • multiple threads from file error

    multiple threads from file error

    File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() Process Process-1: File "Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names' Traceback (most recent call last): File "Python\Python38-32\lib\multiprocessing\process.py", line 315, in _bootstrap self.run() File "Programs\Python\Python38-32\lib\multiprocessing\process.py", line 108, in run self._target(*self._args, **self._kwargs) File "4chdl\inb4404.py", line 44, in download_thread if args.use_names or os.path.exists(os.path.join(workpath, 'downloads', board, thread_tmp)): AttributeError: 'NoneType' object has no attribute 'use_names'

    opened by th3illu 8
  • Filenames...

    Filenames...

    I've run into a few issues with filenames. One of them isn't entirely Windows specific, and I have an idea of what needs to be done to fix it, but no idea how to. Duplicate filenames within a thread simply overwrite any preceding files. Usually Spoiler_Image or file.png.

    The other issue is Windows specific and I've managed to solve it at least locally by:

    from django.utils.text import get_valid_filename
    ...snip...
                    img_path = ntpath.join(directory, get_valid_filename(img))
    ...snip...
    

    The issue in question, filenames like this used to make the script halt on Windows:

    C:\vtai>inb4404.py -c -d -l -t https://boards.4channel.org/vt/thread/34806758
    [2022-10-11 07:50:46 PM] [  9/278] vt/34806758/[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png
    Traceback (most recent call last):
      File "C:\vtai\inb4404.py", line 171, in <module>
        main()
      File "C:\vtai\inb4404.py", line 31, in main
        download_thread(thread, args)
      File "C:\vtai\inb4404.py", line 112, in download_thread
        with open(img_path, 'wb') as f:
    OSError: [Errno 22] Invalid argument: 'C:\\vtai\\downloads\\vt\\34806758\\[sound=files.catbox.moe%2F7a2f1v.m4a]{{takanashi kiara}}, {{{1girl}}}, {begging},[[[lipstick]]],[[[lip gloss]]],kusogaki,closed eyes,{{pov}}, {{incoming kiss}}, close-up, {{horny}}, blushing, solo, puffy sleeves, orange skirt, aqua choker, orange hair, ba.png'
    

    After importing it appears to work.

    opened by vt-idiot 5
  • Script can't download files from a thread with a other thread quoted (Cross-thread)

    Script can't download files from a thread with a other thread quoted (Cross-thread)

    NSFW

    How to reproduce:

    1° Try to download from this link: https://boards.4chan.org/gif/thread/13521437

    This error appears

    File TTT Part 2 Top Tier Titties \1527053877481.webm from https://i.4cdn.org/gif/1536882732639.webm UNKNOWN ERROR OCCURRED [WinError 183] cannot create an existing file: 'C:\\Users\\mario\\4channer\\TTT Part 2 Top Tier Titties '

    But the "TTT Part 2 Top Tier Tittiles" and the two .webm files doens't exist. Using a custom folder location doens't work.

    opened by ghost 5
  • Corrupt downloads

    Corrupt downloads

    Hi! Just downloaded your script but every file downloaded is corrupt. I picked a random thread https://boards.4chan.org/b/thread/573855146/pink-ids-will-have-pink-eye-for-the-rest-of-their, but also tried with some more with same result.

    Is there any kind of logging i can enable to help you sort this out, if you're not experiencing the same problem?

    Cheers! Luca

    opened by LucaNonato 4
  • auto download/ all images in one folder?

    auto download/ all images in one folder?

    i want all the images of a certian tag to go into just one folder instead of each individual one, is there a way to do that? also is there a feature to autodownload threads in the queue text file when one appears

    question 
    opened by azzerzzzeqwe 3
  • Logo for the Downloader Readme

    Logo for the Downloader Readme

    Hey, i would like to propose a Logo for the readme header and maybe even as icon for a PyQt GUI-version if that one ever gets a go. I have altered the original 4chan Logo.

    opened by ThisLimn0 2
  • invalid syntax?

    invalid syntax?

    I guess I miss something very simple, but cannot figure it out, so here goes nothing

    I followed the instructions to in README.md and typed

    $ python inb4404.py http://boards.4chan.org/w/thread/2096591/lain-thread-anyone-have-any-arisu

    The only result was:

    File "inb4404.py", line 129 print(line.replace(link, '-' + link), end='') ^ SyntaxError: invalid syntax

    And I'm too dumb to know if this is a typo to fix or I should use a different URL (direct one to the first pic in thread maybe?).

    (After hundreds of shameful edits): also the output points at end='' equation, but for love of god I have no idea how to paste it here so markdown does not cut out all spaces.

    opened by tcheerno 2
  • Rewrite/Convert to python3?

    Rewrite/Convert to python3?

    The 2to3 tool should automate this as far as possible. I think the script should be converted to python3 because this is the new standard, support for python2 will be dropped in near future.

    opened by KopfKrieg 2
  • Stuck on checking the first post of the thread

    Stuck on checking the first post of the thread

    The script keeps "checking" the first post of the thread specified as thread_link

    I'm using python 2.7.13, script doesn't work with python 3 because of an error at line 73.

    opened by elleborgo 2
  • Reloading the file doesn't work properly

    Reloading the file doesn't work properly

    Single thread and multiple thread (using a file) works just fine, but the script is unable to reload the file and update the queue properly as of now.

    opened by Exceen 2
  • Configurable threads

    Configurable threads

    I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

    Previously each 4chan thread would get its own process to check and download new images. I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

    Let me know if you have any questions or want me to change anything.

    Thanks!

    opened by Zand3r24 4
Releases(1.0)
  • 1.0(Jul 23, 2018)

    • Supports downloading images and webm-files of single and multiple (multi-threading) 4chan-threads with continuous checks for new posts.
    • Assign names to threads to locate them easier on your hard drive
    • When using a file to download multiple threads at once, 404'd threads will be marked automatically with a "-" in front of the URL.
    • Separates downloads into a "download" directory which serves as archive and a "new" directory. Downloaded files are put into both directories. If a files is deleted inside the "download" directory is will be downloaded again. On the other hand, if a file inside the "new" directory is deleted it won't be downloaded again. This serves as an easy way to keep a whole thread archived and to track new downloads. Therefore, deleting a file inside the "new" directory serves as some kind of a "mark as read" feature.
    • Thread Watcher is more or less an Add-On for the actual download file which checks for threads including a specified text and adds the respective URL into a text file which can be used with the download script.
    Source code(tar.gz)
    Source code(zip)
Owner
Micha Fink
Micha Fink
Implementation of Cross-category Video Highlight Detection via Set-based Learning (ICCV 2021).

Cross-category Video Highlight Detection via Set-based Learning Introduction This project is an implementation of ``Cross-category Video Highlight Det

Minghao (Alan) Xu 49 Dec 17, 2022
📺 YouTube Song Downloader Bot For Telegram 🔮

📺 YouTube Song Downloader Bot For Telegram 🔮 Powerd By TamilBots.

Tamil Bots 146 Dec 31, 2022
Fetch McDonald invoices from mailbox and merge them to one PDF file.

concatenate Fetch McDonald invoices from mailbox and merge them to one PDF file. Description This script will fetch all McDonald invoice pdfs from a p

3 Oct 06, 2022
Python utility to download jobs at seek.com.au

Job Seeker job_seeker is an utility to download data of a job search from seek.com.au into a csv file for data analysis and exploration Install using

PyBites 3 May 14, 2022
music downloader written in python. (Uses jiosaavn API)

music downloader written in python. (Uses jiosaavn API)

Rohn Chatterjee 35 Jul 20, 2022
Discord Nitro Generator + Checker

Discord Nitro Generator + Checker Usage Download the project files and run main.py You will be prompted with 2 questions the first one being the amoun

509 Jan 02, 2023
⚙️ A CLI tool that can download songs from youtube.

⚙️ Music Downloader Music Downloader is a tool that can download songs from Youtube. Installation Base requirements: Python 3.7+ If you have Python 3.

matjs 4 Nov 03, 2021
TikTok downloader video without watermark from Telegram bot

⬇️ How to download video from Tik Tok via telegram bot? Send a link to the video from tik tok to our telegram bot and it will send you a video without

1 Mar 04, 2022
Stremio addon for fetching videos from your google drive.

stremio-gdrive Instructions: There are two ways to go about: Method 1 is hard and long but might give you better performance and you need to make your

72 Dec 31, 2022
Download all your URI Online Judge source codes and upload to GitHub with simple steps.

URI-Code-Downloader Download all your URI Online Judge source codes and upload to GitHub with simple steps. Prerequisites Python 3.x Installing Downlo

Luan Simões 9 Mar 23, 2022
A downloader for the ISIS service of TU Berlin

isis_dl A downloading utility for the ISIS tool of TU-Berlin. Version 0.4 Features Downloads all Material from all courses of your ISIS page. Efficien

1 Nov 06, 2021
Web Downloader With Python

Web Downloader Introduction This module will provide API to download the webpage components : html file, image file, css fil, javascript file, href li

3 Dec 28, 2022
Organize your downloads easily with DownloadOrganizer

DownloadOrganizer Organize your downloads organize your downloads easily with DownloadOrganizer Instilation how to install DownloadOrganizer Method 1:

1 Dec 02, 2021
Download all posts and comments in a subreddit

subreddit downloader This subreddit downloader downloads all posts and comments in a subreddit For a tutorial to use this program please follow this m

Guneet 6 Dec 16, 2022
Let's you download entire YT-playlists.

Youtube MP3 Playlist Downloader Let's you download entire youtube playlists as mp3 files. This application is basically a script that makes it easier

11 Dec 18, 2022
Simple Youtube Video Downloader

Simple Youtube Video Downloader Download Youtube video using link and Will output result in D:/ (You can change the path in main.py file) Installation

Hansen Gianto 1 Oct 28, 2021
Download images where login is required using har python and js

이미지 다운로드(har, python, js 사용) 로그인이 필요한 사이트에서 DevTools로 이미지를 다운받는 방법은 조금 까다로웠다. 가장 쉽게 할 수 있는 방법을 찾아보았다. 사용법 F12를 눌러 DevTools를 실행 Network 탭으로 이동 페이지 새로고침

0 Jul 22, 2022
A growing collection of search plugins for the qBittorrent, an awesome and opensource torrent client

qBittorrent Search Plugins This is a still growing collection of search plugins for qBittorent, an amazing and open source torrent client, maintained

Alessio Tudisco 59 Dec 26, 2022
A Fast as F*** Downloader

FAFD A Fast as F*** Downloader Github Usages You'll want to use a URL like this: https://github.com/RPowell-C/FAFD/raw/main/FAFD.py It's easier DONT F

1 Jan 19, 2022
Download any video from YouTube playlists

youtube-dl Download any videos from YouTube playlists. Requirements Python 3 BeautifulSoup4 PyQt PyQtWebEngine pytube pyyoutube python-decouple Usage

Antonio Fortin 1 Oct 26, 2021