Command-line program to download image galleries and collections from several image hosting sites

Overview

gallery-dl

gallery-dl is a command-line program to download image galleries and collections from several image hosting sites (see Supported Sites). It is a cross-platform tool with many configuration options and powerful filenaming capabilities.

pypi build gitter

Dependencies

Optional

Installation

Pip

The stable releases of gallery-dl are distributed on PyPI and can be easily installed or upgraded using pip:

$ python3 -m pip install -U gallery-dl

Installing the latest dev version directly from GitHub can be done with pip as well:

$ python3 -m pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz

Note: Windows users should use py -3 instead of python3.

It is advised to use the latest version of pip, including the essential packages setuptools and wheel.
To ensure that these packages are up-to-date, run
$ python3 -m pip install --upgrade pip setuptools wheel

From Source

Get the code by either

  • Downloading a stable or dev archive and unpacking it
  • Or via git clone https://github.com/mikf/gallery-dl.git

Navigate into the respective directory and run the setup.py file.

$ wget https://github.com/mikf/gallery-dl/archive/master.tar.gz
$ tar -xf master.tar.gz
# or
$ git clone https://github.com/mikf/gallery-dl.git

$ cd gallery-dl*
$ python3 setup.py install

Standalone Executable

Download a standalone executable file, put it into your PATH, and run it inside a command prompt (like cmd.exe).

These executables include a Python interpreter and all required Python packages.

Snap

Linux users that are using a distro that is supported by Snapd can install gallery-dl from the Snap Store:

$ snap install gallery-dl

Chocolatey

Windows users that have Chocolatey installed can install gallery-dl from the Chocolatey Community Packages repository:

$ choco install gallery-dl

Scoop

Apart from Chocolatey, gallery-dl is also available in Scoop "main" bucket for Windows users.

$ scoop install gallery-dl

Usage

To use gallery-dl simply call it with the URLs you wish to download images from:

$ gallery-dl [OPTION]... URL...

See also gallery-dl --help.

Examples

Download images; in this case from danbooru via tag search for 'bonocho':

$ gallery-dl "https://danbooru.donmai.us/posts?tags=bonocho"

Get the direct URL of an image from a site that requires authentication:

$ gallery-dl -g -u "<username>" -p "<password>" "https://seiga.nicovideo.jp/seiga/im3211703"

Filter manga chapters by language and chapter number:

$ gallery-dl --chapter-filter "lang == 'fr' and 10 <= chapter < 20" "https://mangadex.org/title/2354/"
Search a remote resource for URLs and download images from them:
(URLs for which no extractor can be found will be silently ignored)
$ gallery-dl "r:https://pastebin.com/raw/FLwrCYsT"

If a site's address is nonstandard for its extractor, you can prefix the URL with the extractor's name to force the use of a specific extractor:

$ gallery-dl "tumblr:https://sometumblrblog.example"

Configuration

Configuration files for gallery-dl use a JSON-based file format.

For a (more or less) complete example with options set to their default values, see gallery-dl.conf.
For a configuration file example with more involved settings and options, see gallery-dl-example.conf.
A list of all available configuration options and their descriptions can be found in configuration.rst.

gallery-dl searches for configuration files in the following places:

Windows:
  • %APPDATA%\gallery-dl\config.json
  • %USERPROFILE%\gallery-dl\config.json
  • %USERPROFILE%\gallery-dl.conf

(%USERPROFILE% usually refers to the user's home directory, i.e. C:\Users\<username>\)

Linux, macOS, etc.:
  • /etc/gallery-dl.conf
  • ${HOME}/.config/gallery-dl/config.json
  • ${HOME}/.gallery-dl.conf

Values in later configuration files will override previous ones.

Command line options will override all related settings in the configuration file(s), e.g. using --write-metadata will enable writing metadata using the default values for all postprocessors.metadata.* settings, overriding any specific settings in configuration files.

Authentication

Username & Password

Some extractors require you to provide valid login credentials in the form of a username & password pair. This is necessary for pixiv, nijie, and seiga and optional for aryion, danbooru, e621, exhentai, idolcomplex, inkbunny, instagram, luscious, pinterest, sankaku, subscribestar, tsumino, and twitter.

You can set the necessary information in your configuration file (cf. gallery-dl.conf)

{
    "extractor": {
        "pixiv": {
            "username": "<username>",
            "password": "<password>"
        }
    }
}

or you can provide them directly via the -u/--username and -p/--password or via the -o/--option command-line options

$ gallery-dl -u <username> -p <password> URL
$ gallery-dl -o username=<username> -o password=<password> URL

Cookies

For sites where login with username & password is not possible due to CAPTCHA or similar, or has not been implemented yet, you can use the cookies from a browser login session and input them into gallery-dl.

This can be done via the cookies option in your configuration file by specifying

  • the path to a Mozilla/Netscape format cookies.txt file exported by a browser addon
    (e.g. Get cookies.txt for Chrome, Export Cookies for Firefox)
  • a list of name-value pairs gathered from your browser's web developer tools
    (in Chrome, in Firefox)

For example:

{
    "extractor": {
        "instagram": {
            "cookies": "$HOME/path/to/cookies.txt"
        },
        "patreon": {
            "cookies": {
                "session_id": "K1T57EKu19TR49C51CDjOJoXNQLF7VbdVOiBrC9ye0a"
            }
        }
    }
}

You can also specify a cookies.txt file with the --cookies command-line option:

$ gallery-dl --cookies "$HOME/path/to/cookies.txt" URL

OAuth

gallery-dl supports user authentication via OAuth for deviantart, flickr, reddit, smugmug and tumblr. This is entirely optional, but grants gallery-dl the ability to issue requests on your account's behalf and enables it to access resources which would otherwise be unavailable to a public user.

To link your account to gallery-dl, start by invoking it with oauth:<sitename> as an argument. For example:

$ gallery-dl oauth:flickr

You will be sent to the site's authorization page and asked to grant read access to gallery-dl. Authorize it and you will be shown one or more "tokens", which should be added to your configuration file.

Comments
  • Questions, Feedback and Suggestions #2

    Questions, Feedback and Suggestions #2

    Continuation of the old issue as a central place for any sort of question or suggestion not deserving their own separate issue.

    #11 had gotten too big, took several seconds to load, and was closed as a result. There is also https://gitter.im/gallery-dl/main if that seems more appropriate.

    Questions Meta 
    opened by mikf 97
  • Patreon - 403 Forbidden after months of working ok, changing cookies or IP doesn't help

    Patreon - 403 Forbidden after months of working ok, changing cookies or IP doesn't help

    Hi,

    I've used Gallery-dl successfully (great app!) for over a year with an occasional 403 Forbidden error where i just either refresh my cookies and/or change my VPN but for the last week or more no matter what I do "403 Forbidden" error stays.

    Here is the --verbose:

    gallery-dl --cookies "cookies.txt" "https://www.patreon.com/xxx/posts"    --verbose
    [gallery-dl][debug] Version 1.19.0
    [gallery-dl][debug] Python 3.7.9 - Windows-7-6.1.7601-SP1
    [gallery-dl][debug] requests 2.25.1 - urllib3 1.25.11
    [gallery-dl][debug] Starting DownloadJob for 'https://www.patreon.com/xxx/posts'
    [patreon][debug] Using PatreonCreatorExtractor for 'https://www.patreon.com/xxx/posts'
    [urllib3.connectionpool][debug] Starting new HTTPS connection (1): www.patreon.c
    om:443
    [urllib3.connectionpool][debug] https://www.patreon.com:443 "GET /xxx/
    posts HTTP/1.1" 403 None
    [patreon][warning] Cloudflare CAPTCHA
    [patreon][error] HttpError: '403 Forbidden' for 'https://www.patreon.com/xxx/posts'
    

    This VM is Win 7 so I tried it on a Win 10 VM LTSC but the error is the same. Getting new cookies or a new ip does not change anything. I updated galley-dl - same error. And it applies to all links on Patreon I have access to. Would love to see what's going on and keep using this wonderful app.

    cloudflare 
    opened by astraetech 83
  • Questions, Feedback and Suggestions

    Questions, Feedback and Suggestions

    A central place for these things might be a good idea.

    This thread could serve as a starting point, results will eventually be collected in the project wiki, if appropriate and useful.

    Edited 2017-04-15 For conciseness

    Edited 2017-05-04 Removed nonsensical checklist thing


    Questions Meta 
    opened by Hrxn 82
  • (Patreon) New

    (Patreon) New "403 Forbidden" Cloudflare CAPTCHA error with 1.15.3

    Have all dependencies and gallery-dl up to date, but have been getting constant 403 errors.

    Made sure that session_id cookie was up to date in config, no dice.

    Exporting all Patreon cookies into a cookies.txt and updating config to point to it leads to an error which reads "No session_id set", but downloads free posts.

    Cloudflare has been blocking for about a whole day at this point, and verbose doesn't really give any useful information.

    cloudflare 
    opened by biznizz 58
  • [kemono.party]

    [kemono.party] "404 NOT FOUND" error

    For the first time, I tried downloading all the posts of a user from kemono party today. I exported the cookies, I placed them in my gallery-dl conf file, but I got the error. You can see the first three lines of the command output below:

    gallery-dl "kemono party user profile"
    
    [downloader.http][warning] '404 NOT FOUND' for 'file name'
    
    [download][error] Failed to download 
    

    Do you have any idea on the reason why this could happen? I'm downloading other stuff using gallery-dl in the background. Might this be the reason why the command isn't working?

    site-change fixed 
    opened by Vishvamitra 45
  • Twitter is now broken

    Twitter is now broken

    For every Twitter post, I am now receiving this output. Was happening every so often a few hours ago, and now it does it all the time. Using gallery-dl 1.40.0.

    Input $ gallery-dl -j https://twitter.com/MotocrossNews/status/1267608884325216256

    Output

    [
      [
        "ValueError",
        "substring not found"
      ]
    ]
    

    Verbose

    [gallery-dl][debug] Version 1.14.0
    [gallery-dl][debug] Python 3.5.2 - Linux-4.4.0-179-generic-x86_64-with-Ubuntu-16.04-xenial
    [gallery-dl][debug] requests 2.23.0 - urllib3 1.25.9
    [gallery-dl][debug] Starting DownloadJob for 'https://twitter.com/MotocrossNews/status/1267608884325216256'
    [twitter][debug] Using TwitterTweetExtractor for 'https://twitter.com/MotocrossNews/status/1267608884325216256'
    [urllib3.connectionpool][debug] Starting new HTTPS connection (1): twitter.com:443
    [urllib3.connectionpool][debug] https://twitter.com:443 "GET /i/web/status/1267608884325216256 HTTP/1.1" 200 None
    [twitter][error] An unexpected error occurred: ValueError - substring not found. Please run gallery-dl again with the --verbose flag, copy its output and report this issue on https://github.com/mikf/gallery-dl/issues .
    [twitter][debug] 
    Traceback (most recent call last):
      File "/usr/local/lib/python3.5/dist-packages/gallery_dl/job.py", line 61, in run
        for msg in self.extractor:
      File "/usr/local/lib/python3.5/dist-packages/gallery_dl/extractor/twitter.py", line 50, in items
        for tweet in self.tweets():
      File "/usr/local/lib/python3.5/dist-packages/gallery_dl/extractor/twitter.py", line 416, in tweets
        end = page.index('class="js-tweet-stats-container')
    ValueError: substring not found
    
    site-change 
    opened by electricduck 42
  • [twitter] `-o text-tweets=true` now downloads reply target's posts

    [twitter] `-o text-tweets=true` now downloads reply target's posts

    The bug have appeared after the update.

    I optionally use -o text-tweets=true only when I need to save no media tweet's text (in fact to download everything of the passed profile), since usually it is required only for some profiles.

    alias gga='gallery-dl --download-archive ~/gallery-dl/gallery-dl.sqlite'
    alias ggat='gallery-dl -o text-tweets=true --download-archive ~/gallery-dl/gallery-dl.sqlite'
    

    The conf:

            "twitter":
            {
                "retweets": false,
                "directory": ["[gallery-dl]", "[{category}] {author[name]}"],
                "filename": "[{category}] {author[name]}—{date:%Y.%m.%d}—{retweet_id|tweet_id}—{filename}.{extension}",
                "size": ["orig", "4096x4096", "large", "medium", "small"],
                "fallback": false,
                "cards": false,
                "pinned": true,
                "replies": "self",
                "cookies": {
                    "auth_token": "XXX"
                },
                "text-tweets": false,
                "postprocessors": [{
                    "name": "mtime",
                    "event": "post"
                }, {
                    "directory": "metadata",
                    "filename": "[{category}] {author[name]}—{date:%Y.%m.%d}—{retweet_id|tweet_id}.html",
                    "name": "metadata",
                    "event": "post",
                    "mtime": true,
                    "mode": "custom",
                    "archive": "~/gallery-dl/gallery-dl-postprocessors.sqlite",
                    "archive-format": "{tweet_id}_{retweet_id}_p1",
                    "format": "<div id='{retweet_id|tweet_id}'><h4><a href='https://twitter.com/{author[name]}/status/{retweet_id|tweet_id}'>{retweet_id|tweet_id}</a> by <a href='https://twitter.com/{author[name]}'>{author[name]}</a></h4><div class='content'>{content}</div><hr><div>{date:%Y.%m.%d %H:%M:%S}</div><hr></div><br>"
                }]
            },
    

    Now when I use ggat it downloads the ~retweets~ reply target posts, that is undesirable. gga works as expected (as earlier).

    opened by AlttiRi 37
  • Twitter - API limit?

    Twitter - API limit?

    It seems that there is an API limit for Twitter? Downloading https://twitter.com/k3_spaceybear/media doesn't download anything beyond: https://twitter.com/i/web/status/802921901689425921 twMediaDownloader and Hitomi Downloader doesn't seem to have this problem, though. Is there any way around this limit?

    EDIT: I don't know how the code from the twMediaDownloader above works because I'm not bigbrained enough to read through them, but I found some simple solutions searching online: https://github.com/bpb27/twitter_scraping https://github.com/MatthewWolff/TwitterScraper https://stackoverflow.com/questions/8471489/find-all-tweets-from-a-user-not-just-the-first-3-200 It seems the general opinion is to use a Selenium headless browser and query a Twitter search. I like MattheyWolff's solution where he search results month by month, it seems like a very effective solution as you're probably not going to get blocked by Twitter's scrolling limit that way. Even if we do, maybe we can still decrease the range from monthly to weekly or maybe even daily.

    As an example, these are the results from k3_spaceybear's join date (March 2009) from: k3_spaceybear since: 2009-03-01 until: 2009-03-31

    Maybe there are better solutions, but this looks pretty feasible

    opened by panhartstuff 37
  • gallery-dl can't download specific images?

    gallery-dl can't download specific images?

    When I downloaded the whole gallery it worked just fine, but for some reason it gives a 404 error when I tried to download from this image's url: https://www.deviantart.com/hikariangelove/art/Kyoka-with-Kaminari-distant-835556023 Is gallery-dl only capable of downloading galleries? Is it not actually possible to just download a specific image from its specific page?

    opened by KirbyFan102 33
  • How to instal ?

    How to instal ?

    I can not install it ... Can someone help me please?

    When I "gallery-dl.exe" it crash directly..

    (I don't speak english.. thx google translation..)

    opened by Izysan 31
  • Instagram stopped working

    Instagram stopped working

    gallery-dl stop working on instagram today, i'm getting the following error:

    E:\gallery-dl>gallery-dl https://www.instagram.com/migichen_/ [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIP9dLAhkn3/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIN3Hhwhtne/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CIDVKJshBuM/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH-cjDIh0Tz/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH4-mdcBlAP/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH2itYohHD8/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CH0I8u5BWVQ/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) [instagram][warning] Unable to fetch data from 'https://www.instagram.com/p/CHxLcqxBqfe/': JSONDecodeError: Expecting value: line 1 column 1 (char 0) . . . .

    fixed 
    opened by mikaljan 30
  • [telegra.ph] some posts cannot be downloaded

    [telegra.ph] some posts cannot be downloaded

    when you download images from telegra.ph, normally gallery-dl will append the domain (telegra.ph) + the image path (ex. /file/2e072ce962a702ce6e846.jpg) to scrape the image, but sometimes the path for the images comes already with the domain included, resulting in the extractor to fail.

    example: (NSFW)

    gallery-dl https://telegra.ph/Vsyo-o-druzyah-moej-sestricy-05-27
    [downloader.http][warning] HTTPSConnectionPool(host='telegra.phhttps', port=443): Max retries exceeded with url: //pith1.ru/uploads/posts/2019-12/1576437427_01.jpg (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x048C2868>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed')) (1/5)
    
    bug 
    opened by ImVantexHD 0
  • Add AIBooru AI metadata

    Add AIBooru AI metadata

    Currently there are posts that include AI metadata associated with the generated image, but that info is not available when saving the metadata with a postprocessor. Could it be added?

    The AI metadata appears below the images if available. Examples:

    https://aibooru.online/posts/13958 https://aibooru.online/posts/13959 https://aibooru.online/posts/13964

    site-feature-request 
    opened by taskhawk 0
  • Download all galleries in ImageFap user Category

    Download all galleries in ImageFap user Category

    trying to download all of the galleries created inside an image fap users folder/category does not work at all. using the URL just gives me a message saying it's unsupported. Also images on imagefap are downloaded out of order and sorting them by date does not make any difference. (using latest version of gallery-dl with the default config)

    nsfw site-feature-request 
    opened by cheese529 4
  • Clarification on these three keywords please

    Clarification on these three keywords please

    Is date the same as create_date but in a different time zone? What's date_url meant to be? Is create_date or date the Pixiv post itself? Is date_url when anything was changed to the post?

    create_date
      2019-06-16T00:04:21+09:00
    date
      2019-06-15 15:04:21
    date_url
      2019-06-15 15:04:21
    
    opened by a84r7a3rga76fg 2
  • bunkr json output has double dots in file urls

    bunkr json output has double dots in file urls

    The json file output for a generic bunkr link has double dots in it again.

    Output looks like follows: \test\gallery-dl.exe https://bunkr.ru/a/XXXXXXXX -j

    [
      [
        2,
        {
          "album_id": "XXXXXXXX",
          "album_name": "XXXXXXXX",
          "category": "bunkr",
          "count": 4,
          "subcategory": "album"
        }
      ],
      [
        3,
        "https://media-files.bunkr..ru/XXXXXXXX.m4v",
        {
          "album_id": "XXXXXXXX",
          "album_name": "XXXXXXXX",
          "category": "bunkr",
          "count": 4,
          "extension": "m4v",
          "file": "https://media-files.bunkr..ru/XXXXXXXX.m4v",
          "filename": "XXXXXXXX",
          "id": "XXXXXXXX",
          "name": "XXXXXXXX",
          "num": 1,
          "subcategory": "album",
          "thumb": "https://cdn.bunkr..ru/thumbs/XXXXXXXX.png"
        }
      ],
      [
    

    Seems to be related to the #3481 issue see (https://github.com/mikf/gallery-dl/blob/master/gallery_dl/extractor/bunkr.py#L62)

    opened by ThisLimn0 1
Releases(v1.24.2)
Owner
Mike Fährmann
Mike Fährmann
:electric_plug: Generating short urls with python has never been easier

pyshorteners A simple URL shortening API wrapper Python library. Installing pip install pyshorteners Documentation https://pyshorteners.readthedocs.i

Ellison 351 Jan 03, 2023
Crosschat - A bot for cross-server communication

CrossChat A bot for cross-server communication. Running the bot To run the bot y

8 May 15, 2022
Бот для скачивания треков с Deezer используя ISRC и UPC коды

deez_robot Запуск Установите необходимые библиотеки pip install -r requirements.txt Создайте файл config.py и поместите туда токен бота и ARL-токен De

Max 4 Jul 31, 2022
Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts.

AntiScam Discord bot code to stop users that are scamming with fake messages of free discord nitro on servers in order to steal users accounts. How to

H3cJP 94 Dec 15, 2022
A pre-attack hacker tool which aims to find out sensitives comments in HTML comment tag and to help on reconnaissance process

Find Out in Comment Find sensetive comment out in HTML ⚈ About This is a pre-attack hacker tool that searches for sensitives words in HTML comments ta

Pablo Emídio S.S 8 Dec 31, 2022
Elemeno.ai standard development kit in Python

Overview A set of glue code and utilities to make using elemeno AI platform a smooth experience Free software: Apache Software License 2.0 Installatio

Elemeno AI 3 Dec 14, 2022
A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

Stox ⚡ A Python Module For The Stock Market ⚡ A Module to predict the "close price" for the next day and give "technical analysis". It uses a Neural N

Dopevog 31 Dec 16, 2022
The Sue Gray Alert System was a 5 minute project that just beeps every time a new article is updated or published on Gov.UK's news pages.

The Sue Gray Alert System was a 5 minute project that just beeps every time a new article is updated or published on Gov.UK's news pages.

Dafydd 1 Jan 31, 2022
a translator bot for discord

TranslatorBOT it is a simple and powerful discord bot, it been used for translating includes more than 100 language, it has a lot of integrated comman

Mear. 2 Feb 03, 2022
Slack->DynamDB->Some applications

slack-event-subscriptions About The Project Do you want to get simple attendance checks? If you are using Slack, participants can just react on a spec

UpstageAI 26 May 28, 2022
Archive tweets and make them searchable

Tweeter Archive and search your tweets and liked tweets using AWS Lambda, DynamoDB and Elasticsearch. Note: this project is primarily being used a tes

Kamil Sindi 8 Nov 18, 2022
Aio-binance-library - Async library for connecting to the Binance API on Python

aio-binance-library Async library for connecting to the Binance API on Python Th

GRinvest 10 Nov 21, 2022
Cleiton Leonel 4 Apr 22, 2022
Creates Spotify playlists from Spinitron playlists.

spin2spot Creates Spotify playlists from Spinitron playlists. Quick Start You can use spin2spot as a command-line tool: Erik Didriksen 1 Aug 28, 2021

Python Twitter API

Python Twitter Tools The Minimalist Twitter API for Python is a Python API for Twitter, everyone's favorite Web 2.0 Facebook-style status updater for

Mike Verdone 2.9k Jan 03, 2023
Build better AWS infrastructure

Sceptre About Sceptre is a tool to drive AWS CloudFormation. It automates the mundane, repetitive and error-prone tasks, enabling you to concentrate o

sceptre 1.4k Jan 04, 2023
Python wrapper for the Sportradar APIs ⚽️🏈

Sportradar APIs This is a Python wrapper for the sports APIs provided by Sportradar. You'll need to sign up for an API key to use the service. Sportra

John W. Miller 39 Jan 01, 2023
🕵️‍♂️ Investigate Google Accounts with emails.

Description GHunt is an OSINT tool to extract information from any Google Account using an email. It can currently extract: Owner's name Last time the

mxrch 13.1k Jan 01, 2023
A bot which provides online/offline and player status for Thicc SMP, using Replit.

AlynaaStatus A bot which provides online/offline and player status for Thicc SMP. Currently being hosted on Replit. How to use? Create a repl on Repli

QuanTrieuPCYT 8 Dec 15, 2022
Python API to interact with Uwazi

Python Uwazi API Quick Start To use the API install the requirements pip3 install -r requirements.txt and use it like this: uwazi_adapter = UwaziAdap

HURIDOCS 2 Dec 16, 2021