A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

Last update: Dec 31, 2022

Overview

🕳️ CygnusX1

Code by Trong-Dat Ngo.

Overviews

🕳️ CygnusX1 is a multithreaded tool 🛠️ , used to search and download images from popular search engines 🔎 . It is straightforward to set up and run!

Key features

🥰 No knowledge is required to get up and to run.
🚀 Download image using customizable number of threads.
⛏️ Crawl all possible images (search results and recommendations).

Installation

This repository is tested on Python 3.6+ and PyTorch selenium 3.141.0+, as well as it works fine on macOS, Windows, Linux.

You should setup and run 🕳️ CygnusX1 in a virtual environment. If you're unfamiliar with Python virtual environments, check out the user guide here.

First, create a virtual environment with the version of Python you're going to use and activate it. (Can be omitted if you want to set up directly on the OS environment)

source venv/bin/activate

Then download 🕳️ CygnusX1 from Github:

git clone https://github.com/dat821168/CygnusX1.git

Finally install dependencies in requirements.txt:

pip install -r requirements.txt

Run

Use run.py to start the script:

python run.py  --keywords "keyword 1, keyword 2" --workers 8 --use_suggestions --headless

Argument details:

--keywords: Indicate the keywords/keyphrases you want to search. For multiple keywords, separate them with commas.
--out_dir: Path where to save results. Default = './IMAGES'.
--workers: The maximum number of workers used to crawl image. Default = 2.
--use_suggestions: Crawl search engine suggestions/recommendations. Default = False.
--headless: Hide browser during scraping. Default = False.

A multithreaded tool for searching and downloading images from popular search engines. It is straightforward to set up and run!

Related tags

Overview

🕳️ CygnusX1

Overviews

Key features

Installation

Run

Future Releases

References

Owner

DatNgo

Web-scraping - Program that scrapes a website for a collection of quotes, picks one at random and displays it

Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Webservice wrapper for hhursev/recipe-scrapers (python library to scrape recipes from websites)

基于Github Action的定时HITsz疫情上报脚本，开箱即用

Get-web-images - A python code that get images from any site

自动完成每日体温上报（Github Actions）

LSpider 一个为被动扫描器定制的前端爬虫

EBay-email-tracker - Scapes an entire search page of a particular item on eBay and sends regular updates to an email address

Scraping followers of an instagram account

script to scrape direct download links (ddls) from google drive index.

A simple code to fetch comments below an Instagram post and save them to a csv file

Scrape all the media from an OnlyFans account - Updated regularly

This is a python api to scrape search results from a url.

Automatically scrapes all menu items from the Taco Bell website

Free-Game-Scraper is a useful script that allows you to track down free games and DLCs on many platforms.

Python framework to scrape Pastebin pastes and analyze them

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Docker containerized Python Flask API that uses selenium to scrape and interact with websites

一个m3u8视频流下载脚本

Use Flask API to wrap Facebook data. Grab the wapper of Facebook public pages without an API key.