A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

Overview

repo_link_extractor

A github actions + python code to extract URLs to code repositories to put into standard form, starting with github

---- NOTE: JUST STARTED ONLY AN IDEA TO COME BACK TO ----

Summary

first minimum viable product goal

The first minimum viable product goal will be to harvest from https://github.com/softwareunderground/awesome-open-geoscience/blob/main/README.md all the github repositories URLs such that the form returned is https://github.com/ + "username" + "repository name" and then add them to the "repos" key in an existing JSON in a form like this: https://github.com/softwareunderground/open_geosciene_code_projects_viz/blob/main/_explore/input_lists.json , which is summarized below:

{
    "memberOrgs": [
        "softwareunderground"
    ],
    "orgs": [
        "agile-geoscience",
        "softwareunderground"
    ],
    "repos": [
        "ahotovec/redpy",
        "whamlyn/auralib"
    ]
}

2nd intermediate product goal

  • Fires from a GitHubAction

3rd intermediate product goal

Eventual product goal

  • works for public, internal, and private GitHub URLs
  • Works for GitHub, GitLab, BitBucket, and other code repository URLS & APIs
  • Keeps track of harvest date, source file name, source file URL & code platform & domain in an intermediate file.

Related Projects

This is referenced on an issue here: https://github.com/softwareunderground/open_geosciene_code_projects_viz/issues/23

Potential Useful Bits

regular expression (https:\/\/github.com\/)\w+(\/)\w+ seems like a good starting point for the extraction of Github URLs.

GitHub Actions Structure Tentative:

  • download README file
  • replace old README file with new
  • extract all links matching a regular expression
  • sort & take out duplicates
  • make into JSON with domain, URL, org or username, repository name, source file name, source file link, and date of harvests
  • pull out org or username & repository name from above and put into appropriate key of the file JSON if not already there in either org or repo keys.

How to Integrate into https://github.com/softwareunderground/open_geosciene_code_projects_viz ??????

Options:

  1. Put all of the code here into the repository: https://github.com/softwareunderground/open_geosciene_code_projects_viz
  2. Call the code here from https://github.com/softwareunderground/open_geosciene_code_projects_viz
If calling the code....
  • (1) add the script to read the README to MASTER.sh as the first step
  • (2) set master.sh to be callled by GitHub actions
  • (3) when triggered the github actions does the entirity of the github actions in this repo, including calling the python scripts as its first step.
  • (4) latter steps include setting up the environnment and calling all the python scripts that the master.sh bash script calls. The code would need to be called by either a GitHub Action on (pull request, push, manual, or cron job) or by trigger after the call to refresh the
Owner
Justin Gosses
Machine-Learning | Data Visualization | Geoscience | NASA |
Justin Gosses
Ethone-Selfbot - Open Source Discord Self-Bot, written in discord.py

Ethone SB Table of contents Newest open-source Discord SelfBot with useful commands and easy documentation on how to add your own and change the exist

Ethone 3 Jan 08, 2022
Telegram RAT written in Python

teleRAT Python based RAT that uses Telegram for sending commands and receiving data to and from a victim computer. Setup.py Insert your API key into t

96 Jan 01, 2023
just a program i made cuz a friend got tokenlogged and spammed me with these scam/phishing links so i made a programm to spam these websides with fake logins

scam-webside-spammer just a program i made cuz a friend got tokenlogged and spammed me with these scam/phishing links so i made a programm to spam the

TerrificTable 3 Sep 23, 2022
Python Client for MLflow Tracking Server

Python Client for MLflow Python client for MLflow REST API. Features: Unlike MLflow Tracking client all REST API methods are exposed to user. All clas

MTS 35 Dec 23, 2022
One of Best renamer bot with python

🌀 One of Best renamer bot repo Please Give a ☆ if You like This Open Source and Don't Forget to Follow Me On Github For More Repos And Codes. Scrappe

1 Dec 14, 2021
A simple fun discord bot using discord.py that can post memes

A simple fun discord bot using discord.py * * Commands $commands - to see all commands $meme - for a random meme from the internet $cry - to make the

Dice Flip 2 Dec 20, 2021
An advanced telegram language translator bot

Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https://github.com/FayasNoushad/Translator-Bot-V3/blob/main/LICE

Fayas Noushad 19 Dec 24, 2022
Python client for CoinPayments API

pyCoinPayments - Python API client for CoinPayments Updates This library has now been converted to work with python3 This is an unofficial client for

James 27 Sep 21, 2022
Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Modern Data Lake Storage Layers This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache

Damon P. Cortesi 25 Oct 31, 2022
Adds a new git subcommand named "ranch".

Git Ranch This script adds ranch, a new subcommand for git that makes it easier to order 1 Gallon of Kraft Ranch Salad Dressing from Amazon. Installat

Austin T Schaffer 8 Jul 06, 2022
Guilherme Matheus 11 Sep 11, 2022
Data and a Twitter bot for the EPA's DOCUMERICA (1972-1977) program.

documerica This repository holds JSON(L) artifacts and a few scripts related to managing archival data from the EPA's DOCUMERICA program. Contents: Ma

William Woodruff 2 Oct 27, 2021
A simple waybar module to display the status of the ICE you are currently in using the ICE Portals JSON API.

waybar-iceportal A simple waybar module to display the status of the ICE you are currently in using the ICE Portals JSON API. Installation Ensure pyth

Moritz 7 Aug 26, 2022
Customizable and open-sourced bot for a few private servers

MarlBot A private bot for controlling monkeys and turtles. Why does this bot exist? The bot exists as a general-purpose community bot for a select few

KR 1 Jan 18, 2022
Eclipse-grabber - Generate Discord Token Grabbers for both Windows and MacOS

Eclipse Grabber Eclipse Discord Token Grabber What is Eclipse? Eclipse is an ope

Dimitris Kalopisis 117 Dec 23, 2022
Shows VRML team stats of all players in your pubs

VRML Team Stat Searcher Displays Team Name, Team Rank (Worldwide), and tier of all the players in your pubs. GUI WIP: Username search works & pub name

Hamish 2 Dec 22, 2022
A python API for BSCScan (Binance Smart Chain Explorer), available on PyPI.

bscscan-python A complete Python API for BscScan.com, available on PyPI. Powered by BscScan.com APIs. This is a gently modified fork of the etherscan-

Panagiotis Kotsias 246 Dec 31, 2022
A Code that can make your Discord Account 24/7 on Voice Channels!

Voicecord Make your Discord Account Online 24/7 on Voice Channels! A Code written in Python that helps you to keep your account 24/7 on Voice Channels

Phantom 229 Jan 07, 2023
With Google Drive API. My computer and my phone are in love now.

Channel trought Google Drive Google Drive API In this case, "Google Drive App" is the program. To install everything you need(has some extra things),

Luis Quiñones Requelme 1 Dec 15, 2021