Polyglot Machine Learning example for scraping similar news articles.

Overview

Polyglot Machine Learning example for scraping similar news articles

Machine Learning Polyglot with Python and NodeJS

In this example, we will see how we can work with Machine Learning applications written in Python with a NodeJS Script, to build a Polyglot Machine Learning application for scraping similar news articles.

Install

Install MetaCall CLI:

$ curl -sL https://raw.githubusercontent.com/metacall/install/master/install.sh | sh

Install application dependencies:

  • For Python: metacall pip3 install -r requirements.txt
  • For NodeJS: metacall npm i readline-sync

Run the Example

$ metacall app.js

Once the application is kick-started, you will be prompted to enter a News Article which you would like to find similar articles for. Let's use this sample article for testing our application: https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html

Here is the application output:

$ metacall app.js
Information: Global configuration loaded from /gnu/store/5cxmq6y8z24ijnvhh6lndgpriwnhf3jl-metacall-0.3.17/configurations/global.json
Enter the News URL:
https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬───────────────┐
│                                                       (index)                                                       │    Values     │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────┤
│ https://auto.timesofindia.com/news/others/teslas-autopilot-technology-faces-fresh-scrutiny/articleshow/81652823.cms │ '83.68405286' │
│                    https://www.autosafety.org/teslas-autopilot-technology-faces-fresh-scrutiny/                     │ '60.35694007' │
│                    https://www.anandmarket.in/teslas-autopilot-technology-faces-fresh-scrutiny/                     │ '94.97681053' │
│                                     https://www.entrepreneur.com/article/367724                                     │ '60.67538891' │
│                 http://www.newsnetworks.in/india/teslas-autopilot-technology-faces-fresh-scrutiny/                  │     '0.'      │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────┘
Script (app.js) loaded correctly

Deployment using MetaCall FaaS

After deploying the application into the FaaS https://dashboard.metacall.io, it can be accessed with (change by the alias you used to sign up):

curl -X POST https://api.metacall.io/<your_alias>/ml-news-article-scraper-example/v1/call/links -X POST --data '{ "url": "https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html" }'

LICENSE

Apache License 2.0

Owner
MetaCall
MetaCall
database for artificial intelligence/machine learning data

AIDB v0.0.1 database for artificial intelligence/machine learning data Overview aidb is a database designed for large dataset for machine learning pro

Aarush Gupta 1 Oct 24, 2021
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Intel(R) Extension for Scikit-learn* Installation | Documentation | Examples | Support | FAQ With Intel(R) Extension for Scikit-learn you can accelera

Intel Corporation 858 Dec 25, 2022
Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑‍🔧

Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re

Giskard 335 Jan 04, 2023
A Time Series Library for Apache Spark

Flint: A Time Series Library for Apache Spark The ability to analyze time series data at scale is critical for the success of finance and IoT applicat

Two Sigma 970 Jan 04, 2023
Lseng-iseng eksplor Machine Learning dengan menggunakan library Scikit-Learn

Kalo dengar istilah ML, biasanya rada ambigu. Soalnya punya beberapa kepanjangan, seperti Mobile Legend, Makan Lontong, Ma**ng L*v* dan lain-lain. Tapi pada repo ini membahas Machine Learning :)

Alfiyanto Kondolele 1 Apr 06, 2022
Contains an implementation (sklearn API) of the algorithm proposed in "GENDIS: GEnetic DIscovery of Shapelets" and code to reproduce all experiments.

GENDIS GENetic DIscovery of Shapelets In the time series classification domain, shapelets are small subseries that are discriminative for a certain cl

IDLab Services 90 Oct 28, 2022
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Dec 22, 2022
Deep Survival Machines - Fully Parametric Survival Regression

Package: dsm Python package dsm provides an API to train the Deep Survival Machines and associated models for problems in survival analysis. The under

Carnegie Mellon University Auton Lab 10 Dec 30, 2022
pure-predict: Machine learning prediction in pure Python

pure-predict speeds up and slims down machine learning prediction applications. It is a foundational tool for serverless inference or small batch prediction with popular machine learning frameworks l

Ibotta 84 Dec 29, 2022
A game theoretic approach to explain the output of any machine learning model.

SHAP (SHapley Additive exPlanations) is a game theoretic approach to explain the output of any machine learning model. It connects optimal credit allo

Scott Lundberg 18.2k Jan 02, 2023
Warren - Stock Price Predictor

Web app to predict closing stock prices in real time using Facebook's Prophet time series algorithm with a multi-variate, single-step time series forecasting strategy.

Kumar Nityan Suman 153 Jan 03, 2023
Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters

Somoclu Somoclu is a massively parallel implementation of self-organizing maps. It exploits multicore CPUs, it is able to rely on MPI for distributing

Peter Wittek 239 Nov 10, 2022
🌊 River is a Python library for online machine learning.

River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on strea

OnlineML 4k Jan 03, 2023
An open-source library of algorithms to analyse time series in GPU and CPU.

An open-source library of algorithms to analyse time series in GPU and CPU.

Shapelets 216 Dec 30, 2022
The code from the Machine Learning Bookcamp book and a free course based on the book

The code from the Machine Learning Bookcamp book and a free course based on the book

Alexey Grigorev 5.5k Jan 09, 2023
MegFlow - Efficient ML solutions for long-tailed demands.

Efficient ML solutions for long-tailed demands.

旷视天元 MegEngine 371 Dec 21, 2022
ThunderSVM: A Fast SVM Library on GPUs and CPUs

What's new We have recently released ThunderGBM, a fast GBDT and Random Forest library on GPUs. add scikit-learn interface, see here Overview The miss

Xtra Computing Group 1.4k Dec 22, 2022
Generate music from midi files using BPE and markov model

Generate music from midi files using BPE and markov model

Aditya Khadilkar 37 Oct 24, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Jan 03, 2023
A python fast implementation of the famous SVD algorithm popularized by Simon Funk during Netflix Prize

⚡ funk-svd funk-svd is a Python 3 library implementing a fast version of the famous SVD algorithm popularized by Simon Funk during the Neflix Prize co

Geoffrey Bolmier 171 Dec 19, 2022