Polyglot Machine Learning example for scraping similar news articles.

Overview

Polyglot Machine Learning example for scraping similar news articles

Machine Learning Polyglot with Python and NodeJS

In this example, we will see how we can work with Machine Learning applications written in Python with a NodeJS Script, to build a Polyglot Machine Learning application for scraping similar news articles.

Install

Install MetaCall CLI:

$ curl -sL https://raw.githubusercontent.com/metacall/install/master/install.sh | sh

Install application dependencies:

  • For Python: metacall pip3 install -r requirements.txt
  • For NodeJS: metacall npm i readline-sync

Run the Example

$ metacall app.js

Once the application is kick-started, you will be prompted to enter a News Article which you would like to find similar articles for. Let's use this sample article for testing our application: https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html

Here is the application output:

$ metacall app.js
Information: Global configuration loaded from /gnu/store/5cxmq6y8z24ijnvhh6lndgpriwnhf3jl-metacall-0.3.17/configurations/global.json
Enter the News URL:
https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬───────────────┐
│                                                       (index)                                                       │    Values     │
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┼───────────────┤
│ https://auto.timesofindia.com/news/others/teslas-autopilot-technology-faces-fresh-scrutiny/articleshow/81652823.cms │ '83.68405286' │
│                    https://www.autosafety.org/teslas-autopilot-technology-faces-fresh-scrutiny/                     │ '60.35694007' │
│                    https://www.anandmarket.in/teslas-autopilot-technology-faces-fresh-scrutiny/                     │ '94.97681053' │
│                                     https://www.entrepreneur.com/article/367724                                     │ '60.67538891' │
│                 http://www.newsnetworks.in/india/teslas-autopilot-technology-faces-fresh-scrutiny/                  │     '0.'      │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴───────────────┘
Script (app.js) loaded correctly

Deployment using MetaCall FaaS

After deploying the application into the FaaS https://dashboard.metacall.io, it can be accessed with (change by the alias you used to sign up):

curl -X POST https://api.metacall.io/<your_alias>/ml-news-article-scraper-example/v1/call/links -X POST --data '{ "url": "https://www.nytimes.com/2021/03/23/business/teslas-autopilot-safety-investigations.html" }'

LICENSE

Apache License 2.0

Owner
MetaCall
MetaCall
A high-performance topological machine learning toolbox in Python

giotto-tda is a high-performance topological machine learning toolbox in Python built on top of scikit-learn and is distributed under the G

giotto.ai 632 Dec 29, 2022
🎛 Distributed machine learning made simple.

🎛 lazycluster Distributed machine learning made simple. Use your preferred distributed ML framework like a lazy engineer. Getting Started • Highlight

Machine Learning Tooling 44 Nov 27, 2022
Learning --> Numpy January 2022 - winter'22

Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along

Shahzaneer Ahmed 0 Mar 12, 2022
Bayesian Additive Regression Trees For Python

BartPy Introduction BartPy is a pure python implementation of the Bayesian additive regressions trees model of Chipman et al [1]. Reasons to use BART

187 Dec 16, 2022
Graphsignal is a machine learning model monitoring platform.

Graphsignal is a machine learning model monitoring platform. It helps ML engineers, MLOps teams and data scientists to quickly address issues with data and models as well as proactively analyze model

Graphsignal 143 Dec 05, 2022
Practical Time-Series Analysis, published by Packt

Practical Time-Series Analysis This is the code repository for Practical Time-Series Analysis, published by Packt. It contains all the supporting proj

Packt 325 Dec 23, 2022
Apple-voice-recognition - Machine Learning

Apple-voice-recognition Machine Learning How does Siri work? Siri is based on large-scale Machine Learning systems that employ many aspects of data sc

Harshith VH 1 Oct 22, 2021
My project contrasts K-Nearest Neighbors and Random Forrest Regressors on Real World data

kNN-vs-RFR My project contrasts K-Nearest Neighbors and Random Forrest Regressors on Real World data In many areas, rental bikes have been launched to

1 Oct 28, 2021
AutoX是一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色、简单易用、通用、自动化、灵活。

English | 简体中文 AutoX是什么? AutoX一个高效的自动化机器学习工具,它主要针对于表格类型的数据挖掘竞赛。 它的特点包括: 效果出色: AutoX在多个kaggle数据集上,效果显著优于其他解决方案(见效果对比)。 简单易用: AutoX的接口和sklearn类似,方便上手使用。

4Paradigm 431 Dec 28, 2022
Distributed scikit-learn meta-estimators in PySpark

sk-dist: Distributed scikit-learn meta-estimators in PySpark What is it? sk-dist is a Python package for machine learning built on top of scikit-learn

Ibotta 282 Dec 09, 2022
Automatic extraction of relevant features from time series:

tsfresh This repository contains the TSFRESH python package. The abbreviation stands for "Time Series Feature extraction based on scalable hypothesis

Blue Yonder GmbH 7k Jan 06, 2023
TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models.

TensorFlow Decision Forests (TF-DF) is a collection of state-of-the-art algorithms for the training, serving and interpretation of Decision Forest models. The library is a collection of Keras models

538 Jan 01, 2023
MiniTorch - a diy teaching library for machine learning engineers

This repo is the full student code for minitorch. It is designed as a single repo that can be completed part by part following the guide book. It uses

1.1k Jan 07, 2023
Open-Source CI/CD platform for ML teams. Deliver ML products, better & faster. ⚡️🧑‍🔧

Deliver ML products, better & faster Giskard is an Open-Source CI/CD platform for ML teams. Inspect ML models visually from your Python notebook 📗 Re

Giskard 335 Jan 04, 2023
ML-powered Loan-Marketer Customer Filtering Engine

In Loan-Marketing business employees are required to call the user's to buy loans of several fields and in several magnitudes. If employees are calling everybody in the network it is also very length

Sagnik Roy 13 Jul 02, 2022
Scikit-Learn useful pre-defined Pipelines Hub

Scikit-Pipes Scikit-Learn useful pre-defined Pipelines Hub Usage: Install scikit-pipes It's advised to install sklearn-genetic using a virtual env, in

Rodrigo Arenas 1 Apr 26, 2022
Python Machine Learning Jupyter Notebooks (ML website)

Python Machine Learning Jupyter Notebooks (ML website) Dr. Tirthajyoti Sarkar, Fremont, California (Please feel free to connect on LinkedIn here) Also

Tirthajyoti Sarkar 2.6k Jan 03, 2023
A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in Python using Scikit-Learn, Keras and TensorFlow 2.

Machine Learning Notebooks, 3rd edition This project aims at teaching you the fundamentals of Machine Learning in python. It contains the example code

Aurélien Geron 1.6k Jan 05, 2023
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Model Serving Made Easy BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. Supports multi

BentoML 4.4k Jan 04, 2023
ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from

Tinkoff.AI 674 Jan 07, 2023