Predictive AI layer for existing databases.

Overview

MindsDB

MindsDB workflow Python supported PyPi Version PyPi Downloads MindsDB Community MindsDB Website

MindsDB is an open-source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning models using SQL queries. Tweet

Predictive AI layer for existing databases
MindsDB

Try it out

Contributing

To contribute to mindsdb, please check out our Contribution guide.

Current contributors

Made with contributors-img.

Report Issues

Please help us by reporting any issues you may have while using MindsDB.

License

Comments
  • [Bug]: Exception

    [Bug]: Exception "Lightgbm mixer not supported for type: tags" when following process plant tutorial

    Is there an existing issue for this?

    • [x] I have searched the existing issues

    Current Behaviour

    I'm following the Manufacturing process quality tutorial and during training I get the error Exception: Lightgbm mixer not supported for type: tags, raised at: /opt/conda/lib/python3.7/site-packages/mindsdb/interfaces/model/learn_process.py#177.

    Expected Behaviour

    I'd expect AutoML to train the model successfully

    Steps To Reproduce

    Follow the tutorial steps until training.
    

    Anything else?

    The tutorials seem to be of mixed quality, some resulting in errors, some in low model performance or "null" predictions (for the bus ride tutorial).

    bug documentation 
    opened by philfuchs 22
  • install issues on windows 10

    install issues on windows 10

    Describe the bug When installing mindsdb, the following error message is output with the following command.

    command: pip install --requirement reqs.txt

    error message: ERROR: Could not find a version that satisfies the requirement torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2) ERROR: No matching distribution found for torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25))

    Desktop (please complete the following information):

    • OS: windows 10

    Additional context I think pytorch for Windows is currently not available through PYPI, so using the commands in https://pytorch.org is a better way.

    bug 
    opened by YottaGin 22
  • Integration merlion issue2377

    Integration merlion issue2377

    1. Modify sql_query.py to support to define customized return columns and dtype of ml_handlers;
    2. [issue2377] Merlion integrated, forecaster: default, sarima, prophet, mses, detector: default, isolation forest, windstats, prophet;
    3. Corresponding test cases added, /mindsdb/tests/unit/test_merlion_handler.
    opened by rfsiten 21
  • [Bug]:  metadata-generation-failed

    [Bug]: metadata-generation-failed

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I keep on trying to pip install mindsdb. Screenshot 2022-05-10 11 13 31

    This is the message that it comes up with.

    Things tried:

    • Installing wheel
    • Upgrading pip
    • Installing sktime
    • googling/StackOverflow

    Expected Behavior

    I anticipated that I would be able to download the package.

    Steps To Reproduce

    pip install mindsdb
    

    Anything else?

    No response

    bug 
    opened by TyanBr 20
  • [Bug]: Not able to preview data due to internal server error HTTP 500

    [Bug]: Not able to preview data due to internal server error HTTP 500

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I was trying the mindsdb get started tutorial, that time i encountered HTTP 500 error, is it due to URL authorization scope not found ? or anything else. I tried googling why this happens but couldn't get a proper explanation. I can't do further steps it keeps on giving me a HTTP 500 error. Is it a common issue, coz i keep getting it on the website while running the query for the tutorial and can't preview my tables. Says that the query is not proper, i ran the exact code from tutorial. Is it is some issue from my side...anyway do help me overcome it. I really like mindsdb concept and an AI/ML enthusiast :)

    Here is the ss of what im getting while running the queries:

    Screenshot 2022-07-14 012410

    It says query with error but i am running the code from the tutorial the syntax is intact. Is there any changes to be made in the syntax?

    Expected Behavior

    I should be able to preview the home rentals data that i will train in the further steps.

    Steps To Reproduce

    Go to the mindsdb website and then start the home rentals demo tutorial. Run the demo code given in the step 1. Try to preview the data
    

    Anything else?

    nothing

    bug 
    opened by prathikshetty2002 19
  • Could not load module ModelInterface

    Could not load module ModelInterface

    hello can someone help to solve this :

    • ERROR:mindsdb-logger-f7442ec0-574d-11ea-a55e-106530eaf271:c:\users\drpbengrir\appdata\local\programs\python\python37\lib\site-packages\mindsdb\libs\controllers\transaction.py:188 - Could not load module ModelInterface
    bug 
    opened by simofilahi 18
  • FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    Describe the bug A FileNotFoundError occurs when running the predict.py script described below.

    The full traceback is the following:

    Traceback (most recent call last):
      File "predict.py", line 12, in <module>
        result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/predictor.py", line 472, in predict
        transaction = Transaction(session=self, light_transaction_metadata=light_transaction_metadata, heavy_transaction_metadata=heavy_transaction_metadata, breakpoint=breakpoint)
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 53, in __init__
        self.run()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 259, in run
        self._execute_predict()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 157, in _execute_predict
        with open(CONFIG.MINDSDB_STORAGE_PATH + '/' + self.lmd['name'] + '_light_model_metadata.pickle', 'rb') as fp:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'
    

    To Reproduce Steps to reproduce the behavior:

    1. Create a train.py script using the dataset: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016#master.csv. The train.py script is the one below:
    from mindsdb import Predictor
    
    Predictor(name='suicide_rates').learn(
        to_predict='suicides_no', # the column we want to learn to predict given all the data in the file
        from_data="master.csv" # the path to the file where we can learn from, (note: can be url)
    )
    
    1. Run the train.py script.
    2. Create and run the predict.py script:
    from mindsdb import Predictor
    
    # use the model to make predictions
    result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
    
    # you can now print the results
    print(result)
    
    1. See error

    Expected behavior What was expected was to see the results.

    Desktop (please complete the following information):

    • OS: Ubuntu 18.04.2 LTS
    • mindsdb 1.0.5
    • python 3.6.7
    bug 
    opened by mlliarm 18
  • MySQL / Singlestore DB SSL support

    MySQL / Singlestore DB SSL support

    Problem I cannot connect my Singlestore DB (MySQL driver) because Mindsdb doesn't support SSL options.

    Describe the solution you'd like Full support for MySQL SSL (key, cert, ca).

    Describe alternatives you've considered No alternative is possible at the moment, security first.

    enhancement question 
    opened by pierre-b 17
  • Caching historical data for streams

    Caching historical data for streams

    I'll be using an example here and generalize whenever needed, let's say we have the following dataset we train a timeseries predictor on:

    time,gb,target,aux
    1     ,A, 7,        foo
    2     ,A, 10,      foo
    3     ,A,  12,     bar
    4     ,A,  14,     bar
    2     ,B,  5,       foo
    4     ,B,  9,       foo
    

    In this case target is what we are predicting, gb is the column we are grouping on and we are ordering by time. aux is an unrelated column that's not timeseries in nature and just used "normally".

    We train a predictor with a window of n

    Then let's say we have an input stream that looks something like this:

    time, gb, target, aux
    6,      A ,   33,     foo
    7,      A,    54,     foo
    

    Caching

    First, we will need to store, for each value of the column gb n recrods.

    So, for example, if n==1 we would save the last row in the data above, if n==2 we would save both, whem new rows come in, we un-cache the older rows`.

    Infering

    Second, when a new datapoint comes into the input stream we'll need to "infer" that the prediction we have to make is actually for the "next" datapoint. Which is to say that when: 7, A, 54, foo comes in we need to infer that we need to actually make predictions for:

    8, A, <this is what we are predicting>, foo

    The challenge here is how do we infer that the next timestamp is 8, one simple way to do this is to just subtract from the previous record, but that's an issue for the first observation (since we don't have a previous record to substract from, unless we cache part of the training data) or we could add a feature to native, to either:

    a) Provide a delta argument for each group by (representing by how much we increment the order column[s]) b) Have an argument when doing timeseries prediction that tells it to predict for the "next" row and then do the inferences under the cover.

    @paxcema let me know which of these features would be easy to implement in native, since you're now the resident timeseries expert.

    enhancement 
    opened by George3d6 16
  • Data extraction with mindsdb v.2

    Data extraction with mindsdb v.2

    This is a followup to #334, which was about the same use case and dataset, but different version of MindsDB with different errors.

    Your Environment

    Google Colab.

    • Python version: 3.6
    • Pip version: 19.3.1
    • Mindsdb version you tried to install: 2.13.8

    Describe the bug Running .learn() fails.

    [nltk_data]   Package stopwords is already up-to-date!
    
    /usr/local/lib/python3.6/dist-packages/lightwood/mixers/helpers/ranger.py:86: UserWarning: This overload of addcmul_ is deprecated:
    	addcmul_(Number value, Tensor tensor1, Tensor tensor2)
    Consider using one of the following signatures instead:
    	addcmul_(Tensor tensor1, Tensor tensor2, *, Number value) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
      exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
    
    Downloading: 100%
    232k/232k [00:00<00:00, 1.31MB/s]
    
    
    Downloading: 100%
    442/442 [00:06<00:00, 68.0B/s]
    
    
    Downloading: 100%
    268M/268M [00:06<00:00, 43.1MB/s]
    
    
    Token indices sequence length is longer than the specified maximum sequence length for this model (606 > 512). Running this sequence through the model will result in indexing errors
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:173 - Could not load module ModelInterface
    
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:239 - index out of range in self
    
    ---------------------------------------------------------------------------
    
    IndexError                                Traceback (most recent call last)
    
    <ipython-input-13-a5e3bd095e46> in <module>()
          7 mdb.learn(
          8     from_data=train,
    ----> 9     to_predict='birth_year' # the column we want to learn to predict given all the data in the file
         10 )
    
    20 frames
    
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
       1812         # remove once script supports set_grad_enabled
       1813         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
    -> 1814     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
       1815 
       1816 
    
    IndexError: index out of range in self
    

    https://colab.research.google.com/drive/1a6WRSoGK927m3eMkVdlwBhW6-3BtwaAO?usp=sharing

    To Reproduce

    1. Rerun this notebook on Google Colab https://github.com/opendataby/vybary2019/blob/e08c32ac51e181ddce166f8a4fbf968f81bd2339/canal03-parsing-with-mindsdb.ipynb
    bug 
    opened by abitrolly 15
  • Upload new Predictor

    Upload new Predictor

    Your Environment Scout The Scout upload Prediction does not function when a zip file is trying to upload.

    Errror is "Just .zip files are allowed" even though the zip file was selected. 
    
    

    Question on prediction, is there any way to upload a model of Tensorflow, or do we need to Convert TensorFlow model into the MindDB prediction model?

    bug question 
    opened by Winthan 15
  • [Bug]: TS dates not interpreted correctly

    [Bug]: TS dates not interpreted correctly

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    Predictions from a timeseries model (using >LATEST) gives dates starting Dec 1st, however there are Dec dates (e.g. 6th) within the training data.

    Expected Behavior

    Predictions using ">LATEST" should start 1 day after the latest date in the training data.

    Steps To Reproduce

    CREATE MODEL mindsdb.callsdata_time
    FROM files
    (SELECT * from CallData)
    PREDICT CallsOfferedTo5
    ORDER BY DateTime
    GROUP BY PrecisionQueue
    WINDOW 8    
    HORIZON 4;
    
    
    SELECT m.DateTime as date,
      m.CallsOfferedTo5 as forecast
      FROM mindsdb.callsdata_time as m
      JOIN files.CallData as t
      WHERE t.DateTime > LATEST
      AND t.PrecisionQueue = 5000;
    
    
    
    ### Anything else?
    
    **Training data**
    [export (3) (1).csv](https://github.com/mindsdb/mindsdb/files/10335508/export.3.1.csv)
    
    **Example of dates in Dec**
    ![image](https://user-images.githubusercontent.com/34073127/210332080-d2471ece-5c3a-4a3c-942c-ea1e41a2d4cd.png)
    
    **Model output**
    ![image](https://user-images.githubusercontent.com/34073127/210332293-418f2a5a-c6d7-4bba-98a7-1effb8e1bca2.png)
    
    bug 
    opened by tomhuds 1
  • [ENH] Override dtypes - LW handler

    [ENH] Override dtypes - LW handler

    Description

    Enables support for overriding automatically inferred data types when using the Lightwood handler.

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [x] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 This change requires a documentation update

    What is the solution?

    If the user provides:

    CREATE ...
    USING 
    dtype_dict = {
       'col1': 'type',
        ...
    };
    

    The handler will extract this mapping and use it to build a modified problem definition, which will trigger the default encoder belonging to each new dtype (note: manually specifying encoders is out of scope for this PR and will be added at a later date).

    Checklist:

    • [x] My code follows the style guidelines(PEP 8) of MindsDB.
    • [x] I have commented my code, particularly in hard-to-understand areas.
    • [ ] I have updated the documentation, or created issues to update them.
    • [x] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [x] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    opened by paxcema 0
  • updated all links

    updated all links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4263 Fixes copied modified content of databases.mdx into its copy databases2.mdx Fixes removed https://docs.mindsdb.com from links

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 Documentation update

    What is the solution?

    (Describe at a high level how the feature was implemented) As in Description.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • updated links

    updated links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4261

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 Documentation update

    What is the solution?

    Updated file names in links wherever necessary.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • Predictions for ranges/sets of values

    Predictions for ranges/sets of values

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Is your feature request related to a problem? Please describe.

    I'd like to retrieve a prediction filtering over a range of values rather than only individual exact values. Example: select rental_price, rental_price_explain from mindsdb.home_rentals_model where sqft between 1000 and 1200 and neighborhood in ('berkeley_hills', 'westbrae', 'downtown');

    This currently generates an error: Only 'and' and '=' operations allowed in WHERE clause, found: BetweenOperation(op='between', args=( Identifier(parts=['sqft']), Constant(value=1000), Constant(value=1200) )

    Describe the solution you'd like.

    I'd like to have the ability to use filter criteria such as "in", "between", or "or" logic to retrieve a single prediction.

    Describe an alternate solution.

    No response

    Anything else? (Additional Context)

    https://mindsdbcommunity.slack.com/archives/C01S2T35H18/p1672320517307839

    opened by rbkrejci 0
Releases(v22.12.4.3)
Official implementation of our paper "LLA: Loss-aware Label Assignment for Dense Pedestrian Detection" in Pytorch.

LLA: Loss-aware Label Assignment for Dense Pedestrian Detection This project provides an implementation for "LLA: Loss-aware Label Assignment for Dens

35 Dec 06, 2022
A collection of IPython notebooks covering various topics.

ipython-notebooks This repo contains various IPython notebooks I've created to experiment with libraries and work through exercises, and explore subje

John Wittenauer 2.6k Jan 01, 2023
[ICML 2022] The official implementation of Graph Stochastic Attention (GSAT).

Graph Stochastic Attention (GSAT) The official implementation of GSAT for our paper: Interpretable and Generalizable Graph Learning via Stochastic Att

85 Nov 27, 2022
Fast and simple implementation of RL algorithms, designed to run fully on GPU.

RSL RL Fast and simple implementation of RL algorithms, designed to run fully on GPU. This code is an evolution of rl-pytorch provided with NVIDIA's I

Robotic Systems Lab - Legged Robotics at ETH Zürich 68 Dec 29, 2022
Object detection using yolo-tiny model and opencv used as backend

Object detection Algorithm used : Yolo algorithm Backend : opencv Library required: opencv = 4.5.4-dev' Quick Overview about structure 1) main.py Load

2 Jul 06, 2022
Dataset used in "PlantDoc: A Dataset for Visual Plant Disease Detection" accepted in CODS-COMAD 2020

PlantDoc: A Dataset for Visual Plant Disease Detection This repository contains the Cropped-PlantDoc dataset used for benchmarking classification mode

Pratik Kayal 109 Dec 29, 2022
An intelligent, flexible grammar of machine learning.

An english representation of machine learning. Modify what you want, let us handle the rest. Overview Nylon is a python library that lets you customiz

Palash Shah 79 Dec 02, 2022
Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

Transferable Unrestricted Adversarial Examples This is the PyTorch implementation of the Arxiv paper: Towards Transferable Unrestricted Adversarial Ex

equation 16 Dec 29, 2022
Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

Daniele Grattarola 2.2k Jan 08, 2023
Reinforcement Learning for the Blackjack

Reinforcement Learning for Blackjack Author: ZHA Mengyue Math Department of HKUST Problem Statement We study playing Blackjack by reinforcement learni

Dolores 3 Jan 24, 2022
Distance correlation and related E-statistics in Python

dcor dcor: distance correlation and related E-statistics in Python. E-statistics are functions of distances between statistical observations in metric

Carlos Ramos Carreño 108 Dec 27, 2022
Autotype on websites that have copy-paste disabled like Moodle, HackerEarth contest etc.

Autotype A quick and small python script that helps you autotype on websites that have copy paste disabled like Moodle, HackerEarth contests etc as it

Tushar 32 Nov 03, 2022
An ever-growing playground of notebooks showcasing CLIP's impressive zero-shot capabilities.

Playground for CLIP-like models Demo Colab Link GradCAM Visualization Naive Zero-shot Detection Smarter Zero-shot Detection Captcha Solver Changelog 2

Kevin Zakka 101 Dec 30, 2022
Unofficial PyTorch Implementation of Multi-Singer

Multi-Singer Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus. Requirements See re

SunMail-hub 123 Dec 28, 2022
Download and preprocess popular sequential recommendation datasets

Sequential Recommendation Datasets This repository collects some commonly used sequential recommendation datasets in recent research papers and provid

125 Dec 06, 2022
Categorizing comments on YouTube into different categories.

Youtube Comments Categorization This repo is for categorizing comments on a youtube video into different categories. negative (grievances, complaints,

Rhitik 5 Nov 26, 2022
PyBrain - Another Python Machine Learning Library.

PyBrain -- the Python Machine Learning Library =============================================== INSTALLATION ------------ Quick answer: make sure you

2.8k Dec 31, 2022
Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021)

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion (CVPR 2021) This repository is for BAAF-Net introduce

90 Dec 29, 2022
PiRank: Learning to Rank via Differentiable Sorting

PiRank: Learning to Rank via Differentiable Sorting This repository provides a reference implementation for learning PiRank-based models as described

54 Dec 17, 2022
Anchor-free Oriented Proposal Generator for Object Detection

Anchor-free Oriented Proposal Generator for Object Detection Gong Cheng, Jiabao Wang, Ke Li, Xingxing Xie, Chunbo Lang, Yanqing Yao, Junwei Han, Intro

jbwang1997 56 Nov 15, 2022