A lightweight tool to get an AI Infrastructure Stack up in minutes not days.

Overview


Welcome to K3ai Project

K3ai is a lightweight tool to get an AI Infrastructure Stack up in minutes not days.

cli version  go version  go report  license


NOTE on the K3ai origins

Original K3ai Project has been developed at the end of October 2020 in 2 weeks by:

K3ai v1.0 has been entirely re-written by Alessandro Festa during the month of October 2021 to offer a better User Experience.

Thanks to the amazing and incredible people and projects that have been instrumental to create K3ai project repositories,website,etc...

โšก๏ธ Quick start

Let's discover K3ai in three simple steps.

๐ŸŒ˜ Getting Started

Get started by download k3ai from the release page here.

Or try K3ai companion script using this command:

curl -LO https://get.k3ai.in | sh -

๐ŸŒ— Load K3ai configuration

Let's start loading the configuration:

k3ai up

First time k3ai run will ask for a Github PAT (Personal Access Token) that we will use to avoid API calls limitations. Check Github Documentation to learn how to create one. Your personal GH PAT only need read repository permission.


๐ŸŒ– Configure the base infrastructure

Choose your favourite Kubernetes flavor and run it:

To know which K8s flavors are available

k3ai cluster list --all

it should print something like:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ INFRASTRUCTURE                                                                                          โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ TYPE  โ”‚ DESCRIPTION                                         โ”‚ KIND  โ”‚ TAG    โ”‚ VERSION โ”‚ STATUS         โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ CIVO  โ”‚ The First Cloud Native Service Provider Power...    โ”‚ infra โ”‚ cloud  โ”‚ latest  โ”‚ Available      โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ EKS-A โ”‚ Amazon Eks Anywhere Is A New Deployment Option...   โ”‚ infra โ”‚ hybrid โ”‚ v0.5.0  โ”‚ Available      โ”‚
โ”‚       โ”‚ ate And Operate Kubernetes Clusters On Custome...   โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ K3S   โ”‚ K3s Is A Highly Available, Certified Kubernetes...  โ”‚ infra โ”‚ local  โ”‚ latest  โ”‚ Available      โ”‚
โ”‚       โ”‚ oads In Unattended, Resource-Constrained...         โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ KIND  โ”‚ Kind Is A Tool For Running Local Kubernetes...      โ”‚ infra โ”‚ local  โ”‚ v0.11.2 โ”‚ Available      โ”‚
โ”‚       โ”‚ as Primarily Designed For Testing Kubernetes...     โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”‚       โ”‚  Or Ci.                                             โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ TANZU โ”‚ Tanzu Community Edition Is A Fully-Featured...      โ”‚ infra โ”‚ hybrid โ”‚ latest  โ”‚ In Development โ”‚
โ”‚       โ”‚ ers And Users. It Is A Freely Available...          โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ”‚       โ”‚  Of Vmware Tanzu.                                   โ”‚       โ”‚        โ”‚         โ”‚                โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Now let start with something super fast and super simple:

k3ai cluster deploy --type k3s --n mycluster

๐ŸŒ Install a plugin to do your AI experimentations

Now that the server is up and running let's type:

k3ai plugin deploy -n mlflow -t mycluster

K3ai will print the url where you may access to the MLFLow tracking server at the end of the installation. That's all now just start having fun with K3ai!

๐ŸŒˆ Push a piece of code to the AI tools and focus on your goals

Let's push some code to the AI tool (i.e.: MLFlow)

k3ai run --source https://github.com/k3ai/quickstart --target mycluster --backend mlflow

wait the run to complete and login the backend AI tolls (i.e.: on the MLFlow UI http:// :30500 )

Current Implementation support

Operating Systems

Operating System K3ai v1.0.0
Linux Yes
Windows In Progress
MacOs In Progress
Arm In Progress

Clusters

K8s Clusters K3ai v1.0.0
Rancher K3s Yes
Vmware Tanzu Community Ed. Yes
Amazon EKS Anywhere Yes
KinD Yes

Plugins

Plugins K3ai v1.0.0
Kuebflow Components Yes
MLFlow Yes
Apache Airflow Yes
Argo Workflows Yes

โญ๏ธ Project assistance

If you want to say thank you or/and support active development of K3ai Project:

Together, we can make this project better every day! ๐Ÿ˜˜

โš ๏ธ License

K3ai is free and open-source software licensed under the BSD 3-Clause. Official logo was created by Alessandro Festa.

Comments
  • [Core] - Initial work for v3 version

    [Core] - Initial work for v3 version

    This PR is the initial work to re-write k3ai into a more flexible tool. This PR implements:

    • [x] #3
    • [x] #4
    • [x] #5
    • [ ] #6
    • [x] #7
    • [x] #8
    • [ ] #9

    This PR also include the Issues in the Plugin repo:

    • [x] https://github.com/k3ai/plugins/issues/1
    • [x] https://github.com/k3ai/plugins/issues/2
    • [x] https://github.com/k3ai/plugins/issues/3
    done 
    opened by alefesta 11
  • [BUG] k3ai up yields version `GLIBC_2.28' not found error

    [BUG] k3ai up yields version `GLIBC_2.28' not found error

    Describe the bug After following the installation instructions, the following error is reported:

    k3ai: /lib/x86_64-linux-gnu/libc.so.6: versionGLIBC_2.28' not found (required by k3ai)`

    To Reproduce Steps to reproduce the behavior:

    1. curl -LO https://get.k3ai.in | sh -
    2. k3ai up

    Expected behavior It should have spun up the cluster!

    OS: Ubuntu 18.04

    done bug 
    opened by htahir1 7
  • [Feature] - Running Kubeflow and MLFLow code through

    [Feature] - Running Kubeflow and MLFLow code through "One-Click" approach

    This PR address:

    • #10
    • #14 Also introduce -x as Extra and -e ad Entrypoint

    Examples:

    k3ai run -s https://github.com/alefesta/sample/mlflow -b mlflow -t <clustername>
    k3ai run -s https://github.com/alefesta/sample/kfp -b kfp -e condition.py -t <clustername>
    

    For MLFlow remains to manage the need of boto3 in the conda.yaml file that is a requirement to run on K8s. This need to be addressed before merge this PR. We may:

    • try to inject boto3 in the conda.yaml file at runtime
    • Force the user to fork the example first , change conda and run k3ai The first seems more compliant with k3ai goals of making life of user easier
    done 
    opened by alefesta 6
  • [BUG] - Kubeflow Pipelines Quickstart Repository Missing

    [BUG] - Kubeflow Pipelines Quickstart Repository Missing

    Describe the bug I was trying to follow the kubeflow pipelines tutorial as described in the k3ai website. It seems the final step of running the pipeline fails because the quickstart repository for kubeflow pipelines does not exist.

    To Reproduce Steps to reproduce the behavior:

    1. k3ai up
    2. k3ai cluster deploy -t k3s -n myk3scluster
    3. k3ai plugin deploy -n kf-pa -t myk3scluster
    4. k3ai run -s https://github.com/k3ai/quickstart/kfp -b kfp -e condition.py -t mycluster

    Expected behavior Pipeline to run successfully.

    Actual behavior Pipeline run fails.

    done 
    opened by harshitmahapatra 5
  • [Feature] - Add support for k3d

    [Feature] - Add support for k3d

    ๐Ÿš€ Is your feature request related to a problem? Please describe. Currently, k3ai doesn't have support for k3d.

    k3s is known to have issues with WSL2 deployment (systemd requirement, etc.), so it would be better to have k3d support.

    ๐Ÿ’ก Describe the solution you'd like We can add k3d support to k3ai in a subsequent release. (would require some work on pkg/io/execution).

    epic done 
    opened by burntcarrot 5
  • Fix lint issues

    Fix lint issues

    Fixed 100+ issues related to ineffectual assignments and added error checks. Added golangci-lint workflow to check linting issues while pushing code.

    The log level used while error checking is Fatal (log.Fatal(err)).

    opened by burntcarrot 4
  • [BUG] - MLFlow endpoint doesn't work in WSL2

    [BUG] - MLFlow endpoint doesn't work in WSL2

    Describe the bug

    While running the MLFlow plugin, the endpoint URI displayed by k3ai is not accessible.

    k3ai-mlflow

    The following endpoints are not accessible:

    • http://172.29.170.187:30500/ (displayed by k3ai)
    • http://172.29.170.187:5000/
    • http://10.96.150.194:30500/
    • http://10.244.0.7:30500/

    The IP address for the WSL2 machine is (through wsl hostname -I): 172.29.170.187

    WSL2 uses dynamic IP allocation.

    To Reproduce Steps to reproduce the behavior:

    k3ai run -s https://github.com/k3ai/quickstart -b mlflow
    

    Expected behavior The MLFlow endpoint exposed through k3ai should have worked.

    bug 
    opened by burntcarrot 4
  • [Feature] - Implement a system domain to automatically bind the plugins

    [Feature] - Implement a system domain to automatically bind the plugins

    K3ai should implement an automatice system domain (i.e: sslip.io or nip.io) so that any plugin installed could be exposed with the standard: <plugin-name>.<clusterIP>.nip.io This way we may use the same IP in cases like:

    • WSL
    • Laptops
    epic 
    opened by alefesta 4
  • [BUG] - runtime error with index out of range when running quickstart

    [BUG] - runtime error with index out of range when running quickstart

    Describe the bug A clear and concise description of what the bug is.

    To Reproduce Steps to reproduce the behavior:

    1. Follow quickstart steps:
    k3ai up
    k3ai cluster deploy -t k3s -n mycluster
    k3ai plugin deploy -n mlflow -t mycluster
    
    1. Try running quickstart: $ k3ai run -s https://github.com/k3ai/quickstart -b mlflow
    2. Receive error:
    ๐Ÿงช	Initializing code...
    panic: runtime error: index out of range [0] with length 0
    
    goroutine 1 [running]:
    github.com/k3ai/pkg/runner.Loader({0x7fff7fe7bb4e, 0x22}, {0x0, 0x0}, {0x7fff7fe7bb74, 0x6}, {0x0, 0x0}, {0x0, 0x0})
    	/home/joshec/git/k3ai/pkg/runner/run.go:78 +0x10f6
    github.com/k3ai/cmd.runCommand.func1(0xc000403680, {0xc0003d57c0, 0x0, 0x4})
    	/home/joshec/git/k3ai/cmd/run.go:71 +0x58b
    github.com/spf13/cobra.(*Command).execute(0xc000403680, {0xc0003d5780, 0x4, 0x4})
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:860 +0x5f8
    github.com/spf13/cobra.(*Command).ExecuteC(0x2254960)
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:974 +0x3bc
    github.com/spf13/cobra.(*Command).Execute(...)
    	/home/joshec/go/pkg/mod/github.com/spf13/[email protected]/command.go:902
    github.com/k3ai/cmd.Execute(...)
    	/home/joshec/git/k3ai/cmd/root.go:34
    main.main()
    	/home/joshec/git/k3ai/main.go:10 +0x25
    

    Expected behavior A clear and concise description of what you expected to happen.

    A successful run with proper artifact storage and tracking URI settings

    Screenshots If applicable, add screenshots to help explain your problem.

    in progress bug 
    opened by jeinstei 3
  • [CI/CD] - Add Lint support

    [CI/CD] - Add Lint support

    On running golangci-lint on my local machine, I was able to find 40+ linting issues.

    10 of them were deadcode issues, so it can be ignored as they're a part of adding code for future releases.

    The rest are ineffectual assignments and skipped error checks. We can log the error message for the skipped error checks; it would help us more in debugging.

    I know this sounds like a minor issue, but with more code coming in the subsequent releases, addressing this earlier can help us save a lot of time maintaining good quality code.

    Suggested Fix: Add golangci-lint action as workflow to check linting issues. We can add a rule for excluding deadcode issues for now.

    done 
    opened by burntcarrot 3
  • 'invalid argument'

    'invalid argument'

    Hello - I am trying out the Mlflow deployment as in the tutorials and I get a stream of logs that say "invalid argument" and after a while I get "We tried to publish MLFLow at:http://172.17.0.2:30500" .. but when I go to this page there is no Mlflow server.

    Would appreciate the help. Thanks.

    Great work btw! this library is amazing!

    opened by jsnanavati 2
  • [BUG] - Kubeflow Pipelines not starting

    [BUG] - Kubeflow Pipelines not starting

    Describe the bug I am trying to run the kubeflow plugin on a single node 8vcpu / 16gb ram.

    To Reproduce curl -sfL https://get.k3ai.in | sh - k3ai up k3ai cluster deploy --type k3s -n mycluster k3ai plugin deploy -n kf-pa -t mycluster

    Issue Installation never ends, seems the pods are not being started correctly

    ubuntu:~$ k3s kubectl get pods -n kubeflow
    NAME                                              READY   STATUS                   RESTARTS        AGE
    workflow-controller-b7f95d6c6-q2wkf               1/1     Running                  0               4m22s
    ml-pipeline-scheduledworkflow-5c549bc5f5-drkmn    1/1     Running                  0               4m23s
    ml-pipeline-viewer-crd-7555c4d55f-fpd2m           1/1     Running                  0               4m23s
    metadata-envoy-deployment-7654b98955-rkt2g        1/1     Running                  0               4m24s
    ml-pipeline-ui-656466fdc9-qg9xv                   1/1     Running                  0               4m23s
    mysql-55778745b6-g4vbd                            1/1     Running                  0               4m22s
    minio-6d6d45469f-xgmz2                            1/1     Running                  0               4m24s
    cache-deployer-deployment-6f8ff5b986-tvwn4        1/1     Running                  0               4m24s
    metadata-grpc-deployment-5c8599b99c-b45jf         1/1     Running                  1 (3m17s ago)   4m24s
    ml-pipeline-8995b746f-dhznz                       1/1     Running                  1 (2m31s ago)   4m23s
    cache-server-74494cbf5-k956w                      0/1     Pending                  0               2m20s
    cache-server-74494cbf5-6v5lj                      0/1     ContainerStatusUnknown   0               4m24s
    ml-pipeline-persistenceagent-59689585f6-s8dhd     1/1     Running                  1 (2m5s ago)    4m23s
    ml-pipeline-visualizationserver-6b8fb8c44-mmrk8   0/1     ContainerStatusUnknown   0               4m22s
    ml-pipeline-visualizationserver-6b8fb8c44-svm25   0/1     Pending                  0               113s
    metadata-writer-fd965db48-9lw22                   0/1     Error                    0               4m24s
    metadata-writer-fd965db48-rqt7d                   0/1     Pending                  0               82s
    

    Pod metadata-writer-fd965db48-9lw22 error : message: 'The node was low on resource: ephemeral-storage. Container main was using 392Ki, which exceeds its request of 0. '

    Any ideas? Thanks!

    needs-triage bug 
    opened by tonxxd 1
  • [BUG] - postgress crashes when deploying mlflow on k3s / intel

    [BUG] - postgress crashes when deploying mlflow on k3s / intel

    Describe the bug in postgres pod: Bus error (core dumped)

    running on: (base) [email protected]:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal (base) [email protected]:~$

    To Reproduce

    k3ai cluster deploy --type k3s --name arrakis rk3ai plugin deploy -n mlflow -t arrakis k3s kubectl logs postgres-0

    Expected behavior successful mlflow startup

    Screenshots (base) [email protected]:~$ kubectl get all -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system pod/local-path-provisioner-6c79684f77-996dg 1/1 Running 0 7m28s kube-system pod/coredns-d76bd69b-8zqzz 1/1 Running 0 7m28s kube-system pod/metrics-server-7cd5fcb6b7-6zwwl 1/1 Running 0 7m28s default pod/minio-0 1/1 Running 0 6m40s default pod/mlflow-7c6768c4c-m6j6d 1/1 Running 0 6m23s default pod/postgres-0 0/1 CrashLoopBackOff 6 (20s ago) 6m31s

    NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.43.0.1 443/TCP 7m43s kube-system service/kube-dns ClusterIP 10.43.0.10 53/UDP,53/TCP,9153/TCP 7m40s kube-system service/metrics-server ClusterIP 10.43.39.68 443/TCP 7m39s default service/minio-service ClusterIP 10.43.144.140 9000/TCP 6m40s default service/postgres-service ClusterIP 10.43.236.158 5432/TCP 6m23s default service/mlflow-service NodePort 10.43.192.251 5000:30500/TCP 6m8s

    NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system deployment.apps/local-path-provisioner 1/1 1 1 7m40s kube-system deployment.apps/coredns 1/1 1 1 7m40s kube-system deployment.apps/metrics-server 1/1 1 1 7m39s default deployment.apps/mlflow 1/1 1 1 6m23s

    NAMESPACE NAME DESIRED CURRENT READY AGE kube-system replicaset.apps/local-path-provisioner-6c79684f77 1 1 1 7m29s kube-system replicaset.apps/coredns-d76bd69b 1 1 1 7m29s kube-system replicaset.apps/metrics-server-7cd5fcb6b7 1 1 1 7m29s default replicaset.apps/mlflow-7c6768c4c 1 1 1 6m23s

    NAMESPACE NAME READY AGE default statefulset.apps/minio 1/1 6m40s default statefulset.apps/postgres 0/1 6m31s

    (base) [email protected]:~$ kubectl logs postgres-0 The files belonging to this database system will be owned by user "postgres". This user must also own the server process.

    The database cluster will be initialized with locale "en_US.utf8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english".

    Data page checksums are disabled.

    fixing permissions on existing directory /var/lib/postgresql/mlflow/data ... ok creating subdirectories ... ok selecting default max_connections ... 20 selecting default shared_buffers ... 400kB selecting default timezone ... Etc/UTC selecting dynamic shared memory implementation ... posix creating configuration files ... ok

    bug todo :spiral_notepad: 
    opened by paxinos 1
  • [BUG] - Incompatible k3s version for kubeflow

    [BUG] - Incompatible k3s version for kubeflow

    Describe the bug There's bug on kubeflow part, they currently don't support k8s 1.22, so at the moment kubeflow pipelines seems to work, with k3s, but e.g. kf-dashboard is failing, which might be related to unsupported k8s API version.

    ...
     โณ     Working...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-crds/base": no matches for kind "CustomResourceDefinition" in version "apiextensions.k8s.io/v1beta1"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "EnvoyFilter" in version "networking.istio.io/v1alpha3"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "Gateway" in version "networking.istio.io/v1alpha3"
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "AuthorizationPolicy" in version "security.istio.io/v1beta1"
    ...
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "MutatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
     ๐Ÿš€ unable to recognize "/home/user/.k3ai/git/common/istio-1-9/istio-install/base": no matches for kind "ValidatingWebhookConfiguration" in version "admissionregistration.k8s.io/v1beta1"
    ...
    

    To Reproduce

    k3ai plugin deploy -n kf-dashboard -t myk3scluster

    Expected behavior

    Successful deployment of all kubeflow components on k3s.

    in progress docs 
    opened by Adiqq 1
  • [Feature] - Clean up CLI error messages

    [Feature] - Clean up CLI error messages

    ๐Ÿš€ Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

    See Issue https://github.com/k3ai/k3ai/issues/53 regarding k3ai run not returning a useful error when a required argument was missing

    ๐Ÿ’ก Describe the solution you'd like A clear and concise description of what you want to happen.

    a better CLI handler with more descriptive errors, or at least a fix for this bug on this command's handling

    ๐Ÿคฉ Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

    None yet

    help-wanted epic todo :spiral_notepad: docs 
    opened by jeinstei 0
  • Implement Plugin remove

    Implement Plugin remove

    Hey, great work with K3ai. Works For most of the operations pretty smooth.

    After experimenting a bit with Kubeflow I wanted to remove a plugin, but it seems that the command is not implemented:

    โžœ k3ai  plugin remove --name kf-pa
    Remove a given plugin based on NAME
    
    Usage:
      k3ai[options] plugin remove [-n NAME] [other flags]
    
    Flags:
      -n, --name string     NAME of plugin to be created/deleted
      -t, --target string   Target from where to remove plugin.
      -q, --quiet           Suppress output messages. Useful when k3ai is used within scripts.
      -c, --config string   Configure K3ai using a custom config file.[-c /path/tofile] [-c https://urlToFile]
    

    See here: https://github.com/k3ai/k3ai/blob/main/cmd/plugin.go#L138

    I am not sure, whether I am missing something but couldn't find anything related in the issues or Roadmap.


    • Your operating system name and version: Ubuntu 18.04
    • Detailed steps to reproduce the bug: Follow exact steps from documentation or README to deploy a plugin
    help-wanted epic todo :spiral_notepad: 
    opened by daniel-vera-g 2
  • [Feature] - Use Github Actions to create issues for exported reports

    [Feature] - Use Github Actions to create issues for exported reports

    ๐Ÿš€ Is your feature request related to a problem? Please describe. Related:

    • #37

    Using the metrics report exported through the executor, we can use Github Actions workflows to create automated issues containing the reports.

    ๐Ÿ’ก Describe the solution you'd like Create issues with the exported report as the content using Github Actions.

    epic todo :spiral_notepad: 
    opened by burntcarrot 1
Releases(1.0.1)
  • 1.0.1(Dec 7, 2021)

    Full Changelog: https://github.com/k3ai/k3ai/compare/1.0...1.0.1

    What's Changed

    K3ai Features:

    ARM support #13 @alefesta Kubeflow one-click pipeline #14 @alefesta Implementing GH actions (GHA) as a method to run K3ai from within the repo #19 Minimal documentation to run K3ai as GH @burntcarrot Implementing a config file to mimic an e2e workflow #18 @burntcarrot Add support for k3d #25 @burntcarrot

    K3ai Plugins:

    https://github.com/k3ai/plugins/issues/6 @alefesta https://github.com/k3ai/plugins/issues/7 @alefesta

    Bugs

    • [BUG] - certain plugins fail to install by @alefesta in https://github.com/k3ai/k3ai/pull/23
    • [BUG] - Fixes right tools download for Architecture by @alefesta in https://github.com/k3ai/k3ai/pull/43
    • [BUG] - minor fixes on download tools by @alefesta in https://github.com/k3ai/k3ai/pull/44
    • [BUG] - Fixes on Civo CLI for ARM by @alefesta in https://github.com/k3ai/k3ai/pull/45

    New Contributors

    • @burntcarrot made their first contribution in https://github.com/k3ai/k3ai/pull/27
    Source code(tar.gz)
    Source code(zip)
    k3ai(43.51 MB)
    k3ai.arm64(40.75 MB)
    k3ai.darwin.amd64(32.96 MB)
  • 1.0(Nov 1, 2021)

    Full Changelog: https://github.com/k3ai/k3ai/commits/1.0

    What's Changed

    • [Core] - Initial work for v1.0.0 version by @alefesta in https://github.com/k3ai/k3ai/pull/2

    New Contributors

    • @alefesta made their first contribution in https://github.com/k3ai/k3ai/pull/2

    Full Changelog:

    • Introducing K3ai DB to manage clusters and plugins dynamically
    • Introducing new CLI logic : K3ai [COMMAND] [ACTION] [OPTIONS]
    • Introducing the One Click experience to run training over deployed plugins.

    Current Operating Systems supported

    • Linux x64
    • macOS (Not Tested) Have fun with K3ai
    Source code(tar.gz)
    Source code(zip)
    k3ai(30.78 MB)
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
Vision Transformer for 3D medical image registration (Pytorch).

ViT-V-Net: Vision Transformer for Volumetric Medical Image Registration keywords: vision transformer, convolutional neural networks, image registratio

Junyu Chen 192 Dec 20, 2022
PyTorch implementation of CVPR'18 - Perturbative Neural Networks

This is an attempt to reproduce results in Perturbative Neural Networks paper. See original repo for details.

Michael Klachko 57 May 14, 2021
A simple python library for fast image generation of people who do not exist.

Random Face A simple python library for fast image generation of people who do not exist. For more details, please refer to the [paper](https://arxiv.

Sergei Belousov 170 Dec 15, 2022
A scikit-learn compatible neural network library that wraps PyTorch

A scikit-learn compatible neural network library that wraps PyTorch. Resources Documentation Source Code Examples To see more elaborate examples, look

4.9k Dec 31, 2022
This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021.

Off-Belief Learning Introduction This repo contains the implementation of the algorithm proposed in Off-Belief Learning, ICML 2021. Environment Setup

Facebook Research 32 Jan 05, 2023
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8.1k Jan 02, 2023
Single Image Random Dot Stereogram for Tensorflow

TensorFlow-SIRDS Single Image Random Dot Stereogram for Tensorflow SIRDS is a means to present 3D data in a 2D image. It allows for scientific data di

Greg Peatfield 5 Aug 10, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

253 Dec 24, 2022
[CVPR 2021] Unsupervised Degradation Representation Learning for Blind Super-Resolution

DASR Pytorch implementation of "Unsupervised Degradation Representation Learning for Blind Super-Resolution", CVPR 2021 [arXiv] Overview Requirements

Longguang Wang 318 Dec 24, 2022
Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

53 Nov 22, 2022
FOSS Digital Asset Distribution Platform built on Frappe.

Digistore FOSS Digital Assets Marketplace. Distribute digital assets, like a pro. Video Demo Here Features Create, attach and list digital assets (PDF

Mohammad Hussain Nagaria 30 Dec 08, 2022
[ICCV21] Code for RetrievalFuse: Neural 3D Scene Reconstruction with a Database

RetrievalFuse Paper | Project Page | Video RetrievalFuse: Neural 3D Scene Reconstruction with a Database Yawar Siddiqui, Justus Thies, Fangchang Ma, Q

Yawar Nihal Siddiqui 75 Dec 22, 2022
This is the repository for The Machine Learning Workshops, published by AI DOJO

This is the repository for The Machine Learning Workshops, published by AI DOJO. It contains all the workshop's code with supporting project files necessary to work through the code.

AI Dojo 12 May 06, 2022
Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

Tengfei Wang 371 Dec 30, 2022
Incomplete easy-to-use math solver and PDF generator.

Math Expert Let me do your work Preview preview.mp4 Introduction Math Expert is our (@salastro, @younis-tarek, @marawn-mogeb) math high school graduat

SalahDin Ahmed 22 Jul 11, 2022
SAS: Self-Augmentation Strategy for Language Model Pre-training

SAS: Self-Augmentation Strategy for Language Model Pre-training This repository

Alibaba 5 Nov 02, 2022
PyTorch implementation for OCT-GAN Neural ODE-based Conditional Tabular GANs (WWW 2021)

OCT-GAN: Neural ODE-based Conditional Tabular GANs (OCT-GAN) Code for reproducing the experiments in the paper: Jayoung Kim*, Jinsung Jeon*, Jaehoon L

BigDyL 7 Dec 27, 2022
Mengzi Pretrained Models

ไธญๆ–‡ | English Mengzi ๅฐฝ็ฎก้ข„่ฎญ็ปƒ่ฏญ่จ€ๆจกๅž‹ๅœจ NLP ็š„ๅ„ไธช้ข†ๅŸŸ้‡Œๅพ—ๅˆฐไบ†ๅนฟๆณ›็š„ๅบ”็”จ๏ผŒไฝ†ๆ˜ฏๅ…ถ้ซ˜ๆ˜‚็š„ๆ—ถ้—ดๅ’Œ็ฎ—ๅŠ›ๆˆๆœฌไพ็„ถๆ˜ฏไธ€ไธชไบŸ้œ€่งฃๅ†ณ็š„้—ฎ้ข˜ใ€‚่ฟ™่ฆๆฑ‚ๆˆ‘ไปฌๅœจไธ€ๅฎš็š„็ฎ—ๅŠ›็บฆๆŸไธ‹๏ผŒ็ ”ๅ‘ๅ‡บๅ„้กนๆŒ‡ๆ ‡ๆ›ดไผ˜็š„ๆจกๅž‹ใ€‚ ๆˆ‘ไปฌ็š„็›ฎๆ ‡ไธๆ˜ฏ่ฟฝๆฑ‚ๆ›ดๅคง็š„ๆจกๅž‹่ง„ๆจก๏ผŒ่€Œๆ˜ฏ่ฝป้‡็บงไฝ†ๆ›ดๅผบๅคง๏ผŒๅŒๆ—ถๅฏน้ƒจ็ฝฒๅ’Œๅทฅไธš่ฝๅœฐๆ›ดๅ‹ๅฅฝ็š„ๆจกๅž‹ใ€‚

Langboat 424 Jan 04, 2023
[ICCV'21] Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

CKDN The official implementation of the ICCV2021 paper "Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment" O

Multimedia Research 50 Dec 13, 2022