LynxKite: a complete graph data science platform for very large graphs and other datasets.

Last update: Dec 14, 2022

Overview

LynxKite

LynxKite is a complete graph data science platform for very large graphs and other datasets. It seamlessly combines the benefits of a friendly graphical interface and a powerful Python API.

Hundreds of scalable graph operations, including graph metrics like PageRank, embeddedness, and centrality, machine learning methods including GCNs, graph segmentations like modular clustering, and various transformation tools like aggregations on neighborhoods.
The two main data types are graphs and relational tables. Switch back and forth between the two as needed to describe complex logical flows. Run SQL on both.
A friendly web UI for building powerful pipelines of operation boxes. Define your own custom boxes to structure your logic.
Tight integration with Python lets you implement custom transformations or create whole workflows through a simple API.
Integrates with the Hadoop ecosystem. Import and export from CSV, JSON, Parquet, ORC, JDBC, Hive, or Neo4j.
Fully documented.
Proven in production on large clusters and real datasets.
Fully configurable graph visualizations and statistical plots. Experimental 3D and ray-traced graph renderings.

LynxKite is under active development. Check out our Roadmap to see what we have planned for future releases.

Getting started

Quick try:

docker run --rm -p2200:2200 lynxkite/lynxkite

Setup with persistent data:

docker run \
  -p 2200:2200 \
  -v ~/lynxkite/meta:/metadata -v ~/lynxkite/data:/data \
  -e KITE_MASTER_MEMORY_MB=1024 \
  --name lynxkite lynxkite/lynxkite

Contributing

If you find any bugs, have any questions, feature requests or comments, please file an issue or email us at [email protected].

You can install LynxKite's dependencies (Scala, Node.js, Go) with Conda.

Before the first build:

tools/git/setup.sh # Sets up pre-commit hooks.
conda env create --name lk --file conda-env.yml
conda activate lk
cp conf/kiterc_template ~/.kiterc

We use make for building the whole project.

make
target/universal/stage/bin/lynxkite interactive

Tests

We have test suites for the different parts of the system:

Backend tests are unit tests for the Scala code. They can also be executed with Sphynx as the backend. If you run make backend-test it will do both. Or you can start sbt and run testOnly *SomethingTest to run just one test. Run ./test_backend.sh -si to start sbt with Sphynx as the backend.
Frontend tests use Protractor to simulate a user's actions on the UI. make frontend-test will build everything, start a temporary LynxKite instance and run the tests against that. Use xvfb-run for headless execution. If you already have a running LynxKite instance and you don't mind erasing all data from it, run npx gulp test in the web directory. You can start up a dev proxy that watches the frontend source code for changes with npx gulp serve. Run the test suite against the dev proxy with npx gulp test:serve.
Python API tests are started with make remote_api-test. If you already have a running LynxKite that is okay to test on, run python/remote_api/test.sh. This script can also run a subset of the test suite: python/remote_api/test.sh -p *something*

License

GNU Affero General Public License v3.0

Comments

R in LynxKite
It's working!

TODO:

[x] The same for edges.

[x] Add "derive table" and "create graph".

[x] Docs.

[x] Tests.

[x] Better type support. In the screenshot as.numeric() is needed because Sphynx only supports int64 and float64, but nchar() returns int32. I don't think I want to add more types to Sphynx. Rather I think we can automatically cast to the declared type.

[x] Make the type declarations more idiomatic. float, str, etc are from Python.

[x] Try some fancy R package, like https://github.com/digitalcytometry/ecotyper.

[ ] Check whether the Docker image needs any changes for this.

[ ] Add test for Long. (Python too.)
opened by darabos 11
Upgrade to Spark 3.1.1, Scala 2.12, and Play 2.8.7
Major highlights so far:

Removed Vegas.

Removed Ammonite.

Play switched to dependency injection. Controllers are classes instead of objects now. It was not obvious how to convert the one test that was affected so I just deleted it.

Scalatest renamed org.scalatest.FunSuite to org.scalatest.funsuit.AnyFunSuite. (Funnily this didn't happen in 3.0.0 but in 3.1.0.) This affected 100+ files.

The Play JSON API changed a bit. It's not very exciting but affected a lot of files.

Looks like HADOOP_HOME must be set now even in single-node usage. I'll come back to look at it a bit more later but for now I just set it to an empty directory and it's fine.

A lot of other API changes and version conflicts, but nothing terribly interesting I think.

LynxKite appears to be working now! I computed stuff on the example graph, looked at histograms, and used SQL.

Next step is to fix the failing tests:

[error] Failed: Total 724, Failed 217, Errors 0, Passed 507, Ignored 4
opened by darabos 9
NetworKit integration

Super early state, but I can finally call NetworKit from Go. It's similar to Jano's solution from a year ago, but doesn't require hand-crafted wrappers. SWIG generates them just fine!

For now I only communicate "scalars" between the two systems. Passing arrays was another hurdle in Jano's PR. We will see.

(Internal link for his PR: https://github.com/biggraph/biggraph/pull/8676)

opened by darabos 9
GitHub actions for testing

For #8. It's hard to test this locally. I'm using https://github.com/nektos/act but I'm getting weird errors and caching doesn't work, so each attempt takes ages. Will this PR trigger a run, I wonder? If not, I may merge this and try to see if I can trigger it that way.

opened by darabos 8
Zero copy import when the schema is known
Resolves #258.

No import button! The corresponding Python code is:

lk.importParquet(eager='no', filename='/home/darabos/eg.parquet', schema='name: String, age: Double')

Outstanding issues:

Currently you can only "import" a file this way once. LynxKite assumes it will never change. This could be avoided with a version parameter, same as its done with export operations.

Add the three parameters: imported_columns, limit, and sql.

Tests, documentation.
opened by darabos 7
Neo4j export
This is part 1: exporting attributes for existing nodes.

There's an option to set node.keys and let them build the query. But if I use that, the label is a must. If I write the same query manually, I can leave it off. (http://5.9.211.195:8000/neo4j-spark-docs/1.0.0/writing.html#bookmark-write-node)

Open tasks:

[x] Make sure this works if the keys are not defined everywhere.

[x] Attribute export for edges.

[ ] Edge export for existing nodes. (I don't think this is important.)

[x] Export whole graph as new stuff.

[x] Documentation.

[x] Tests. (Maybe when the final Neo4j Spark Connector is released.)
opened by darabos 7
Ditch ordered mapping

The idea (from @xandrew-lynx) being that MappingToOrdered takes up a lot of memory. The tests seem to be passing locally. I haven't measured the impact on memory use yet. I also haven't thought backward compatibility entirely through.

opened by darabos 5
Upgrade to Spark 3.0

It seems despite the new major version, "No major code changes are required to adopt this version of Apache Spark."

It seems to have quite a few improvements. It would also allow for GPU acceleration as point out by Gyorgy Mezo.

opened by xandrew-lynx 5
Allow starting and stopping LynxKite from Scala
The idea is that you have a JVM which already has a Spark session. You want to run LynxKite in this session. And you want to use it from Python too while it's running. This is a common situation in a Databricks notebook, which allows mixing Scala and Python cells.

Instead of a kiterc you can set environment variables or provide overrides like this:

com.lynxanalytics.lynxkite.Environment.set( "KITE_ENABLE_CUDA" -> "yes", "KITE_CONFIGURE_SPARK" -> "no", "KITE_META_DIR" -> "/home/darabos/kite/meta", "KITE_DATA_DIR" -> "file:/home/darabos/kite/data", "KITE_ALLOW_PYTHON" -> "yes", "KITE_ALLOW_NON_PREFIXED_PATHS" -> "true", "SPHYNX_HOST" -> "localhost", "SPHYNX_PORT" -> "5551", "ORDERED_SPHYNX_DATA_DIR" -> "/home/darabos/kite/sphynx/ordered", "UNORDERED_SPHYNX_DATA_DIR" -> "/home/darabos/kite/sphynx/unordered", ) com.lynxanalytics.lynxkite.Main.start() // ... com.lynxanalytics.lynxkite.Main.stop()

All this wouldn't be too bad. But it's the first time we're really exposing the LynxKite package name. I wanted it to be com.lynxanalytics.lynxkite rather than com.lynxanalytics.biggraph. So there are a bit more diffs than strictly necessary.

But we should have renamed it already anyway! What's a "biggraph"? Nobody knows.
opened by darabos 4

sphynx: bump dependency versions

Hi, This change just bumps dependency versions of sphynx. After ./build.sh, there is an error (with/without this change) though:

networkit_wrap.cxx: In function ‘std::vector<double>* _wrap_Centrality_TLX_DEPRECATED_networkit_77eaa497b00f90e1(NetworKit::Centrality*, void*)’:
networkit_wrap.cxx:2001:12: error: ‘arg2’ was not declared in this scope; did you mean ‘arg1’?

I dont know how to fix it :) Cheers..

opened by jfcg 4

Segmentation metrics from NetworKit

There are 7 more per-segment metrics like this one. One of them takes two segmentations as input. I think I'll skip that one and just add the 6 that have the same interface.

There are also 5 segmentation metrics that are just a single scalar for a whole segmentation. An example is modularity. (I originally missed these because they don't derive from the Algorithm class.) I'll add these too.

I would be fine putting these all into the new "Segmentation attributes" box category. Or do you have a better idea for organization?

Also not sure about separate boxes vs one box with a dropdown. But I like separate boxes. It leaves more room for documentation, you more easily find them in the box search, saves the user from picking from a dropdown. So I'll go that way if you don't stop me.

opened by darabos 4
Bump fast-json-patch from 3.0.0-1 to 3.1.1 in /web
Bumps fast-json-patch from 3.0.0-1 to 3.1.1.

Release notes

Sourced from fast-json-patch's releases.

3.1.1

Security Fix for Prototype Pollution - huntr.dev #262

Bug fixes and ES6 modules

Use ES6 Modules

package now exports non-bundled ES module Starcounter-Jack/JSON-Patch#232

main still points to CommonJS module for backward compatibility

README recommends use of named ES imports

List of changes https://github.com/Starcounter-Jack/JSON-Patch/compare/v2.2.1...3.0.0-0

Commits

9d313ac fix(tests): Updated tests to reflect new error message

e4f4eb3 3.1.1

d7903fb fix: typescript codegen changes

5f04488 Bumping version number

7e9fe13 Typescript provided

097864a Documentation updated

51964ed feat: Cleaned up vars vs consts

8a6a360 New build

adeb422 Update .gitignore

59336fe Merge pull request #292 from Starcounter-Jack/dependabot/npm_and_yarn/ajv-6.12.6

Additional commits viewable in compare view

Maintainer changes

This version was pushed to npm by mountain-jack, a new releaser for fast-json-patch since your current version.

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies javascript
opened by dependabot[bot] 0
Pass DataFrames to/from managed LynxKite

When LynxKite is running in a user-provided SparkSession, it should be possible to pass Spark DataFrames between the user's Python code and LynxKite. This would be very efficient and very powerful.

opened by darabos 0
Better errors if edge src/dst indexing is wrong

I "Create graph in R" (and maybe in Python too) if you set an out of bounds edge src/dst then Sphynx will just crash. You get "UNAVAILABLE: Network closed for unknown reason". Let's add a better error.
good first issue

opened by darabos 0
Clicking a box doesn't open its popup until it's saved

This came up in https://github.com/lynxkite/lynxkite/pull/307#discussion_r1032239653 but I think I've also experienced it when using a LynxKite instance on a different continent. Maybe we could fix it?
bug

opened by darabos 0

Releases(5.2.0)

5.2.0(Dec 1, 2022)
LynxKite 5.2.0 brings a large number of cool new features! In addition to Python, Scala, and SQL, we now have boxes for running R in LynxKite. We've made it possible to output custom plots from these new R boxes and also from the existing Python boxes. You can output static plots (as with Matplotlib) or even dynamic visualizations (as with Deck.gl).

On the other hand, if you're running LynxKite as part of an automated workflow, our Python API can now start and stop LynxKite automatically to avoid wasting resources when LynxKite is idle.

The changes in detail:

The Python API can now be used without a running LynxKite instance. If you pass in a SparkSession to LynxKite (lk = lynx.kite.LynxKite(spark=spark)), LynxKite will run in that SparkSession. #294 Useful if you want to run LynxKite as part of a pipeline, rather than as permanent fixture.

The LynxKite() constructor in the Python API now defaults to connecting to http://localhost:2200. #291

Added "Compute in R" and "Create graph in R" boxes that behave the same as their Python counterparts, but let you use R. #292

Set up an Earthly build. #296 This should make builds very reliable for everyone.

"Compute in Python" boxes can now output plots. Just set the output to matplotlib, or html. #297

Source code(tar.gz)
Source code(zip)
lynxkite-5.2.0.jar(213.53 MB)
5.1.0(Sep 28, 2022)
LynxKite 5.1.0 brings a major change in how LynxKite is started. It also includes a high-performance Neo4j import box, support for Google's BigQuery, and several other improvements.

Changes to how LynxKite is started

Until now, the script generated by Play Framework was in charge of starting LynxKite. We added a significant amount of code to it with tools/call_spark_submit.sh. You would run this script as lynxkite/bin/lynxkite interactive. And this script started spark-submit with parameters based on .kiterc.

All that is gone now. LynxKite is distributed as a single jar file. You can run it with spark-submit lynxkite-5.1.0.jar. Most of the settings from your .kiterc still apply, but you now have to load these into the environment.

. ~/.kiterc spark-3.3.0/bin/spark-submit lynxkite-5.1.0.jar

The benefit of this change is that LynxKite is now started like any other Spark application. Any environment that is set up to run Spark applications will be able to run LynxKite too.

Our Docker images have been updated with this change. If you are running LynxKite in Docker, you don't have to change anything.

Detailed changelist

Upgraded to Apache Spark 3.3.0. #272

LynxKite is now started more simply, with spark-submit. #269 This makes deployment much simpler in Hadoop environments.

The new box "Import from Neo4j files" can be used to import Neo4j data directly from files instead of reading from a running Neo4j instance. This can reduce the memory requirements from terabytes to gigabytes on large datasets. #268

Added two new "Import from BigQuery" boxes. #245

Changed the font styling on legends to make them more readable over maps. #267

The "Import from Parquet" box now has an option for using the source files directly instead of pulling the data into LynxKite. #261 This avoids an unnecessary copy and is more convenient to use through the Python API.

The "Weighted aggregate on neighbors" box now supports weighting by edge attributes. #257

The "Add rank attribute" box now supports ranking edges by edge attributes. #255

Congratulations to @tuckging and @lacca0 for their first LynxKite commits in this release! 🎉
Source code(tar.gz)
Source code(zip)
lynxkite-5.1.0.jar(220.53 MB)
5.0.0(Jun 13, 2022)
LynxKite 5.0.0 is a big release giving us fast GPU-accelerated algorithms, a new internal storage format, and other improvements.

Download the attached release file or follow the instructions for running our Docker image.

Added GPU implementations of several algorithms using RAPIDS cuGraph. #241 Enable GPU usage by setting KITE_ENABLE_CUDA=yes in .kiterc. The list of algorithms includes PageRank, connected components, betweenness and Katz centrality, the Louvain method, k-core decomposition, and ForceAtlas2, a new option in Place vertices with edge lengths.

Switched the internal storage of graph entities from custom SequenceFiles to Parquet. #237 This is an incompatible change, but the migration is simple: delete $KITE_DATA/partitioned. Everything will be recomputed when accessed, and will be stored in the new format.

Added methods in the Python API for conversion between PySpark DataFrames and LynxKite tables. #240

Domain preference is now configurable. #236 This is useful if you want the distributed Spark backend to take precedence over the local Sphynx backend.

Migration from LynxKite 4.x

#237 changed the data format for graph data. You will have to delete your $KITE_DATA/partitioned directory. The data will be regenerated in the new format.
Source code(tar.gz)
Source code(zip)
lynxkite-5.0.0.tgz(169.99 MB)
4.4.0(May 24, 2022)
LynxKite 4.4.0 is a maintenance release with optimizations, bug fixes, and dependency upgrades.

Upgraded to PyTorch Geometric (PyG) 2.0.1. #206

Upgraded to NetworKit 10.0. #234

The workspace interface is much faster now. #220

Now using Conda for managing all dependencies. #209

Fixed an issue with Python boxes returning errors unnecessarily. #225

Fixed an issue with GCS. #224

Fixed CUDA issues with GCN and Node2vec boxes. #234

Source code(tar.gz)
Source code(zip)
lynxkite-4.4.0.tgz(170.03 MB)
4.3.0(Sep 10, 2021)
LynxKite 4.3.0 is a massive maintenance release. We have long wanted to upgrade to Spark 3.x, but this required upgrading to Scala 2.12, which in turn required upgrading Play Framework and other things. And now it's all done!

We found the time to include some user-visible improvements too. Check out the full list of changes below:

Upgraded to Apache Spark 3.1.2. This also brought us up to Scala 2.12, Java 11, Play Framework 2.8.7, and new versions of some other dependencies. #178 #184

The "Custom plot" box now lets you use the latest version of Vega-Lite by directly writing JSON instead of going through the Vegas Scala DSL.

Logistic regression models can now be configured to use elastic net regularization.

Boxes used as steps in a wizard are highlighted in the workspace view by a faint glow. #183

"Compute in Python" boxes can be used on tables. #160

Added a "Draw ROC curve" built-in custom box. #197

Performance and compatibility improvements. #188 #194

Source code(tar.gz)
Source code(zip)
lynxkite-4.3.0.tgz(173.04 MB)
4.2.2(Apr 30, 2021)
Fixes a regression introduced in LynxKite 4.2.1. If you encountered the bug, please hit "Reimport" on your import boxes after upgrading to ensure the corrupted data gets recomputed.

Fix for attributes becoming undefined. #176

Source code(tar.gz)
Source code(zip)
lynxkite-4.2.2.tgz(252.72 MB)
4.2.1(Apr 15, 2021)
LynxKite 4.2.1 is a minor release to fix a breaking issue with the newest Google Chrome release. A few other small fixes are included.

Fix for Chrome 90. #162

Fixed a few other UI bugs. #164

Reduced memory use in Sphynx. #141

Source code(tar.gz)
Source code(zip)
lynxkite-4.2.1.tgz(249.09 MB)
4.2.0(Jan 29, 2021)
LynxKite 4.2.0 comes with a series of minor bugfixes and a much expanded collection of graph algorithms.

42 algorithms from NetworKit have been integrated into LynxKite. They include new centrality measures, random graph generators, community detection methods, graph metrics (diameter, effective diameter, assortativity), optimal spanning trees and more. (#102, #106, #111, #123)

Users can now opt in to sharing anonymous usage statistics with the LynxKite team. (#128)

Environment variables can be used to override .kiterc settings. (#110)

Added a built-in for parametric parameters (workspaceName) that can be used to force recomputation in wizards. (#131)

Source code(tar.gz)
Source code(zip)
lynxkite-4.2.0.tgz(248.60 MB)
4.1.0(Oct 5, 2020)
LynxKite 4.1.0 comes with a big update for our Neo4j support. This has been the most frequently raised point by our new users. Thanks for all the feedback!

Neo4j 4.x support.

Revamped Neo4j import. Instead of importing tables, you can now import a whole graph. (#90)

Added Neo4j export. You can export vertex or edge attribute or the whole graph. (#91)

AVRO and Delta Lake import and export. (#63, #86)

Added the "Filter with SQL" box as a more flexible alternative to "Filter by attributes".

Visualization option to not display edges. Great in large geographic datasets.

"Use table as vertex/edge attributes" boxes are more friendly and handle name conflicts better now.

Added aggregation support for Vector attributes. (Elementwise average, sum, etc.)

Added an option to disable generated suffixes for aggregated variables.

Fix for edge coloring. (#84)

Source code(tar.gz)
Source code(zip)
lynxkite-4.1.0.tgz(245.43 MB)
4.0.1(Jul 3, 2020)
Fixed issue with interactive tutorials. (#30)

Fixed issue with graph attributes in “Create graph in Python”. (#25)

Fixed issue with non-String attributes in “Use table as graph”. (#26)

Replaced trademarked box icons (it was an accident!) with free ones. Also switched to FontAwesome 5 everywhere to get a better selection of icons. (#37)

Improved the User Guide. (#38, #39)

Source code(tar.gz)
Source code(zip)
lynxkite-4.0.1.tgz(221.39 MB)
4.0.0(Jun 22, 2020)
We've open-sourced LynxKite!

We took this opportunity to make many changes that break compatibility with the LynxKite 3.x series. We can help migrate existing workspaces to LynxKite 4.0 if necessary.

Replaced the separate Long, Int, Double attribute types with number.

Instead of the (Double, Double) attribute type, 2D positions are now represented as Vector[number]. This type is widely supported and more flexible. Use "Bundle vertex attributes into a Vector" instead of "Convert vertex attributes to position", which is now gone.

Renamed "scalars" to "graph attributes". Renamed "projects" to "graphs". These mysterious names were largely used for historical reasons.

Removed "Predict with a graph neural network" operation. (It was an early prototype, long since succeeded by the "Predict with GCN" box.)

Removed "Predict attribute by viral modeling" box. It is more flexible to do the same thing through a series of more elemental boxes. A built-in box ("Predict from communities") has been added to serve as a starting point.

Made it easier to use graph convolutional boxes: added "Bundle vertex attributes into a Vector" and "One-hot encode attribute" boxes.

Replaced the "Reduce vertex attributes to two dimensions" and "Embed with t-SNE" boxes with the new "Reduce attribute dimensions" box which offers both PCA and t-SNE.

"Compute in Python" boxes now support Vector[Double] attributes.

"Create Graph in Python" box added.

Inputs and outputs for "Compute in Python" can now be inferred from the code.

See our changelog for release notes for older releases.
Source code(tar.gz)
Source code(zip)
lynxkite-4.0.0.tgz(220.81 MB)

Owner

GitHub Repository https://lynxkite.com/

Flenser is a simple, minimal, automated exploratory data analysis tool.

Flenser Have you ever been handed a dataset you've never seen before? Flenser is a simple, minimal, automated exploratory data analysis tool. It runs

79 Sep 20, 2022

Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

15.7k Jan 07, 2023

Stitch together Nanopore tiled amplicon data without polishing a reference

Stitch together Nanopore tiled amplicon data using a reference guided approach Tiled amplicon data, like those produced from primers designed with pri

14 Aug 30, 2022

ETL pipeline on movie data using Python and postgreSQL

Movies-ETL ETL pipeline on movie data using Python and postgreSQL Overview This project consisted on a automated Extraction, Transformation and Load p

0 Jul 07, 2021

The Spark Challenge Student Check-In/Out Tracking Script

The Spark Challenge Student Check-In/Out Tracking Script This Python Script uses the Student ID Database to match the entries with the ID Card Swipe a

1 Dec 09, 2021

General Assembly's 2015 Data Science course in Washington, DC

DAT8 Course Repository Course materials for General Assembly's Data Science course in Washington, DC (8/18/15 - 10/29/15). Instructor: Kevin Markham (

1.6k Jan 07, 2023

Efficient matrix representations for working with tabular data

70 Dec 14, 2022

High Dimensional Portfolio Selection with Cardinality Constraints

High-Dimensional Portfolio Selecton with Cardinality Constraints This repo contains code for perform proximal gradient descent to solve sample average

2 Mar 22, 2022

Flood modeling by 2D shallow water equation

hydraulicmodel Flood modeling by 2D shallow water equation. Refer to Hunter et al (2005), Bates et al. (2010). Diffusive wave approximation Local iner

6 Nov 30, 2022

Exploratory Data Analysis for Employee Retention Dataset

Exploratory Data Analysis for Employee Retention Dataset Employee turn-over is a very costly problem for companies. The cost of replacing an employee

2 Oct 01, 2021

This mini project showcase how to build and debug Apache Spark application using Python

Spark app can't be debugged using normal procedure. This mini project showcase how to build and debug Apache Spark application using Python programming language. There are also options to run Spark a

1 Dec 29, 2021

Get mutations in cluster by querying from LAPIS API

Cluster Mutation Script Get mutations appearing within user-defined clusters. Usage Clusters are defined in the clusters dict in main.py: clusters = {

1 Oct 22, 2021

Feature engineering and machine learning: together at last

Feature engineering and machine learning: together at last! Lambdo is a workflow engine which significantly simplifies data analysis by unifying featu

14 Sep 15, 2022

PrimaryBid - Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift

Transform application Lifecycle Data and Design and ETL pipeline architecture for ingesting data from multiple sources to redshift This project is composed of two parts: Part1 and Part2

1 Jan 19, 2022

Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks

The following Python scripts aim to use a Random Forest machine learning algorithm to predict the water affinity of Metal-Organic Frameworks (MOFs). The training set is extracted from the Cambridge S

1 Jan 09, 2022

LynxKite: a complete graph data science platform for very large graphs and other datasets.

Related tags

Overview

LynxKite

Getting started

Contributing

Tests

License

Comments

3.1.1

Bug fixes and ES6 modules

Releases(5.2.0)

5.2.0(Dec 1, 2022)

5.1.0(Sep 28, 2022)

Changes to how LynxKite is started

Detailed changelist

5.0.0(Jun 13, 2022)

Migration from LynxKite 4.x

4.4.0(May 24, 2022)

4.3.0(Sep 10, 2021)

4.2.2(Apr 30, 2021)

4.2.1(Apr 15, 2021)

4.2.0(Jan 29, 2021)

4.1.0(Oct 5, 2020)

4.0.1(Jul 3, 2020)

4.0.0(Jun 22, 2020)