Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.

Introduction

BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can write their deep learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters.

Installation

Please download BigDL Packages or pip install BigDL (conda)

How to run Program on Spark

Usage: spark-submit-with-bigdl.sh + [options] + file.py

Options:

master MASTER URL: spark, yarn, k8s, local.
local[k]: Run Spark locally with k worker threads as logical cores on your machine.
File.py: File for executing program.

System configuration

Program run on system includes:

System/Host Processor: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
CPU(s): 48
Core(s) per socket: 12
Socket(s): 2
Memory: 183 G (free)

Data Description and Run Model

It is a dataset of 60,000 small square 28×28 pixel grayscale images of handwritten single digits between 0 and 9. The MNIST data is split into three parts: 60,000 data points of training data, 10,000 points of test data.

With this BigDL Problem, We use LSTM model for MNIST digit classification problem.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
images		images
README.md		README.md
Report.pdf		Report.pdf
Slide.pdf		Slide.pdf
package.json		package.json
run.py		run.py
source_code.py		source_code.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

README.md

README.md

Report.pdf

Report.pdf

Slide.pdf

Slide.pdf

package.json

package.json

run.py

run.py

source_code.py

source_code.py

Repository files navigation

Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.

Introduction

Installation

How to run Program on Spark

System configuration

Data Description and Run Model

BigDL Performance Evaluation

Execution running time

Computation Evaluation (SPEED UP)

About

Releases

Packages

Languages

congthanhquangphu/BigDL

Folders and files

Latest commit

History

Repository files navigation

Evaluate the performance of BigDL (Distributed Deep Learning on Apache Spark) in big data analysis problems.

Introduction

Installation

How to run Program on Spark

System configuration

Data Description and Run Model

BigDL Performance Evaluation

Execution running time

Computation Evaluation (SPEED UP)

About

Resources

Stars

Watchers

Forks

Languages