Python Data Science Handbook: full text in Jupyter Notebooks

Overview

Python Data Science Handbook

Binder Colab

This repository contains the entire Python Data Science Handbook, in the form of (free!) Jupyter notebooks.

cover image

How to Use this Book

About

The book was written and tested with Python 3.5, though other Python versions (including Python 2.7) should work in nearly all cases.

The book introduces the core libraries essential for working with data in Python: particularly IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and related packages. Familiarity with Python as a language is assumed; if you need a quick introduction to the language itself, see the free companion project, A Whirlwind Tour of Python: it's a fast-paced introduction to the Python language aimed at researchers and scientists.

See Index.ipynb for an index of the notebooks available to accompany the text.

Software

The code in the book was tested with Python 3.5, though most (but not all) will also work correctly with Python 2.7 and other older Python versions.

The packages I used to run the code in the book are listed in requirements.txt (Note that some of these exact version numbers may not be available on your platform: you may have to tweak them for your own use). To install the requirements using conda, run the following at the command-line:

$ conda install --file requirements.txt

To create a stand-alone environment named PDSH with Python 3.5 and all the required package versions, run the following:

$ conda create -n PDSH python=3.5 --file requirements.txt

You can read more about using conda environments in the Managing Environments section of the conda documentation.

License

Code

The code in this repository, including all code samples in the notebooks listed above, is released under the MIT license. Read more at the Open Source Initiative.

Text

The text content of the book is released under the CC-BY-NC-ND license. Read more at Creative Commons.

Owner
Jake Vanderplas
Python, Astronomy, Data Science
Jake Vanderplas
Zipline, a Pythonic Algorithmic Trading Library

Zipline is a Pythonic algorithmic trading library. It is an event-driven system for backtesting. Zipline is currently used in production as the backte

Quantopian, Inc. 15.7k Jan 07, 2023
SCICO is a Python package for solving the inverse problems that arise in scientific imaging applications.

Scientific Computational Imaging COde (SCICO) SCICO is a Python package for solving the inverse problems that arise in scientific imaging applications

Los Alamos National Laboratory 37 Dec 21, 2022
Efficient Python Tricks and Tools for Data Scientists

Why efficient Python? Because using Python more efficiently will make your code more readable and run more efficiently.

Khuyen Tran 944 Dec 28, 2022
Doing bayesian data analysis - Python/PyMC3 versions of the programs described in Doing bayesian data analysis by John K. Kruschke

Doing_bayesian_data_analysis This repository contains the Python version of the R programs described in the great book Doing bayesian data analysis (f

Osvaldo Martin 851 Dec 27, 2022
Data intensive science for everyone.

InVesalius InVesalius generates 3D medical imaging reconstructions based on a sequence of 2D DICOM files acquired with CT or MRI equipments. InVesaliu

Galaxy Project 1k Jan 08, 2023
A simple computer program made with Python on the brachistochrone curve.

Brachistochrone-curve This is a simple computer program made with Python on the brachistochrone curve. I decided to write it after a physics lesson on

Diego Romeo 1 Dec 16, 2021
PennyLane is a cross-platform Python library for differentiable programming of quantum computers.

PennyLane is a cross-platform Python library for differentiable programming of quantum computers. Train a quantum computer the same way as a neural network.

PennyLaneAI 1.6k Jan 04, 2023
Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica

Mathics is a general-purpose computer algebra system (CAS). It is an open-source alternative to Mathematica. It is free both as in "free beer" and as in "freedom".

Mathics 535 Jan 04, 2023
AnuGA for the simulation of the shallow water equation

ANUGA Contents ANUGA What is ANUGA? Installation Documentation and Help Mailing Lists Web sites Latest source code Bug reports Developer information L

Geoscience Australia 147 Dec 14, 2022
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Dec 31, 2022
A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

Cookiecutter Data Science A logical, reasonably standardized, but flexible project structure for doing and sharing data science work. Project homepage

6.4k Jan 02, 2023
CoCalc: Collaborative Calculation in the Cloud

logo CoCalc Collaborative Calculation and Data Science CoCalc is a virtual online workspace for calculations, research, collaboration and authoring do

SageMath, Inc. 1k Dec 29, 2022
Float2Binary - A simple python class which finds the binary representation of a floating-point number.

Float2Binary A simple python class which finds the binary representation of a floating-point number. You can find a class in IEEE754.py file with the

Bora Canbula 3 Dec 14, 2021
Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Aesara

PyMC3 is a Python package for Bayesian statistical modeling and Probabilistic Machine Learning focusing on advanced Markov chain Monte Carlo (MCMC) an

PyMC 7.2k Dec 30, 2022
Statsmodels: statistical modeling and econometrics in Python

About statsmodels statsmodels is a Python package that provides a complement to scipy for statistical computations including descriptive statistics an

statsmodels 8.1k Dec 30, 2022
A framework for feature exploration in Data Science

Beehive A framework for feature exploration in Data Science Background What do we do when we finish one episode of feature exploration in a jupyter no

Steven IJ 1 Jan 03, 2022
Animation engine for explanatory math videos

Manim is an engine for precise programatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This repo

Grant Sanderson 48.9k Jan 03, 2023
Algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos

Bioinformatics This is a repository of all the algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos Algorithm

16 Jun 30, 2022
Open Delmic Microscope Software

Odemis Odemis (Open Delmic Microscope Software) is the open-source microscopy software of Delmic B.V. Odemis is used for controlling microscopes of De

Delmic 32 Dec 14, 2022
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

Kenji Hiranabe 3.2k Jan 08, 2023