Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Last update: Jan 01, 2023

Overview

Mining the Social Web, 3rd Edition

The official code repository for Mining the Social Web, 3rd Edition (O'Reilly, 2019). The book is available from Amazon and Safari Books Online.

The notebooks folder of this repository contains the latest bug-fixed sample code used in the book chapters.

Quickstart

The easiest way to start playing with code right away is to use Binder. Binder is a service that takes a GitHub repository containing Jupyter Notebooks and spins up a cloud-based server to run them. You can start experimenting with the code without having to install anything on your machine. Click the badge above, or follow this link to get started right away.

NOTE: Binder will not save your files on its servers. During your next session, it will be a completely fresh instantiation of this repository. If you need a more persistent solution, consider running the code on your own machine.

Getting started on your own machine using Docker

Install Docker
Install repo2docker: pip install jupyter-repo2docker
From the command line:

repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

This will create a Docker container from the repository directly. It takes a while to finish building the container, but once it's done, you will see a URL printed to screen. Copy and paste the URL into your browser.

A longer set of instructions can be found here.

Getting started on your own machine from source

If you are familiar with git and have a git client installed on your machine, simply clone the repository to your own machine. However, it is up to you to install all the dependencies for the repository. The necessary Python libraries are detailed in the requirements.txt file. The other requirements are detailed in the Requirements section below.

If you prefer not to use a git client, you can instead download a zip archive directly from GitHub. The only disadvantage of this approach is that in order to synchronize your copy of the code with any future bug fixes, you will need to download the entire repository again. You are still responsible for installing any dependencies yourself.

Install all the prerequisites using pip:

pip install -r requirements.txt

Once you're done, step into the notebooks directory and launch the Jupyter notebook server:

jupyter notebook

Side note on MongoDB

If you wish to complete all the examples in Chapter 9, you will need to install MongoDB. We do not provide support on how to do this. This is for more advanced users and is really only relevant to a few examples in Chapter 9.

Contributing

There are several ways in which you can contribute to the project. If you discover a bug in any of the code, the first thing to do is to create a new issue under the Issues tab of this repository. If you are a developer and would like to contribute a bug fix, please feel free to fork the repository and submit a pull request.

The code is provided "as-is" and we make no guarantees that it is bug-free. Keep in mind that we access the APIs of various social media platforms and their APIs are subject to change. Since the start of this project, various social media platforms have tightened the permissions on their platform. Getting full use out of all the code in this book may require submitting an application the social media platform of your choice for approval. Despite these restrictions, we hope that the code still provides plenty of flexibility and opportunities to go deeper.

Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Related tags

Overview

Mining the Social Web, 3rd Edition

Quickstart

Getting started on your own machine using Docker

Getting started on your own machine from source

Side note on MongoDB

Contributing

Owner

Mikhail Klassen

Codebase for Time-series Generative Adversarial Networks (TimeGAN)

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

Multiview Dataset Toolkit

This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

A PyTorch Implementation of Neural IMage Assessment

Pytorch code for semantic segmentation using ERFNet

TensorFlow tutorials and best practices.

MINOS: Multimodal Indoor Simulator

LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

OBG-FCN - implementation of 'Object Boundary Guided Semantic Segmentation'

Training DiffWave using variational method from Variational Diffusion Models.

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit