Mining-the-Social-Web-3rd-Edition - The official online compendium for Mining the Social Web, 3rd Edition (O'Reilly, 2018)

Overview

Mining the Social Web, 3rd Edition

The official code repository for Mining the Social Web, 3rd Edition (O'Reilly, 2019). The book is available from Amazon and Safari Books Online.

The notebooks folder of this repository contains the latest bug-fixed sample code used in the book chapters.

Quickstart

Binder

The easiest way to start playing with code right away is to use Binder. Binder is a service that takes a GitHub repository containing Jupyter Notebooks and spins up a cloud-based server to run them. You can start experimenting with the code without having to install anything on your machine. Click the badge above, or follow this link to get started right away.

NOTE: Binder will not save your files on its servers. During your next session, it will be a completely fresh instantiation of this repository. If you need a more persistent solution, consider running the code on your own machine.

Getting started on your own machine using Docker

  1. Install Docker
  2. Install repo2docker: pip install jupyter-repo2docker
  3. From the command line:
repo2docker https://github.com/mikhailklassen/Mining-the-Social-Web-3rd-Edition

This will create a Docker container from the repository directly. It takes a while to finish building the container, but once it's done, you will see a URL printed to screen. Copy and paste the URL into your browser.

A longer set of instructions can be found here.

Getting started on your own machine from source

If you are familiar with git and have a git client installed on your machine, simply clone the repository to your own machine. However, it is up to you to install all the dependencies for the repository. The necessary Python libraries are detailed in the requirements.txt file. The other requirements are detailed in the Requirements section below.

If you prefer not to use a git client, you can instead download a zip archive directly from GitHub. The only disadvantage of this approach is that in order to synchronize your copy of the code with any future bug fixes, you will need to download the entire repository again. You are still responsible for installing any dependencies yourself.

Install all the prerequisites using pip:

pip install -r requirements.txt

Once you're done, step into the notebooks directory and launch the Jupyter notebook server:

jupyter notebook

Side note on MongoDB

If you wish to complete all the examples in Chapter 9, you will need to install MongoDB. We do not provide support on how to do this. This is for more advanced users and is really only relevant to a few examples in Chapter 9.

Contributing

There are several ways in which you can contribute to the project. If you discover a bug in any of the code, the first thing to do is to create a new issue under the Issues tab of this repository. If you are a developer and would like to contribute a bug fix, please feel free to fork the repository and submit a pull request.

The code is provided "as-is" and we make no guarantees that it is bug-free. Keep in mind that we access the APIs of various social media platforms and their APIs are subject to change. Since the start of this project, various social media platforms have tightened the permissions on their platform. Getting full use out of all the code in this book may require submitting an application the social media platform of your choice for approval. Despite these restrictions, we hope that the code still provides plenty of flexibility and opportunities to go deeper.

Owner
Mikhail Klassen
Co-Founder and CTO at @PaladinAI. PhD, astrophysics. I specialize in machine learning, AI, data mining, and data visualization.
Mikhail Klassen
Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Codebase for Time-series Generative Adversarial Networks (TimeGAN)

Jinsung Yoon 532 Dec 31, 2022
To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

To propose and implement a multi-class classification approach to disaster assessment from the given data set of post-earthquake satellite imagery.

Kunal Wadhwa 2 Jan 05, 2022
Multiview Dataset Toolkit

Multiview Dataset Toolkit Using multi-view cameras is a natural way to obtain a complete point cloud. However, there is to date only one multi-view 3D

11 Dec 22, 2022
This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners.

LiST (Lite Self-Training) This is the implementation of the paper LiST: Lite Self-training Makes Efficient Few-shot Learners. LiST is short for Lite S

Microsoft 28 Dec 07, 2022
A PyTorch Implementation of Neural IMage Assessment

NIMA: Neural IMage Assessment This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Proc

yunxiaos 418 Dec 29, 2022
Pytorch code for semantic segmentation using ERFNet

ERFNet (PyTorch version) This code is a toolbox that uses PyTorch for training and evaluating the ERFNet architecture for semantic segmentation. For t

Edu 394 Jan 01, 2023
TensorFlow tutorials and best practices.

Effective TensorFlow 2 Table of Contents Part I: TensorFlow 2 Fundamentals TensorFlow 2 Basics Broadcasting the good and the ugly Take advantage of th

Vahid Kazemi 8.7k Dec 31, 2022
MINOS: Multimodal Indoor Simulator

MINOS Simulator MINOS is a simulator designed to support the development of multisensory models for goal-directed navigation in complex indoor environ

194 Dec 27, 2022
LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Deep-Leafsnap Convolutional Neural Networks have become largely popular in image tasks such as image classification recently largely due to to Krizhev

Sujith Vishwajith 48 Nov 27, 2022
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
KIND: an Italian Multi-Domain Dataset for Named Entity Recognition

KIND (Kessler Italian Named-entities Dataset) KIND is an Italian dataset for Named-Entity Recognition. It contains more than one million tokens with t

Digital Humanities 5 Jun 21, 2022
A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis

A Shading-Guided Generative Implicit Model for Shape-Accurate 3D-Aware Image Synthesis Figure: Shape-Accurate 3D-Aware Image Synthesis. A Shading-Guid

Xingang Pan 115 Dec 18, 2022
A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation This repository contains the source code of the paper A Differentiable

Bernardo Aceituno 2 May 05, 2022
Streamlit app demonstrating an image browser for the Udacity self-driving-car dataset with realtime object detection using YOLO.

Streamlit Demo: The Udacity Self-driving Car Image Browser This project demonstrates the Udacity self-driving-car dataset and YOLO object detection in

Streamlit 992 Jan 04, 2023
Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

Repository under construction. Example dataset, checkpoints, and training/testing scripts will be avaible soon! 💡 Collated best practices from most p

4 Jun 26, 2022
Vision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)

Vision-Language Transformer and Query Generation for Referring Segmentation Please consider citing our paper in your publications if the project helps

Henghui Ding 143 Dec 23, 2022
OBG-FCN - implementation of 'Object Boundary Guided Semantic Segmentation'

OBG-FCN This repository is to reproduce the implementation of 'Object Boundary Guided Semantic Segmentation' in http://arxiv.org/abs/1603.09742 Object

Jiu XU 3 Mar 11, 2019
Training DiffWave using variational method from Variational Diffusion Models.

Variational DiffWave Training DiffWave using variational method from Variational Diffusion Models. Quick Start python train_distributed.py discrete_10

Chin-Yun Yu 26 Dec 13, 2022
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

CNTK Chat Windows build status Linux build status The Microsoft Cognitive Toolkit (https://cntk.ai) is a unified deep learning toolkit that describes

Microsoft 17.3k Dec 29, 2022