MapReader: A computer vision pipeline for the semantic exploration of maps at scale

Overview

MapReader

A computer vision pipeline for the semantic exploration of maps at scale

Continuous integration badge License

MapReader is an end-to-end computer vision (CV) pipeline designed by the Living with Machines project. It has two main components: preprocessing/annotation and training/inference:

MapReader pipeline

MapReader provides a set of tools to:

  • load images/maps stored locally or retrieve maps via web-servers (e.g., tileservers which can be used to retrieve maps from OpenStreetMap (OSM), the National Library of Scotland (NLS), or elsewhere). ⚠️ Refer to the credits and re-use terms section if you are using digitized maps or metadata provided by NLS.
  • preprocess images/maps (e.g., divide them into patches, resampling the images, removing borders outside the neatline or reprojecting the map).
  • annotate images/maps or their patches (i.e. slices of an image/map) using an interactive annotation tool.
  • train, fine-tune, and evaluate various CV models.
  • predict labels (i.e., model inference) on large sets of images/maps.
  • Other functionalities include:
    • various plotting tools using, e.g., matplotlib, cartopy, Google Earth, and kepler.gl.
    • compute mean/standard-deviation pixel intensity of image patches.

Below is an example of MapReader CV model output (see the paper on MapReader for more details):

British railspace and buildings as predicted by a MapReader computer vision model

British 'railspace' and buildings as predicted by a MapReader computer vision model. ~30.5M patches from ~16K nineteenth-century Ordnance Survey map sheets were used (courtesy of the National Library of Scotland). (a) Predicted railspace; (b) predicted buildings; (c) and (d) predicted railspace (red) and buildings (black) in and around Middlesbrough and London, respectively. MapReader extracts information from large images or a set of images at a patch level, as depicted in the insets. For both railspace and buildings, we removed those patches that had no other neighboring patches with the same label within a distance of 250 meters.

Table of contents

Installation

Set up a conda environment

We strongly recommend installation via Anaconda:

conda create -n mr_py38 python=3.8
  • Activate the environment:
conda activate mr_py38

Method 1

  • Install mapreader:
pip install git+https://github.com/Living-with-machines/MapReader.git
python -m ipykernel install --user --name mr_py38 --display-name "Python (mr_py38)"

Method 2

  • Clone mapreader source code:
git clone https://github.com/Living-with-machines/MapReader.git 
cd /path/to/MapReader
poetry install
poetry shell

How to cite MapReader

Please consider acknowledging MapReader if it helps you to obtain results and figures for publications or presentations, by citing:

Link: https://arxiv.org/abs/2111.15592

Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen and Katherine McDonough (2021), MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale, arXiv:2111.15592.

and in BibTeX:

@misc{hosseini2021mapreader,
      title={MapReader: A Computer Vision Pipeline for the Semantic Exploration of Maps at Scale}, 
      author={Kasra Hosseini and Daniel C. S. Wilson and Kaspar Beelen and Katherine McDonough},
      year={2021},
      eprint={2111.15592},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Credits and re-use terms

Digitized maps

MapReader can retrieve maps from NLS (National Library of Scotland) via webservers. For all the digitized maps (retrieved or locally stored), please note the re-use terms:

⚠️ Use of the digitised maps for commercial purposes is currently restricted by contract. Use of these digitised maps for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Metadata

We have provided some metadata files in mapreader/persistent_data. For all these file, please note the re-use terms:

⚠️ Use of the metadata for commercial purposes is currently restricted by contract. Use of this metadata for non-commercial purposes is permitted under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC-BY-NC-SA) licence. Please refer to https://maps.nls.uk/copyright.html#exceptions-os for details on copyright and re-use license.

Acknowledgements

This work was supported by Living with Machines (AHRC grant AH/S01179X/1) and The Alan Turing Institute (EPSRC grant EP/N510129/1). Living with Machines, funded by the UK Research and Innovation (UKRI) Strategic Priority Fund, is a multidisciplinary collaboration delivered by the Arts and Humanities Research Council (AHRC), with The Alan Turing Institute, the British Library and the Universities of Cambridge, East Anglia, Exeter, and Queen Mary University of London.

Comments
  • Update README.md

    Update README.md

    • [x] TODOs: See https://github.com/Living-with-machines/MapReader/pull/38#issuecomment-1109569025
    • [x] Rename Maps / Non-maps to Geospatial / Non-geospatial.
    • [x] @kasra-hosseini Review the changes, check the links and merge.
    opened by kasra-hosseini 24
  • Testing `MapReader`

    Testing `MapReader`

    Hi All 👋🏼

    I will be testing MapReader install and the demo notebooks run to evecute the analysis. I will document my process here

    • [x] Installation
      • [x] Install git clone [email protected]:Living-with-machines/MapReader.git
      • [x] git branch -> * dev
      • [X] git pull origin dev
      • [X] poetry install
      • [X] poetry shell
        • this command was not included in the README.md (unlike conda activate ...)
    • [x] Notebooks code execution
      • [x] 001_retrieve_patchify_plot.ipynb
      • [x] 002_annotation.ipynb
      • [x] 003_train_classifier.ipynb
      • [x] 004_inference.ipynb
    opened by ChristinaLast 18
  • :bug: some errors in `binder` deployment.

    :bug: some errors in `binder` deployment.

    Tasks

    • [x] Fix 'great_circle' is not defined
    • [x] Fix simplekml needs to be installed to create KML outputs!

    Associated tracebacks

    ---------------------------------------------------------------------------
    NameError                                 Traceback (most recent call last)
    /tmp/ipykernel_60/1620428857.py in <module>
          3 
          4 xmin, xmax, ymin, ymax, myimg_shape, size_in_m = \
    ----> 5         mymaps.calc_pixel_width_height(all_maps[0])
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in calc_pixel_width_height(self, parent_id, calc_size_in_m)
        349 
        350         elif calc_size_in_m in ['gc', 'great-circle']:
    --> 351             bottom = great_circle((ymin, xmin), (ymin, xmax)).meters
        352             right = great_circle((ymin, xmax), (ymax, xmax)).meters
        353             top = great_circle((ymax, xmax), (ymax, xmin)).meters
    
    NameError: name 'great_circle' is not defined
    
    ---------------------------------------------------------------------------
    ModuleNotFoundError                       Traceback (most recent call last)
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in _createKML(self, path2kml, value, coords, counter)
        817         try:
    --> 818             import simplekml
        819         except:
    
    ModuleNotFoundError: No module named 'simplekml'
    
    During handling of the above exception, another exception occurred:
    
    ImportError                               Traceback (most recent call last)
    /tmp/ipykernel_60/28836796.py in <module>
          4             save_kml_dir="./kml_tutorial",
          5             figsize=(20, 20),
    ----> 6             image_width_resolution=600)
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in show(self, image_ids, value, plot_parent, border, border_color, vmin, vmax, colorbar, alpha, discrete_colorbar, tree_level, grid_plot, plot_histogram, save_kml_dir, image_width_resolution, kml_dpi_image, **kwds)
        675                                     value=one_image_id,
        676                                     coords=self.images["parent"][one_image_id]["coord"],
    --> 677                                     counter=-1)
        678                 else:
        679                     plt.title(one_image_id)
    
    /srv/conda/envs/notebook/lib/python3.7/site-packages/mapreader/loader/images.py in _createKML(self, path2kml, value, coords, counter)
        818             import simplekml
        819         except:
    --> 820             raise ImportError("[ERROR] simplekml needs to be installed to create KML outputs!")
        821 
        822         (lon_min, lon_max, lat_min, lat_max) = coords
    
    ImportError: [ERROR] simplekml needs to be installed to create KML outputs!
    
    opened by ChristinaLast 6
  • 🐛 `LoadAnnotations` not returning annotation interface

    🐛 `LoadAnnotations` not returning annotation interface

    When using a local notebook to run through the annotation section of the quick_start notebook, I am unable to see the LoadAnnonations object returned in order to generate new labels! See screen shot below:

    Screenshot 2022-05-09 at 13 58 51

    opened by ChristinaLast 3
  • d actual edits to first para

    d actual edits to first para

    I've restored the order to 'maps' -> 'images' so we get a clearer narative as in the current existing repo; and shortened / combined a sentence, as it was repeating 'non-maps' and 'maps', so I used 'any images' instead to make it more intuitive to read.

    I was also going to add a few sentences giving the nice positive spin about interdisciplinary cross-pollination of image analysis, but not sure where this should go: I don't want to break the flow to the instructions, so perhaps it can go after the bullet points?

    opened by dcsw2 2
  • Deploying `MapReader` through `binder`

    Deploying `MapReader` through `binder`

    • [x] @ChristinaLast and @andrewphilipsmith to walk through binder deployment
      • [x] adding requirements.txt with no hashed libraries for binderhub deployment
    opened by ChristinaLast 2
  • Model inference in one step

    Model inference in one step

    Summary

    Currently, we first need to patchify an image and then do the model inference (in two separate steps). In this issue, we plan to have a method that does both steps, i.e.,

    # example interface
    my_classifier.inference(path2image, **kwds for the slice method, including patch size, ...)
    my_classifier.plot()
    

    TODO

    • Refer to https://github.com/alan-turing-institute/mapreader-plant-scivision. Here, we have a function/method called "predict" that does model inference on an image. Under the hood, it slices an image into patches, does model inference on the patches and then plot the results (and return the predicted labels).
    • It would be interesting to have a similar function/method in MapReader.
    opened by kasra-hosseini 2
  • Dev

    Dev

    Creating requirements.txt from pyproject.toml to generate package list needed for binderhub build

    Commands run:

    • to generate requirements.txt
    poetry export -f requirements.txt --output requirements.txt --without-hashes
    

    After doing this, I am required to add the github repo manually to the requirements.txt file to install MapReader such as:

    git+https://github.com/Living-with-machines/[email protected]#egg=mapreader
    
    opened by ChristinaLast 1
  • Bump ipython from 8.0.0 to 8.0.1

    Bump ipython from 8.0.0 to 8.0.1

    Bumps ipython from 8.0.0 to 8.0.1.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Add plant phenotyping example notebooks and data

    Add plant phenotyping example notebooks and data

    Add directory with cleaned and updated notebooks demonstrating classification of plant patches in images. Also includes examples of open access data that can be used in running these notebooks, annotation files to facilitate annotating plant vs. non-plant patches.

    opened by evangeline-corcoran 1
  • Bump pillow from 8.4.0 to 9.0.0

    Bump pillow from 8.4.0 to 9.0.0

    Bumps pillow from 8.4.0 to 9.0.0.

    Release notes

    Sourced from pillow's releases.

    9.0.0

    https://pillow.readthedocs.io/en/stable/releasenotes/9.0.0.html

    Changes

    ... (truncated)

    Changelog

    Sourced from pillow's changelog.

    9.0.0 (2022-01-02)

    • Restrict builtins for ImageMath.eval(). CVE-2022-22817 #5923 [radarhere]

    • Ensure JpegImagePlugin stops at the end of a truncated file #5921 [radarhere]

    • Fixed ImagePath.Path array handling. CVE-2022-22815, CVE-2022-22816 #5920 [radarhere]

    • Remove consecutive duplicate tiles that only differ by their offset #5919 [radarhere]

    • Improved I;16 operations on big endian #5901 [radarhere]

    • Limit quantized palette to number of colors #5879 [radarhere]

    • Fixed palette index for zeroed color in FASTOCTREE quantize #5869 [radarhere]

    • When saving RGBA to GIF, make use of first transparent palette entry #5859 [radarhere]

    • Pass SAMPLEFORMAT to libtiff #5848 [radarhere]

    • Added rounding when converting P and PA #5824 [radarhere]

    • Improved putdata() documentation and data handling #5910 [radarhere]

    • Exclude carriage return in PDF regex to help prevent ReDoS #5912 [hugovk]

    • Fixed freeing pointer in ImageDraw.Outline.transform #5909 [radarhere]

    • Added ImageShow support for xdg-open #5897 [m-shinder, radarhere]

    • Support 16-bit grayscale ImageQt conversion #5856 [cmbruns, radarhere]

    • Convert subsequent GIF frames to RGB or RGBA #5857 [radarhere]

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 1
  • Satellite images (some references)

    Satellite images (some references)

    • I just had a talk with one of the REG members on https://github.com/urbangrammarai and they are using this tool to download satellite images: https://github.com/urbangrammarai/gee_pipeline/.
    • The other option is : https://planetarycomputer.microsoft.com/
    opened by kasra-hosseini 0
  • Add `min_std_pixel` and `max_std_pixel` to `prepare_annotation`

    Add `min_std_pixel` and `max_std_pixel` to `prepare_annotation`

    So that we can filter out black patches easier. We have trained some MapReader models using ~6K annotated patches (the plant phenotyping project), and now we need to extend the dataset, particularly for non-black patches.

    enhancement 
    opened by kasra-hosseini 1
  • Choose a tool to simplify diffs on .ipynb files.

    Choose a tool to simplify diffs on .ipynb files.

    Consider

    • https://www.reviewnb.com/
    • https://jupyter.org/enhancement-proposals/08-notebook-diff/notebook-diff.html
    • https://blog.ouseful.info/2017/01/27/displaying-differences-in-jupyter-notebooks-nbdime-nbdiff/

    and others

    Build into workflow using pre-commit/CI as appropriate.

    opened by andrewphilipsmith 1
  • Create CODE_OF_CONDUCT.md

    Create CODE_OF_CONDUCT.md

    @DavidBeavan Could you please review this PR? I am using "Contributor Covenant" of GitHub with the following edit:

    Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at https://livingwithmachines.ac.uk/contact-us/. All complaints will be reviewed and investigated promptly and fairly.

    opened by kasra-hosseini 0
  • Adding a notebook containing start of implementation for maps

    Adding a notebook containing start of implementation for maps

    This PR aims to implement the requirements of this issue https://github.com/Living-with-machines/MapReader/issues/36

    For details: See https://hackmd.io/bL3y2cWdT-y3qPGkyVzD5Q?both

    Tasks:

    • [ ] Get or create annotations for example map data
    • [ ] Complete text in HackMD above and transfer it into an appropriate place within the repo. (readme.md or quick_start.ipynb etc)
    • [ ] Resolve all of the questions in the HackMD (whether adding more detail or explicitly deciding to exclude from a quick start guide).
    • [ ] Give the quick_start.ipynb (maps) and quick_start.ipynb (plants) distinct names.
    • [ ] Complete the quick_start.ipynb (maps) to, at least, the same level of detail as the quick_start.ipynb (plants).
    opened by andrewphilipsmith 1
Releases(v0.3.3)
Owner
Living with Machines
A radical collaboration between computational linguists, curators, data scientists, software engineers, geographers and historians
Living with Machines
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021) Citation Please cite as: @inproceedings{liu2020understan

Sunbow Liu 22 Nov 25, 2022
Pure python implementations of popular ML algorithms.

Minimal ML algorithms This repo includes minimal implementations of popular ML algorithms using pure python and numpy. The purpose of these notebooks

Alexis Gidiotis 3 Jan 10, 2022
A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

3d-pose-baseline This is the code for the paper Julieta Martinez, Rayat Hossain, Javier Romero, James J. Little. A simple yet effective baseline for 3

Julieta Martinez 1.3k Jan 03, 2023
Repositório da disciplina de APC, no segundo semestre de 2021

NOTAS FINAIS: https://github.com/fabiommendes/apc2018/blob/master/nota-final.pdf Algoritmos e Programação de Computadores Este é o Git da disciplina A

16 Dec 16, 2022
Oriented Object Detection: Oriented RepPoints + Swin Transformer/ReResNet

Oriented RepPoints for Aerial Object Detection The code for the implementation of “Oriented RepPoints + Swin Transformer/ReResNet”. Introduction Based

96 Dec 13, 2022
ROMP: Monocular, One-stage, Regression of Multiple 3D People, ICCV21

Monocular, One-stage, Regression of Multiple 3D People ROMP, accepted by ICCV 2021, is a concise one-stage network for multi-person 3D mesh recovery f

Yu Sun 937 Jan 04, 2023
Histology images query (unsupervised)

110-1-NTU-DBME5028-Histology-images-query Final Project: Histology images query (unsupervised) Kaggle: https://www.kaggle.com/c/histology-images-query

1 Jan 05, 2022
Implementation of the Swin Transformer in PyTorch.

Swin Transformer - PyTorch Implementation of the Swin Transformer architecture. This paper presents a new vision Transformer, called Swin Transformer,

597 Jan 03, 2023
Ros2-voiceroid2 - ROS2 wrapper package of VOICEROID2

ros2_voiceroid2 ROS2 wrapper package of VOICEROID2 Windows Only Installation Ins

Nkyoku 1 Jan 23, 2022
Predictive Maintenance LSTM

Predictive-Maintenance-LSTM - Predictive maintenance study for Complex case study, we've obtained failure causes by operational error and more deeply by design mistakes.

Amir M. Sadafi 1 Dec 31, 2021
Official PyTorch implementation of "The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation" (ICCV 21).

CenterGroup This the official implementation of our ICCV 2021 paper The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person P

Dynamic Vision and Learning Group 43 Dec 25, 2022
Unofficial pytorch implementation of 'Image Inpainting for Irregular Holes Using Partial Convolutions'

pytorch-inpainting-with-partial-conv Official implementation is released by the authors. Note that this is an ongoing re-implementation and I cannot f

Naoto Inoue 525 Jan 01, 2023
Learning Neural Painters Fast! using PyTorch and Fast.ai

The Joy of Neural Painting Learning Neural Painters Fast! using PyTorch and Fast.ai Blogpost with more details: The Joy of Neural Painting The impleme

Libre AI 72 Nov 10, 2022
PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML)

pytorch-maml This is a PyTorch implementation of the supervised learning experiments from the paper Model-Agnostic Meta-Learning (MAML): https://arxiv

Kate Rakelly 516 Jan 05, 2023
The hippynn python package - a modular library for atomistic machine learning with pytorch.

The hippynn python package - a modular library for atomistic machine learning with pytorch. We aim to provide a powerful library for the training of a

Los Alamos National Laboratory 37 Dec 29, 2022
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.

The Ultimate PyTorch Source-Build Template Translations: 한국어 TL;DR PyTorch built from source can be x4 faster than a naïve PyTorch install. This repos

Joonhyung Lee/이준형 651 Dec 12, 2022
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
Machine learning notebooks in different subjects optimized to run in google collaboratory

Notebooks Name Description Category Link Training pix2pix This notebook shows a simple pipeline for training pix2pix on a simple dataset. Most of the

Zaid Alyafeai 363 Dec 06, 2022
MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Digital Image Processing Python MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python TO-DO: Refactor scripts, curren

Merve Noyan 24 Oct 16, 2022
WormMovementSimulation - 3D Simulation of Worm Body Movement with Neurons attached to its body

Generate 3D Locomotion Data This module is intended to create 2D video trajector

1 Aug 09, 2022