Exploring dimension-reduced embeddings

Overview

Travis Build Status CRAN_Status_Badge Downloads

sleepwalk

Exploring dimension-reduced embeddings

This is the code repository. See here for the Sleepwalk web page.

License and disclaimer

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Comments
  • Error running sleepwalk: cannot open the connection

    Error running sleepwalk: cannot open the connection

    Dear sleepwalk developers, Thanks a lot for providing such nice method. I could install the package but I get the following error when I tried to run:

    > sleepwalk([email protected][email protected], [email protected][email protected])
    Estimating 'maxdist' for feature matrix 1
    Server has been stopped.
    Server has been stopped.
    Error in app$openPage(useViewer, browser) : 
      Timeout waiting for websocket.
    In addition: Warning messages:
    1: In file(con, "r") :
      cannot open file 'sleepwalk_canvas.html': No such file or directory
    2: In func(req) : File '/favicon.ico' is not found
    

    I know this is probably not a sleepwalk specific error, but I couldn't find a solution for this. Any hints/help on how to fix this issue?

    Also, I have a question about the output. Besides using the interactive mode to manually inspect cells that might be "misplaced" on the reduced-dimension space, I would like to systematically find the cells that don't quite fit to the clusters they were originally assigned to. In other words, how would you suggest to use sleepwalk to refine my clustering since I suspect that many of my cells were wrongly assigned to their clusters. I am using Seurat package to reduce dimension and clustering.

    Thank you very much, Gustavo

    opened by gufranca 2
  • Error: 'browser' must be a non-empty character string

    Error: 'browser' must be a non-empty character string

    Hello,

    After calling the sleepwalk function on a Seurat object, I got this error:

    > sleepwalk( as.matrix([email protected][email protected]), as.matrix([email protected][email protected]) )
    
    Estimating 'maxdist' for feature matrix 1
    Error in browseURL(str_c("http://localhost:", port, "/", pageobj$startPage),  :
      'browser' must be a non-empty character string
    

    I have loaded the stringr library (containing the function str_c()), and I cannot find the file originating this error. Can I ask if someone had this problem at some point?

    Thank you

    opened by PedroRaposo 2
  • slw_on_selection error when sleepwalk is not attached

    slw_on_selection error when sleepwalk is not attached

    Running sleepwalk without attaching the package (i.e., NOT specifying library(sleepwalk)) like this works fine:

    sleepwalk::sleepwalk(se[email protected][email protected], t([email protected][[email protected],]))

    But the moment you select cells with your mouse, it crashed (browser tab closes) and R gives this error:

    Error in slw_on_selection(selPoints, 1) : could not find function "slw_on_selection"

    Loading the package using library(sleepwalk) solves the issue, but it'd be nice if it weren't necessary.

    opened by FelixTheStudent 0
  • doc for comparison

    doc for comparison

    The example on the web page for comparing two embeddings still uses the old version where both distances are used concurrently. We also need to change the explanation below to say that the same cell always has the same colour in all embeddings

    opened by simon-anders 0
  • Suggestion: Link embeddings from transposed table

    Suggestion: Link embeddings from transposed table

    Let say I have e.g. a matrix where I have individuals (cells e.g.) as rows and features as columns, and then run a UMAP on both the ordinary matrix, and the transposed one. Then it would be natural to want to look at the individual UMAP with the default usage (the distances to other individuals), but it would also be interesting to see the features for that individual (and vice versa).

    Is it clear what I mean?

    opened by StaffanBetner 2
Releases(v0.3.2)
  • v0.3.2(Sep 17, 2021)

    • jrc now (v.0.5.0) uses setLimits function for all the security restriction. This update fixes the dependency problem caused by that change.
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Sep 30, 2020)

  • v.0.3.0(Feb 27, 2020)

    • New argument metric allows to use angular distance (metric = "cosine") as an alternative to default Euclidean distance (meric = "euclid").

    • If compare = "distances", it is no longer required to provide several embeddings. If only one embedding is given, it will be used for all the distances.

    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Oct 2, 2019)

    • Changes due to an update of the jrc package.

    • Indices of selected points are no longer stored in a variable and can be accessed only via the callback function. Thus, no changes to the global environment are made, unless user specifies them his- or herself.

    • Added the possibility to pass arguments to jrc::openPage (such as port number or browser in which to open the app.)

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Sep 27, 2019)

    • Now HTML Canvas is used to plot the embedding. It makes Sleepwalk faster and allows to simultaneously display more points.

    • New parameter mode = c("canvas", "svg") is added, that allows user to go back to the old SVG-based version of Sleepwalk app.

    • Bug in slw_snapshot is fixed. The function no longer returns a list of identical plots, when used with several different embeddings.

    Source code(tar.gz)
    Source code(zip)
Owner
S. Anders's research group at ZMBH
S. Anders's research group at ZMBH
Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Paradigm Shift in NLP Welcome to the webpage for "Paradigm Shift in Natural Language Processing". Some resources of the paper are constantly maintaine

Tianxiang Sun 41 Dec 30, 2022
A natural language processing model for sequential sentence classification in medical abstracts.

NLP PubMed Medical Research Paper Abstract (Randomized Controlled Trial) A natural language processing model for sequential sentence classification in

Hemanth Chandran 1 Jan 17, 2022
Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

JHJu 2 Jan 18, 2022
Google's Meena transformer chatbot implementation

Here's my attempt at recreating Meena, a state of the art chatbot developed by Google Research and described in the paper Towards a Human-like Open-Domain Chatbot.

Francesco Pham 94 Dec 25, 2022
A fast hierarchical dimensionality reduction algorithm.

h-NNE: Hierarchical Nearest Neighbor Embedding A fast hierarchical dimensionality reduction algorithm. h-NNE is a general purpose dimensionality reduc

Marios Koulakis 35 Dec 12, 2022
Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

Chinese named entity recognization (bert/roberta/macbert/bert_wwm with Keras)

2 Jul 05, 2022
A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis

WaveGlow A PyTorch implementation of the WaveGlow: A Flow-based Generative Network for Speech Synthesis Quick Start: Install requirements: pip install

Yuchao Zhang 204 Jul 14, 2022
Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

ICTNLP 90 Dec 27, 2022
A PyTorch implementation of VIOLET

VIOLET: End-to-End Video-Language Transformers with Masked Visual-token Modeling A PyTorch implementation of VIOLET Overview VIOLET is an implementati

Tsu-Jui Fu 119 Dec 30, 2022
Unsupervised text tokenizer focused on computational efficiency

YouTokenToMe YouTokenToMe is an unsupervised text tokenizer focused on computational efficiency. It currently implements fast Byte Pair Encoding (BPE)

VK.com 847 Dec 19, 2022
MRC approach for Aspect-based Sentiment Analysis (ABSA)

B-MRC MRC approach for Aspect-based Sentiment Analysis (ABSA) Paper: Bidirectional Machine Reading Comprehension for Aspect Sentiment Triplet Extracti

Phuc Phan 1 Apr 05, 2022
Need: Image Search With Python

Need: Image Search The problem is that a user needs to search for a specific ima

Surya Komandooru 1 Dec 30, 2021
Beyond Paragraphs: NLP for Long Sequences

Beyond Paragraphs: NLP for Long Sequences

AI2 338 Dec 02, 2022
Understand Text Summarization and create your own summarizer in python

Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent

Sreekanth M 1 Oct 18, 2022
NLPShala , the best IDE for all Natural language processing tasks.

The revolutionary IDE for all NLP (Natural language processing) stuffs on the internet.

Abhi 3 Aug 08, 2021
Weaviate demo with the text2vec-openai module

Weaviate demo with the text2vec-openai module This repository contains an example of how to use the Weaviate text2vec-openai module. When using this d

SeMI Technologies 11 Nov 11, 2022
This repository contains (not all) code from my project on Named Entity Recognition in philosophical text

NERphilosophy 👋 Welcome to the github repository of my BsC thesis. This repository contains (not all) code from my project on Named Entity Recognitio

Ruben 1 Jan 27, 2022
A fast, efficient universal vector embedding utility package.

Magnitude: a fast, simple vector embedding utility library A feature-packed Python package and vector storage file format for utilizing vector embeddi

Plasticity 1.5k Jan 02, 2023
Include MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.

Fast (GAN Based Neural) Vocoder Chinese README Todo Submit demo Support NHV Discription Include MelGAN, HifiGAN and Multiband-HifiGAN, maybe include N

Zhengxi Liu (刘正曦) 134 Dec 16, 2022
FireFlyer Record file format, writer and reader for DL training samples.

FFRecord The FFRecord format is a simple format for storing a sequence of binary records developed by HFAiLab, which supports random access and Linux

77 Jan 04, 2023