Exploring dimension-reduced embeddings

Last update: Nov 29, 2022

Related tags

Text Data & NLP sleepwalk

Overview

sleepwalk

Exploring dimension-reduced embeddings

This is the code repository. See here for the Sleepwalk web page.

License and disclaimer

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

Comments

Error running sleepwalk: cannot open the connection
Dear sleepwalk developers, Thanks a lot for providing such nice method. I could install the package but I get the following error when I tried to run:

> sleepwalk([email protected][email protected], [email protected][email protected]) Estimating 'maxdist' for feature matrix 1 Server has been stopped. Server has been stopped. Error in app$openPage(useViewer, browser) : Timeout waiting for websocket. In addition: Warning messages: 1: In file(con, "r") : cannot open file 'sleepwalk_canvas.html': No such file or directory 2: In func(req) : File '/favicon.ico' is not found

I know this is probably not a sleepwalk specific error, but I couldn't find a solution for this. Any hints/help on how to fix this issue?

Also, I have a question about the output. Besides using the interactive mode to manually inspect cells that might be "misplaced" on the reduced-dimension space, I would like to systematically find the cells that don't quite fit to the clusters they were originally assigned to. In other words, how would you suggest to use sleepwalk to refine my clustering since I suspect that many of my cells were wrongly assigned to their clusters. I am using Seurat package to reduce dimension and clustering.

Thank you very much, Gustavo
opened by gufranca 2
Error: 'browser' must be a non-empty character string
Hello,

After calling the sleepwalk function on a Seurat object, I got this error:

> sleepwalk( as.matrix([email protected][email protected]), as.matrix([email protected][email protected]) ) Estimating 'maxdist' for feature matrix 1 Error in browseURL(str_c("http://localhost:", port, "/", pageobj$startPage), : 'browser' must be a non-empty character string

I have loaded the stringr library (containing the function str_c()), and I cannot find the file originating this error. Can I ask if someone had this problem at some point?

Thank you
opened by PedroRaposo 2
slw_on_selection error when sleepwalk is not attached

Running sleepwalk without attaching the package (i.e., NOT specifying library(sleepwalk)) like this works fine:

sleepwalk::sleepwalk(se[email protected][email protected], t([email protected][[email protected],]))

But the moment you select cells with your mouse, it crashed (browser tab closes) and R gives this error:

Error in slw_on_selection(selPoints, 1) : could not find function "slw_on_selection"

Loading the package using library(sleepwalk) solves the issue, but it'd be nice if it weren't necessary.

opened by FelixTheStudent 0
doc for comparison

The example on the web page for comparing two embeddings still uses the old version where both distances are used concurrently. We also need to change the explanation below to say that the same cell always has the same colour in all embeddings

opened by simon-anders 0
Suggestion: Link embeddings from transposed table

Let say I have e.g. a matrix where I have individuals (cells e.g.) as rows and features as columns, and then run a UMAP on both the ordinary matrix, and the transposed one. Then it would be natural to want to look at the individual UMAP with the default usage (the distances to other individuals), but it would also be interesting to see the features for that individual (and vice versa).

Is it clear what I mean?

opened by StaffanBetner 2

Releases(v0.3.2)

v0.3.2(Sep 17, 2021)
jrc now (v.0.5.0) uses setLimits function for all the security restriction. This update fixes the dependency problem caused by that change.

Source code(tar.gz)
Source code(zip)
v0.3.1(Sep 30, 2020)
broken path to the start page, caused by jrc update fixed

Source code(tar.gz)
Source code(zip)
v.0.3.0(Feb 27, 2020)
New argument metric allows to use angular distance (metric = "cosine") as an alternative to default Euclidean distance (meric = "euclid").

If compare = "distances", it is no longer required to provide several embeddings. If only one embedding is given, it will be used for all the distances.

Source code(tar.gz)
Source code(zip)
v0.2.1(Oct 2, 2019)
Changes due to an update of the jrc package.

Indices of selected points are no longer stored in a variable and can be accessed only via the callback function. Thus, no changes to the global environment are made, unless user specifies them his- or herself.

Added the possibility to pass arguments to jrc::openPage (such as port number or browser in which to open the app.)

Source code(tar.gz)
Source code(zip)
v0.2.0(Sep 27, 2019)
Now HTML Canvas is used to plot the embedding. It makes Sleepwalk faster and allows to simultaneously display more points.

New parameter mode = c("canvas", "svg") is added, that allows user to go back to the old SVG-based version of Sleepwalk app.

Bug in slw_snapshot is fixed. The function no longer returns a list of identical plots, when used with several different embeddings.

Source code(tar.gz)
Source code(zip)

Owner

S. Anders's research group at ZMBH

GitHub Repository https://anders-biostat.github.io/sleepwalk/

Python package for performing Entity and Text Matching using Deep Learning.

DeepMatcher DeepMatcher is a Python package for performing entity and text matching using deep learning. It provides built-in neural networks and util

461 Dec 28, 2022

This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

Text Summarizer This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text. Team Members This mini-project was

1 Nov 16, 2021

[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

460 Oct 13, 2022

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

T-IAI-901-MSC2022 - GROUP 18 Gestion de projet Notre travail a été organisé et réparti dans un Trello. https://trello.com/b/X3s2fpPJ/ia-projet Install

1 Feb 05, 2022

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

rJAM splitscreen message reader for MysticBBS A46+

4 Nov 22, 2022

A look-ahead multi-entity Transformer for modeling coordinated agents.

baller2vec++ This is the repository for the paper: Michael A. Alcorn and Anh Nguyen. baller2vec++: A Look-Ahead Multi-Entity Transformer For Modeling

30 Dec 16, 2022

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

36 Dec 08, 2022

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr

334 Dec 30, 2022

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

WordleSolver An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode. How to use the program Copy this proje

3 Mar 02, 2022

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

End to end text to speech system using gruut and onnx

673 Dec 28, 2022

Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

135 Dec 29, 2022

This is the source code of RPG (Reward-Randomized Policy Gradient)

RPG (Reward-Randomized Policy Gradient) Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Shaolei Du, Yu Wang, Yi Wu (

40 Nov 25, 2022

Code for the Python code smells video on the ArjanCodes channel.

7 Python code smells This repository contains the code for the Python code smells video on the ArjanCodes channel (watch the video here). The example

55 Dec 29, 2022

Generating Korean Slogans with phonetic and structural repetition

LexPOS_ko Generating Korean Slogans with phonetic and structural repetition Generating Slogans with Linguistic Features LexPOS is a sequence-to-sequen

3 May 23, 2022

Pytorch NLP library based on FastAI

Quick NLP Quick NLP is a deep learning nlp library inspired by the fast.ai library It follows the same api as fastai and extends it allowing for quick

283 Nov 21, 2022

Unsupervised Language Model Pre-training for French

FlauBERT and FLUE FlauBERT is a French BERT trained on a very large and heterogeneous French corpus. Models of different sizes are trained using the n

212 Dec 10, 2022

State of the art faster Natural Language Processing in Tensorflow 2.0 .

tf-transformers: faster and easier state-of-the-art NLP in TensorFlow 2.0 ****************************************************************************

74 Dec 05, 2022

Shared, streaming Python dict

UltraDict Sychronized, streaming Python dictionary that uses shared memory as a backend Warning: This is an early hack. There are only few unit tests

192 Dec 23, 2022

Black for Python docstrings and reStructuredText (rst).

Style-Doc Style-Doc is Black for Python docstrings and reStructuredText (rst). It can be used to format docstrings (Google docstring format) in Python

13 Oct 24, 2022

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

Tensor2Tensor Tensor2Tensor, or T2T for short, is a library of deep learning models and datasets designed to make deep learning more accessible and ac

12.9k Jan 07, 2023

Exploring dimension-reduced embeddings

Related tags

Overview

sleepwalk

License and disclaimer

Comments

Error running sleepwalk: cannot open the connection

Error: 'browser' must be a non-empty character string

slw_on_selection error when sleepwalk is not attached

doc for comparison

Suggestion: Link embeddings from transposed table

Releases(v0.3.2)

v0.3.2(Sep 17, 2021)

v0.3.1(Sep 30, 2020)

v.0.3.0(Feb 27, 2020)

v0.2.1(Oct 2, 2019)

v0.2.0(Sep 27, 2019)

Owner

S. Anders's research group at ZMBH

Python package for performing Entity and Text Matching using Deep Learning.

This project uses word frequency and Term Frequency-Inverse Document Frequency to summarize a text.

[ICLR'19] Trellis Networks for Sequence Modeling

Application to help find best train itinerary, uses speech to text, has a spam filter to segregate invalid inputs, NLP and Pathfinding algos.

Mysticbbs-rjam - rJAM splitscreen message reader for MysticBBS A46+

A look-ahead multi-entity Transformer for modeling coordinated agents.

This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

👄 The most accurate natural language detection library for Python, suitable for long and short text alike

An algorithm that can solve the word puzzle Wordle with an optimal number of guesses on HARD mode.

End-to-end text to speech system using gruut and onnx. There are 40 voices available across 8 languages.

Label data using HuggingFace's transformers and automatically get a prediction service

This is the source code of RPG (Reward-Randomized Policy Gradient)

Code for the Python code smells video on the ArjanCodes channel.

Generating Korean Slogans with phonetic and structural repetition

Pytorch NLP library based on FastAI

Unsupervised Language Model Pre-training for French

State of the art faster Natural Language Processing in Tensorflow 2.0 .

Shared, streaming Python dict

Black for Python docstrings and reStructuredText (rst).

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.