pubmex.py - a script to get a fancy paper title based on given DOI or PMID

Overview

Pu(b)mex

tag PyPI version

pubmex.py is a script to get a fancy paper title based on given DOI or PMID (can be also combined with macOS Finder)

Format of the title:

a first author . a last author - (title("dotted") or your customed title) . PMID . journal . year . pdf
e.g.
  Kelley.Scott.The.evolution.biology.shift.towards.engineering.prediction-generating.tools.away.traditional.research.practice.EMBORep.2008.pdf

Nowadays, it’s not a big issue, with all Mendeley and other tools, however...

I don’t want to put any PDF file collected on the way into my library, because then it gets super big (and then it’s hard to sync it for example with Dropbox). So now I can keep these PDF files into pdf-icebox and re-name them niecely automatically:

$ ls
Hnisz.Sharp.Phase.Separation.Model.Transcriptional.Control.Cell.2017.pdf
Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf

Usage:

./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf ">
$ pubmex.py sharp2017.pdf
Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
mv sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf

$ pubmex.py Query.Konarska.pdf
mv Query.Konarska.pdf --> ./Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
$ pubmex.py eabc9191.full.pdf
mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf

DEPENDENCIES

INSTALLATION

pip install pubmex
# Ubuntu (Debian-based system)
apt-get install xclip python-biopython pdftotext
# macOS
brew install poppler biopython # or "sudo port install poppler biopython"

HISTORY

  • 1.4 Add osx-automator
  • 1.3 Fixed #4 #5
  • 1.2 Fixed #2
  • 1.1 Simplify input, pubmex.py *.pdf
  • 1.0 With recent bugfixes 2021
  • 0.3 OSX installation
  • 0.2 Small changes
  • 0.1 Init version in 2010! :-)
Comments
  • Automator not working

    Automator not working

    It seems that when using the automator installations that come with the pubmex the pubmex.py can not be found.

        for f in "$@"
        do
            pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “zsh:3: command not found: pubmex.py”

    When specifying the direct location of just the pubmex.py file another error occures.

        for f in "$@"
        do
            /users/suntim/miniforge3/bin/pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “”

    When specifying the direct location of python and the pubmex.py file another error occures.

        for f in "$@"
        do
            /usr/local/bin/python3 /users/suntim/miniforge3/bin/pubmex.py $f
        done
    

    The following error is displayed:

    The action “Run Shell Script” encountered an error: “Traceback (most recent call last): File "/users/suntim/miniforge3/bin/pubmex.py", line 27, in <module> from Bio import Entrez ModuleNotFoundError: No module named 'Bio'”

    I have all dependencies installed pip3 install pubmex, pip3 install biopython, brew install poppler. As it says in the readme.md that biopython should be isntalled via brew I assume that was a mistake. I instead installed it via pip3.

    The same error messages occure regardless of using the zsh or bash version.

    opened by LinusKaiser 2
  • Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected

    Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected

    [email protected]:~/Desktop/pdfs$ pubmex.py -a -r -f 1-s2.0-S1874939915001868-main.pdf ERROR: Not found in PubMed, although DOI (.ORG/10.1016/J.BBAGRM.2015.08.009) was detected in the pdf! Traceback (most recent call last): File "/home/magnus/bin/pubmex.py", line 472, in main() File "/home/magnus/bin/pubmex.py", line 451, in main title = get_title_auto_from_text(text, OPTIONS.debug, False, OPTIONS.keywords) File "/home/magnus/bin/pubmex.py", line 239, in get_title_auto_from_text return get_title_via_doi(doi, debug, reference, customed_title) File "/home/magnus/bin/pubmex.py", line 359, in get_title_via_doi pmid = get_pmid_via_doi_net(doi) File "/home/magnus/bin/pubmex.py", line 333, in get_pmid_via_doi_net return get_value('citation_pmid', content) TypeError: get_value() takes exactly 3 arguments (2 given)

    opened by mmagnus 2
  • Invalid git clone (edit: on windows machines)

    Invalid git clone (edit: on windows machines)

    The colon in 'demo/10.1261:rna.418407.pdf' causes problems in cloning from windows machines.

    Cloning into 'pubmex'... remote: Enumerating objects: 426, done. remote: Counting objects: 100% (9/9), done. remote: Total 426 (delta 8), reused 8 (delta 8), pack-reused 417 eceiving obj Receiving objects: 100% (426/426), 3.79 MiB | 2.86 MiB/s, done. Resolving deltas: 100% (252/252), done. error: invalid path 'demo/10.1261:rna.418407.pdf' fatal: unable to checkout working tree warning: Clone succeeded, but checkout failed. You can inspect what was checked out with 'git status' and retry with 'git restore --source=HEAD :/'

    opened by gcasale 1
  • ct200162x.pdf

    ct200162x.pdf

    (py37) [mx] rna$ pubmex.py ct200162x.pdf --debug
    filename: .......... ct200162x.pdf
    filename: .......... ct200162x.pdf
    doi: ............... ct200162x
    IdList.............. []
    pmid: .............. False
    ERROR: 		Not found in PubMed, although DOI (ct200162x) was detected in the pdf!
    generate ./temp.....[OK]
    out:
    err:
    temp is going to be opened
    doi_line: .......... DX.DOI.ORG/10.1021/CT200162X | J. CHEM. THEORY COMPUT. 2011, 7, 28862902
    doi is found: ...... 10.1021/CT200162X
    doi: ............... 10.1021/CT200162X
    IdList.............. ['21921995']
    pmid: .............. 21921995
    summary_dict........ {'Item': [], 'Id': '21921995', 'PubDate': '2011 Sep 13', 'EPubDate': '2011 Aug 2', 'Source': 'J Chem Theory Comput', 'AuthorList': ['Zgarbová M', 'Otyepka M', 'Sponer J', 'Mládek A', 'Banáš P', 'Cheatham TE 3rd', 'Jurečka P'], 'LastAuthor': 'Jurečka P', 'Title': 'Refinement of the Cornell et al. Nucleic Acids Force Field Based on Reference Quantum Chemical Calculations of Glycosidic Torsion Profiles.', 'Volume': '7', 'Issue': '9', 'Pages': '2886-2902', 'LangList': ['English'], 'NlmUniqueID': '101232704', 'ISSN': '1549-9618', 'ESSN': '1549-9626', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed', 'PubStatus': 'ppublish+epublish', 'ArticleIds': {'pubmed': ['21921995'], 'medline': [], 'doi': '10.1021/ct200162x', 'pmc': 'PMC3171997', 'rid': '21921995', 'eid': '21921995', 'pmcid': 'pmc-id: PMC3171997;'}, 'DOI': '10.1021/ct200162x', 'History': {'pubmed': ['2011/09/17 06:00'], 'medline': ['2011/09/17 06:01'], 'received': '2011/03/08 00:00', 'entrez': '2011/09/17 06:00'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(242, attributes={}), 'FullJournalName': 'Journal of chemical theory and computation', 'ELocationID': '', 'SO': '2011 Sep 13;7(9):2886-2902'}
    ERROR: 		Problem! The pubmex could not find automatically a title for the pdf file! Sorry!
    
    opened by mmagnus 0
  • gkz1184.pdf

    gkz1184.pdf

    (py37) [mx] rna$ pubmex.py gkz1184.pdf --debug
    filename: .......... gkz1184.pdf
    filename: .......... gkz1184.pdf
    doi: ............... gkz1184
    IdList.............. []
    pmid: .............. False
    ERROR: 		Not found in PubMed, although DOI (gkz1184) was detected in the pdf!
    generate ./temp.....[OK]
    out:
    err:
    temp is going to be opened
    doi_line: .......... 11641174 NUCLEIC ACIDS RESEARCH, 2020, VOL. 48, NO. 3 DOI: 10.1093/NAR/GKZ1184
    doi is found: ...... 10.1093/NAR/GKZ1184
    doi: ............... 10.1093/NAR/GKZ1184
    IdList.............. ['31889193']
    pmid: .............. 31889193
    summary_dict........ {'Item': [], 'Id': '31889193', 'PubDate': '2020 Feb 20', 'EPubDate': '', 'Source': 'Nucleic Acids Res', 'AuthorList': ['Reißer S', 'Zucchelli S', 'Gustincich S', 'Bussi G'], 'LastAuthor': 'Bussi G', 'Title': 'Conformational ensembles of an RNA hairpin using molecular dynamics and sparse NMR data.', 'Volume': '48', 'Issue': '3', 'Pages': '1164-1174', 'LangList': ['English'], 'NlmUniqueID': '0411011', 'ISSN': '0305-1048', 'ESSN': '1362-4962', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed - indexed for MEDLINE', 'PubStatus': 'ppublish', 'ArticleIds': {'pubmed': ['31889193'], 'medline': [], 'pii': '5691221', 'doi': '10.1093/nar/gkz1184', 'pmc': 'PMC7026608', 'rid': '31889193', 'eid': '31889193', 'pmcid': 'pmc-id: PMC7026608;'}, 'DOI': '10.1093/nar/gkz1184', 'History': {'pubmed': ['2020/01/01 06:00'], 'medline': ['2020/03/20 06:00'], 'accepted': '2019/12/09 00:00', 'revised': '2019/12/05 00:00', 'received': '2019/10/14 00:00', 'entrez': '2020/01/01 06:00'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(3, attributes={}), 'FullJournalName': 'Nucleic acids research', 'ELocationID': 'doi: 10.1093/nar/gkz1184', 'SO': '2020 Feb 20;48(3):1164-1174'}
    ERROR: 		Problem! The pubmex could not find automatically a title for the pdf file! Sorry!
    
    opened by mmagnus 0
  • some problem when I removed some prints to make the script quite

    some problem when I removed some prints to make the script quite

    (py37) [mx] d$ pubmex -p 10.1016/j.molcel.2020.11.004
    (py37) [mx] d$ pubmex -p 10.1016/j.molcel.2020.11.004 -d
    doi: ............... 10.1016/j.molcel.2020.11.004
    IdList.............. ['33259809']
    pmid: .............. 33259809
    summary_dict........ {'Item': [], 'Id': '33259809', 'PubDate': '2020 Dec 17', 'EPubDate': '2020 Nov 5', 'Source': 'Mol Cell', 'AuthorList': ['Ziv O', 'Price J', 'Shalamova L', 'Kamenova T', 'Goodfellow I', 'Weber F', 'Miska EA'], 'LastAuthor': 'Miska EA', 'Title': 'The Short- and Long-Range RNA-RNA Interactome of SARS-CoV-2.', 'Volume': '80', 'Issue': '6', 'Pages': '1067-1077.e5', 'LangList': ['English'], 'NlmUniqueID': '9802571', 'ISSN': '1097-2765', 'ESSN': '1097-4164', 'PubTypeList': ['Journal Article'], 'RecordStatus': 'PubMed - indexed for MEDLINE', 'PubStatus': 'ppublish+epublish', 'ArticleIds': {'pubmed': ['33259809'], 'medline': [], 'pii': 'S1097-2765(20)30782-6', 'doi': '10.1016/j.molcel.2020.11.004', 'pmc': 'PMC7643667', 'rid': '33259809', 'eid': '33259809', 'pmcid': 'pmc-id: PMC7643667;'}, 'DOI': '10.1016/j.molcel.2020.11.004', 'History': {'pubmed': ['2020/12/02 06:00'], 'medline': ['2021/01/12 06:00'], 'received': '2020/07/20 00:00', 'revised': '2020/10/05 00:00', 'accepted': '2020/10/29 00:00', 'entrez': '2020/12/01 20:08'}, 'References': [], 'HasAbstract': IntegerElement(1, attributes={}), 'PmcRefCount': IntegerElement(10, attributes={}), 'FullJournalName': 'Molecular cell', 'ELocationID': 'doi: 10.1016/j.molcel.2020.11.004', 'SO': '2020 Dec 17;80(6):1067-1077.e5'}
    Ziv.Miska.The.Short-Long-Range.RNA-RNA.Interactome.SARS-CoV-2.MolCell.2020.pdf
    
    bug 
    opened by mmagnus 0
Releases(1.4.2)
  • 1.4.2(Mar 15, 2022)

    Now you can see in Finder QuickAction pubmex to quick run it on a number of PDFs files.

    Install pubmex_zsh.workflow from pubmex/osx-automator/ for if you default shell is zsh, or pubmex_bash.workflow for bash.

    158028806-039d4ec6-caf5-446e-bcb0-835face858ee

    Source code(tar.gz)
    Source code(zip)
  • 1.4.1(Mar 12, 2022)

  • 1.4(Sep 27, 2021)

    Now you can see in Finder QuickAction pubmex to quick run it on a number of PDFs files.

    Install pubmex_zsh.workflow from pubmex/osx-automator/ for if you default shell is zsh, or pubmex_bash.workflow for bash.

    pubmex-osx-automator

    Source code(tar.gz)
    Source code(zip)
  • 1.3(Sep 26, 2021)

  • 1.2(Sep 14, 2021)

  • 1.1(Aug 18, 2021)

    Simplify input to pubmex.py *.pdf. Fixed #2

    Now, usage:

    $ pubmex.py sharp2017.pdf
    mv  sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
    
    $ pubmex.py  Query.Konarska.pdf
    mv  Query.Konarska.pdf --> Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
    $ pubmex.py eabc9191.full.pdf
    mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf
    
    Source code(tar.gz)
    Source code(zip)
  • 1.0(Jun 23, 2021)

    I don’t want to put any PDF file collected on the way into my library, because then it gets super big (and then it’s hard to sync it for example with Dropbox). So now I can keep these PDF files into pdf-icebox and re-name them niecely automatically:

    Usage:

    $ pubmex.py -a -f sharp2017.pdf -r
    mv  sharp2017.pdf --> ./Sharp.Hockfield.Convergence.The.future.health.Science.2017.pdf
    
    $ pubmex.py -a -f Query.Konarska.pdf -r
    mv  Query.Konarska.pdf --> Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf
    
    $ pubmex.py -a -f eabc9191.full.pdf -r
    mv  eabc9191.full.pdf --> ./Balas.Johnson.Establishing.RNA-RNA.interactions.remodels.lncRNA.structure.promotes.PRC2.activity.SciAdv.2021.pdf
    

    .. and we get a file:

    Smith.Konarska."Nought.may.endure.but.mutability".spliceosome.dynamics.regulation.splicing.MolCell.2008.pdf

    Source code(tar.gz)
    Source code(zip)
Owner
Marcin Magnus
Ph.D., molecular biologist & bioinformatician, uses Pen & Paper and Emacs for notes, coding & RNA!
Marcin Magnus
python code used to download all images contained in a facebook uid , the uid can be profile,group,fanpage

python code used to download all images contained in a facebook uid , the uid can be profile,group,fanpage

VVHai 2 Dec 21, 2021
Download candlestick data fast & easy for analysis

crypto-candlesticks 📈 The goal behind this project is to facilitate downloading cryptocurrency candlestick data fast & simple. Currently only the Bit

Pedro Torres 31 Dec 11, 2022
This is Yt Downloader. Coded with Python (my first repository)

Get Started Download & install Python first before using this software. Download Python Installing Python and Pytube Library (IMPORTANT) Installing Py

Qi 2 Oct 25, 2021
Used Insta Loader to download high quality images from instagram account

Insta Dp Downloader Project Description: In this project, I have used "Insta Loader" to download high quality images from instagram account. You only

Hassan Shahzad 3 Oct 31, 2022
SubGrab is a utility that allows you to automate subtitles downloading for your media files.

SubGrab - Command-line Subtitles Downloader: A utility which provides an ease for automating media i.e., Movies, TV-Series subtitle scraping from mult

Rafay 106 Dec 17, 2022
YT-Downloader is a Tool to download youtube video.

YT-Downloader YT-Downloader is a Tool to download youtube video.If you are looking for a simple video downloader tool Than This YT-Downloader may be u

Pradip Thapa 7 May 11, 2022
Simple avogadr.io batch downloader python script

Simple avogadr.io batch downloader python script

2 Jan 19, 2022
Simple Python script to download images and videos from public subreddits without using Reddit's API 😎

Subreddit Media Downloader Download images and videos from any public subreddit without using Reddit's API Made with ❤ by Nico 💬 About: This script a

Nico 106 Jan 07, 2023
Python utility to download jobs at seek.com.au

Job Seeker job_seeker is an utility to download data of a job search from seek.com.au into a csv file for data analysis and exploration Install using

PyBites 3 May 14, 2022
Downloader Middleware to support Playwright in Scrapy & Gerapy

Gerapy Playwright This is a package for supporting Playwright in Scrapy, also this package is a module in Gerapy. Installation pip3 install gerapy-pla

Gerapy 85 Dec 31, 2022
Twitter Media Downloader (Telegram Bot)

Twitter Media Downloader (Telegram Bot)

Matin Baloochestani 8 Oct 27, 2022
A Quick demo of how to use the youtube_dl module in python.

youtube_dl python module demo A Quick demo of how to use the youtube_dl module in python. Whole documentation for the youtube_dl Installation git

7 Aug 27, 2021
PyDownloader - Downloads files and folders at high speed (based on your interent speed).

PyDownloader - Downloads files and folders at high speed (based on your interent speed).

Armen._.G 4 Feb 24, 2022
The tool allows to download a list of tiktok sounds

dependencies: pip install requests how to use LAUCH THE PROGRAM file (option f)

carpal 3 Jan 21, 2022
🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube

🔥 A Bot To Telegram For Download High Qulity Videos & Songs From Youtube 🎗 Fast And Free Bot No Need To Pay ✅ By SL-Alpha-X-Team ⚡

Official Alpha-X-Team Account 7 Aug 31, 2022
DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key

Ditch YouTubeAPI (BETA) DYA ( Ditch YouTube API ) is a package created to power the user with YouTube Data API functionality without any API Key Detai

Sougata Jana 23 Dec 22, 2022
A program that can download animations from myself website

MYD A program that can download animations from myself website 一個可以用來下載Myself網站上動漫的程式 Quick Start [無GUI版本] 確定電腦內包含 ffmpeg 並設為環境變數 (Environment Variabl

Patrick_star 1 Nov 07, 2021
FireDM is a python open source (Internet Download Manager) with multi-connections, high speed engine, it downloads general files and videos from youtube and tons of other streaming websites .

python open source (Internet Download Manager) with multi-connections, high speed engine, based on python, LibCurl, and youtube_dl https://github.com/firedm/FireDM

1.6k Apr 12, 2022
A YouTube downloader app built with Django.

YouTube Downloader ⭐️ Star this project ⭐️ Requirements Python3+ Git Installation Install the dependencies and start the server. git clone https://git

Gabriel Tavares 26 Aug 19, 2022
A growing collection of search plugins for the qBittorrent, an awesome and opensource torrent client

qBittorrent Search Plugins This is a still growing collection of search plugins for qBittorent, an amazing and open source torrent client, maintained

Alessio Tudisco 59 Dec 26, 2022