Pelican plugin that adds site search capability

Overview

Search: A Plugin for Pelican

Build Status PyPI Version

This plugin generates an index for searching content on a Pelican-powered site.

Why would you want this?

Static sites are, well, static… and thus usually don’t have an application server component that could be used to power site search functionality. Rather than give up control (and privacy) to third-party search engine corporations, this plugin adds elegant and self-hosted site search capability to your site. Last but not least, searches are really fast. πŸš€

Installation

This plugin uses Stork to generate a search index. Follow the Stork installation instructions to install this required command-line tool and ensure it is available within /usr/local/bin/ or another $PATH-accessible location of your choosing. For example, Stork can be installed on macOS via:

export STORKVERSION="v1.2.1"
wget -O /usr/local/bin/stork https://files.stork-search.net/releases/$STORKVERSION/stork-macos-10-15
chmod +x /usr/local/bin/stork

Confirm that Stork is properly installed via:

stork --help

Once Stork has been successfully installed and tested, this plugin can be installed via:

python -m pip install pelican-search

Settings

This plugin’s behavior can be customized via Pelican settings. Those settings, and their default values, follow below.

SEARCH_MODE = "output"

In addition to plain-text files, Stork can recognize and index HTML and Markdown-formatted content. The default behavior of this plugin is to index generated HTML files, since Stork is good at extracting content from tags, scripts, and styles. But that mode may require a slight theme modification that isn’t necessary when indexing Markdown source (see SEARCH_HTML_SELECTOR setting below). That said, indexing Markdown means that markup information may not be removed from the indexed content and will thus be visible in the search preview results. With that caveat aside, if you want to index Markdown source content files instead of the generated HTML output, you can use: SEARCH_MODE = "source"

SEARCH_HTML_SELECTOR = "main"

By default, Stork looks for

[…]
tags to determine where your main content is located. If such tags do not already exist in your theme’s template files, you can either (1) add
tags or (2) change the HTML selector that Stork should look for.

To use the default main selector, in each of your theme’s relevant template files, wrap the content you want to index with

tags. For example:

article.html:

<main>
{{ article.content }}
main>

page.html:

<main>
{{ page.content }}
main>

For more information, refer to Stork’s documentation on HTML tag selection.

Static Assets

There are two options for serving the necessary JavaScript, WebAssembly, and CSS static assets:

  1. Use a content delivery network (CDN) to serve Stork’s static assets
  2. Self-host the Stork static assets

The first option is easier to set up. The second option is provided for folks who prefer to self-host everything. After you have decided which option you prefer, follow the relevant section’s instructions below.

Static Assets β€” Option 1: Use CDN

CSS

Add the Stork CSS before the closing tag in your theme’s base template file, such as base.html:

">
<link rel="stylesheet" href="https://files.stork-search.net/basic.css" />

If your theme supports dark mode, you may want to also add Stork’s dark CSS file:

">
<link rel="stylesheet" media="screen and (prefers-color-scheme: dark)" href="https://files.stork-search.net/dark.css">

JavaScript

Add the following script tags to your theme’s base template, just before your closing tag, which will load the most recent Stork module along with the matching WASM binary:

">
<script src="https://files.stork-search.net/releases/v1.2.1/stork.js">script>
<script>
    stork.register("sitesearch", "{{ SITEURL }}/search-index.st")
script>

Static Assets β€” Option 2: Self-Host

Download the Stork JavaScript, WebAssembly, and CSS files and put them in your theme’s respective static asset directories:

export STORKVERSION="v1.2.1"
cd $YOUR-THEME-DIR
mkdir -p static/{js,css}
wget -O static/js/stork.js https://files.stork-search.net/releases/$STORKVERSION/stork.js
wget -O static/js/stork.wasm https://files.stork-search.net/releases/$STORKVERSION/stork.wasm
wget -O static/css/stork.css https://files.stork-search.net/basic.css
wget -O static/css/stork-dark.css https://files.stork-search.net/dark.css

CSS

Add the Stork CSS before the closing tag in your theme’s base template file, such as base.html:

">
<link rel="stylesheet" href="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/css/stork.css">

If your theme supports dark mode, you may want to also add Stork’s dark CSS file:

">
<link rel="stylesheet" media="screen and (prefers-color-scheme: dark)" href="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/css/stork-dark.css">

JavaScript & WebAssembly

Add the following script tags to your theme’s base template file, such as base.html, just before the closing tag:

">
<script src="{{ SITEURL }}/{{ THEME_STATIC_DIR }}/js/stork.js">script>
<script>
    stork.initialize("{{ SITEURL }}/{{ THEME_STATIC_DIR }}/js/stork.wasm")
    stork.downloadIndex("sitesearch", "{{ SITEURL }}/search-index.st")
    stork.attach("sitesearch")
script>

Search Input Form

Decide in which place(s) on your site you want to put your search field, such as your index.html template file. Then add the search field to the template:

">
Search: <input data-stork="sitesearch" />
<div data-stork="sitesearch-output">div>

For more information regarding this topic, see the Stork search interface documentation.

Deployment

Ensure your production web server serves the WebAssembly file with the application/wasm MIME type. For folks using older versions of Nginx, that might look like the following:

…
http {
    …
    include             mime.types;
    # Types not included in older Nginx versions:
    types {
        application/wasm                                 wasm;
    }
    …
}

For other self-hosting considerations, see the Stork self-hosting documentation.

Contributing

Contributions are welcome and much appreciated. Every little bit helps. You can contribute by improving the documentation, adding missing features, and fixing bugs. You can also help out by reviewing and commenting on existing issues.

To start contributing to this plugin, review the Contributing to Pelican documentation, beginning with the Contributing Code section.

Comments
  • unexpected keyword argument 'capture_output'

    unexpected keyword argument 'capture_output'

    Installed the module according the installation documentation.

    But get the following error msg: CRITICAL TypeError: init() got an unexpected keyword argument 'capture_output'

    Stork --help, results in the expected output.

    I've added the plugin to the configuration, as well as the settings:

    PLUGINS = [ .....
      'post_stats', 'related_posts', 'search', 'seo', 'simple_footnotes', 'share_post', 'sitemap',
      ....
    ]
    
    SEARCH_MODE = "output"
    SEARCH_HTML_SELECTOR = "main"
    

    Can this be investigated?

    opened by radoeka 8
  • expected newline, found an identifier

    expected newline, found an identifier

    • [X] I have read the Filing Issues and subsequent β€œHow to Get Help” sections of the documentation.
    • [X] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    • OS version and name: FreeBSD 13.0-RELEASE-p4
    • Python version:3.8.12
    • Pelican version: 4.7.1
    • Version of this plugin: 1.0.0

    Steps up to this point:

    • I installed Stork as needed for this plugin via Rustup. stork --version reports 1.3.0.
    • I set up a venv with Pelican and generated some posts using the default theme. It all generated okay.
    • I installed pelican-search via pip in the venv
    • I set edited the theme to add the CDN code and add the search box.
    • I added SEARCH_MODE = "output" and SEARCH_HTML_SELECTOR = "main" to the pelicanconf.py file.
    • I called pelican to regenerate the site.

    After that I got this error.

    CRITICAL Exception: Search plugin         __init__.py:550
                        reported Error: expected                        
                        newline, found an identifier at                 
                        line 438 column 11   
    

    Help! I do not understand Python at all, so I don't have any insight into the issue beyond what I have here. :( I am happy to provide additional information if requested (and given a little instruction on how).

    bug 
    opened by oiseaumanifesto 6
  • How to translate stork?

    How to translate stork?

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.

    Issue

    How to translate search? My site is in portuguese and the search in english:

    image question 
    opened by paulocoutinhox 3
  • Wrong URL when inside an article

    Wrong URL when inside an article

    • [x] I have read the Filing Issues and subsequent β€œHow to Get Help” sections of the documentation.
    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    Issue

    Hi,

    When enable search, the stork url is not going to correct place when im inside article:

    image

    Example on screenshot above: http://localhost:8000/2022/06/28/2022/06/28/apocalipse-de-jesus-cristo-fase-3-o-trono-e-os-seres-viventes.html

    duplicated: 2022/06/28/2022/06/28
    

    If im on home, it is working.

    Thanks.

    bug 
    opened by paulocoutinhox 2
  • demo website ?

    demo website ?

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.
    • [ ] I am willing to lend a hand to help implement this feature. (well, ATM I don't have time, but in a few week why not)

    Feature Request

    Hello ! It could be cool to have a demo site for this plugin ! :D

    enhancement 
    opened by ebanDev 2
  • search.toml

    search.toml

    I have installed the plugin, using Flex theme. I cannot see any search.toml, or instructions on how to generate it.

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [x] I have searched the documentation and believe that my question is not covered.

    Issue

    Trying to generate my theme, with the search enabled, and I get errors regarding no search.toml file - it is similar to https://github.com/pelican-plugins/search/issues/3 and it appears all I need is a search.toml file. However, I can't see any definitive / suggested configurations for pelican.

    If you could point me in the correct direction that would be superb!

    Thanks

    question 
    opened by Bobspadger 0
  • Escape double quotes in page titles in stork TOML file

    Escape double quotes in page titles in stork TOML file

    Pull Request Checklist

    Resolves: https://github.com/pelican-plugins/search/issues/3#issuecomment-1186287268

    • [x] Conformed to code style guidelines by running appropriate linting tools
    • [x] Updated documentation for changed code

    Description

    If a page title contains a double quote ("), the double quote is rendered into the stork toml file verbatim, creating a syntax error and causing stork to fail.

    This PR escapes double quotes in titles (AFAICT this is the only place where they should appear) with \", fixing the syntax errors.

    opened by s3lph 0
  • Layout issues due to Stork progress bar

    Layout issues due to Stork progress bar

    • [X ] I have searched the issues (including closed ones) and believe that this is not a duplicate.
    • [ X] I have searched the documentation and believe that my question is not covered.

    I have a couple layout issues with the Stork progress bar.

    First it appears in the middle of the article (seemingly where the results box would end).

    Second, more bothersome, blank space appears on the right side of my page, which impacts usability, especially on mobiles when swiping.

    Narrowed down the issue (deleting elements in DevTools until I found the culprit) to this div:

    <div class="stork-progress" style="width: 100%; opacity: 0;"></div>
    

    which is not mine - it seemingly gets inserted by the plugin at build time, and I have not managed to override the inline style via external CSS.

    When width: 0% my layout issue disappears (because the progress bar does not have any space, ie is hidden).

    Where/how can I change width to 0%, or deactivate the progress bar entirely (ie not have this div inserted)? Perhaps better for a future version to define those inline styles in the external CSS?

    question 
    opened by ndeville 0
  • hint to include plugins in pelicanconf.py

    hint to include plugins in pelicanconf.py

    Please include a hint to include plugins=['search'] in the README.MD

    • [x] I have searched the issues (including closed ones) and believe that this is not a duplicate.

    Issue

    opened by kika21 0
  • fix(windows): convert back-slashes to forward-slashes for Windows

    fix(windows): convert back-slashes to forward-slashes for Windows

    1. Summary

    Pelican Stork search doesn’t work correctly on my Windows if OUTPUT_PATH setting is custom.

    After fixing I can successfully use Pelican Stork search:

    404 page search demo

    2. MCVE files

    You can see this MCVE configuration on the KiraPelicanPluginsSitemapStork branch of my demo repository for testing Pelican.

    All files except those listed below are the result of running the command pelican-quickstart.

    1. pelicanconf.py

      """MCVE."""
      
      # [INFO] Default settings
      AUTHOR = 'Sasha Chernykh'
      SITENAME = 'SashaPelicanDebugging'
      SITEURL = 'https://kristinita.netlify.app'
      
      PATH = 'content'
      
      TIMEZONE = 'Europe/Moscow'
      
      DEFAULT_LANG = 'en'
      
      ARTICLE_PATHS = [
          'Articles'
      ]
      
      MARKDOWN = {
          'output_format': 'html5',
      }
      
      
      # [INFO] Settings for this issue
      PLUGINS = [
          'search'
      ]
      
      SEARCH_HTML_SELECTOR = 'body'
      
      OUTPUT_PATH = 'output/'
      
      
    2. content/Articles/KiraArticle.md:

      Slug: KiraArticle
      Title: KiraArticle
      Date: 2020-09-24 18:57:33
      
      Kira Goddess!
      
      
    3. .circleci/config.yml:

      version: 2.1
      
      jobs:
        build:
          machine:
            image: ubuntu-2204:current
          steps:
          - checkout
          - run: pyenv global 3.10.5
          - run: pip install pelican markdown
          - run: pip install pelican-search
          # [INFO] Non-interactive Rust installation on Ubuntu
          # https://stackoverflow.com/a/57251636/5951529
          - run: curl https://sh.rustup.rs -sSf | sh -s -- -y
          - run: cargo install stork-search --locked
          - run: stork --version
          - run: pelican content -s pelicanconf.py --fatal warnings --debug
          - run: ls output
          - run: cat output/search.toml
      
      

    3. Behavior before change

    If custom OUTPUT_PATH on Windows, Pelican Stork search generate invalid path slashes for the value of base_directory setting of search.toml file:

    [input]
    base_directory = "D:\SashaDemoRepositories\SashaPelicanDebugging\output"
    html_selector = "body"
    
    [[input.files]]
    path = "KiraArticle.html"
    url = "/KiraArticle.html"
    title = "KiraArticle"
    

    Incorrect TOML

    If I run:

    pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    

    I get an error:

    Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`
    

    Full output:

    D:\SashaDemoRepositories\SashaPelicanDebugging>pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    [11:23:23] DEBUG    Pelican version: 4.8.0                                                                                                                                                                                                                        __init__.py:531
               DEBUG    Python version: 3.10.6                                                                                                                                                                                                                        __init__.py:532
               DEBUG    Adding current directory to system path                                                                                                                                                                                                        __init__.py:66
               DEBUG    Finding namespace plugins                                                                                                                                                                                                                        _utils.py:81
               DEBUG    Namespace plugins found:                                                                                                                                                                                                                         _utils.py:84
                        pelican.plugins.search
                        pelican.plugins.sitemap
               DEBUG    Loading plugin `search`                                                                                                                                                                                                                          _utils.py:90
               DEBUG    Registering plugin `pelican.plugins.search`                                                                                                                                                                                                    __init__.py:73
               DEBUG    Found generator: ArticlesGenerator (internal)                                                                                                                                                                                                 __init__.py:209
               DEBUG    Found generator: PagesGenerator (internal)                                                                                                                                                                                                    __init__.py:209
               DEBUG    Found generator: SearchSettingsGenerator (pelican.plugins.search.search)                                                                                                                                                                      __init__.py:209
               DEBUG    Found generator: StaticGenerator (internal)                                                                                                                                                                                                   __init__.py:209
               DEBUG    Template list: ['!simple/archives.html', '!simple/article.html', '!simple/author.html', '!simple/authors.html', '!simple/base.html', '!simple/categories.html', '!simple/category.html', '!simple/gosquared.html', '!simple/index.html',     generators.py:70
                        '!simple/page.html', '!simple/pagination.html', '!simple/period_archives.html', '!simple/tag.html', '!simple/tags.html', '!simple/translations.html', '!theme/analytics.html', '!theme/archives.html', '!theme/article.html',
                        '!theme/article_infos.html', '!theme/author.html', '!theme/authors.html', '!theme/base.html', '!theme/categories.html', '!theme/category.html', '!theme/comments.html', '!theme/disqus_script.html', '!theme/github.html',
                        '!theme/index.html', '!theme/page.html', '!theme/period_archives.html', '!theme/tag.html', '!theme/taglist.html', '!theme/tags.html', '!theme/translations.html', '!theme/twitter.html', 'analytics.html', 'archives.html', 'article.html',
                        'article_infos.html', 'author.html', 'authors.html', 'base.html', 'categories.html', 'category.html', 'comments.html', 'disqus_script.html', 'github.html', 'gosquared.html', 'index.html', 'page.html', 'pagination.html',
                        'period_archives.html', 'tag.html', 'taglist.html', 'tags.html', 'translations.html', 'twitter.html']
               DEBUG    Read file Articles/KiraArticle.md -> Article                                                                                                                                                                                                   readers.py:547
               DEBUG    Signal article_generator_preread.send(ArticlesGenerator)                                                                                                                                                                                       readers.py:560
               DEBUG    Successfully imported extension module "markdown.extensions.meta".                                                                                                                                                                                core.py:163
               DEBUG    Successfully loaded extension "markdown.extensions.meta.MetaExtension".                                                                                                                                                                           core.py:126
               DEBUG    Signal article_generator_context.send(ArticlesGenerator, <metadata>)                                                                                                                                                                           readers.py:627 [11:23:24] DEBUG    Read file images/.keep -> Static                                                                                                                                                                                                               readers.py:547
               DEBUG    Signal static_generator_preread.send(StaticGenerator)                                                                                                                                                                                          readers.py:560
               DEBUG    Signal static_generator_context.send(StaticGenerator, <metadata>)                                                                                                                                                                              readers.py:627
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all.atom.xml                                                                                                                                                               writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/articles.atom.xml                                                                                                                                                          writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.atom.xml                                                                                                                                                    writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.rss.xml                                                                                                                                                     writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all-en.atom.xml                                                                                                                                                            writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/KiraArticle.html                                                                                                                                                                 writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/index.html                                                                                                                                                                       writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/tags.html                                                                                                                                                                        writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/categories.html                                                                                                                                                                  writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/authors.html                                                                                                                                                                     writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/archives.html                                                                                                                                                                    writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/category/articles.html                                                                                                                                                           writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/author/sasha-chernykh.html                                                                                                                                                       writers.py:212
               CRITICAL Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`                                               __init__.py:566
    
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ Traceback (most recent call last) ────────────────────────────────┐
    β”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:38 in build_search_index         β”‚
    β”‚                                                                                                  β”‚
    β”‚    35 β”‚   β”‚   if not which("stork"):                                                             β”‚
    β”‚    36 β”‚   β”‚   β”‚   raise Exception("Stork must be installed and available on $PATH.")             β”‚
    β”‚    37 β”‚   β”‚   try:                                                                               β”‚
    β”‚ >  38 β”‚   β”‚   β”‚   output = subprocess.run(                                                       β”‚
    β”‚    39 β”‚   β”‚   β”‚   β”‚   [                                                                          β”‚
    β”‚    40 β”‚   β”‚   β”‚   β”‚   β”‚   "stork",                                                               β”‚
    β”‚    41 β”‚   β”‚   β”‚   β”‚   β”‚   "build",                                                               β”‚
    β”‚                                                                                                  β”‚
    β”‚ C:\Python310\lib\subprocess.py:524 in run                                                        β”‚
    β”‚                                                                                                  β”‚
    β”‚    521 β”‚   β”‚   β”‚   raise                                                                         β”‚
    β”‚    522 β”‚   β”‚   retcode = process.poll()                                                          β”‚
    β”‚    523 β”‚   β”‚   if check and retcode:                                                             β”‚
    β”‚ >  524 β”‚   β”‚   β”‚   raise CalledProcessError(retcode, process.args,                               β”‚
    β”‚    525 β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”‚   β”‚    output=stdout, stderr=stderr)                        β”‚
    β”‚    526 β”‚   return CompletedProcess(process.args, retcode, stdout, stderr)                        β”‚
    β”‚    527                                                                                           β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    CalledProcessError: Command '['stork', 'build', '--input', 'D:\\SashaDemoRepositories\\SashaPelicanDebugging\\output\\search.toml', '--output', 'D:\\SashaDemoRepositories\\SashaPelicanDebugging\\output/search-index.st']' returned non-zero exit status 1.
    
    During handling of the above exception, another exception occurred:
    
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ Traceback (most recent call last) ────────────────────────────────┐
    β”‚ C:\Python310\lib\site-packages\pelican\__init__.py:562 in main                                   β”‚
    β”‚                                                                                                  β”‚
    β”‚   559 β”‚   β”‚   β”‚   watcher = FileSystemWatcher(args.settings, Readers, settings)                  β”‚
    β”‚   560 β”‚   β”‚   β”‚   watcher.check()                                                                β”‚
    β”‚   561 β”‚   β”‚   β”‚   with console.status("Generating…"):                                          β”‚
    β”‚ > 562 β”‚   β”‚   β”‚   β”‚   pelican.run()                                                              β”‚
    β”‚   563 β”‚   except KeyboardInterrupt:                                                              β”‚
    β”‚   564 β”‚   β”‚   logger.warning('Keyboard interrupt received. Exiting.')                            β”‚
    β”‚   565 β”‚   except Exception as e:                                                                 β”‚
    β”‚                                                                                                  β”‚
    β”‚ C:\Python310\lib\site-packages\pelican\__init__.py:127 in run                                    β”‚
    β”‚                                                                                                  β”‚
    β”‚   124 β”‚   β”‚                                                                                      β”‚
    β”‚   125 β”‚   β”‚   for p in generators:                                                               β”‚
    β”‚   126 β”‚   β”‚   β”‚   if hasattr(p, 'generate_output'):                                              β”‚
    β”‚ > 127 β”‚   β”‚   β”‚   β”‚   p.generate_output(writer)                                                  β”‚
    β”‚   128 β”‚   β”‚                                                                                      β”‚
    β”‚   129 β”‚   β”‚   signals.finalized.send(self)                                                       β”‚
    β”‚   130                                                                                            β”‚
    β”‚                                                                                                  β”‚
    β”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:113 in generate_output           β”‚
    β”‚                                                                                                  β”‚
    β”‚   110 β”‚   β”‚   β”‚   fd.write(search_settings)                                                      β”‚
    β”‚   111 β”‚   β”‚                                                                                      β”‚
    β”‚   112 β”‚   β”‚   # Build the search index                                                           β”‚
    β”‚ > 113 β”‚   β”‚   build_log = self.build_search_index(search_settings_path)                          β”‚
    β”‚   114 β”‚   β”‚   build_log = "".join(["Search plugin reported ", build_log])                        β”‚
    β”‚   115 β”‚   β”‚   logger.error(build_log) if "error" in build_log else logger.debug(build_log)       β”‚
    β”‚   116                                                                                            β”‚
    β”‚                                                                                                  β”‚
    β”‚ C:\Python310\lib\site-packages\pelican\plugins\search\search.py:52 in build_search_index         β”‚
    β”‚                                                                                                  β”‚
    β”‚    49 β”‚   β”‚   β”‚   β”‚   check=True,                                                                β”‚
    β”‚    50 β”‚   β”‚   β”‚   )                                                                              β”‚
    β”‚    51 β”‚   β”‚   except subprocess.CalledProcessError as e:                                         β”‚
    β”‚ >  52 β”‚   β”‚   β”‚   raise Exception("".join(["Search plugin reported ", e.stdout, e.stderr]))      β”‚
    β”‚    53 β”‚   β”‚                                                                                      β”‚
    β”‚    54 β”‚   β”‚   return output.stdout                                                               β”‚
    β”‚    55                                                                                            β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    Exception: Search plugin reported Error: Couldn't read the configuration file: Cannot parse config as TOML. Stork recieved error: `invalid escape character in string: `S` at line 2 column 22`
    

    4. Change

    I applied os.sep to convert back-slashes to forward-slashes. I change the line of search.py:

    - self.output_path = output_path
    + self.output_path = output_path.replace(os.sep, '/')
    

    5. Behavior after change

    search.toml on Windows after my changes:

    - base_directory = "D:\SashaDemoRepositories\SashaPelicanDebugging\output"
    + base_directory = "D:/SashaDemoRepositories/SashaPelicanDebugging/output"
    

    This is the correct path for Windows. No errors in output.

    D:\SashaDemoRepositories\SashaPelicanDebugging>pelican content -s pelicanconf.py --fatal warnings --ignore-cache --debug
    [11:30:32] DEBUG    Pelican version: 4.8.0                                                                                                                                                                                                                        __init__.py:531
               DEBUG    Python version: 3.10.6                                                                                                                                                                                                                        __init__.py:532
               DEBUG    Adding current directory to system path                                                                                                                                                                                                        __init__.py:66
               DEBUG    Finding namespace plugins                                                                                                                                                                                                                        _utils.py:81
               DEBUG    Namespace plugins found:                                                                                                                                                                                                                         _utils.py:84
                        pelican.plugins.search
                        pelican.plugins.sitemap
               DEBUG    Loading plugin `search`                                                                                                                                                                                                                          _utils.py:90
               DEBUG    Registering plugin `pelican.plugins.search`                                                                                                                                                                                                    __init__.py:73
               DEBUG    Found generator: ArticlesGenerator (internal)                                                                                                                                                                                                 __init__.py:209
               DEBUG    Found generator: PagesGenerator (internal)                                                                                                                                                                                                    __init__.py:209
               DEBUG    Found generator: SearchSettingsGenerator (pelican.plugins.search.search)                                                                                                                                                                      __init__.py:209
               DEBUG    Found generator: StaticGenerator (internal)                                                                                                                                                                                                   __init__.py:209
               DEBUG    Template list: ['!simple/archives.html', '!simple/article.html', '!simple/author.html', '!simple/authors.html', '!simple/base.html', '!simple/categories.html', '!simple/category.html', '!simple/gosquared.html', '!simple/index.html',     generators.py:70
                        '!simple/page.html', '!simple/pagination.html', '!simple/period_archives.html', '!simple/tag.html', '!simple/tags.html', '!simple/translations.html', '!theme/analytics.html', '!theme/archives.html', '!theme/article.html',
                        '!theme/article_infos.html', '!theme/author.html', '!theme/authors.html', '!theme/base.html', '!theme/categories.html', '!theme/category.html', '!theme/comments.html', '!theme/disqus_script.html', '!theme/github.html',
                        '!theme/index.html', '!theme/page.html', '!theme/period_archives.html', '!theme/tag.html', '!theme/taglist.html', '!theme/tags.html', '!theme/translations.html', '!theme/twitter.html', 'analytics.html', 'archives.html', 'article.html',
                        'article_infos.html', 'author.html', 'authors.html', 'base.html', 'categories.html', 'category.html', 'comments.html', 'disqus_script.html', 'github.html', 'gosquared.html', 'index.html', 'page.html', 'pagination.html',
                        'period_archives.html', 'tag.html', 'taglist.html', 'tags.html', 'translations.html', 'twitter.html']
               DEBUG    Read file Articles/KiraArticle.md -> Article                                                                                                                                                                                                   readers.py:547
               DEBUG    Signal article_generator_preread.send(ArticlesGenerator)                                                                                                                                                                                       readers.py:560
               DEBUG    Successfully imported extension module "markdown.extensions.meta".                                                                                                                                                                                core.py:163
               DEBUG    Successfully loaded extension "markdown.extensions.meta.MetaExtension".                                                                                                                                                                           core.py:126
               DEBUG    Signal article_generator_context.send(ArticlesGenerator, <metadata>)                                                                                                                                                                           readers.py:627
               DEBUG    Read file images/.keep -> Static                                                                                                                                                                                                               readers.py:547
               DEBUG    Signal static_generator_preread.send(StaticGenerator)                                                                                                                                                                                          readers.py:560
               DEBUG    Signal static_generator_context.send(StaticGenerator, <metadata>)                                                                                                                                                                              readers.py:627
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all.atom.xml                                                                                                                                                               writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/articles.atom.xml                                                                                                                                                          writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.atom.xml                                                                                                                                                    writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/sasha-chernykh.rss.xml                                                                                                                                                     writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/feeds/all-en.atom.xml                                                                                                                                                            writers.py:163
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/KiraArticle.html                                                                                                                                                                 writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/index.html                                                                                                                                                                       writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/tags.html                                                                                                                                                                        writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/categories.html                                                                                                                                                                  writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/authors.html                                                                                                                                                                     writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/archives.html                                                                                                                                                                    writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/category/articles.html                                                                                                                                                           writers.py:212
               INFO     Writing D:/SashaDemoRepositories/SashaPelicanDebugging/output/author/sasha-chernykh.html                                                                                                                                                       writers.py:212
               DEBUG    Search plugin reported                                                                                                                                                                                                                          search.py:118
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\fonts.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\fonts.css                                                                                utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\main.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\main.css                                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\pygment.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\pygment.css                                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\reset.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\reset.css                                                                                utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\typogrify.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\typogrify.css                                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\css\wide.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\css\wide.css                                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\font.css to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\font.css                                                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.eot to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.eot                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.svg to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.svg                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.ttf to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.ttf                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.woff to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.woff                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\fonts\Yanone_Kaffeesatz_400.woff2 to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\fonts\Yanone_Kaffeesatz_400.woff2                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\aboutme.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\aboutme.png                                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\bitbucket.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\bitbucket.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\delicious.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\delicious.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\facebook.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\facebook.png                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\github.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\github.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\gitorious.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\gitorious.png                                                      utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\gittip.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\gittip.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\google-groups.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\google-groups.png                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\google-plus.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\google-plus.png                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\hackernews.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\hackernews.png                                                    utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\lastfm.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\lastfm.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\linkedin.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\linkedin.png                                                        utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\reddit.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\reddit.png                                                            utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\rss.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\rss.png                                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\slideshare.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\slideshare.png                                                    utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\speakerdeck.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\speakerdeck.png                                                  utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\stackoverflow.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\stackoverflow.png                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\twitter.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\twitter.png                                                          utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\vimeo.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\vimeo.png                                                              utils.py:332
               INFO     Copying C:\Python310\lib\site-packages\pelican\themes\notmyidea\static\images\icons\youtube.png to D:\SashaDemoRepositories\SashaPelicanDebugging\output\theme\images\icons\youtube.png                                                          utils.py:332
               INFO     Copying D:\SashaDemoRepositories\SashaPelicanDebugging\content\images\.keep to D:\SashaDemoRepositories\SashaPelicanDebugging\output\images\.keep                                                                                                utils.py:302
               INFO     Copying D:\SashaDemoRepositories\SashaPelicanDebugging\content\images\.keep to images/.keep                                                                                                                                                 generators.py:906 Done: Processed 1 article, 0 drafts, 0 hidden articles, 0 pages, 0 hidden pages and 0 draft pages in 0.56 seconds
    

    6. UNIX possible consequences

    My change shouldn’t affect *nix operating systems. I check it on Circle CI.

    1. Circle CI build with configuration from the item 2.3 of this issue, generated search.toml:

      [input]
      base_directory = "/home/circleci/project/output"
      html_selector = "body"
      
      [[input.files]]
      path = "KiraArticle.html"
      url = "/KiraArticle.html"
      title = "KiraArticle"
      
    2. I change in my config.yml:

      - - run: pip install pelican-search
      + - run: pip install git+https://github.com/Kristinita/[email protected]
      

    Circle CI build, the same search.toml.

    I didn’t see anything changed in Ubuntu build after my change.

    7. Reproducing problem

    I can’t reproduce my problem on free remote CI services. Unfortunately, installing Stork on Windows isn’t quick. To install Stork, a Windows user must install Rust and compile Stork on his own machine. I can to compile Stork on my machine, but I was getting bugs on Circle CI and AppVeyor CI.

    If you know how to compile Stork for Windows on free remote CI services, please, tell me. See also my issue on the Stork issue tracker.

    8. Environment

    1. Operating system:

      1. Local β€” Microsoft Windows [Version 10.0.19041.1415]
      2. Circle CI β€” Ubuntu 22.04 LTS (Jammy Jellyfish)
    2. Python β€” 3.10.5, 3.10.6

    3. Pelican β€” 4.8.0

    4. Stork β€” 1.5.0

    5. pelican-search β€” 1.0.1

    Thanks.

    opened by Kristinita 0
Web Scraping images using Selenium and Python

Web Scraping images using Selenium and Python A propos de ce document This is a markdown document about Web scraping images and videos using Selenium

Nafaa BOUGRAINE 3 Jul 01, 2022
Scraping Thailand COVID-19 data from the DDC's tableau dashboard

Scraping COVID-19 data from DDC Dashboard Scraping Thailand COVID-19 data from the DDC's tableau dashboard. Data is updated at 07:30 and 08:00 daily.

Noppakorn Jiravaranun 5 Jan 04, 2022
Proxy scraper. Format: IP | PORT | COUNTRY | TYPE

proxy scraper πŸ”Ž Installation: git clone https://github.com/ebankoff/proxy_scraper Required pip libraries (pip install library name): lxml beautifulso

Eban'ko 19 Dec 07, 2022
Parse feeds in Python

feedparser - Parse Atom and RSS feeds in Python. Copyright 2010-2020 Kurt McKee Kurt McKee 1.5k Dec 30, 2022

DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques

DaProfiler allows you to get emails, social medias, adresses, works and more on your target using web scraping and google dorking techniques, based in France Only. The particularity of this program i

Dalunacrobate 347 Jan 07, 2023
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

Pattern Pattern is a web mining module for Python. It has tools for: Data Mining: web services (Google, Twitter, Wikipedia), web crawler, HTML DOM par

Computational Linguistics Research Group 8.4k Jan 08, 2023
Scrape Twitter for Tweets

Backers Thank you to all our backers! πŸ™ [Become a backer] Sponsors Support this project by becoming a sponsor. Your logo will show up here with a lin

Ahmet Taspinar 2.2k Jan 05, 2023
Haphazard scripts for scraping bitcoin/bitcoin data from GitHub

This is a quick-and-dirty tool used to scrape bitcoin/bitcoin pull request and commentary data. Each output/pr number folder contains comments.json:

James O'Beirne 8 Oct 12, 2022
PyQuery-based scraping micro-framework.

demiurge PyQuery-based scraping micro-framework. Supports Python 2.x and 3.x. Documentation: http://demiurge.readthedocs.org Installing demiurge $ pip

Matias Bordese 109 Jul 20, 2022
A simplistic scraper made to download tons of random screenshots made by people.

printStealer 1.1 What is this tool? This tool is developed to show the insecurity of the screenshot utility called prnt sc. It is a site that stores s

appelsiensam 4 Jul 26, 2022
θŒ…ε°ζŠ’θ΄­ζœ€ζ–°δΌ˜εŒ–η‰ˆζœ¬οΌŒθŒ…ε°η§’ζ€οΌŒδΌ˜εŒ–δΊ†ζŠ’θ΄­εη¨‹ι˜Ÿεˆ—

θŒ…ε°ζŠ’θ΄­ζœ€ζ–°δΌ˜εŒ–η‰ˆζœ¬οΌŒθŒ…ε°η§’ζ€οΌŒδΌ˜εŒ–δΊ†ζŠ’θ΄­εη¨‹ι˜Ÿεˆ—

MaoTai 33 Sep 03, 2022
Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings and results from live.skidor.com Usage: Put the python file in a dedic

0 Jan 07, 2022
Extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file.

GetTss python Package extract gene TSS site form gencode/ensembl/gencode database GTF file and export bed format file. Install $ pip install GetTss Us

laojunjun 6 Nov 21, 2022
This is a python api to scrape search results from a url.

googlescrape Installation Installation is simple! # Stable version pip install googlescrape Examples from googlescrape import client scrapeClient=cli

1 Dec 15, 2022
Simple python tool for the purpose of swapping latinic letters with cirilic ones and vice versa in txt, docx and pdf files in Serbian language

Alpha Swap English This is a simple python tool for the purpose of swapping latinic letters with cirylic ones and vice versa, in txt, docx and pdf fil

Aleksandar Damnjanovic 3 May 31, 2022
Dictionary - Application focused on word search through web scraping

Dictionary - Application focused on word search through web scraping, in addition to other functions such as dictation, spell and conjugation of syllables.

Juan Manuel 2 May 09, 2022
Current Antarctic large iceberg positions derived from ASCAT and OSCAT-2

Iceberg Locations Antarctic large iceberg positions derived from ASCAT and OSCAT-2. All data collected here are from the NASA SCP website Overview Thi

Joel Hanson 5 Jul 27, 2022
Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Agroforestry Species Switchboard 2.0 Scraper Scrape plants scientific name information from Species Switchboard 2.0. Requirements python = 3.10 (you

Mgs. M. Rizqi Fadhlurrahman 2 Dec 23, 2021
Scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info

SpaceX Sofware I developed software to scrape data on SpaceX: Capsules, Rockets, Cores, Roadsters, SpaceX Info to use the software you need Python a

Maxence RΓ©my 16 Aug 02, 2022
This repo has the source code for the crawler and data crawled from auto-data.net

This repo contains the source code for crawler and crawled data of cars specifications from autodata. The data has roughly 45k cars

Tô Đức Anh 5 Nov 22, 2022