Bringing sanity to world of messed-up data

Last update: Oct 26, 2021

Related tags

Overview

Sanitize

sanitize is a Python module for making sure various things (e.g. HTML) are safe to use. It was originally written by Mark Pilgrim and is distributed under the BSD license.

Usage

>>> from sanitize import HTML
>>> HTML('<b>hello')
'<b>hello</b>'
>>> HTML('<img>')
'<img />'
>>> HTML(("<b><b><b>hello")
... )
'<b><b><b>hello</b></b></b>'
>>> HTML('<img src="foo"/')
''
>>> HTML('<input type="checkbox" checked>')
'<input type="checkbox" checked="checked" />'
>>> # dangerous tags (a small sample)
... 
>>> HTML('safe<applet code="foo.class" codebase="http://example.com/"></applet> <b>description</b>')
'safe <b>description</b>'
>>> HTML('safe<frameset rows="*"><frame src="http://example.com/"></frameset> <b>description</b>')
'safe <b>description</b>'
>>> # bad protocols (a small sample)
>>> HTML('<a href="java' + chr(1) + 'script:foo">bar</a>')
'<a href="#foo">bar</a>'
>>> HTML('<a href="vbscript:foo">bar</a>')
'<a href="#foo">bar</a>'
>>>

To see more usage examples see tests/test_sanitize_html.py.

Installation

python-sanitize is available on pypi

http://pypi.python.org/pypi/sanitize

So easily install it by pip:

pip install sanitize

Or by easy_install:

$ easy_install sanitize

Another way is by cloning python-sanitize's git repository

$ git clone git://github.com/Alir3z4/python-sanitize.git

Then install it by running

$ python setup.py install

Tests

To run unit tests:

$ python setup.py test

License

Sanitize is distributed under BSD license.

You might also like...

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

Compositional Zero-Shot Learning This is the official PyTorch code of the CVPR 2021 works Learning Graph Embeddings for Compositional Zero-shot Learni

70 Dec 27, 2022

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

piglet PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021] This repo contains code and data for PIGLeT. If you like

51 Oct 8, 2022

The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

105 Dec 30, 2022

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

ManhattanSLAM Authors: Raza Yunus, Yanyan Li and Federico Tombari ManhattanSLAM is a real-time SLAM library for RGB-D cameras that computes the camera

117 Dec 28, 2022

Releases(2014.10.7)

2014.10.7(Oct 7, 2014)
Version 2014.10.7 - 2014-10-07

Feature: Add ChangeLog.rst file.

Feature: Add AUTHORS.rst file.

Feature: Add setup.cfg for wheel support.`

Feature #2: Add travis-ci testing.

Feature #4: Using unittest for testing.

Feature #7: Add coveralls support.

Feature #8: Add MANIFEST.in file.

Feature #5: Better Readme and documentation.

Feature #1: Python packaging done right.

Feature #9: Change version numbering.

Source code(tar.gz)
Source code(zip)

Bringing sanity to world of messed-up data

Related tags

Overview

Sanitize

Usage

Installation

Tests

License

You might also like...

PyTorch CZSL framework containing GQA, the open-world setting, and the CGE and CompCos methods.

PIGLeT: Language Grounding Through Neuro-Symbolic Interaction in a 3D World [ACL 2021]

The first dataset on shadow generation for the foreground object in real-world scenes.

A Planar RGB-D SLAM which utilizes Manhattan World structure to provide optimal camera pose trajectory while also providing a sparse reconstruction containing points, lines and planes, and a dense surfel-based reconstruction.

Open-World Entity Segmentation

HDR Video Reconstruction: A Coarse-to-fine Network and A Real-world Benchmark Dataset (ICCV 2021)

Learning Generative Models of Textured 3D Meshes from Real-World Images, ICCV 2021

[CVPR2021] De-rendering the World's Revolutionary Artefacts

Learning Open-World Object Proposals without Learning to Classify

Releases(2014.10.7)

2014.10.7(Oct 7, 2014)

Version 2014.10.7 - 2014-10-07

Owner

Alireza Savand

A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

PyTorch implementation of EigenGAN

Object detection, 3D detection, and pose estimation using center point detection:

Official PyTorch Implementation of "AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting".

Efficient Multi Collection Style Transfer Using GAN

A small library for creating and manipulating custom JAX Pytree classes

Study of human inductive biases in CNNs and Transformers.

Creating Artificial Life with Reinforcement Learning

Implementation of PersonaGPT Dialog Model

Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

SeqTR: A Simple yet Universal Network for Visual Grounding

《Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis》(2021)

Shallow Convolutional Neural Networks for Human Activity Recognition using Wearable Sensors

Experiments for Operating Systems Lab (ETCS-352)

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"

Reinforcement Learning for the Blackjack

An Artificial Intelligence trying to drive a car by itself on a user created map

🛠 All-in-one web-based IDE specialized for machine learning and data science.

Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)