Unified file system operation experience for different backend

Overview

megfile - Megvii FILE library

build docs Latest version Support python versions License

megfile provides a silky operation experience with different backends (currently including local file system and OSS), which enable you to focus more on the logic of your own project instead of the question of "Which backend is used for this file?"

megfile provides:

  • Almost unified file system operation experience. Target path can be easily moved from local file system to OSS.
  • Complete boundary case handling. Even the most difficult (or even you can't even think of) boundary conditions, megfile can help you easily handle it.
  • Perfect type hints and built-in documentation. You can enjoy the IDE's auto-completion and static checking.
  • Semantic version and upgrade guide, which allows you enjoy the latest features easily.

megfile's advantages are:

  • smart_open can open resources that use various protocols, including fs, s3, http(s) and stdio. Especially, reader / writer of s3 in megfile is implemented with multi-thread, which is faster than known competitors.
  • smart_glob is available on s3. And it supports zsh extended pattern syntax of [], e.g. s3://bucket/video.{mp4,avi}.
  • All-inclusive functions like smart_exists / smart_stat / smart_sync. If you don't find the functions you want, submit an issue.
  • Compatible with pathlib.Path interface, referring to S3Path and SmartPath.

Quick Start

Here's an example of writing a file to OSS, syncing to local, reading and finally deleting it.

from megfile import smart_open, smart_exists, smart_sync, smart_remove, smart_glob
from megfile.smart_path import SmartPath

# open a file in s3 bucket
with smart_open('s3://playground/refile-test', 'w') as fp:
    fp.write('refile is not silver bullet')

# test if file in s3 bucket exist
smart_exists('s3://playground/refile-test')

# copy files or directories
smart_sync('s3://playground/refile-test', '/tmp/playground')

# remove files or directories
smart_remove('s3://playground/refile-test')

# glob files or directories in s3 bucket
smart_glob('s3://playground/video-?.{mp4,avi}')

# or in local file system
smart_exists('/tmp/playground/refile-test')

# smart_open also support protocols like http / https
smart_open('https://www.google.com')

# SmartPath interface
path = SmartPath('s3://playground/megfile-test')
if path.exists():
    with path.open() as f:
        result = f.read(7)
        assert result == b'megfile'

Installation

PyPI

pip3 install megfile

You can specify megfile version as well

pip3 install "megfile~=0.0"

Build from Source

megfile can be installed from source

git clone [email protected]:megvii-research/megfile.git
cd megfile
pip3 install -U .

Development Environment

git clone [email protected]:megvii-research/megfile.git
cd megfile
sudo apt install libgl1-mesa-glx libfuse-dev fuse
pip3 install -r requirements.txt -r requirements-dev.txt

How to Contribute

  • We welcome everyone to contribute code to the megfile project, but the contributed code needs to meet the following conditions as much as possible:

    You can submit code even if the code doesn't meet conditions. The project members will evaluate and assist you in making code changes

    • Code format: Your code needs to pass code format check. megfile uses yapf as lint tool and the version is locked at 0.27.0. The version lock may be removed in the future

    • Static check: Your code needs complete type hint. megfile uses pytype as static check tool. If pytype failed in static check, use # pytype: disable=XXX to disable the error and please tell us why you disable it.

      Note : Because pytype doesn't support variable type annation, the variable type hint format introduced by py36 cannot be used.

      i.e. variable: int is invalid, replace it with variable # type: int

    • Test: Your code needs complete unit test coverage. megfile uses pyfakefs and moto as local file system and OSS virtual environment in unit tests. The newly added code should have a complete unit test to ensure the correctness

  • You can help to improve megfile in many ways:

    • Write code.
    • Improve documentation.
    • Report or investigate bugs and issues.
    • If you find any problem or have any improving suggestion, submit a new issuse as well. We will reply as soon as possible and evaluate whether to adopt.
    • Review pull requests.
    • Star megfile repo.
    • Recommend megfile to your friends.
    • Any other form of contribution is welcomed.
Owner
MEGVII Research
Power Human with AI. 持续创新拓展认知边界 非凡科技成就产品价值
MEGVII Research
SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning

SPCL SPCL: A New Framework for Domain Adaptive Semantic Segmentation via Semantic Prototype-based Contrastive Learning Update on 2021/11/25: ArXiv Ver

Binhui Xie (谢斌辉) 11 Oct 29, 2022
A library for preparing, training, and evaluating scalable deep learning hybrid recommender systems using PyTorch.

collie Collie is a library for preparing, training, and evaluating implicit deep learning hybrid recommender systems, named after the Border Collie do

ShopRunner 96 Dec 29, 2022
ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees This repository is the official implementation of the empirica

Kuan-Lin (Jason) Chen 2 Oct 02, 2022
Pytorch implementation of XRD spectral identification from COD database

XRDidentifier Pytorch implementation of XRD spectral identification from COD database. Details will be explained in the paper to be submitted to NeurI

Masaki Adachi 4 Jan 07, 2023
PyTorch implementation of our paper How robust are discriminatively trained zero-shot learning models?

How robust are discriminatively trained zero-shot learning models? This repository contains the PyTorch implementation of our paper How robust are dis

Mehmet Kerim Yucel 5 Feb 04, 2022
LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation by Junjue Wang, Zhuo Zheng, Ailong Ma, Xiaoyan Lu, and Yanfei Zh

Kingdrone 174 Dec 22, 2022
Code for the paper "Adversarial Generator-Encoder Networks"

This repository contains code for the paper "Adversarial Generator-Encoder Networks" (AAAI'18) by Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Pr

Dmitry Ulyanov 279 Jun 26, 2022
Prososdy Morph: A python library for manipulating pitch and duration in an algorithmic way, for resynthesizing speech.

ProMo (Prosody Morph) Questions? Comments? Feedback? Chat with us on gitter! A library for manipulating pitch and duration in an algorithmic way, for

Tim 71 Jan 02, 2023
Accelerated Multi-Modal MR Imaging with Transformers

Accelerated Multi-Modal MR Imaging with Transformers Dependencies numpy==1.18.5 scikit_image==0.16.2 torchvision==0.8.1 torch==1.7.0 runstats==1.8.0 p

54 Dec 16, 2022
Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021)

Canonical Capsules: Unsupervised Capsules in Canonical Pose (NeurIPS 2021) Introduction This is the official repository for the PyTorch implementation

165 Dec 07, 2022
Python scripts to detect faces in Python with the BlazeFace Tensorflow Lite models

Python scripts to detect faces using Python with the BlazeFace Tensorflow Lite models. Tested on Windows 10, Tensorflow 2.4.0 (Python 3.8).

Ibai Gorordo 46 Nov 17, 2022
K-FACE Analysis Project on Pytorch

Installation Setup with Conda # create a new environment conda create --name insightKface python=3.7 # or over conda activate insightKface #install t

Jung Jun Uk 7 Nov 10, 2022
This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

Towards Persona-Based Empathetic Conversational Models (PEC) This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (E

Zhong Peixiang 35 Nov 17, 2022
SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

SymmetryNet SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images ACM Transactions on Gra

26 Dec 05, 2022
BisQue is a web-based platform designed to provide researchers with organizational and quantitative analysis tools for 5D image data. Users can extend BisQue by implementing containerized ML workflows.

Overview BisQue is a web-based platform specifically designed to provide researchers with organizational and quantitative analysis tools for up to 5D

Vision Research Lab @ UCSB 26 Nov 29, 2022
Dcf-game-infrastructure-public - Contains all the components necessary to run a DC finals (attack-defense CTF) game from OOO

dcf-game-infrastructure All the components necessary to run a game of the OOO DC

Order of the Overflow 46 Sep 13, 2022
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

62 Dec 22, 2022
Face detection using deep learning.

Face Detection Docker Solution Using Faster R-CNN Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe thro

Nataniel Ruiz 181 Dec 19, 2022
Conditional Generative Adversarial Networks (CGAN) for Mobility Data Fusion

This code implements the paper, Kim et al. (2021). Imputing Qualitative Attributes for Trip Chains Extracted from Smart Card Data Using a Conditional Generative Adversarial Network. Transportation Re

Eui-Jin Kim 2 Feb 03, 2022
PyTorch Implementation of Region Similarity Representation Learning (ReSim)

ReSim This repository provides the PyTorch implementation of Region Similarity Representation Learning (ReSim) described in this paper: @Article{xiao2

Tete Xiao 74 Jan 03, 2023