Starred repositories
Official PyTorch Implementation of "WordStylist: Styled Verbatim Handwritten Text Generation with Latent Diffusion Models" - ICDAR 2023
Official Code for ECCV 2024 paper — One-Shot Diffusion Mimicker for Handwritten Text Generation
This repository is the official implementation of Disentangling Writer and Character Styles for Handwriting Generation (CVPR 2023)
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
整理目前开源的最优表格识别模型,完善前后处理,模型转换为ONNX Organize the currently open-source optimal table recognition models, improve pre-processing and post-processing, and convert the models to ONNX.
A PyTorch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
[TPAMI'24] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
A toolbox of scene text super-resolution and recognition
Font Datasets used in "Font and Calligraphy Style Recognition Using Complex Wavelet Transform"
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...
Document Image Quality Assessment via Convolutional Neural Network
Real-CE: A Benchmark for Chinese-English Scene Text Image Super-resolution (ICCV2023)
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
A collection of papers and resources on scene text image super-resolution.
Swift Parameter-free Attention Network for Efficient Super-Resolution
[TAI 2023] Appearance Enhancement for Camera-captured Document Images in the Wild
Collection of recent shadow removal works, including papers, codes, datasets, and metrics.
This repository contains a paper collection of the methods for document image processing, including appearance enhancement, deshadow, dewarping, deblur, and binarization.
Detect dog face rect and facial landmarks(6 points) using dlib
Cat facial detection and landmark recognition in Python
Rembg is a tool to remove images background
Open-Sora: Democratizing Efficient Video Production for All
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
This repository contains a Automatic1111 Extension allows users to select and apply different styles to their inputs using SDXL 1.0.
set prompt to divided region