-
Fondazione Bruno Kessler
- https://fabiopoiesi.github.io/
Stars
Code for the paper "IFFNeRF: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model"
A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
Given file or youtube url generates the transcript, and supports multiple speakers (diarization)
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, and other large language models.
Code of the paper: 6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model
We write your reusable computer vision tools. 💜
Source code of CVPR 2024 paper 'FastMAC: Stochastic Spectral Sampling of Correspondence Graph'
An anomaly detection library comprising state-of-the-art algorithms and features such as experiment management, hyper-parameter optimization, and edge inference.
Fast maximal clique finder and robust registration library
An overview of different quaternion implementations and their chosen order: x-y-z-w or w-x-y-z?
Metric depth estimation from a single image
OCR, layout analysis, reading order, table recognition in 90+ languages
The official PyTorch implementation of Google's Gemma models
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the informati…
#1 Locally hosted web application that allows you to perform various operations on PDF files
Visual Speech Recognition for Multiple Languages
A toolkit for making real world machine learning and data analysis applications in C++
[CVPR'24 Oral] Official repository of Point Transformer V3 (PTv3)
A CUDA implementation of Bundle Adjustment
A python script to help manage a Gmail inbox by filtering out promotional emails using GPT-3 or GPT-4.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
📷 📷 📷 📷 📷 Wireless software synchronization of multiple distributed smartphone cameras.
[ICCV 2023] SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes