Stars
Hibiki is a model for streaming speech translation (also known as simultaneous translation). Unlike offline translation—where one waits for the end of the source utterance to start translating--- H…
Efficient Triton Kernels for LLM Training
A beautiful, simple, clean, and responsive Jekyll theme for academics
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
[AAAI 2024 Oral] AnomalyGPT: Detecting Industrial Anomalies Using Large Vision-Language Models
Audio processing by using pytorch 1D convolution network