GitHub - binhrobles/improved-parakeet: Demo project for live, broadcasted transcription using Amazon Transcribe + EB

Overview

Learnings

AWS transcribe streams

there's also a medical option, but didn't explore that
streaming functionality different than Transcribe jobs
- https://docs.aws.amazon.com/transcribe/latest/dg/streaming.html
transcribe streaming API responds chunk by chunk to the sender
- so if streaming from client, client would need to track all listeners
- better to have a separate backend service handling those connections
- allows more flexibilty with what to do with the transcribed chunks
presigned url generation would've been a HUGE headache without a good community library
- https://github.com/qasim9872/aws-transcribe
- official AWS examples suck/are too low level
no auto-lang detect: language must be declared on connection
struggles with informal speak/accents
domain-specific speech can be improved with custom libraries
- https://docs.aws.amazon.com/transcribe/latest/dg/how-vocabulary.html
no multi-channel/multi-speaker support
provided http2 clients only available in select languages (not JS)
streaming API expects very specific 16-bit PCM audio format
- this was a struggle until I found a lib

AWS Translate streams

allows auto detection of text language
300->500ms latency seemed to do fine when piping in Transcribe partial output

Modern audio streaming architectures

for web tech, websockets are king, but HTTP2 + SSE also possible
APIG is an option for simpler use cases
should separate inbound/outbound concerns w/ data streaming architecture
could be managed service or another container, depending on tradeoff b/w portability, scale, and persistence
Kinesis, Kafka, Redis Pub/sub, Redis streams, ...
data stream tech discussion

ADO

templating of specific pipeline pieces super useful
- should create shareable example templates for common scenarios
would like to cache intermediate docker images
- yarn caching a bit more difficult when inside docker container

Elastic Beanstalk

Dockerrun.aws.json defines a single Task which is created on each underlying EC2 instance
horizontal scaling would thus duplicate the entire definition
if you want more granular scaling, move to ECS, or multiple EB apps
small hack required to enable websockets

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
ado-templates		ado-templates
backend-infra		backend-infra
frontend-infra		frontend-infra
nginx		nginx
reader-client		reader-client
recorder-client		recorder-client
transcriber		transcriber
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.prettierrc		.prettierrc
Dockerrun.aws.json		Dockerrun.aws.json
README.md		README.md
azure-pipelines.yml		azure-pipelines.yml
docker-compose.yml		docker-compose.yml
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Learnings

AWS transcribe streams

AWS Translate streams

Modern audio streaming architectures

ADO

Elastic Beanstalk

About

Releases

Packages

Languages

binhrobles/improved-parakeet

Folders and files

Latest commit

History

Repository files navigation

Overview

Learnings

AWS transcribe streams

AWS Translate streams

Modern audio streaming architectures

ADO

Elastic Beanstalk

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages