hyphenconnect Greenhouse · Posted 3mo ago

Multimodal AI Systems Architect (AI Engineering)

United States

Continue to application Add your email once, then Caio opens the original posting.

Indexed description

We are seeking a talented Multimodal AI Systems Architect to develop and optimize AI systems that seamlessly integrate vision and audio models. This role focuses on enhancing our voice-to-voice interactions and multimodal retrieval capabilities, ensuring our systems are efficient and innovative.

Responsibilities:

Integrate vision encoders and audio-native models into core agent reasoning loops.
Optimize streaming latency for voice-to-voice AI interactions.
Architect multimodal RAG systems capable of retrieving insights from videos and PDFs.

Qualifications:

Experience with Whisper, CLIP, and multimodal LLM integration.
Knowledge of streaming architectures and WebRTC.
Expertise in cross-modal alignment.

Free. 20 seconds. No password. See every match in this search.

Create a free Caio profile to unlock more results and save your role and location preferences.

Unlock free search

Want help applying to roles like this? Search Caio for free. If repetitive applications get heavy, Managed Job Search adds supervised execution for $99/month.

View Managed Job Search

hyphenconnect Company profile preview

Source: Greenhouse
Location: United States
Compensation: Not listed
Open on Caio: 847 roles

Salary insight

Compensation not indexed

Caio highlights salary ranges whenever the original posting exposes them. Compare similar roles as the index fills in.

Similar role details

Full-time roles Location flexible matches Greenhouse postings

Company stats

Current index details for hyphenconnect, based on roles Caio has indexed from public sources.

847open roles 2sources 3markets Posted 2mo agolatest role