Stream-Omni is an GPT-4o-like language-vision-speech chatbot that simultaneously supports interaction across any modality combinations.