Сортировать по
Язык: Все
Топик: multi-modality
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
llamachatgptllama2chatbotgpt-4instruction-tuningvision-language-modelvisual-language-learningfoundation-modelsllama-2llavamulti-modalitymultimodal- Python
00Обновлено 7 месяцев назад
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
machine-learningdeep-learningchatgptgpt-4instruction-tuningvisual-language-learningfoundation-modelsmulti-modalityapple-vision-proartificial-inteligenceegocentric-visionembodiedembodied-ailarge-scale-models- Python
00Обновлено 7 месяцев назад