The massive alter from GPT-3.5 is the fact OpenAI's 4th era language product is multimodal, which suggests it can procedure each textual content, images and audio. This means you could present it images and it will respond to them together with a textual content prompt – an early example https://mickq764zkw7.bloggosite.com/profile