GPT-4

OpenAI releases update: Achieving real-time cross-audio, visual, and text reasoning

ChainCatcher news, according to Cointelegraph, OpenAI made four updates to its models in October to help its AI models engage in conversations better and improve image recognition capabilities. The first major update is the real-time API, which allows developers to create AI-generated voice applications using a single prompt, enabling natural conversations similar to ChatGPT's advanced voice mode. Previously, developers had to "stitch together" multiple models to create these experiences. Audio input typically needed to be fully uploaded and processed before receiving a response, resulting in high latency for real-time applications like voice interactions. With the streaming capabilities of the Realtime API, developers can now achieve instant, natural interactions, just like voice assistants. This API runs on GPT-4, which will be released in May 2024, and can reason in real-time across audio, visual, and text inputs.Another update includes fine-tuning tools provided for developers, enabling them to improve AI responses generated from image and text inputs. The image-based fine-tuner allows the AI to better understand images, enhancing visual search and object detection capabilities. This process includes feedback from humans, who provide examples of good and bad responses for training.In addition to voice and visual updates, OpenAI also introduced "model distillation" and "prompt caching," allowing smaller models to learn from larger models and reducing development costs and time by reusing processed text. According to Reuters, OpenAI expects its revenue to increase to $11.6 billion next year, up from an estimated $3.7 billion in 2024.

OpenAI Roadmap: Will reduce GPT-4 API costs, considering open-sourcing GPT-3

According to ChainCatcher news and a blog post from the AI development platform HumanLoop, OpenAI CEO Sam Altman stated in a closed-door seminar that OpenAI is currently severely limited by GPU availability, causing them to delay many short-term plans. Most issues regarding the reliability and speed of ChatGPT are due to the shortage of GPU resources.Sam Altman also shared OpenAI's recent roadmap: in 2023, they will reduce the cost of the GPT-4 API; a longer ChatGPT context window (up to 1 million tokens) will be available, and there will be a future API version that remembers conversation history; the multimodal capabilities of GPT-4 will not be publicly available until 2024, as they cannot scale the visual version of GPT-4 to everyone until they acquire more GPU resources.Additionally, OpenAI is considering open-sourcing GPT-3. One reason they have not done so yet is that they believe not many individuals and companies are capable of properly managing such a large language model. The recent claims in many articles that "the era of giant AI models is over" are incorrect. Internal data from OpenAI indicates that the law of scale being proportional to performance still holds, and the scale of OpenAI's models may double or triple each year (various sources indicate that GPT-4 has a parameter scale of 1 trillion), rather than increasing by many orders of magnitude. (Source link)
ChainCatcher Building the Web3 world with innovators