ChatGPT is no longer alone, Claude v2 is surpassing it.

  We may be witnessing the last days of ChatGPT ruling artificial intelligence.

  Stanford university's paper "how does ChatGPT's behavior change over time?" Flooding the Internet, describing the obvious quality loss of ChatGPT. Meanwhile, Claude, ChatGPT's biggest competitor, is actually getting stronger.

  In this process, it broke through several unexplored fields of generative artificial intelligence and opened the door for many use cases that were impossible a few days ago.

  Claude's version 2 is now available, which has many functions, making it unique in many categories.

  But how scared should OpenAI be?

  In my opinion, it is very.

  The battle is real.

  ChatGPT-4 has been the undisputed king of large language model (LLM) since its release.

  Its original ability enables it to beat all other chat bots in most benchmark tests, because it has an ultra-wide data distribution (it can correctly remember a large amount of incomprehensible content) and flourishes in complex reasoning problems.

  Needless to say, the biggest evidence that GPT-4 is considered as the best model in the game is that most open source models, trained by distillation (a process in which a larger model teaches a smaller model), use ChatGPT as the teaching model, even though it is the most expensive.

  But a large number of users, including Stanford University now, have publicly demonstrated obvious quality loss.

  The king is no longer invincible.

  Security tradeoff

  It is generally believed that the usefulness of artificial intelligence is inversely proportional to its security.

  Simply put, if you train your chat bot to be "more helpful", it usually means that it is open to more topics that can help you, so it is easier to help people on topics, to put it mildly ... it shouldn't.

  On the other hand, if you make it a priority to become "safer", it usually means that it becomes less helpful and more focused on completely human principles.

  An example of this vision is Influence's chat robot Pi, which has publicly declared that its Inflection is not to become the smartest chat robot, but to become the safest chat robot.

  The most powerful proof comes from our friend GPT-4, who is the master of law behind the advanced model of ChatGPT.

  As evidenced by the paper provided by Stanford University at the end of this article, it is now clear that OpenAI is committed to making ChatGPT a "safer" experience.

  Overall, the "GPT-4 experience" has deteriorated sharply. It becomes worse in mathematics, worse in code generation, and more difficult to answer sensitive questions.

  In addition, although not shown in the figure, GPT-4 also reduced verbosity through the update in June 2023, and at the same time, the text generation speed became faster.

  The reason for this is very clear:

  ChatGPT is safer to use now ... but at a price. As the paper said, "This shows that these LLM services may become more secure, but it also provides fewer reasons for refusing to answer some questions." They tried to save costs by running shorter reasoning.

  I can understand why they want to do these two things, but this is a particularly bad moment for them, because Claude is taking the opposite route.

  ChatGPT is getting worse ... but its main competitor is getting better.

  Be careful, bugs bunny, you are not the coolest anymore.

  As mentioned above, a group of outstanding former OpenAI researchers are competing with OpenAI with their model Claude, whose version 1 has been considered as "comparable" to the version 3.5 of ChatGPT.

  But if you think this is just a direct comparison, you are all wet.

  In fact, Claude brought many things that made him different and unique.

  In order to show how unique Claude has become, we need to study its particularity in depth. Two of them are taller than the others.

  Bigger and newest.

  As we all know, ChatGPT has limited knowledge before September 2021, while Claude is the latest before 2023.

  In addition, ChatGPT is limited to the already great 32k token, or about 26,000 words per prompt, while Claude can handle up to 100,000 or 75,000 words.

  For reference, this is a complete record of six Star Wars movies.

  Considering that you can also send multiple documents to Claude at a time (ChatGPT can't), this allows you to realize a function that has not been seen in AI so far: you can send multiple document sources and generate content that considers all these document sources.

  Are you considering creating a report on a topic that requires multiple documents and thousands of words?

  Claude caught you, and the most impressive thing is … only Claude can, because in this case, it is actually your only choice.

  Claude is unique in some use cases.

  However, Claude's other impressive thing is the new code functions, which are especially important now that GPT-4 has become worse in this respect.

  Claude can understand code, debug code, and even propose or implement improvements according to commands.

  This means that Claude 2 is expected to become a real competitor of GitHub Copilot and GPT-4 and become the co-pilot assistant of coders, which no one can say today.

  But what makes Claude really different? Well, actually, this is how it is trained.

  Rule AI with AI

  When creating a chatbot like ChatGPT, there is a very important step in this process, which only a few companies in the world can afford, namely RLHF.

  The reinforcement L obtained from human feedback not only teaches artificial intelligence to be good at talking, but also helps it "forget" some prejudices learned in the pre-training stage.

  Because ChatGPT is trained by self-monitoring through Internet texts, and the Internet is full of hatred, racism and other unacceptable prejudices, we use RLHF to counter this situation.

  In the context of LLM, training them by self-monitoring means that human beings will not manually annotate the data to teach the model (tell it what the correct answer is, which will make the process a supervision process), but just hide the next word and teach it to predict it. When the amount of data sent to the model reaches several billion words, the monitoring process will not be feasible. In fact, the next step in the training process is supervised. There are only thousands of examples in the data set (therefore, it is feasible to make it manual), and the teaching model begins to operate in a dialogue way. The model at this stage is called SFT (supervised fine-tuning model) and precedes RLHF stage.

  In order to carry out this RLHF process, OpenAI uses human beings to teach ChatGPT what is good and what is bad.

  But human nature trained Claude differently.

  They used artificial intelligence following a series of human principles (called "constitution") to teach Claude what is good and what is bad.

  They call this method "constitutional artificial intelligence", making it a chat robot that can take us into a world where human beings no longer decide whether it is good or bad.

  Man should not play God.

  Anthropic believes that certain human groups are not suitable for "playing God" and teaching models what is right. We need to use artificial intelligence guided by the global principles of human beings to guide other artificial intelligence.

  RLAIF and RLHF and Moral Self-correction

  If we are creating something much smarter than ourselves, which is often called super intelligence in the world of artificial intelligence, then what makes us think that human beings are ready to control such a superior "existence"?

  Therefore, Anthropic decided to start using other AI to guide Claude, and defined the RLAIF of AI feedback. This process is the same as RLHF, but artificial intelligence is used instead of human for feedback.

  But Anthropic also uses other technologies to make its model "safer", one of which is moral self-correction.

  Just tell it's fair

  According to their research linked at the end of this article, LLMs with more than 22 billion parameters are old enough to learn complex definitions such as "fairness" or "injustice", and if they say so, their prejudice will be reduced.

  Most importantly, if you tell the LLM that it is fair, they will be fair.

  Logically, because many future users will not ask Claude to be fair, they fine-tune Claude v2, making it naturally inclined to abide by its constitution and a series of principles that should not be violated (such as fairness). This is probably the most harmless chat robot at present (although OpenAI seems to be committed to objecting to this and destroying the usefulness of ChatGPT).

  If we look at this study on the surface, we will draw three important conclusions:-Size is still important, because the larger the model, the more complicated the representation it can learn from our world. -Emergency behavior seems to be one thing, which means that although the design principle of LLM still exists, with the increase of our scale, unexpected behavior will appear, but usually it will produce benefits. -We really don't understand LLM at all. For example, researchers proved the importance of achieving the 22 billion parameter milestone, but they did not explain why it happened at 22 billion instead of 21. These show how poorly understood this new technology we are developing is.

  The game entered a white-hot stage.

  If you think that no one can fight against OpenAI, then you forget that many researchers who put it in their current position have left this company and are creating other excellent companies, such as Anthropic and its chat robot Claude.

  To be honest, the idea of "combining artificial intelligence with artificial intelligence" appeals to me, but it may be dangerous to separate human beings from pipelines.

  At the same time, Claude just introduced us to the strength of GenAI. If you are lucky enough to live in the United States or Britain, you can try it yourself.