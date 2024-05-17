



In recent months, OpenAI has been losing employees who pay close attention to ensuring the safety of its AI. Currently, the company continues to bleed aggressively.

Ilya Sutskever and Jan Leike announced Tuesday their departure from OpenAI, the developer of ChatGPT. They are the leaders of the company's super-alignment team, tasked with ensuring that AI aligns with the goals of its creators, rather than acting unpredictably and harming humanity. I did.

They weren't the only ones who left. Since OpenAIs' board of directors tried to fire CEO Sam Altman last November, only to see him quickly return to power, at least one of its most safety-critical employees has Five people resigned or were kicked out.

what's happening?

If you've been following this story on social media, you might think that OpenAI has quietly made a huge technological breakthrough. Meme What did he see, Ilya? I guess former chief scientist Mr. Satskever quit because he saw something scary, like an AI system that could wipe out humanity.

But the real answer may have less to do with pessimism about technology and more to do with pessimism about people, and one person in particular: Altman. People familiar with the company say safety-minded employees have lost confidence in him.

A person familiar with the company's internal affairs said on condition of anonymity that trust is gradually collapsing like dominoes.

Not many employees are willing to talk about this publicly. Part of the reason is that OpenAI is known for forcing employees to sign offboarding agreements with stigmatizing clauses upon termination. If you refuse to sign, you could be giving up your company stock and potentially losing millions of dollars.

However, one former employee refused to sign an offboarding agreement so that he could freely criticize the company. Daniel Cocotajiro joined OpenAI in 2022 with the hope of steering the company toward secure AI adoption, and worked on the governance team until leaving last month.

OpenAI is training ever more powerful AI systems with the goal of eventually surpassing human intelligence across the board. This may be the best thing to ever happen to humanity, but it could also be the worst if we don't proceed with caution, Cocotadillo said this week.

OpenAI says it wants to build artificial general intelligence (AGI), a virtual system that can perform at human or superhuman levels across many domains.

I came in with high hopes that OpenAI would address this situation and act more responsibly as we move closer to achieving AGI. For many of us, Cocotadillo said, it became increasingly clear that this was not going to happen. I gradually lost faith in his OpenAI leadership and their ability to handle AGI responsibly and quit.

And Reike, who explained why he resigned as co-leader of the Super Alignment team in the X thread, described a very similar situation on Friday. “I have disagreed with OpenAI's leadership about the company's core priorities for quite some time, but I have finally reached a breaking point,” he wrote.

OpenAI did not respond to a request for comment in time for publication.

Why OpenAIs' safety team started to mistrust Sam Altman

To understand what happened, we have to rewind to November of last year. At that time, Mr. Sutskever worked with the OpenAI board to try to remove Mr. Altman. The board said Mr. Altman's communications were not consistently candid. Translation: We don't trust him.

The expulsion failed spectacularly. Altman and his ally, company president Greg Brockman, threatened to take OpenAI's top talent to Microsoft and effectively destroy OpenAI unless Altman returned. Faced with that threat, the board relented. Mr. Altman is back stronger than ever, with new, more supportive board members and more freedom to run the company.

If you shoot at the king and miss, things can get tricky.

In public, Sutskever and Altman appeared to have an ongoing friendship. And when Mr. Sutskever announced his resignation this week, he said he was leaving to pursue a project that was very personally meaningful to me. Mr. Altman posted on X two minutes later saying this is very sad to me. Ilya is a dear friend.

However, Mr. Satskever has not appeared at the OpenAI office for about six months since the coup attempt. He remotely co-directs his team in Super Alignment and is tasked with ensuring that his AGI in the future aligns with humanity's goals rather than going in a nefarious direction. That's an impressive enough ambition, but it's a far cry from the company's day-to-day operations as it races to commercialize products under Mr. Altman's leadership. And then there was the following tweet, posted shortly after Altman's return and quickly deleted:

So despite their public friendship, there is reason to be skeptical that Sutskever and Altman were ever friends after the former tried to oust the latter.

And Altman's reaction to being fired revealed something about his character. His threat to hollow out OpenAI if the board doesn't rehire him, and his insistence on filling the board with new members biased in his favor, are a sign of his desire to retain power and avoid the future. He showed determination. Check it out. His former co-workers and employees have described him as a manipulator who speaks out loud, saying he wants to prioritize safety, but his actions contradict that.

For example, Altman could raise money from a dictatorship like Saudi Arabia to start a new AI chip manufacturing company, which would provide vast amounts of much-needed resources needed to build cutting-edge AI. It turns out. This was concerning for safety-conscious employees. If Altman really cares about building and deploying AI in the safest way possible, why is he seemingly in a mad dash to accumulate as many chips as possible that will only accelerate the technology? And for that matter, why did he take the security risk of working with a regime that could use AI to escalate digital surveillance and human rights abuses?

For employees, all of this leads to a gradual loss of belief that when OpenAI says it's going to do something or says it cares about something, it's actually true. A source familiar with the company's internal affairs told me.

That gradual process culminated this week.

Jan Reik, co-leader of the Super Alignment team, did not try to play well. Just hours after Mr. Sutskever announced his resignation, he posted on X that I have resigned. There is no warm farewell. There is no vote of confidence in the company's management.

Other safety-minded former employees tweeted references to Lykes' blunt resignation with heart emojis. One of them was Sutskever ally Leopold Aschenbrenner, a member of the Super Alignment team who was fired from OpenAI last month. According to media reports, another researcher on his team, Pavel Izmailov, was fired on suspicion of leaking information. However, OpenAI has not provided any evidence of a leak. And given the strict non-disclosure agreements everyone signs when first joining OpenAI, even if Mr. Altman, a deeply networked Silicon Valley veteran and news operations expert, is enthusiastic, It would be easy to portray the sharing of even the most innocuous information as a leak. To eliminate Sutskevaar's allies.

The same month that Aschenbrenner and Izmailov were fired, another safety researcher, Karen O'Keefe, also left the company.

And two weeks ago, another safety researcher, William Saunders, posted a cryptic post on the EA Forums, an online gathering place for members of the Effective Altruism movement who have been deeply involved in the cause of AI safety. Did. Sanders summarized the work he did at OpenAI as part of the Super Alignment team. He then wrote: I left his OpenAI on February 15, 2024. One commenter asked the obvious question: Why did Sanders post this?

Mr. Sanders said he had no comment. Commenters concluded that he was probably bound by a non-defamation agreement.

All of this, combined with conversations with internal insiders, suggests that at least some people who tried to push OpenAI in a more secure direction from within ended up losing trust in the charismatic leader and were unable to maintain their position. Seven people are visible.

I think many people in the company who take safety and social impact seriously consider this an open question. Is it a good idea to work for a company like OpenAI? said a person familiar with the company's internal affairs. And as long as OpenAI is truly thoughtful and responsible about its actions, the answer can only be yes.

Now that the safety team has been dismantled, who will ensure that OpenAI is working safely?

With Reich no longer responsible for running the Super Alignment team, OpenAI has replaced him with company co-founder John Schulman.

However, the team has become hollow. And Shulman already has his hands full with his existing full-time job ensuring OpenAI's current products are secure. How serious and forward-looking are the safety measures we can expect from OpenAI in the future?

Probably not much.

The whole point of creating the Super Alignment Team was that if the company were to succeed in building AGI, there would actually be all sorts of safety issues, people familiar with the matter told me. So this was a dedicated investment in its future.

Even when the team was at full capacity, the dedicated investment involved only a small percentage of OpenAI researchers, with only 20% of the computing power, perhaps the most important resource in an AI company, being committed. did not. Now, that computing power may be siphoned off to the rest of his OpenAI team, and it's unclear whether the focus will be on avoiding catastrophic risks with future AI models.

To be clear, this does not mean that products like OpenAI's current release of a new version of ChatGPT called GPT-4o that allows for natural user interaction will destroy humanity. But what comes down the pike?

It is important to make a distinction. Are we currently building and deploying insecure AI systems? vs. Are we on track to securely building and deploying AGI or superintelligence? People familiar with the matter said. I think the answer to the second question is no.

Reich expressed the same concerns in Friday's thread about X. He noted that teams are struggling to get enough computing power to do their work and are generally sailing against the wind.

Most impressively, much of our bandwidth goes into preparing next-generation models, including security, surveillance, readiness, safety, adversarial robustness, (hyper)coordination, confidentiality, social impact, and related topics. Mr. Reich said that it is necessary to spend . . Solving these problems will be very difficult and I fear we are not on track to get there.

When one of the world's leading minds in AI safety says that the world's leading AI companies are not on the right track, we are all right to be concerned.

