AI learns from what you said on Reddit, Stack Overflow or Facebook. Are you okay with that?

CAMBRIDGE, Mass. (AP) — Post a comment on Reddit, answer coding questions on Stack Overflow, edit a Wikipedia article or share a baby photo on your public Facebook or Instagram feed and you’ll also be helping to train the next generation of artificial intelligence.

Not everyone is happy about this, especially since the same online forums where they have been expressing their opinions for years are increasingly being flooded with AI-generated comments that sound like what real people would say.

Some loyal users have tried to delete their previous posts or rewrite them into nonsense, but the protests have had little effect. A handful of governments — including Brazil’s privacy regulator on Tuesday — have also tried to intervene.

“A larger portion of the population just feels helpless,” said Reddit volunteer Sarah Gilbert, who also studies online communities at Cornell University. “There’s no other way than to go completely offline or not contribute in ways that add value to themselves and others.”

Platforms are responding — with mixed results. Take Stack Overflow, the popular hub for computer programming tips. It first banned comments written by ChatGPT because of frequent errors, but now it’s partnering with AI chatbot developers and has punished some of its own users who tried to delete their previous contributions in protest.

It’s one of many social media platforms grappling with user reluctance — and occasional uprisings — as they try to adapt to the changes brought on by generative AI.

Software developer Andy Rotering of Bloomington, Minnesota, has used Stack Overflow daily for 15 years and says he worries that the company is “inadvertently harming its greatest resource”: the community of contributors who have donated time to help other programmers.

“It should be of paramount importance to ensure that contributors remain motivated to comment,” he said.

Stack Overflow CEO Prashanth Chandrasekar said the company is trying to balance the growing demand for immediate chatbot-generated coding support with the desire for a community “knowledge base” where people still want to post and “be recognized” for what they’ve contributed.

“Fast forward five years — there’s going to be all kinds of machine-generated content on the Web,” he said in an interview. “There’s going to be very few places where there’s really authentic, original human thought. And we’re one of those places.”

Chandrasekar describes Stack Overflow’s challenges as one of the “case studies” he learned about at Harvard Business School in how a company survives — or dies — after a disruptive technological change.

For more than a decade, users typically arrived at Stack Overflow after typing a programming question into Google, then finding, copying, and pasting the answer. The answers they saw most often came from volunteers who had been racking up points to gauge their credibility — which in some cases could help them get jobs.

Now, programmers can simply ask an AI chatbot (some of which have already been trained on everything ever posted on Stack Overflow) a question and it can instantly provide an answer.

The launch of ChatGPT in late 2022 threatened to put Stack Overflow out of business. So Chandrasekar assembled a dedicated 40-person team at the company to accelerate the launch of its own specialized AI chatbot, called Overflow AI. The company then struck deals with Google and ChatGPT maker OpenAI, allowing the AI developers to tap into Stack Overflow’s Q&A repository to further improve their AI models for major languages.

That kind of strategy makes sense, but it may have come too late, said Maria Roche, an assistant professor at Harvard Business School. “I’m surprised Stack Overflow didn’t work on this sooner,” she said.

When some Stack Overflow users attempted to delete their previous comments after the Open AI partnership was announced, the company responded by suspending their accounts due to its terms that ensure all contributions are “perpetually and irrevocably licensed to Stack Overflow.”

“We addressed it quickly and said, ‘Look, that is not acceptable behavior,’” said Chandrasekar, who described the protesters as a small minority in the “low hundreds” of the platform’s 100 million users.

Brazil’s national data protection authority took action Tuesday to ban social media giant Meta Platforms from training its AI models on Brazilians’ Facebook and Instagram posts, imposing a daily fine of 50,000 reais ($8,820) for non-compliance.

Meta called it a “step back for innovation” in a statement and said it has been more transparent than many industry peers offering similar AI training for public content, and that its practices comply with Brazilian law.

Meta has also faced resistance in Europe, where it recently shelved plans to use people’s public messages to train AI systems — which were originally supposed to begin last week. In the U.S., where there is no national law protecting online privacy, such training is likely already taking place.

“The vast majority of people have no idea that their data is being used,” Gilbert said.

Reddit has taken a different approach: It has partnered with AI developers like OpenAI and Google, while also making it clear that content cannot be taken over in bulk without the platform’s approval by commercial entities “without regard to user rights or privacy.” The deals helped give Reddit the cash it needed to debut on Wall Street in March, with investors driving the company’s valuation to nearly $9 billion seconds after it began trading on the New York Stock Exchange.

Reddit hasn’t tried to punish users who protest — and it probably wouldn’t, given how much say volunteer moderators have in what happens in their specialized forums, called subreddits. But what worries Gilbert, who helps moderate the “AskHistorians” subreddit, is the increasing flood of AI-generated comments that moderators must decide whether to allow or ban.

“People come to Reddit because they want to talk to people, they don’t want to talk to bots,” Gilbert said. “There are apps that let them talk to bots if they want to. But Reddit has historically been about connecting with people.”

She said it’s ironic that the AI-generated content threatening Reddit is based on the comments of millions of human Redditors, and “there’s a real risk that it will ultimately drive people away.”

——

Eléonore Hughes, an Associated Press journalist in Rio de Janeiro, contributed to this report.

——

The Associated Press and OpenAI have a licensing and technology agreement that gives OpenAI access to some of AP’s text archives.

Leave a Comment Cancel reply