Join Gen AI Enterprise Leaders in Boston on March 27th for an exclusive evening of conversations about networking, insights, and data integrity.request an invitation here.
You may not have noticed it, but the rapid progress is AI technology has ushered in a new wave of AI-generated content, from surreal images to compelling video and text. But this proliferation has opened Pandora's box, unleashing a potential torrent of misinformation and deception, challenging our ability to discern truth from fabrication.
The fear that we are becoming lost in totality is, of course, not unfounded. Since 2022, AI users have collectively created over 15 billion images of him. To put this huge number into perspective, it took him 150 years for humans to create the same amount of photos by 2022.
an astonishing amount AI-generated content It has an impact that we are only beginning to discover. Because of the sheer volume of AI-generated images and content, historians see the post-2023 Internet as radically different from before, in the same way that the atomic bomb set back radiocarbon dating. It would have to be. Already, many Google Image searches are yielding results for artificial intelligence, and evidence of war crimes in the Israel/Gaza conflict has been accused of being artificial intelligence when in fact it wasn't. is increasing.
Embedding “signatures” into AI content
For beginners, deep fake It is essentially counterfeit content generated using machine learning (ML) algorithms. These algorithms create realistic footage by mimicking human facial expressions and voices, and a preview of his Sora (OpenAI's text-to-video model) last month showed how virtual reality compares to physical reality. further demonstrated how they rapidly become indistinguishable.
Naturally, with preemptive attempts to control the situation and concerns mounting, tech giants have stepped into the fray, proposing solutions to usher in the tide of AI-generated content in hopes of taking control of the situation. There is.
In early February, Meta began labeling images created using its AI tools on platforms like Facebook, Instagram, and Threads to remove visible markers, invisible watermarks, and artificial origins. announced a new effort to incorporate detailed metadata that shows Following that, Google and OpenAI announced similar measures aimed at embedding „signatures“ within content generated by their AI systems.
These efforts are supported by the open source Internet protocol The Coalition for Content Provenance and Authenticity (C2PA) is a group founded in 2021 by arm, BBC, Intel, Microsoft, Truepic, and Adobe with the aim of tracking the origins of digital files and helping distinguish between genuine and manipulated content. is.
These efforts are an attempt to promote transparency and accountability in content creation, which of course is a force for good. But while these efforts are well-intentioned, do they mean we walk before we run? Will it really be enough to prevent potential abuses of this evolving technology? Or is this a solution before its time?
Who decides what is true?
I ask because when you create such a tool, problems immediately appear. Can detection be made universal without giving those with access the power to abuse it? If not, how can we prevent those who administer the system from abusing the system itself? finds himself back at square one and the question of who decides what is true. This is the elephant in the room. Before I answer this question, I'm concerned that I'm not the only one who notices this.
This year's Edelman Confidence Barometer We uncovered important insights into public trust in technology and innovation. The report highlights widespread skepticism about the management of innovation by institutions, with people around the world believing that innovation is poorly managed (39%) more than it is well managed (22%). ), indicating that a significant proportion express concerns about rapid innovation. The pace of technological change is not beneficial to society as a whole.
The report highlights widespread public skepticism about how companies, NGOs and governments introduce and regulate new technologies, as well as concerns about the independence of science from political and economic interests. highlights concerns.
As technology has repeatedly shown, as countermeasures become more sophisticated, so too does the ability of the problems to which they are imposed (and vice versa). To prevent watermarks from becoming entrenched, we must start by reversing widespread public mistrust of innovation.
As we have seen, this is easier said than done. last month, Google Gemini It came under fire after it rendered the image absurd with shadow prompts (a method by which an AI model takes prompts and changes them to suit specific biases). One Google employee singled out the X Platform as „the most embarrassing thing“ he had ever been at the company, and the model's tendency to not produce images of white people has made it the center of a culture war. I apologized, but the damage was done.
Don't CTOs need to know what data models are being used?
Recently, the following video Mira Murati, CTO of OpenAI has been interviewed by washington post I caught a virus. In the clip, Murati is asked what kind of data was used to train Sora, and Murati replies, „Public data and licensed data.“ To her follow-up question about exactly what data was used, she admitted she wasn't actually sure.
Given that the quality of training data is so important, you would expect this to be the central question that CTOs need to discuss when making the decision to commit resources to a video transformer. Her subsequent interruption of the line of questioning (which I might add was otherwise a very friendly interview) is also alarming. There are only two reasonable conclusions you can draw from this clip: Either she's a lackluster CTO, or she's a lying CTO.
Of course, there will be many more episodes like this as this technology is rolled out en masse, but if we want to reverse the trust deficit, we need to make sure we have some standards in place. . General education about what these tools are and why they are needed would be a good start. Consistency in how things are labeled and steps to hold individuals and organizations accountable when things go wrong would also be welcome additions. Additionally, when things inevitably go wrong, there needs to be open communication about why things happened the way they did. Transparency throughout and in every process is essential.
Without such measures, I fear that watermarks will act as just a whitewash and fail to address the fundamental problems of misinformation and diminished trust in synthetic content. Rather than serving as a powerful tool for authenticity verification, it can become just a token gesture, perhaps being avoided by someone with the intention of deceiving or believing they have already been deceived. It can be simply ignored by the person.
As we are seeing (and as we are already seeing in some places), election interference through deepfakes will likely define the story of this year's generation of AI. With more than half of the world's population voting and public trust in institutions still at rock bottom, this is an issue that must be resolved before things like content watermarking can be expected to swim rather than sink.
The founder of Elliott Levy is acquaintanceEurope's first generative AI consulting company.
data decision maker
Welcome to the VentureBeat community!
DataDecisionMakers is a place where experts, including technologists who work with data, can share data-related insights and innovations.
If you want to read about cutting-edge ideas, updates, best practices, and the future of data and data technology, join DataDecisionMakers.
You may also consider Submit an article It's your own!