The Cookie Jar, January 2024
NYT-OpenAI showdown; AVs too ambitious; Sharenting 2.0; Captive internet; Duolingo bites the AI bug; Deepfakes going too far; and more…
Steal for free, or pay a fee
Would you like it if someone copied your diligently crafted project or report and passed it off as their own to anyone who asked for it? This appears to be the ‘New York Times’ (NYT) peeve with Open AI and Microsoft, as they sued the two for copyright infringement in what is set to be a watershed moment for AI-related copyright murkiness. NYT claimed in the lawsuit that ChatGPT has been reproducing almost verbatim versions of NYT’s investigative journalism pieces.
While some experts claim training LLMs on vast amounts of data on the internet would fall under “fair use” for the health of open information ecosystems, other experts call it plain stealing, and then there are the consequences on privacy. Copyright laws, over time, have evolved to protect intellectual properties, specifically to prevent theft of creative or intellectual labour and substitution of any economic advantage thereof. Platforms like Google that consume all the information on the internet also only give glimpses of that to the public when it comes to copyrighted material and drive traffic over to the source instead of substituting the original work. For example, you only see an excerpt from books or research papers, unless you buy or subscribe to them. When you see copyrighted images, you see thumbnails or watermarked results unless you buy them.
It’s one thing to redirect readers to your AI chatbot products by offering an alternative source of information via the “transformative” and “fair use” of human-produced works; and a whole other deal to scrape content online, co-opt, ingest, free-ride on, and infringe upon copyrighted works without seeking permission. The times are such that Open AI claims it’s impossible to build models without copying, and that they should not be charged for all that stealing!
Just pay and use: Axel Springer (news publishing giant and parent of popular publications Politico and Business Insider) will allow OpenAI to use its content for a fee. This is akin to Grimes offering to split revenues in exchange for AI artists using her voice or Universal Music’s bid to monetise AI-generated music. Last we checked, without consent, it’s still theft. So what’s the right thing to do? Just pay licence fees or royalties like Netflix and Spotify do. Or be prepared to dish out itsy-bitsy billions for non-compliance.
Self-drive yourself out
The Indian start-up fraternity has been unveiling one self-driving autonomous vehicle after another. In July 2023, Bengaluru-based AI start-up ‘Minus Zero’ introduced their self-driving car — the ‘zPod’ — that they claim can be scaled up to ‘Level 5’, the highest level of autonomy for self-driving vehicles.
But little do these AV start-ups realise that not only are self-driving vehicles potentially dangerous for Indian roads, L5 autonomy is but semi-autonomous, and even Tesla is not close to true Level 5 self-driving; it is currently at Level 2 out of 6 in vehicle automation; and General Motors’ driverless taxi fleet Cruise was recently grounded owing to safety concerns.
Tesla’s tech may be inherently flawed, holding drivers responsible for covering for its tech failures. Their solution is focused on measuring drivers’ attention. Tesla has even admitted on its website that their ‘Autopilot’ and ‘Full Self-Driving’ cars can’t actually drive themselves and require driver intervention at all times, making it clear as day that it was more hype than actual tech.
Bringing us to India’s Minister for Road Transport and Highways, Nitin Gadkari’s statement on self-driving cars in India — “Will never allow driverless cars to come into India”. He is on a mission to prevent job losses among Indian drivers, and in the process, also preventing Tesla’s famous “Self-Driving Cars” from lining up roadkills on the infamous Indian roads.
The audacity — $50 for your child’s data.
Parents hungover on the good ‘ol days of social media were beginning to battle the prospect of easy celebrity status and money for their kids alongside the perils of sharenting when the ever-hungry data mammoth Google sneaked in on their children’s facial recognition data.
We kid you not, 50 measly dollars and Google managed to convince parents to share video footage of their children’s faces, aged 13 to 17 years, supposedly for “age verification purposes”. Crouched behind a contractor, ‘Telus International’, a subsidiary of Google ran this project from November 2023 to January 2024 in Canada.
It’s baffling how these schemes achieve their objective so freely, as just in December of 2023 we learned how AI training datasets include child pornography. If any parent thought 50 bucks was worth exchanging invaluable data of their children, they should know AI systems are capable of producing explicit photos of fake children as well as turning real photos of actual kids into deep fake nudes. Time to reconcile that social media is not the same as when you first signed up on Orkut!
Bicycle? More like marijuana.
Remember a time when being away from your phone for two minutes didn’t induce crippling anxiety? We can’t either. Shaking our heads at the naiveté of tech scions like Steve Jobs, who thought screens would be a ‘bicycle for the mind’, we are left to deal with the effects of smartphone and social media addiction.
In May of 2023, the Surgeon General of the United States, Dr. Vivek Murthy, spoke about the risk of harm to the mental health and well-being of children and adolescents in the Surgeon General’s Advisory on Social Media and Youth Mental Health. Experts in online child protection are pushing for the urgent need for raised awareness of children's exposure to digital devices, with some as young as two losing their vision and experiencing speech delays.
AI’s war on writers
Employers everywhere are replacing writers and translators with AI. Duolingo, the language learning app, rose like a phoenix from nothing, clocking in over 23 Billion completed lessons and 8.4 Million active learners studying Hindi. We loved it for teaching us the nuances of a language utterly foreign to our existence. But now, it’s going to be just another app doling out robotic, poor-quality lessons teeming with errors, missing the human touch.
In the latest high-profile tech layoff, Duolingo let go of 10% of their contract workforce, mainly translators. Turns out, most of them are very aware of the deteriorating quality of lessons and the barrage of errors in them.
We already know that LLMs hallucinate — in that they can regurgitate incorrect answers to questions they don’t know — and are essentially incapable of admitting when they’re wrong. Now imagine your language teacher blurting out incorrect answers to your questions with confidence. It’s just not going to be the same learning with Duolingo anymore, tut tut!
Miss us OG writers, yet? 😉
Deepfakes takedown not Swift enough
Deepfakes were cute when the first few took us by surprise. Remember the one with the dancing Queen Elizabeth? Not anymore, as deepfakes of Taylor Swift are shaking fans and policymakers into demanding action. X’s attempts at clearing the mess (post Musk’s decision to remove content filtering) only elicit a collective facepalm.
What does this mean for ordinary women and girls? Experts are actively sharing their concerns about how women are the prime target (still surprised?) of weaponised AI in the form of abuse and intimidation via deepfake pornography. AI companies releasing these tools are somehow always late to the show. By the time they start to impose limits and ask users to pay a fee, the damage is already done.
Not this again: 404 media reports the Swift-fake was created using a Microsoft free text-to-image generator and popped up on 4chan - a fringe site notorious for propagating conspiracy theories and mainstreaming trolling ‘for the lulz’ - and a Telegram group, before hitting X. Suffice to say when you put sophisticated AI tools like image generators and audio editors in the hands of nameless, faceless internet trolls via free, open-source access, we can’t really be surprised at the churn-out of malicious, hateful content that disrupts even evidential court proceedings.
Kill-able Bills
In a spree to modernise and update archaic laws, the Indian government has announced the Broadcast Services Bill and the Indian Telecommunications Bill, and there are hardly any surprises! They remain colonial, draconian, and archaic! The Telecom Bill has emerged as a gigantic threat to our fundamental rights, and the Broadcast Bill ensures that the Internet’s gone to the dads.
These bills may have time-travelled us back to 1885 with their ambiguity, threatening user privacy and online anonymity. And the icing on the cake would be excessive and unchecked power in the hands of the Government to control OTT platforms, surveil your device, and suspend your internet. What this means is that you have to put up with dad jokes, but dad will not put up with jokes… or questions.
Think Signal, Zoom, Skype, Gmail, WhatsApp — all your favourite apps under Government watch. The makers of the Telecom Bill and the Broadcast Bill would argue that they are both a welcome change. However, they do consist of numerous trade-offs and challenges that need to be addressed. We’d say ‘change’ and ‘reform’ are two different things.
Snoop supreme
On January 10th, OpenAI amended its usage policies to reword one of its clauses to be more “readable”. Who knew lucidity would come at the cost of integrity? Open AI’s policy had a ban on “activity that has high risk of physical harm”, including “weapons development” and “military and warfare”. In the new clause, the phrase “military and warfare” has vanished, indicating the potent tech’s potential for more surreptitious applications in reconnaissance.
So what, you ask? India’s contentious Digital Personal Data Protection Act (DPDPA) has an exemption for publicly available personal data. This could mean that Indians’ data, including that of security personnel, could be used for surveillance, training, and targeting, among others. Although OpenAI’s new policy states that users are not to use its tools to “harm yourself or others”, it abounds in grey areas resulting from a lack of specificity, which in turn raises questions about its effectiveness.
Speaking of: Facebook is doing it again by rolling out a new sus feature called ‘Link History’. The company will continue to track and store data in the name of yet another confusing feature, doing what it does best. This is Meta’s way of working around tech regulations amidst iOS and Android’s privacy beef-up aimed at tightening the noose around the massive data harvesting empire, i.e. Facebook.
Via Link History, all the links that you ever visited will be saved by default — unless you turn it off — which FB will most definitely use for more targeted content. It’s like giving users the illusion of privacy while stealing more of it from right under their noses.
And the apple doesn’t fall far: Turns out, Apple was no good either. They’ve known as early as 2019 that their AirDrop users could be identified and tracked. They just didn’t do anything about it, even as Chinese authorities used it to track down some users. In a plot twist, it’s seemingly rare for China to publicly disclose its capabilities, and all fingers are pointing towards an intentional reveal. One reason could be to scare dissidents from using AirDrop.
Odds stacked against Substack
If you’re reading this newsletter, you probably know what Substack is. As with any platform that interacts with a large audience on the daily, Substack’s content moderation policy was called into question by readers and publishers, pointing out anti-Semitic and white supremacist content on the platform despite its terms of use prohibiting users from posting hateful content. This resulted in popular tech publication Platformer leaving Substack for Ghost — an open-source alternative — along with others like the evolutionary biologist Richard Dawkins, who signed a post titled “Substack shouldn’t decide what we read” and an open letter to the company from more than 200 writers demanding an explanation.
Substack’s response, “…we don’t like Nazis either — we wish no one held those views”, obviously hasn’t gone down well. FT argues that unlike other platforms that algorithmically recommend and influence the content you consume, Substack does not recommend content and hence cannot be treated the same way as other publishers. Mashable thinks Substack launching ‘Notes’ — which allows users to share their thoughts and content related to newsletters — calls for the platform’s content moderation policies to be rigorously enforced. We shall wait for the jury to be out.
CDF chips
CDF contributes to the whitepaper on DPDPA
Following CDF’s roundtable discussion with the Wadhwani Centre for Government Digital Transformation (WGDT) last month, the WGDT has released a whitepaper — “𝐏𝐞𝐫𝐬𝐨𝐧𝐚𝐥 𝐃𝐚𝐭𝐚 𝐏𝐫𝐨𝐭𝐞𝐜𝐭𝐢𝐨𝐧 𝐀𝐜𝐭 - 𝐀𝐦𝐛𝐢𝐠𝐮𝐢𝐭𝐢𝐞𝐬, 𝐋𝐢𝐦𝐢𝐭𝐚𝐭𝐢𝐨𝐧𝐬 𝐚𝐧𝐝 𝐑𝐞𝐜𝐨𝐦𝐦𝐞𝐧𝐝𝐚𝐭𝐢𝐨𝐧𝐬” — compiling recommendations put forth by the panelists in the discussion. CDF’s recommendations have been outlined in Section 2.3 of the paper and include those such as appointing an intermediary certification body to assess the nature and risks of data processing methods of specific platforms or apps, among other recommendations we believe will help ensure that the DPDP Act’s rules leave as little room as possible, for self-serving ambiguous interpretations.
RAI hacks for all
Ashoka’s Tech and Humanity Initiative and OpenNyAI curated the 23rd issue of the Social Innovations Journal (SIJ) and CDF contributed a practical implementation guide to getting started on Responsible AI (RAI). The what, when and how of incorporating AI governance in social enterprises highlights the opportunities and real-world challenges encountered by social sector organisations during various stages of AI transformation. SIJ officially launched the journal — Social Entrepreneurs Leveraging and Shaping AI — on January 18, in a meet-the-author event where the authors and top global voices in RAI including Daniela Matielo (Co-founder, Ashoka AI Lab), Odin Mühlenbein (Co-founder, Ashoka AI Lab) Nidhi Sudhan (Co-founder, CDF), Smita Gupta (Co-leader, OpenNyAI), and Hera Hussain (Founder and CEO, Chayn), participated in panel discussions on RAI moderated by Sachin Malhan (Co-founder, Agami) and Hanae Baruchel (Senior Change Leader, Ashoka's Next Now).
Enter Good Tech Squad 🥁🥁
January was a busy one for the Good Tech Squad (GTS) at Trivandrum International School (TRINS). GTS is a peer-to-peer support system to foster safe tech interactions in schools, shepherded by CDF. From installing ‘Black Boxes’ around the campus — for students to anonymously share their online concerns — to putting their heads together for GTS launch activities and building the first interactive tipsheet for students, the GTS has rolled out with much enthusiasm. This marks the first of many events in the GTS’ roster and we’re excited to see how their plans for the year unfold!
CDF is a non-profit, tackling techno-social issues like misinformation,
online harassment, online child sexual abuse, polarisation, data & privacy breaches, cyber fraud, hypertargeting, behaviour manipulation, AI bias,
election engineering etc. We do this through knowledge solutions,
responsible tech advocacy, and good tech collaborations under our goals of
Awareness, Accountability and Action.