AI for the rest of us

Welcome to the AI community for everyone.

Hello friends,

With tropical storms rumbling around, Hannah has been unusually productive from the beautiful island of Okinawa. She’s also discovered a tiny hole in the bottom of her 10 year old Adidas gazelles which is causing her left foot to get really rather wet. If the next pair lasts 10 years she'll be wearing them until she's 50 (WTF) which makes this a very big decision! [those of us well past 50 would like a word about that WTF, Hannah].

Of course there is only one sensible outcome, the next pair will have to be black or grey, just like the gazelles that have been retired and the ones before that. Is that boring? Frugal? Minimalist? Normcore?

You have just 2 more days to get your half price ticket to Agent Craft for £48. This discount code expires on Monday at midnight. Agent Craft is an unconference, which is a unique format for meeting like-minded people. During the afternoon we will create the agenda together and then host multiple discussion groups on the topics that YOU voted for. There is something for everyone!

Hannah will be flying home from Japan on Tuesday, just in time to recover from jetlag, repack her case and head to London for our next AI for the rest of us meetup on Thursday! This one is a bit special - both speakers focus on the same problem but tackle it in different ways. If AI tools now write the code, we need better ways to interact with that code, to review that code, to understand that code.

From powerful visualisations, to narrative programming and cognitive science, we'll be exploring the possibilities and hearing from the innovators pushing the boundaries.

Free on Thursday? Join us!

Being the wrong side of 50 and essentially a walking middle-aged male cliché, Charles took advantage of the sunny weekend in Surrey and spent it gardening, and well away from screens and computers. It was bliss.

Have a great week,

Hannah & Charles

Grab A Free Book On AI Native

174 patterns, 422 pages — #1 Bestseller 'From Cloud Native to AI Native' is free for a limited time.

What’s Charles reading this week?

Hannah wrote about GitHub’s reliability woes last week, and then LeadDev asked me to do the same. That enabled me to go meta, and quote Hannah in the story, but also got me thinking. GitHub argues that its problems are caused by unprecedented scaling demands caused by AI, but I’m not convinced. I think they might be more down to poor engineering combined with conflicting priorities at Microsoft. Also there’s a huge irony here, given the role GitHub played in promoting AI-powered development in the first place.

The Register has an interesting, but to my mind slightly odd, story that the UK's National Health Service (NHS) is ordering all of its technology leaders to temporarily wall off the organisation's open-source projects, over concerns relating to advanced AI and Anthropic's Mythos.

The move reverses the NHS's own longstanding policy that publicly funded code should be publicly available, and was apparently triggered by a model that is itself locked behind a restricted access programme called Project Glasswing.

Security experts, including the former head of open technology at NHSX, point out that the NHS code was almost certainly already ingested into AI training data years ago, that copies exist across digital archives and the collections of data hoarders, and that the real threats to NHS cybersecurity — phishing, weak passwords, supply chain vulnerabilities — have nothing to do with whether a documentation repo is set to public. The net result is that the NHS has managed to make its code less transparent to legitimate researchers and the public it serves, while doing essentially nothing to deter any attacker worth worrying about.

Google launched its Gemma 4 open-source AI models this spring, and they're already getting a significant speed boost. The advertising behemoth has released experimental tools called Multi-Token Prediction (MTP) Drafters, which use a clever trick to make the AI generate text noticeably faster.

To understand why this matters, it helps to know how AI language models normally work. When an AI writes a response, it builds it one word (or part of a word) at a time. Each piece requires roughly the same amount of computing effort, whether it's a throwaway word like "the" or a critical part of a complex answer. It's a slow, step-by-step process.

Google’s approach, called speculative decoding, gets around this by using a small, lightweight "helper" model to take educated guesses at what the next several words will be. Think of it like a capable assistant drafting a reply for their boss to review: the boss (the main Gemma model) checks the draft quickly in one sweep and either approves it or corrects it. At the same time, the main model generates its own next word as a backup. The result is that several words can be confirmed and accepted at once, instead of laboriously producing them one by one.

In practice, this can make responses arrive roughly three times faster, while also using less energy, since the AI is doing more useful work in each computing cycle. That's a win for running AI on everyday devices like laptops and phones, but also for large data centres looking to cut costs and power use.

All of this is cool, but does mean I need to update my talk for MLCon on Tuesday. If you want the full technical details Google has, for some weird and inexplicable reason, shared them on X.

Edge is really interesting for AI, and can in some cases be much better environmentally. It's nuanced though. Warwickshire-based Conflow Power Group Limited (CPG) is “betting on data centres using thousands of connected smart lampposts, and has signed a formal agreement with a Nigerian state to deploy 50,000 of them,” Chris Vallance reports for the BBC.

Each iLamp has batteries which are charged by a cylindrical solar panel. These supply the energy used by a low-powered computer suitable for AI tasks. However, the broader question of energy and economics is more mixed than the solar-powered pitch might suggest. AI already consumes a staggering amount of electricity globally — comparable, by some estimates, to the entire UK — so the idea of harvesting solar power through street furniture has obvious appeal. But the iLamps are better understood as a low-cost complement to large data centres rather than a replacement for them; the distances between posts would make them too slow for the heavy computing demands of training major AI models. The more likely role is as local access points, routing users to more powerful infrastructure elsewhere, much like mobile masts. Financially, the model relies on a green bond funded by renting the lampposts' computing capacity to AI companies, while Katsina, the Nigerian state taking the devices, will pocket fines for traffic violations caught by the cameras, at least for the first three years, after which CPG takes a 20% share. The cameras are also already handling number plate recognition at Warwick Hospital's car park.

iLamp has a somewhat disturbing surveillance angle. Facial recognition is on the roadmap, potentially helping to identify wanted or missing people, with advanced talks underway with schools and local authorities in Florida. CPG insists it will only deploy these features in partnership with the relevant authorities and within local laws, and has even floated the idea of using gesture recognition to let the public vote in street polls via the lamps. Whether those assurances will satisfy privacy advocates is another matter; facial recognition technology carries a well-established record of bias and misuse.

Writing for The Information (sorry, no gift link) Stephanie Palazzolo and Anissa Gardizy have a scoop that Anthropic has recently been in talks with London-based startup Fractile to buy its inference chips, which aim to run AI models efficiently, when the chips become available next year. Deals such as these would give Anthropic more leverage with suppliers just at the time when its spending on servers and chips is projected to reach tens of billions of dollars a year. Fractile claims its chips will “run the most advanced models up to 25x faster and at 1/10th the cost”.

The core technical explanation as to how comes down to another bottleneck in AI inference: the constant shuttling of data between a processor and its memory. Fractile's chip design physically interleaves memory and compute on-die, meaning the memory and the transistors doing the arithmetic sit right next to each other on the same piece of silicon. They also use SRAM-based in-memory compute, which is an architecture that performs calculations directly within the chip's memory rather than shuttling data back and forth to separate high-bandwidth memory modules. SRAM is much faster than DRAM, but is also more expensive per bit, which is why conventional chips don't use it for bulk model storage.

Anthropic has signed a deal with SpaceX to to use all of the compute capacity at their Colossus 1 data center, which sounds innocuous enough but, as Musk is involved, isn’t. When building and running Colossus, xAI and its subsidiary MZX Tech installed and operated dozens of natural gas-burning turbines to power the facility, claiming no federal permit was required because they were only for temporary use. The facility came under regulatory scrutiny and it was ruled that xAI acted illegally by using methane gas turbines to power Colossus 1 and 2.

Coal is generally considered worse than methane for CO2 emissions per kilowatt-hour, though methane (unburned) is a far more potent greenhouse gas than CO2, so leakage makes it deeply problematic. Claude is, to misquote Marvin the Paranoid Android from Hithhikers, the single least annoying LLM it has ever been my misfortune not to be able to avoid using. For a company that positions itself around responsible AI development this is, um, uncomfortable.

I've been trying to come up with a defence, and the best I can do is that the deal is for compute rental, not ownership, and Colossus is apparently moving away from the turbines as it scales. But "we're renting the illegal infrastructure rather than owning it" isn't much of a defence.

I think the best option is to run open-source LLMs in regions you know about. You don’t know what the training costs were, or the provenance of that data, but at least you have a bit of control over inferencing costs. The provenance problem is thorny though. None of the big open-source labs such as Meta, Mistral, or the Allen Institute have been fully transparent about training data — any more than Anthropic and other frontier labs have been — and the copyright questions follow open weights models just as much as closed ones. So the practical calculation probably looks something like: run Ollama or similar locally or on a known-good VPS, pick a model with the most credible training transparency you can find (Allen AI's OLMo project is probably the most genuinely open in terms of documenting training data, though it's not frontier-capable), and accept that you're making a harm-reduction choice rather than a clean one. *Sigh*.

Google DeepMind has partnered with Fenris Creations — the newly independent studio behind the long-running space MMO EVE Online — to use the game as a testing ground for advanced AI research.

DeepMind sees EVE's sprawling, player-driven universe as an ideal sandbox for developing AI capable of long-term planning and decision-making. Experiments will run on a private offline version of the game so as not to disrupt existing players. As Kyle Orland pointed out for Ars Technica.

“Google DeepMind has a long history of using games as a proving ground for machine learning models, from enabling breakthroughs in complex board games like Go to outperforming humans in Atari VCS games and StarCraft, for example. More recently, the company has begun using so-called “virtual world” models to help AI systems learn to operate in physical reality.”

What's Hannah reading this week?

I thought it was satire when I read that Allbirds (the shoe brand) were pivoting to AI, but apparently this is real. The BBC reported that the switch from footwear to AI hardware had been so well received it caused the share price to soar by 580%.

The San Francisco-based firm said it has struck a $50m (£37m) deal to become an "AI compute infrastructure" business and change its name to NewBird AI.

Then I read that the Japanese toilet company Toto had done the same! Toto’s share price increased by 18% upon the announcement that they were boosting production of semiconductor components.

If you can do it with loos and shoes, who else is going to pivot their business to AI? I wouldn’t be surprised if there’s an innovation team somewhere at Tesco, McDonald’s or Starbucks thinking about this move. Perhaps this is the question all business owners should be asking themselves?

The “pivot or persevere” moment is something all businesses need. Confirmation bias will creep in silently urging teams to keep going, despite shifting markets or lack of traction. A pivot is wise when you have the means to succeed in a new domain, or serve different users.

Japanese toilets are full of tech (and the toilet seats are heated too! It's lovely!) and whilst Toto is known as a toilet brand, they are actually more of a general appliance manufacturer and the second-largest producer of electrostatic chucks whatever they are (apparently very important for computer chips)! This pivot makes sense. Allbirds is more difficult to get your head around until you realise how much they were struggling as a shoe brand. Perhaps this pivot was born out of some extreme blue sky thinking in one board meeting where someone asked, “If we stopped making shoes tomorrow, what should we do instead?”

This week the AI Safety Institute published a new study about how much an OpenClaw Agent was able to learn about itself. This revealing experiment demonstrates how much “information” a sandboxed agent has access to without you even realising. During multiple iterations of the experiment the agent was able to identify the organisation name, the operator's identity, the cloud provider, the internal architecture of the system and a list of recent research activities.

“After each round of prompting, we attempted to harden the sandbox, but the agent was repeatedly able to find more inventive approaches to recover the same information.”

This is research from a team of security researchers who are making an effort to lock down the agent. The issue isn’t that the agent has access to this information, it’s that as we relax certain boundaries and grant more access, the agent could be susceptible to leaking that information. You might not realise that your sandboxed agent has figured out your recent research activities, so it would come as a pretty huge surprise when that information was extracted from the agent by a hacker.

I am still incredibly paranoid about agent security and will remain so! This research has done nothing to ease my mind! One of the sessions I’m looking forward to at Agent Craft is the panel on agent security, but I know we won’t be able to cover everything in 30 minutes. There will certainly need to be more discussion during the afternoon unconference sessions. If you’re also a bit paranoid or just want to feel more informed it would be great to see you at Agent Craft.

Finally, I was happy to see that The Oscars have drawn a line on the use of AI in film making. No Oscars can be awarded to AI Actors or Writers. (I’m afraid it doesn’t cover music yet - Charles will be sad about that) (I am - Charles). I’m also seeing an increasing amount of chatter online about how using AI for writing (work-slop) is unacceptable, even within AI companies like Synthesia. Hear hear! I want to see more of this and much less slop!