Confidential AI

Part III: AI 2030: Scale, Deploy and Secure. aka how VCs try to sound smart when they should just invest in talented entrepreneurs to figure it out

Nov 05, 2024

It Was Supposed to Be so Easy. Invest in semiconductors they said. All the value is being captured by hardware and Nvidia they said. Looked at Etched they said. Generative gaming they said. Okay, sure, I get it, compute is the output, semiconductors are the process, and energy is the input. So looking at energy infra makes sense. Good, second order consequences, I hear that’s how you make money. But what about third-order consequences? I’m talking 7-minute abs…

Chapter I:

State of the Future! Part Deux

Lawrence Lundy-Bryan

October 22, 2024

Read full story

Chapter II:

Deploy! Intelligence too cheap to meter @ $0.0001 per million tokens.

Lawrence Lundy-Bryan

October 29, 2024

Read full story

Surprise. The answer is Crypto. You say: We don’t believe you, you need more people. I say. Buckle up and hear this:

All trained models today basically scraped the Internet. The AI labs are paying publishers and companies for access to new data sources. Any data on the Internet was used to train models regardless of copyright, and reparations will be made. But that world is over. Companies, humans and Governments are acutely aware of the value of data for frontier models. The next vanguard is access to personal data for training and inference. So here we are again. The people versus the capitalists. But this time the capitalists are the good guys right?

Wrong. Obviously. The EU knows, man. But the US tech influencers tell us that we are lazy and that we should let it rip. I'm standing here next to the World’s second oldest parliament and I’m here to tell you speed isn’t everything. Like I tell my sons “Life is a dance, not a race”.

I’m here to tell you, models will not be widely used across society and public services, unless we finally “Take This Seriously”. The frontiersman in the Valley does not a culture make. I’m not going all libertarian and telling you we need to protect ourselves from the Government because the right to bear arms and leave me alone etc. But because if this really is the most transformative technology we’ve ever created, the societal impacts will be profound. We can’t “figure it out later”. Just tell that to the 14 year olds addicted to Chinese propaganda on TikTok. Lovely precautionary principle you guys had there, would be a shame if you forgot to use it.

I would 100% advocate for regulation if I thought we could write it good. But we can’t. So now what? Well technology. Let’s use technology to fight technology. If we solve three problems, we have a fighting chance of bending the the Gini Coefficient towards 0.

How can we protect data privacy during training and inference?
How can we ensure fair and unbiased models?
How can we scale open-access AI infrastructure?

Before we get into it, take 10 seconds and share this newsletter with one person who you think will enjoy it. I’m not chasing numbers, I’m chasing legacy.

Share State of the Future

7. Protect data privacy during training and inference

I dunno what to tell you, I still care about privacy. And Apple seem to think it’s a competitive advantage. They are seemingly prepared to deliver a worse iPhone because they want to ship AI with privacy. It’s a bold move Cotton. Sure, people still Google and use Facebook and TikTok and have seemingly accepted the Faustian bargin of giving up privacy for cool applications. And look I get it, cool apps are better than privacy. But I think maybe this is a bridge too far? As AI becomes more pervasive and personal, we need to think very carefully about the danger for misuse. The trade-off for privacy has until recently not been worth it. But what if I told you, maybe it’s not a trade-off anymore? Or maybe the trade-off isn’t so big? The computational complexity of advanced privacy-preserving techniques is getting better every day. It can’t come fast enough.

7.1. Confidential Computing

tldr, yes it’s a trade-off, always is. Confidential computing is the brand name for running Trusted Execution Environments (TEEs) to isolate sensitive computations from the operating system, hypervisor, and other potentially untrusted software layers. By creating a secure enclave for computations, Confidential Computing protects against both external threats, such as malware or hackers, and insider risks, including malicious administrators. In the realm of AI, where vast amounts of sensitive data are processed for training models or running inferences, the ability to secure data during computation is crucial, particularly in cloud environments where users may not fully control the infrastructure. For example, Microsoft’s Azure Confidential Computing and Google’s Confidential VM offer platforms that enable secure data processing, mitigating the risks associated with outsourcing AI workloads to public clouds.

Buy: Distributed TEEs for AI Model Training and Inference (maybe open-source, maybe not, depends what your trust model is), MPC with TEEs, Verifiable confidential computing
Companies to watch: Flashbots, Nethermind, Quex

7.2. Programmable Crypto

We can use the relative simplicity of confidential computing and hardware to maintain confidentiality. Or we can try and do better with pure software. Hardware is hard and hard to update, so if we can avoid it, let’s try yea? Software-based cryptography will always have advantages of flexibility and lack of reliance on a hardware vendor. The evolution from special-purpose to programmable cryptography marks a paradigm shift in securing AI systems. This transition expands our cryptographic toolkit, enabling complex computations on encrypted data and revolutionizing collaborative AI development. Special-purpose cryptography refers to protocols designed for specific operations like Public-key encryption with RSA and digital signatures. Programmable cryptography, in contrast, allows for general-purpose computation within cryptographic protocols and includes advanced techniques like FHE, MPC, and ZKPs, Witness Encryption and Obfuscation. We wrote about this in more detail in our collaborative computer papers. A recent example would be DeepMind using federated learning with homomorphic encryption to train an AI model for breast cancer detection across multiple hospitals. These techniques will be combined in various combinations to meet the performance, latency, cost and security guarantees of specific applications.

Buy: There is no single technology that will win, the best solutions will combine Federated Learning (FL), Fully homomorphic encryption (FHE), Secure Multi-Party Computation (MPC), etc to deliver the optimal balance of throughput, latency, cost and security that the application delivers.
Companies to watch: Zama, Roseman Labs, OpenMined, Flower Labs, FLoCk.io

7.3. Synthetic Data

I used to think synethtic data was a waste of time. That eventually hardware and software-based cryprography would improve so that we wouldn’t need to create synthetic data to protect privacy at all. But then transformers just wanted all the data in the world to train on. And then it ran out. And we had to feed the beast. So here we are. fwiw, synthetic data refers to artificially generated information that mimics the statistical properties and patterns of real-world data without containing actual individual records. In the context of AI privacy, synthetic data serves as a tool for mitigating privacy risks during both training and inference phases. By training models on synthetic datasets that capture the essential characteristics of sensitive real data without incorporating personally identifiable information, organizations can develop AI systems that generalize well to real-world scenarios while minimizing exposure of individual data points. During inference, synthetic data can be used to query models or test system behavior without risking the privacy of real individuals. The key to effective synthetic data lies in its ability to preserve the utility of the data for the intended AI task while breaking the one-to-one mapping between synthetic and real data points, thereby providing a layer of privacy protection. In the context of training frontier models, state-of-the-art approaches to synthetic data generation have become integral to the development process in particular, masked language modeling, data augmentation, and instruction tuning.

Buy: Masked language modeling, Data augmentation, Instruction tuning
Companies to watch: Gretel.ai

8. Ensure fair and unbiased models

Honestly, I appreciate that people are putting the effort into thinking about existential risk. It’s Gods work. But I haven’t got it in me. Maybe I am too scared of death. But what is also important, at least until the very moment, when it isn’t important at all, is building fair and unbiased models. Ned Ludd happened for a reason. Weavers don’t just go quietly into the night. And software developers won’t either. The systems we are building are just too valuable and useful not to be deployed widely. But but but. You cannot ignore people. They call them users over in the Valley, to depersonalise them. But honestly, this stuff we get lobbied against, and petitioned and stopped if real world actual meat people aren’t brought into the tent. Biased models can perpetuate and amplify existing societal inequalities, leading to discriminatory outcomes. We can and must do better. And as per this entire essay, I’ve brought solutions, which if we mix it just right might not slow down progress if at all. I promise it will be worth it. Note, most of these are techniques. An open question is the extent to which these techniques are enough to base a startup on, or if they will just be features incorporated into "compound AI systems”. I can’t imagine a world where finance, healthcare and other heavily regulated systems use fair and unbiased, but worse models while Claude and ChatGPT give stock tips to the masses.

8.1. Causal Inference

Causal inference in AI tries to move away from purely correlational learning to understanding the causal relationships in data. Some argue that we ca’t get to human-level reasoning or so-called “System-2” thinking without causal inference. They argue statistics can only take you so far. But you know, everyone is guessing out here with different heuristics and desire to use Twitter to argue. Nevertheless, their is a strong case that this approach is crucial for developing fair and robust AI systems that can make reliable predictions and interventions in complex real-world scenarios. Unlike traditional machine learning methods that focus on pattern recognition, causal inference aims to uncover the underlying causal structure of a problem. Judea Pearl's work on causal diagrams and the do-calculus has been foundational in this field. Recent applications, such as Microsoft's DoWhy library and Uber's CausalML, demonstrate the practical potential of causal inference in AI. For instance, in healthcare, causal models have been used to identify effective treatments by distinguishing between correlation and causation in patient data. This approach is particularly valuable in addressing bias in AI systems by helping to identify and control for confounding variables that might lead to unfair or inaccurate predictions.

Buy: Scalable causal discovery algorithms, Counterfactual reasoning, Combining casual and transformer-based models
Companies to watch: Causalens, Fiddler.ai

8.2. Adversarial Debiasing Techniques

Adversarial debiasing techniques try to mitigate bias in AI models by leveraging adversarial learning. These methods involve training a model to perform its primary task while simultaneously ensuring that it cannot predict sensitive attributes, thus reducing bias. The technique draws inspiration from Generative Adversarial Networks (GANs), where two neural networks compete against each other. In the context of debiasing, one network attempts to make predictions, while the other tries to identify biases in those predictions. This adversarial process forces the main model to become "blind" to protected attributes, thereby reducing discriminatory outcomes. IBM's AI Fairness 360 toolkit (I know IBM, but bare with me pls) implements several adversarial debiasing algorithms, demonstrating their practical applicability. For instance, in credit scoring models, adversarial debiasing has been shown to significantly reduce gender and racial biases while maintaining high predictive accuracy. This approach is particularly valuable in high-stakes domains where fairness is crucial, such as healthcare diagnostics and criminal justice risk assessment. You might scoff at this as small potatoes, but srsly when these god-like models are making decisions of life and death, this sort of stuff is going to be needed, and let’s be honest the EU will regulate it pretty soon anyway, so we might as well get cracking.

Buy: Multi-attribute fairness, Adaptive adversarial debiasing, Privacy-preserving adversarial debiasing
Companies to watch: FairPlay AI

8.3. Explainable AI (XAI)

My previous work on State of the Future from 2022, noted XAI “would never happen because frontier models would blow past the need for explainability.” I do still believe that, but as with causal inference, adversarial debiasing, I think the regulators are coming, so we might as well get ahead of it. fwiw, XAI encompasses a suite of techniques designed to demystify the decision-making processes of deep neural networks. Which today basically consists of 🤷🏻‍♂️🤷🏻‍♂️ 🤷🏻‍♂️ . These methods aim to turn opaque AI systems into transparent, interpretable entities whose outputs can be traced and justified. Model-agnostic techniques, which can be applied to any AI system regardless of its architecture, have emerged as a particularly promising avenue of research. These methods operate by probing the model's behavior through manipulation of inputs and analysis of corresponding outputs, effectively treating the AI as a black box while still extracting meaningful insights about its internal logic. DARPA's XAI program has been instrumental in advancing this field, spurring the development of various explanation methods. Recent implementations, such as LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations), have gained traction in practical applications. Model-agnostic explanation, counterfactual explanations and visual explanation techniques are the most interesting current opportunites. Now, we just have to work out if enough people care about this or just want generated answers NOW!

Buy: Model-agnostic explanation, Counterfactual explanations, Visual explanation techniques for deep learning models
Companies to watch: Prometheux, Benevolant.ai, Google Vertex

9. Scale open access AI infrastructure

Now, to finish, hear me out. Crypto. Yeah, I know it’s not cool rn, but the fact remains AI is dominated by a few large companies. Capex is going to be 10bn+ annually to train next gen models. Even Inflection founded by Linkedin and Deepmind founders didn’t have the money to win. And the biggest lab doesn’t even have a board anymore. Or senior management. Only OpenAI, Google, Microsoft, Meta, Apple, Amazon, and Nvidia can build billion dollar AI clusters. But there is an opportunity to aggregate a fungible pool of compute. Blockchains offer an alternative to closed AI providers with Meta an outlier with an open-source strategy. These protocols are designed for broad-based participation, allow for independent verification of results and technically unstoppable. It may be closer to activism than customer-driven problem solving, but there is a role for crypto in AI as a secure, auditable and open access infrastructure. Because can we afford for the future to be detemined by Sam? Feels like it’s time to get those crypto folk to finally ship some stuff that works yeah? It’s final boss time.

9.1. Decentralized Training

Okay, what if, and hear me out, we trained models not on a GPU cluster in someone’s Cloud, but across thousands and ideally millions of everyday computers distributed across the globe. It’s a tricky challenge yes, but much of the groundwork has already be laid by the crypto industry since 2008. While Bitcoin's proof-of-work and Ethereum's smart contracts laid the groundwork for decentralised and distributed computation, their capabilities fall short of the intensive demands posed by frontier models. Pioneering projects like Golem and iExec have offered peer-to-peer compute resources for years, yet widespread adoption has been constrained by limited demand and technological hurdles. However, a convergence of technologies is now setting the stage for decentralized compute to become a viable infrastructure at scale. It’s interesting to think that at a micro scale we are connecting multiple chips together to scale out into clusters of distributed servers. The same trend may lead us to connect up smaller data centers, servers and computers around the world to basically aggregate all the computing resources into a “global fungible pool of compute”, a macro scale distributed compute. Note the term decentralised training appears to be used by Microsoft and others when talking about training runs across multiple datacentre sites. This is really distributed training because there is a network operator. This is obviously an easier problem to solve because the operator controls the hardware stack and doesn’t have to worry about hetergenous hardware and some random dude’s Acer Aspire 5315 with Intel Celeron and 1GB of RAM.

Problems to solve: Network efficiency, Blockchain scalability, Private decentralised training
Companies to watch: Prime Intellect, Gensysn.ai, SuperChain

9.2. Decentralized Inference

Right, and it we are training these models on rando computers around the world, we might as well do inference that way too right? Unlike cloud inference, which relies on centralized servers, or edge inference, which processes data locally on devices, decentralized inference distributes the task of running ML models across a network of independent nodes, ideally not owned by China. This approach offers enhanced privacy, data sovereignty, and collective computational power, but it also introduces three big problems: ensuring model consistency and versioning across all nodes, maintaining privacy and security in a distributed environment, and efficiently managing resources and load balancing among heterogeneous nodes. These challenges differ from those faced in decentralized training, which focuses on collaboratively building models rather than using them for predictions. While decentralized training requires frequent, large data exchanges and intensive computation, decentralized inference typically involves less communication overhead but demands consistent low-latency performance. Additionally, decentralized inference places greater emphasis on model version control, data privacy, and system reliability to ensure timely and accurate predictions.

Problems to solve: Model consistency and versioning, Resource and load balancing among heterogeneous nodes, Private decentralised inference
Companies to watch: Giza, EZKL, Modulus (zk), gensyn, ora (optimistic ML), Ritual, Atoma Network (crypto-economics)

9.3. Decentralized Agents

You made it? Congrats, for you, the biggest opportunity of them all. Jks, but actually maybe not? Not sure, but still edcentralized agents represent a frontier in AI, blending autonomous decision-making capabilities with distributed systems. Current state-of-the-art agent frameworks like LangChain and AutoGPT showcase the potential of AI agents to perform complex, multi-step tasks with minimal human intervention. We should expect OpenAI O1 and the associated scaling pathway to make more sophisticated agents economically viable going into 2025. For agents to run on blockchains, more work is needed on verifiable computing, agent transactions, and intern-agent communication. Sorry to say we need more infrastructure, it’s always more infrastructure with crypto I know, and yes we need actual users soon. But maybe the users are actually AI not humans? Maybe everyone has it wrong. Maybe crypto was never for humans. Maybe we were supposed to build infrastructure for agents all along. Is this actually how AI wins? It’s already got us building a computational substrate for it to become economically productive and accumulate wealth? What have we done?

Problems to solve: Off-chain verification and verifiable computing, Autonomous agent transactions, Inter-agent communication
Companies to watch: Nevermined, Naptha.ai, Olas Network, Fetch.ai

And that’s the thesis: scale, deploy and secure. As I said, It’s happening, update your priors. First scale. By 2026, we'll see gen5 GPT, Claude, Gemini and Llama; Gen6 models in 2028 requiring $100bn+ and 10GW of power; and MAYBE 100GW, 1 trilli Gen7 models just into the 2030s. Second deploy. For systems to be pervasive, we will massively reduce token costs making intelligence too cheap to meter at $0.0001/million tokens. Finally, secure. For society to accept AI we must protect privacy, offer fair and unbiased models and have open access AI infrastructure.

Have I convinced you? Have I managed to pull together these 3 themes into a cohesive framework? Hmm, it’s decent I think. The “reduce semiconductor manuafcturing costs” is the weakest imo. There is probably also something about embodiment and robotics missing too. But you know, we can only do our best out here.

This was the macro. Over the following weeks I’ll bring you the meso level for each of these opportunities. Next week: 1.1. On-site Power Generation. The week after: 1.2. Grid Upgrades and Ultra-High Voltage (UHV) Networks. Guess what I’ll bring you after that?

State of the Future