threat_intelligence1134 wordsRead on Arc Codex

Summary: TGT’s 2026 ICML Papers

The International Conference on Machine Learning (ICML), held annually for over forty years, is among the most influential conferences in modern AI research. This year in Seoul, ICML is hosting its second workshop on Technical AI Governance Research (TAIGR), and several members of MIRI’s Technical Governance Team (TGT) will attend in July. This post summarizes the six TGT papers featured in this year’s TAIGR workshop. One discusses how states might lose the option to govern AI the longer they delay, and proposes actions that can be taken today to retain that option. Another examines the degree to which distributed training threatens compute governance options, and suggests countermeasures. A third addresses restrictions on AI research, spotlighting nearly thirty mechanisms that might help verify compliance. The remaining three papers cover specific technical methods that could help states verify chip use, even when dealing with untrusted rivals: a way to detect training workloads without reading everything on a chip, a way to monitor data to and from a cluster, and a demonstration that AI inference can be exactly reproduced for verification purposes. The Closing Window: How Governments Could Lose Their Ability to Restrain Advanced AI Peter Barnett As AI capabilities increase, governments may eventually decide to prevent developers from making even more dangerous models. But if they wait too long, new developments might make effective restrictions unworkable. This paper analyzes plausible developments that may hamper governance, including proliferation of AI hardware, advances in AI algorithms, and release of models with dangerous capabilities, such as the ability to automate AI R&D. It also suggests prompt actions that preserve the option to govern AI, which include tracking existing and newly manufactured chips, developing robust verification technology, and facilitating information sharing between the U.S. and China. Does Distributed Training Undermine Compute Governance? Robi Rahman Large language models are typically trained in networked computer clusters with a staggering number of chips. Compute governance regulates AI based on the amount of compute used to train a model, and so requires identifying and monitoring hardware. Governance could be undermined by distributed training, which spreads workloads across multiple linked clusters (e.g. over the internet). To address this problem, this paper (1) simulates the performance of distributed training based on published machine learning experiments, (2) tests the feasibility of training large models using nodes too small for regulators to track, and (3) explores detection methods. It finds that distributed training can produce frontier-level models above every existing or proposed compute threshold, even when using hardware that doesn’t trigger monitoring requirements. For instance, the highest currently-regulated threshold of 10^26 FLOP can be surpassed withunder $4 billion of hardware arranged in small nodes, and lower thresholds with even less. These costs would be unappealing, but not prohibitive, for those with incentives to develop frontier AI, and would therefore not prevent evasion. Robust regulation would need to be accompanied by countermeasures, such as chip tracking, whistleblower programs, and limits on unregistered hardware. Verifying Restrictions on Frontier AI Research Aaron Scher AI progress depends primarily on three inputs: compute, algorithms, and data. A durable halt to the capabilities race might aim to freeze all three of these inputs; this paper focuses on algorithms (and somewhat on data). Specifically, improvements to AI algorithms come through machine learning research, which must be restricted if AI capabilities are to be lastingly controlled. Without recommending specific restrictions, this paper analyzes how states could verify compliance in an environment of low trust. It explores over 20 verification options, including international search warrants, reviews of AI training code, and standard intelligence gathering tools. The paper highlights the potential of automated reviews which preserve privacy and security, only escalating to human attention when necessary. Whistleblower protections emerge as a clear low-hanging fruit: easy to implement and requiring no technical R&D. Detecting Hidden ML Training With Zero-Overhead Telemetry Robi Rahman, Sabiha Tajdari Many compute governance proposals, including MIRI’s international agreement, call for restricting AI model training. Such regulation would not restrict harmless use of powerful hardware, such as for physics simulations, graphics rendering, or running already-trained AI models (inference). However, if developers can disguise their training workloads as other computations, they could evade restrictions. Thankfully, the high-end graphics processing units (GPUs) used in AI training behave differently when running different workloads, and those differences can be used to tell training from harmless usage without having to read what’s on the chip. This work develops a classifier that detects when a GPU is being used for ML training based on hardware usage characteristics such as memory use and power consumption. It also describes a way to create classifiers that can detect disguised training workloads. Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors Naci Cankaya, Jakub Kryś, Jonathan Ng, Luke Marks, Felix Krückel If there is to be an international agreement on AI restrictions, we need to start building verification infrastructure for AI datacenters, so auditors don’t have to trust the systems they test, and datacenter owners don’t have to trust the auditors. This paper proposes a solution: record and encrypt all the information that enters or leaves a facility. It works like this: all the wires that carry information into or out of a facility are tapped using a Secure Gateway Device whose architecture is known to all parties involved. The proposed device efficiently scrubs the input and output data to remove covert channels, then splits, hashes, and routes a copy of the hash offsite for verification. Auditors do not have access to the data, but they can challenge the facility owner to re-send inputs to the facility and confirm the hashes match. This setup makes it much harder for bad actors to extract the results of illicit workloads from a facility, and increases confidence that the facility is only running declared workloads. Bit-Exact AI Inference Verification Without Performance Tradeoffs Best paper award winner. Naci Cankaya International AI governance needs the ability to verify what is running on monitored chips in a low-trust environment with incentives to evade. One option is to re-run inputs on verifier machines using the same software that facility owners claim they’re using. Unfortunately, the floating-point operations performed by chips might not always produce identical results. Auditors reviewing calculation records would need to settle for an approximate match, and this approximation leaves degrees of freedom that can be exploited, such as by encoding covert information. This paper shows that sufficiently informed methods can still verify the outputs of modern AI setups. Most seemingly random variation is due to differences in how a given developer arranges their hardware and software. These differences can be recorded and duplicated to remove the seeming randomness, and they can also be used to fingerprint a developer’s setup. When setup details are known, inference outputs can be reproduced exactly, and the auditor is not required to have identical hardware.

How it works

Once you click Generate, Ollama reads this article and crafts 5 comprehension questions. Your answers are graded against the article content — general knowledge won't be enough. Score 70+ to count toward your certificate.

Questions are cached — you'll always get the same 5 for this article.