TheThinkingMachine
Menu
  • Home
  • Academic
    • The Maths of AI – An introduction
    • Artificial Intelligence -a MIT Short course
  • Cyber-Defence
  • SPECTER
Menu

DeepMind breakthrough is a JEST

Posted on July 8, 2024July 8, 2024 by Webmaster

A new way of training AI – JEST, Joint Example Selection Techniques – a multimodal contrastive learning with Joint Example Selection (JEST)— according to a paper entitled “Data curation via joint example selection further accelerates multimodal learning” (published by DeepMind) surpasses state-of-the-art models with up to 13× fewer iterations and 10× less computation, making it significantly quicker, cheaper and more efficient. Essential to the performance of JEST is the ability to steer the data selection process towards the distribution of smaller, well-curated datasets via pretrained reference models, exposing the level of data curation as a new dimension for neural scaling laws.

Data quality is a key driver of performance for large-scale pretraining, regardless of the type of modelling and training on well-curated dataset has consistently demonstrated that strong performance can be achieved with significantly less data, even where up to 60% of the data is suspect, however manual curation is difficult and expensive to scale, but model-based data curation, using features of the model being trained to select high quality data may improve the slow, power-law scaling of large-scale pretraining across all modalities. In computer vision, clusters of points which lie close to one another but contain different labels have been found to provide a more effective learning signal than trivially solvable ones and this has suggested that batching data using model-based data-selection criteria may accelerate learning beyond what is possible by selecting examples individually. This multimodal learning, which exposes the interactions between examples in a batch, can be used to derive “a simple and tractable algorithm for joint example selection” (JEST) which efficiently selects relevant ‘sub-batches’ of data from much larger ‘super-batches’ given their model-based scores. 

By training a single model at multiple resolutions in parallel, DeepMind found that it could apply the model for scoring large super-batches, find their most learnable sub-batch much more effectively, and efficiently, and by applying these savings in both learning and example scoring, they could reduce the overhead of scoring from 133% to 10% additional FLOPs while maintaining significant gains in training efficiency, thus scoring 11× fewer iterations and 10× fewer FLOPs. Central to the performance gains is the ability to steer the curation process towards the distribution of smaller, well-curated batched datasets which, coupled with model-based selection criteria, prioritizes examples that most resemble the data it was trained on, and this strong data quality bootstrapping effectively guides the curation of a much larger dataset, allowing the training of a model which strongly surpasses the quality of the reference model on many downstream tasks.

The AI industry is known for its high energy consumption, requiring a lot of energy and water for cooling – for example, Microsoft’s water consumption rose by 34% from 2021 to 2022 due to increased AI computing demands, and ChatGPT uses nearly half a litre of water every 5 to 50 prompts, leading to the International Energy Agency (IEA) predicting that data-centre electricity consumption will double from 2022 to 2026, but by optimizing data selection for AI training, JEST can significantly reduce the number of iterations and computational power needed, thus significantly lowering overall energy consumption.

Category: NEWS, TECHNOLOGY

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • The Tesla trolley Problem – success
  • The Greens were wrong! It is Microbes not Fossil Fuels!
  • The Doctor has already seen you!
  • AI assisted North Korean cyber-criminals being hired in US, UK, Europe and Australia
  • AI learns to teach and improve AI

Recent Comments

No comments to show.

Recent Comments

    Tags

    Academic Papers AI Tools Escalating threat to democracy Regulation Techsistential Risk Work

    Archives

    • November 2024
    • October 2024
    • September 2024
    • July 2024
    • June 2024
    • April 2024
    • November 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • September 2022
    • April 2016

    Categories

    • ACADEMIC
    • AI Books
    • Asia
    • CASELAW
    • ETHICS
    • European
    • LEGISLATIVE
    • NEWS
    • RISK
    • TECHNOLOGY
    • UK
    • US
    © 2026 TheThinkingMachine | Powered by Minimalist Blog WordPress Theme