Researchers from the University of York and Université Paris-Saclay Introduce DeepKnowledge for Generalisation-Driven Deep Learning Testing 

  • Deep Neural Networks (DNNs) excel in complex tasks but lack consistency when faced with data variations.
  • DeepKnowledge, developed by researchers at the University of York and Université Paris-Saclay, focuses on improving DNN generalization through knowledge-driven testing.
  • The method utilizes ZeroShot learning to evaluate DNNs’ generalization capacity amidst domain shifts.
  • DeepKnowledge identifies transfer knowledge (TK) neurons crucial for information transfer across domains, allocating more testing resources to them.
  • It employs a TK-based adequacy criterion to measure input set appropriateness, demonstrating effectiveness through large-scale evaluations.
  • The project webpage offers case studies and an open-source DeepKnowledge tool for collaboration and exploration.
  • Future plans involve extending support for object detection models, automating data augmentation, and enabling model pruning.

Main AI News:

The remarkable advancements in Deep Neural Networks (DNNs) have revolutionized various complex tasks, often surpassing human capabilities. DNNs have found extensive applications in critical domains such as autonomous driving, flight control systems, and healthcare, where safety and security are paramount concerns.

However, the consistency of DNN models remains a challenge, as they exhibit instability when confronted with even minor deviations in input data. High-profile accidents, like Tesla’s Autopilot crash, have underscored concerns regarding the reliability of DNNs in safety-critical applications. Industrial studies indicate that operational data often deviates significantly from training data distributions, leading to decreased performance and raising doubts about their resilience to unforeseen shifts and adversarial attacks.

To address these challenges, a recent study by researchers at the University of York and Université Paris-Saclay introduces DeepKnowledge, a knowledge-driven test criterion for DNN systems. DeepKnowledge is built on the principle of out-of-distribution generalization, aiming to enhance the understanding of how models make decisions and improve their generalization capacity.

The method leverages ZeroShot learning to assess a model’s generalization ability when confronted with domain shifts. By analyzing the generalization behavior at the neuron level, DeepKnowledge identifies transfer knowledge (TK) neurons, which play a crucial role in reusing and transferring information across different domains. These TK neurons are deemed essential for ensuring proper DNN behavior and are allocated a larger portion of the testing budget.

DeepKnowledge employs a TK-based adequacy criterion to measure the appropriateness of input sets, considering the coverage of transfer knowledge neuron clusters. The study demonstrates the effectiveness of DeepKnowledge through large-scale evaluations with diverse datasets and DNN models for image recognition tasks. Results highlight the correlation between test suite diversity, DNN problem detection, and the efficacy of DeepKnowledge’s test adequacy criterion.

In addition to the research findings, the project webpage offers access to case studies and an open-source DeepKnowledge tool, fostering collaboration and further exploration in this domain. Looking ahead, the research team envisions expanding DeepKnowledge’s capabilities to support object detection models, automating data augmentation, and enabling model pruning. These initiatives underscore the team’s dedication to advancing DNN testing methodologies and enhancing the reliability of DNN systems in real-world applications.


DeepKnowledge presents a significant advancement in ensuring the reliability of Deep Neural Networks (DNNs) by addressing their generalization challenges. This approach not only enhances DNN performance but also fosters trust and confidence in deploying DNN systems across safety-critical domains. Businesses operating in sectors reliant on DNN technologies should consider adopting DeepKnowledge to mitigate risks and improve the robustness of their applications, ultimately enhancing competitiveness and customer satisfaction.