Lancelot Framework Enables Secure, Encrypted AI Collaboration Across Institutions

Researchers have developed a new framework for machine learning that allows multiple institutions to collaborate on training artificial intelligence models without exposing sensitive private data. Named Lancelot, the system uses an advanced form of encryption to build a defensive wall against malicious attacks and privacy leaks, creating a secure environment for developing AI in heavily regulated fields like healthcare and finance. This approach not only strengthens security but also dramatically increases computational speed, outperforming previous methods by a significant margin and clearing a critical barrier to its practical use.

The development addresses a central challenge in modern AI: how to harness the power of diverse datasets while complying with strict data privacy laws. The technique, known as federated learning, trains models by having participants submit model updates, or “weights,” instead of their raw data. While this protects privacy, it leaves the central model vulnerable to data poisoning, where bad actors submit corrupted updates. Furthermore, even secure models can inadvertently “memorize” and leak parts of the private data they were trained on. Lancelot confronts both problems by allowing a central server to perform calculations on fully encrypted data, ensuring that private information is never visible while simultaneously identifying and rejecting malicious contributions.

The Federated Learning Dilemma

Federated learning has emerged as a crucial paradigm for collaborative AI, especially in sectors where data governance is paramount. This distributed machine learning approach allows organizations—such as hospitals wanting to train a diagnostic model on patient scans—to work together without sharing the underlying confidential data. Instead of pooling sensitive information in one place, each institution trains a copy of the model on its local data and then sends only the resulting model parameters to a central server. The server then aggregates these parameters to create a more robust, globally improved model.

Despite its privacy advantages, this process is susceptible to adversarial threats. Malicious clients can intentionally send manipulated model updates in what are known as poisoning or Byzantine attacks, aiming to corrupt the final aggregated model and compromise its integrity. To counter this, researchers developed Byzantine-robust federated learning (BRFL) systems, which use specialized aggregation rules to identify and mitigate the impact of these harmful updates. However, these BRFL systems introduced another pressing security risk. The neural network models themselves could effectively memorize and reveal specific details from their training data, creating a pathway for attackers to reconstruct sensitive information. This left a critical gap for a system that was both robust against attacks and completely secure against information leakage, all while remaining computationally efficient enough for real-world applications.

A Novel Cryptographic Shield

Lancelot provides a comprehensive solution by integrating fully homomorphic encryption (FHE), a cutting-edge cryptographic technique that allows computations to be performed directly on encrypted data. This means the central server can aggregate the model updates without ever needing to decrypt them, providing an unprecedented layer of security against both internal and external threats.

Fully Homomorphic Encryption

The core of Lancelot’s privacy protection is FHE. In a typical federated learning setup, model updates might be exposed to the central server during the aggregation phase. Lancelot eliminates this risk entirely. Each client encrypts its trained model weights before transmission, and they remain encrypted throughout the entire process. The framework uses a specific FHE scheme known as CKKS, which is well-suited for machine learning because it supports the approximate arithmetic required for these complex calculations. By leveraging FHE, Lancelot ensures that sensitive data embedded in the model weights is never visible to the server or any potential eavesdroppers.

Resisting Malicious Attacks

To achieve Byzantine robustness, the system must be able to identify and discard toxic updates sent by malicious clients. Lancelot accomplishes this in the encrypted domain. The server homomorphically computes the pairwise distances between all the encrypted model updates it receives. This allows it to gauge the similarity between contributions without seeing their actual content. Updates that are significant outliers from the consensus are flagged as likely malicious and earmarked for exclusion from the final aggregated model. This defensive process ensures the integrity and reliability of the final AI model.

Architecture of the Digital Round Table

The Lancelot framework operates on a unique tripartite architecture consisting of the clients, a central server, and a dedicated Key Generation Center (KGC). This structure is essential for its novel interactive sorting mechanism, which is designed to overcome the immense computational challenges typically associated with ordering encrypted data. The KGC is a trusted entity responsible for generating and managing the cryptographic keys used in the FHE system.

The process of filtering malicious updates relies on a technique called mask-based encrypted sorting. After the server calculates the distances between encrypted models, it sends the encrypted distance list to the KGC. The KGC decrypts this list, sorts it to identify the legitimate models, and then encodes this sorting information into a cryptographic permutation matrix, or a “mask.” This mask, which effectively hides the indices of the models to be selected, is sent back to the server. The server can then apply this mask to aggregate the correct, verified model updates without ever learning which client contributed which update, achieving robust filtering with zero information leakage.

Breakthroughs in Speed and Efficiency

A major barrier to the widespread adoption of fully homomorphic encryption has been its immense computational overhead. Performing calculations on encrypted data is inherently slower than on unencrypted data. Lancelot overcomes this challenge through a combination of algorithmic innovations and hardware acceleration. Extensive experiments, including tests on medical imaging diagnostics and standard public image datasets, have shown that Lancelot is remarkably efficient.

The framework delivers more than a twenty-fold increase in processing speed compared to existing Byzantine-robust federated learning systems. This dramatic performance gain is achieved by implementing several cryptographic enhancements, such as Lazy Relinearization and Dynamic Hoisting, which optimize the complex operations involved in FHE. Furthermore, the system is designed to leverage GPU hardware acceleration to further streamline computations. This leap in efficiency makes Lancelot not just a theoretical model but a practical and scalable solution for real-world, large-scale AI applications.

Implications for Sensitive Data Collaboration

The development of the Lancelot framework marks a significant advance for secure multi-party computations and privacy-preserving machine learning. By successfully combining robust security against targeted attacks with complete data privacy through advanced encryption, it sets a new benchmark for the field. The platform is particularly transformative for regulated industries like healthcare and finance, where the potential of collaborative AI has been constrained by data-sharing restrictions.

With Lancelot, multiple hospitals could collaborate to train a powerful diagnostic AI on their collective medical data without violating patient privacy. Similarly, financial institutions could build more effective fraud detection models by securely leveraging shared insights. The framework paves the way for safer, more efficient, and more powerful federated learning applications, opening the door to new scientific collaborations that were previously considered too risky from a data privacy standpoint.