End-to-End In-Silico Antibody Humanization Pipeline for Google Colab
Keywords:
Antibody humanization, immunogenicity reduction, protein language models, sequence embeddings, humanness scoring, framework mutations, in-silico antibody design, ColabFold structure predictionAbstract
Antibodies are Y-shaped proteins produced by the immune system to recognise and tag foreign substances for destruction. Therapeutic antibodies have become an important class of drugs for treating infectious diseases, autoimmune disorders, and cancers, but many promising candidates cannot be used directly in humans due to their immunogenicity. The immune system may recognise these non-human antibodies as foreign and attack them, rendering them unsafe for clinical use. Antibody humanization tries to solve this problem by modifying the amino acid sequence so that the antibody looks more human while still keeping its structure, binding function and basic developability properties.
In this project, I built an end-to-end, fully in-silico antibody humanization pipeline in Google Colab. The workflow starts from raw antibody FASTA sequences, applies standardized numbering to define framework and CDR regions, and then uses protein language model embeddings and a supervised classifier to estimate how “human-like” each sequence is. Guided by this humanness score, the pipeline proposes small mutations mainly in the framework regions, keeping CDRs fixed, and accepts only those changes that improve the score. Finally, ColabFold is used to check that the humanized sequence still preserves a reasonable antibody fold at the structural level. By combining AI-based sequence representations with simple, transparent design rules in a Colab notebook, this pipeline aims to make antibody humanization faster and more accessible to students and small labs, while still keeping the steps understandable and easy to modify.
Downloads
Published
Conference Proceedings Volume
Section
License
Copyright (c) 2026 DMPedia Lecture Notes in Computer Science & Engineering

This work is licensed under a Creative Commons Attribution 4.0 International License.