Issue:October 2022

DRUG DISCOVERY - Getting the Most From a DNA-Encoded Library Screen


DNA-encoded library (DEL) technology has emerged in the past 10 years as a powerful and innovative approach to phar­maceutical lead discovery.1 By harnessing the power of molecular biology, DEL technology allows the screening of chemical libraries with unprecedented size and diversity. In addition, because the libraries are screened as mixtures, the entire screening experiment can be conducted in a volume of a few microliters. This allows screens to be multiplexed across various parameters to produce uniquely informative output data.

Most large pharmaceutical companies now access a DEL platform, either through partnering or an internal capability or a hybrid of the two. Smaller biotech companies are increasingly in­terested in accessing DEL to help build their pipelines. Despite this high level of interest, there remains variable understanding of how to maximize the potential of a DEL screen. The following will share how we at X-Chem think about DEL screening and how our partners get the most from this powerful technology when working with our platform.


DEL technology is rooted in the challenge of combinatorial chemical synthesis. Starting in the 1990s, chemists found that large numbers of compounds could be prepared through a pro­cedure known as split-and-pool synthesis, where all possible combinations of individual reactants are used to generate large numbers of different products. While large libraries could be con­veniently created, identifying the active library members was a challenge. Compounds were often attached to beads, where most analytical methods cannot operate.

These challenges were addressed through the concept of en­coding. In an encoded library, an easily analyzable chemical sys­tem is employed as an identifier for the library compounds of interest. Many encoding schemes were developed, notably the Still GC tagging system, later employed by Pharmacopeia.2 The idea of using DNA as the encoding element was first proposed in 1992, and its advantages as an information storing chemical system were immediately apparent.3

It took another 15 years, however, for DNA-encoding to re­ally take hold. The key advance was the recognition that one could discard the bead and synthesize the libraries as mixtures in solution. Borrowing from molecular evolution techniques, the re­sulting encoded mixtures could be enriched for binders by affin­ity-mediated selection procedures, and the identities of the ligands decoded through DNA sequencing of the retained frac­tion.


“Garbage in, garbage out” is a truism that can be applied to numerous processes and experiments in science and beyond. A DEL selection experiment is no different. The discovery of lig­ands from numerically large encoded libraries depends on the physical segregation of the binders from the bulk library. This separation is achieved by exposing the library to the protein of interest and removing the un­bound fraction by washing. Obviously, the quality of the protein reagent will be a pri­mary factor in the quality of the screening output.

High-quality protein reagents from Proteros allowed the discovery and characterization of this covalent BTK inhibitor.

In functional screening paradigms, such as HTS, the primary quality metric is the reproducibility and signal-to-noise ratio of the functional assay, often ex­pressed as the Z’.4 By optimizing assay conditions and choosing the appropriate reagents, high Z’ values can be achieved even with relatively crude protein samples. This stands in contrast to biophysical screening techniques, such as crystallog­raphy, SPR, ASMS, or DEL, in which the pri­mary quality metrics are related to the purity, aggregation state, and fraction of the protein sample that is in the biologi­cally relevant conformation.

At X-Chem, we subject protein reagents to a rigorous qualification process that includes solution-phase tech­niques, including dynamic light-scattering (DLS), size-exclusion chromatography (SEC), and differential scanning fluorime­try (DSF), as well as capture assessment and the assessment of the maintenance of appropriate conformation and accessibility in the immobilized state. We sometimes observe that protein samples that are amenable to functional assay develop­ment do not meet our standards for DEL screening.

For researchers considering a DEL screen for pharmaceutical hit generation, we would make the following recommen­dations:

  1. If you do not have an internal protein production capability, partner with a premium provider. Companies that provide access to structural biology ca­pability often have expertise in gener­ating high-quality protein. We have found that providers, such as Proteros, can offer a differentiated capability in delivering high-quality protein (as well as biophysical assays and structures). X-Chem and Proteros have a rich his­tory of pairing high-quality reagents with DEL screening.
  2. If you do have an internal capability, be sure to set stringent criteria for the reagents you will produce for DEL screening. Reagents generated for X-ray crystallography often meet our quality requirements.
  3. Consider exploring alternative selec­tion modalities that circumvent the need for purified protein reagents. At X-Chem, we have developed selection protocols in which library is applied di­rectly to cell lysate. This technique is often the method of choice for projects whose protein is difficult to express, or that require a multi-protein complex to maintain correct fold and function.


One of the key aspects of DEL-based discovery is its ability to probe specific as­pects of selectivity and site-of-action dur­ing the screening experiment. Due to its miniaturized format, affinity-mediated se­lection can be conveniently conducted under a variety of conditions in parallel. These multiplexed selections can provide rich data sets that assess the output com­pounds across a large number of useful parameters.

Example of a profile plot of three separate chemical series across four selection conditions.5

The most common and useful param­eter examined during DEL selection is se­lectivity. Tuning out activity against undesirable but closely related targets is a frequent challenge in target-based drug discovery. Conversely, pan-activity across a number of targets can be therapeutically advantageous, particularly when the re­lated targets are mutants that confer resist­ance to an established treatment. Both these situations can be effectively ad­dressed by multiplexed DEL selection. One need only conduct selections against the various targets in parallel, and examine the output for overlap or uniqueness across the various selections. At X-Chem, we commonly visualize such an analysis using a bar chart, referred to as a “pro­file.” For a particular chemical series, its enrichment in each selection is repre­sented by the bar height. Selective com­pounds should only exhibit enrichment at a single target. Poly-selective compounds will show enrichments against a number of targets. Compounds that exhibit a pro­file consistent with the pharmacological rationale of the project are prioritized for follow-up.

While selection multiplexing of this sort is a powerful tool for discovery of use­ful ligands, it does come with an important caveat. As discussed in the aforemen­tioned section, in order for the data to be useful, the targets must be of similarly high quality, available at similar concentrations and ideally appended with the same affin­ity tag. A selection campaign in which the primary target is of high quality, but the various additional targets are not, will have only limited utility in probing ques­tions of selectivity. Therefore, the decision to conduct a multiplexed selection experi­ment must take into account the additional expense of reagent (or cell lysate) genera­tion. We have found that protein produc­tion is a key factor governing the scope of a selection campaign. At X-Chem, we rou­tinely conduct campaigns that contain multiple individual selection conditions.

Structure of two atom-efficient X-Chem libraries and their resulting physicochemical properties.

Another parameter that can be con­veniently addressed in selection is compet­itive behavior. By adding a high concentration of a known ligand to a se­lection, we can saturate a protein’s binding site and render it unavailable to library members. Library compounds that are en­riched in the apo selection, but absent when the ligand is added, are likely to be competitive with the added ligand. This technique is effective at focusing efforts on ligands that have a high likelihood to be functionally active as they compete with a known functional ligand. On the other hand, library members that do not com­pete with a known ligand could be poten­tially allosteric binders. While their functional characteristics cannot be pre­dicted prior to follow-up, they could, if ac­tive, represent discovery of a new functional binding pocket on the target.

There are a few factors to keep in mind when designing a competition exper­iment in DEL selection. The first is the po­tency and solubility of the added ligand. At X-Chem, we aim to saturate the binding site with a stoichiometric excess of ligand over protein. Because DEL selections are typically run with high protein concentra­tions (ie, ca. 1 μM), the solubility of the tool compound must usually therefore be greater than 10 μM. For lower affinity lig­ands, even higher concentrations may be needed. All these factors must be assessed prior to designing the selection experi­ment.

Interpretation of competitive selection results requires a thoughtful approach. It is commonly assumed that reduced en­richment of library members by added lig­ands is an indicator of orthosteric competition. It is possible, however, that such behavior can be caused by allosteric communication between the library mem­ber binding site and that of the added lig­and. This situation is especially likely in proteins, such as GPCRs, in which long-range conformational changes have func­tional consequences.

In addition to known inhibitors, com­petition selection experiments can be con­ducted using protein binding partners, cofactors, peptides, antibodies, nucleic acids, or any other entities known to bind or otherwise influence the target protein. In some cases, binding to a cofactor or other binding partner is required to organ­ize the target protein into its active confor­mation. In these cases, one may often observe ligand-dependent enrichment of library members. When the ligand is a substrate or cofactor, then the resulting complex may represent a biologically rel­evant form of the target.


One of the appealing aspects of DEL screening is that it does not require choos­ing which chemical matter to search in the selection experiment. The depth of modern sequencing techniques allows the inclusion of all the DELs available to the practitioner. At X-Chem, we include all of our libraries into every selection we conduct. At the analysis stage, however, we do find it use­ful to prioritize certain classes of chemistry, depending on the needs of the project and the productivity of the selection.

DELs in the past have often suffered from poor physicochemical properties, in particular high molecular weights and cal­culated lipophilicities. This property infla­tion was a result of the enthusiasm for DELs with ever-larger numerical size. To obtain libraries with billions of com­pounds, or more, four-cycle libraries were constructed. Because a typical synthetic cycle will add at least 150 Da of molecular weight, four-cycle libraries have average molecular weights of at least 600 Da, without taking into account cores or other constant moieties in the library. In the early days of DEL, the technology was aimed at intractable targets that had failed in other hit generation approaches. In that context, inflated properties were forgivable being no other platform could provide action­able ligands. As DEL has matured how­ever, it has been increasingly applied to broad portfolios of projects across therapy areas and target classes. In response, DELs need to be designed with a greater focus on physical properties and devel­opability. At X-Chem, our library strategy is focused on atom economy, so that we can deliver lead-like or drug-like matter for most targets. Still, we often observe that many high-value targets only yield to com­pounds with high molecular weight, lipophilicity and/or peptidic character. Therefore, we also maintain a rich set of peptidic, macrocyclic, and covalent li­braries, so that we can be confident that we will find hits for even the most difficult and previously intractable targets.

X-Chem libraries therefore span a number of chemical classes and property profiles: small and lead-like, drug-like and “beyond rule-of-5.” While all libraries are put into a given selection experiment, not all libraries receive equal attention during analysis. Macrocyclic libraries, for in­stance, would not be prioritized for a typi­cal kinase inhibitor project, unless the project demanded an allosteric inhibitor or some other nonstandard modality. Like­wise, it may be not fruitful to focus on lead-like matter for a challenging target like a shallow pocket protein-protein inter­action or a highly disordered protein.


DEL technology produces copious quantities of data. A typical selection ex­periment at X-Chem can easily generate over one billion reads of encoding DNA. These sequences must be translated into the corresponding chemical structural in­formation, clustered on chemical similarity, and profiled across the various selection conditions. While this process provides great depth of understanding of the chem­ical space selected by the target, it also places great demand on informatics sys­tems. Researchers interested in fully ex­ploiting the power of DEL data need access to a robust and scalable suite of in­formatics tools.

There are currently no commercial so­lutions for the analysis of DEL selection data. Various DEL practitioners have im­plemented bespoke informatic platforms with varying degrees of scalability. At X-Chem, we have created tools that can ef­ficiently mine the 40+ terabytes of data generated by our platform to date. Our tools allow rapid assessment of selectivity and promiscuity, convenient profiling across selection conditions, clustering on chemical similarity and formatting for input into predictive model-building.


While DEL technology is conceptually straightforward, the successful implemen­tation of this platform requires innovations at all stages. Numerous pitfalls exist that can confound DEL-based lead generation, including suboptimal protein reagents, un­informative selection campaigns, libraries with poor physical properties, and time-consuming analysis. We recommend that researchers looking to exploit this powerful technology partner with an experienced and innovative service provider with a large library of attractive compounds, a fully dedicated suite of informatics tools and, most importantly, a proven track record of success. Even organizations that have an existing DEL platform can benefit from a partner who is driving new innova­tions in library design, selection science, and informatics.


  1. Goodnow, R. A. Jr.;,Dumelin, C. E.;,Keefe, A. D. Nat. Rev.Drug Discovery (2016); DOI: 10.1038/nrd.2016.213.
  2. Ohlmeyer, Michael H. J.; Swanson, Robert N.; Dillard, Lawrence; Reader, John C.; Asouline, Gigi; Kobayashi, Ryuji; Wigler, Michael; Still, W. Clark Proc. Nat. Acad. Sci. U.S.A. 90, 10922-6 (1993).
  3. Brenner, Syndey; Lerner, Richard A. Proc. Nat. Acad. Sci. U.S.A. 89, 5381-3 (1992).
  4. Zhang, J.-H.; Chung, T. D. Y.; Oldenburg, K. R. J. Biomolec. Screening 4, 67-73 (1999).

Dr. Matthew A. Clark is a world-recognized innovator and leader in the DNA-encoded library (DEL) field. He was part of X-Chem’s founding team and served as VP of Chemistry and SVP of research prior to his appointment to the CEO position. Under his scientific leadership, the company developed from a niche chemical discovery platform to a world-leading drug discovery engine serving the biopharma industry. Before joining X-Chem, he was Director of Chemistry at GlaxoSmithKline, where he led the group responsible for design and synthesis of early iteration DELs. He began his professional career at Praecis Pharmaceuticals, where he played a key role in the early development and implementation of technologies that would become the basis for DEL. Dr. Clark is a thought leader in the DEL space, with numerous patents and key DEL publications to his name. He earned his BS in Biochemistry from the University of California, San Diego, his PhD in Chemistry from Cornell University, and conducted post-doctoral studies at the Massachusetts Institute of Technology.