Generating a SARS-CoV Mpro chemical space with Reinvent

Installation

In [ ]:
 # conda env create -f reinvent.yml
  • Activate the environment:
In [ ]:
# conda activate reinvent

Run jupyter notebook

  • Execute jupyter: jupyter notebook
In [1]:
# load dependencies
import os
import re
import json
import tempfile

# --------- change these path variables as required
reinvent_dir = os.path.expanduser("/mnt/NVMe/Reinvent2-master")
reinvent_env = os.path.expanduser("/home/hazemaj/anaconda3/envs/reinvent_shared.v2.1")
output_dir = os.path.expanduser("/mnt/NVMe/Reinvent2-master/production/SARS_CoV-2_Mpro_5_min_fame")
output_dir_TL = os.path.expanduser("/mnt/NVMe/Reinvent2-master/production/SARS_CoV-2_Mpro_5_min_fame/TL")
output_dir_RL = os.path.expanduser("/mnt/NVMe/Reinvent2-master/production/SARS_CoV-2_Mpro_5_min_fame/RL")
output_dir_sampling = os.path.expanduser("/mnt/NVMe/Reinvent2-master/production/SARS_CoV-2_Mpro_5_min_fame/sampling")

# --------- do not change
# get the notebook's root path
try: ipynb_path
except NameError: ipynb_path = os.getcwd()

# if required, generate a folder to store the results
try:
    os.mkdir(output_dir)
except FileExistsError:
    pass

# if required, generate a folder to store the results
try:
    os.mkdir(output_dir_TL)
except FileExistsError:
    pass

# if required, generate a folder to store the results
try:
    os.mkdir(output_dir_RL)
except FileExistsError:
    pass

# if required, generate a folder to store the results
try:
    os.mkdir(output_dir_sampling)
except FileExistsError:
    pass
In [2]:
#For Transfer Learning: run next cells
#For Reinforcement Learning: Go to Reinforcement Learning and run RL cells
#For Sampling: go to Sampling and run the cells

Transfer Learning

In [3]:
# initialize the dictionary
configuration = {
    "version": 2,                          # we are going to use REINVENT's newest release
    "run_type": "transfer_learning"        # other run types: "scoring", "validation",
                                           #                  "transfer_learning",
                                           #                  "reinforcement_learning" and
                                           #                  "create_model"
}
In [4]:
# add block to specify whether to run locally or not and
# where to store the results and logging
configuration["logging"] = {
    "sender": "http://127.0.0.1",          # only relevant if "recipient" is set to "remote"
    "recipient": "local",                  # either to local logging or use a remote REST-interface
    "logging_path": os.path.join(output_dir_TL, "progress.log"), # where the run's output is stored
    "job_name": "Transfer Learning", # set an arbitrary job name for identification
    "job_id": "TL1"                       # only relevant if "recipient" is set to "remote"
}
  • Define parameters for Transfer Learning
  • Define the input prior (original prior trained with ChEMBL)
  • Define the output focused agent path
  • Define the Mpro chemical space (selection of active molecules from ChEMBL and CovidMoonshot) for which Transfer Leraning should be used
  • Define the number of epochs
In [5]:
adaptive_lr_config = {
      "mode": "constant", # other modes: "exponential", "adaptive", "constant"
      "gamma": 0.8,
      "step": 1,
      "start": 5E-4,
      "min": 1E-5,
      "threshold": 1E-4,
      "average_steps": 4,
      "patience": 8,
      "restart_value": 1E-5,
      "sample_size": 10000,
      "restart_times": 0
    }
In [6]:
output_model_path = os.path.join(output_dir_TL, "focused.agent") \
# The final focused agent will be named "focused.agent"
# The intermediate steps will be named "focused.agent.1", "focused.agent.2", "focused.agent.3" and etc.

# add the "parameters" block
configuration["parameters"] = {
    "input_model_path": os.path.join(reinvent_dir,        # path to prior or trained agent
                               "data",
                               "augmented.prior"),
    "output_model_path": output_model_path,                # location to store the focused agent
    "input_smiles_path": os.path.join(output_dir,        # path to input smiles                     
                               "SARS-CoV-2_ChemSpace.smi"),# this is a dataset consisting of non-covalent Mpro inhibitors
    "save_every_n_epochs": 1,      # how often to save the focused Agent. Here its stored after each epoch
    "batch_size": 128,             # batch size the input data
    "num_epochs": 20,              # number of epochs to focus the agent for
    "standardize": True,           # the input may contain SMILES strings that are invalid according to the agent
                                   # this atempts to clean up the input dataset
    "randomize": True,             # this triggers data augmentation which is quite important for small datasets
    "adaptive_lr_config": adaptive_lr_config        # setting the learning rate behavior
}
In [7]:
# write the configuration file to the disc
configuration_JSON_path = os.path.join(output_dir_TL, "transfer_learning_config.json")
with open(configuration_JSON_path, 'w') as f:
    json.dump(configuration, f, indent=4, sort_keys=True)

Run REINVENT Now it is time to execute REINVENT locally.

execute REINVENT

python /input.py .json

In [8]:
%%capture captured_err_stream --no-stderr

# execute REINVENT from the command-line
!python {reinvent_dir}/input.py {configuration_JSON_path}

Analyse the results

In order to analyze the run in a more intuitive way, we can use tensorboard:

  • go to the root folder of the output cd "your_path"/SARS_CoV-2_Mpro_5_min_fame/TL

  • make sure, you have activated the proper environment
    $conda activate reinvent

  • start tensorboard
    $tensorboard --logdir progress.log

Reinforcement Learning

In [9]:
# initialize the dictionary
configuration = {
    "version": 2,                          # we are going to use REINVENT's newest release
    "run_type": "reinforcement_learning"   # other run types: "sampling", "validation",
                                           #                  "transfer_learning",
                                           #                  "scoring" and "create_model"
}
In [10]:
# add block to specify whether to run locally or not and
# where to store the results and logging
configuration["logging"] = {
    "sender": "http://127.0.0.1",          # only relevant if "recipient" is set to "remote"
    "recipient": "local",                  # either to local logging or use a remote REST-interface
    "logging_frequency": 10,               # log every x-th steps
    "logging_path": os.path.join(output_dir_RL, "progress.log"), # load this folder in tensorboard
    "resultdir": os.path.join(output_dir_RL, "results"),         # will hold the compounds (SMILES) and summaries
    "job_name": "Reinforcement Learning",                # set an arbitrary job name for identification
    "job_id": "RL1"                       # only relevant if "recipient" is set to "remote"
}
In [11]:
# add the "parameters" block
configuration["parameters"] = {}

# add a "diversity_filter"
configuration["parameters"]["diversity_filter"] =  {
    "name": "IdenticalMurckoScaffold",     # other options are: "IdenticalTopologicalScaffold", 
                                           #                    "NoFilter" and "ScaffoldSimilarity"
                                           # -> use "NoFilter" to disable this feature
    "nbmax": 25,                           # the bin size; penalization will start once this is exceeded
    "minscore": 0.4,                       # the minimum total score to be considered for binning
    "minsimilarity": 0.4                   # the minimum similarity to be placed into the same bin
}

# prepare the inception (we do not use it in this example, so "smiles" is an empty list)
configuration["parameters"]["inception"] = {
    "smiles": [],                          # fill in a list of SMILES here that can be used (or leave empty)
    "memory_size": 100,                    # sets how many molecules are to be remembered
    "sample_size": 10                      # how many are to be sampled each epoch from the memory
}

# set all "reinforcement learning"-specific run parameters
configuration["parameters"]["reinforcement_learning"] = {
    "prior": os.path.join(reinvent_dir, "data/augmented.prior"), # path to the pre-trained model
    "agent": os.path.join(output_dir_TL, "focused.agent.19"), # path to the pre-trained model
    "n_steps": 500,                        # the number of epochs (steps) to be performed; often 1000
    "sigma": 128,                          # used to calculate the "augmented likelihood", see publication
    "learning_rate": 0.0001,               # sets how strongly the agent is influenced by each epoch
    "batch_size": 128,                     # specifies how many molecules are generated per epoch
    "reset": 0,                            # if not '0', the reset the agent if threshold reached to get
                                           # more diverse solutions
    "reset_score_cutoff": 0.5,             # if resetting is enabled, this is the threshold
    "margin_threshold": 50                 # specify the (positive) margin between agent and prior
}
In [12]:
# prepare the scoring function definition and add at the end
scoring_function = {
    "name": "custom_product",              # this is our default one (alternative: "custom_sum")
    "parallel": True,                     # sets whether components are to be executed
                                           # in parallel; note, that python uses "False" / "True"
                                           # but the JSON "false" / "true"

    # the "parameters" list holds the individual components
    "parameters": [

    # add component: calculate the QED drug-likeness score (using RDkit)
    {
        "component_type": "qed_score",
        "name": "QED Score",                   # arbitrary name for the component
        "weight": 1,                           # the weight of the component (default: 1)
        "model_path": None,                    # not required; note, this is "null" in JSON
        "smiles": [],                          # not required
        "specific_parameters": None            # not required; note, this is "null" in JSON
    },
        
    # add component: enforce the match to a given substructure
    {
        "component_type": "matching_substructure", 
        "name": "Matching substructure",       # arbitrary name for the component
        "weight": 1,                           # the weight of the component (default: 1)
        "model_path": None,                    # not required; note, this is "null" in JSON
        "smiles": ["Brc1ccc2c(c1)CCCO2","CC(=O)N(Cc1ccsc1)c1ccc(N(C)C)cc1","Cc1ccc2nc(N)sc2c1","Cc1ccc2ncsc2c1","CC(C(N)=O)c1cccc(Cl)c1","Cc1ccccc1C1CCC1","Cc1cncc2ccccc12","Cc1nc2n(n1)CC(C(N)=O)CC2","Cc1nc2n(n1)CCCC2","CC(C)(C(N)=O)c1ccccc1","Clc1cc(Cl)c2c(c1)CCCO2","Clc1cc2c(cc1Cl)NCCC2","Clc1cc2c(cc1Cl)OCCC2","Clc1ccc(Cl)cc1","Clc1ccc2c(c1)C=CCC2","Clc1ccc2c(c1)CCCC2","Clc1ccc2c(c1)CCCN2","Clc1ccc2c(c1)CCCO2","Clc1ccc2c(c1)CCCS2","CC(C)(C)NC=O","Clc1ccc2c(c1)CCNC2","Clc1ccc2c(c1)CCO2","Clc1ccc2c(c1Cl)OCCC2","Clc1cccc(CN2CCNCC2)c1","Clc1cccc(Cc2ccn[nH]2)c1","Clc1cccc(Cl)c1","Clc1cccc(N2CCNCC2)c1","Clc1cccc(NCCC2CCCCC2)c1","Clc1cccc(Oc2cccnc2)c1","Clc1cccc2c1OCC2","CC(C)(C)c1ccc(NC(=O)c2cocn2)cc1","Clc1ccccc1Cl","Cn1nc2n(c(=O)c1=O)CCC2","FC(F)c1ccc(Cl)c(Cl)c1","Fc1cc2c(cc1Cl)CCCO2","Fc1ccc(Cl)cc1","Fc1ccc(F)cc1","Fc1ccc2cnccc2c1","Fc1cccc(F)c1","Fc1ccccc1Cl","N#Cc1ccc2cncc(N)c2c1","N#Cc1ccc2cnccc2c1","CC(C)(C)c1cncs1","NC(=O)C1CCOc2ccc(Cl)cc21","NC(=O)CN1CCN(C2CC2)C1=O","NC(=O)Cc1ccc(Cl)c(Cl)c1","NC(=O)Cn1ccnc1","NC(=O)Cn1nc2n(c(=O)c1=O)CCC2","NC(=O)N(CCC1CCCCC1)c1cccc(Cl)c1","NC(=O)[[email protected]@H]1CCOc2ccc(Cl)cc21","NC(=O)[[email protected]]1CCOc2ccc(Cl)cc21","NC(=O)c1cn[nH]c1S(N)(=O)=O","NC1CC(F)(F)C1","NC1CCOCC1","NCC1CCCO1","CC1CCOc2ccc(Cl)cc21","NCc1ccc(Cl)c(Cl)c1","Nc1ccc(C(F)(F)F)cn1","Nc1cncc2c1CCC2","Nc1cncc2c1CNCC2","Nc1cncc2ccc(F)cc12","Nc1cncc2ccccc12","Nc1cncc2ccncc12","CC1Cn2nccc2CN1","Nc1cncc2cnccc12","Nn1nnc2ccccc21","O=C(Cc1ccc(Cl)cc1)Nc1cncc2ccccc12","O=C(Cc1cccc(Cl)c1)Nc1cncc2ccccc12","O=C(Cc1cccc(F)c1)Nc1cncc2ccccc12","O=C(Cc1ccccc1)Nc1cncc2ccccc12","CC1Cn2nccc2CN1C=O","O=C(Cc1cccnc1)NC1CCOC1","O=C(Cc1cccnc1)NC1COC1","O=C(Cc1cncc2ccccc12)NCCC1CCCCC1","O=C(Cn1nnc2ccccc21)NCc1ccsc1","O=C(NCCC1CCCCC1)Nc1cncc2ccccc12","O=C(Nc1ccccc1)C1CC1","O=C(Nc1cccnc1)C1CCOc2ccc(Cl)cc21","O=C(c1ccco1)N1CCNCC1","O=C(c1cncc2ccccc12)N1CCNCC1","O=C1CC(Cc2cccc(Cl)c2)N1","O=C1CC(Oc2cc(O)cc(Cl)c2)N1","O=C1CCCCN1c1cncc2ccccc12","O=C1CCc2cc(Br)ccc2N1","O=C1CCc2cc(Cl)cc(Cl)c2N1","O=C1NCCN1C1CC1","O=CN(CCC1CCCCC1)c1cccc(Cl)c1","O=CN1CCN(Cc2cccc(Cl)c2)CC1","O=CN1CCN(c2cccc(Cl)c2)CC1","O=CNC1CC(F)(F)C1","O=CNC1CCOCC1","O=CNC1COC1","O=CNc1cncc2ccc(F)cc12","O=CNc1cncc2ccccc12","O=CNc1cncc2ccncc12","O=CNc1cncc2cnccc12","O=COc1cncc(Cl)c1","O=c1cccc(O)[nH]1","OCC1CCOc2ccc(Cl)cc21","OCc1ccc(Cl)c(Cl)c1","CCNc1ccc(C#N)cn1","Oc1cncc(Cl)c1","c1cc2c(cn1)CCC2","c1cc2c(cn1)CCNC2","c1cc2ccncc2cn1","c1cc2cnccc2cn1","c1ccc(C2CCCC2)cc1","CCOc1c(C)cccc1Cl","c1ccc2cnccc2c1","c1cnc2ccncc2c1","CCOc1ccccc1Cl","CCS(=O)(=O)N1CCNCC1","CCc1ccc(Cl)c(Cl)c1","CC(=O)N1CCNCC1","CN(C)Cc1ccc(Cl)c(Cl)c1","CN(C)c1ccc(N(C=O)Cc2ccsc2)cc1","CN(C)c1ccc(NC(=O)Cn2nnc3ccccc32)cc1","CN(C)c1ccc(NCc2ccsc2)cc1","CN(C)c1cccnc1","CN1CCCc2cc(Cl)ccc21","CC(=O)N1Cc2ccnn2CC1C","CN1CCN(C2CC2)C1=O","CNC(=O)Cn1ccnc1","CNC(=O)c1cn[nH]c1S(N)(=O)=O","CNCc1ccc(Cl)c(Cl)c1","CNCc1cccc2ccccc12","CNc1ccc(NC(=O)Cn2nnc3ccccc32)cc1","COC(=O)C1(N)CCCC1","COC(=O)C1(NC=O)CCCC1","COC(=O)C1CC(N)C1","COC(=O)C1CC(NC=O)C1","COC(=O)C1CCC1","COC(=O)C1CCCC1","COC(=O)[[email protected]@H]1CC[[email protected]](N)C1","COC(=O)[[email protected]@H]1CC[[email protected]](NC=O)C1","CC(=O)NCc1ccc(Cl)c(Cl)c1","COC1CCOc2ccc(Cl)cc21","COCc1ccc(Cl)c(Cl)c1","COc1cc(Br)ccc1N","COc1cc(Cl)cc(OC2CC(=O)N2)c1","COc1cc(Cl)nc(N)n1","COc1cc(Cl)ncn1","COc1ccc(Br)cn1","COc1ccc(Cl)cc1","COc1ccc(Cl)cc1C","CC(=O)Nc1cncc2ccccc12","COc1ccc2cncc(N)c2c1","COc1ccc2cncc(NC=O)c2c1","COc1ccc2cnccc2c1","COc1cccc(N2CCN(C=O)CC2)c1","COc1cccc(N2CCNCC2)c1","COc1ccccn1","COc1ncc(Br)cc1N","CC(=O)Nc1cncc2ccncc12","CS(=O)(=O)c1ccc2cnccc2c1","Cc1cc(Cl)cc(C#N)c1","Cc1cc(Cl)cc(C(C)C(N)=O)c1","Cc1cc(Cl)cc(Cl)c1","Cc1cc(Cl)cc2c1OCC2","CC(=O)Nc1cncc2cnccc12","Cc1cc(Cl)ccc1Cl","Cc1cc(Cl)ccc1F","Cc1cc(Cl)ccn1","Cc1cc(F)cc(F)c1","Cc1cc(F)ccc1F","Cc1ccc(C)c(Cl)c1","Cc1ccc(Cl)c(Cl)c1","Cc1ccc(Cl)c(F)c1","Cc1ccc(F)c(Cl)c1"],              # a match with this substructure is required
        "specific_parameters": None            # not required; note, this is "null" in JSON
    },
    
    # add component: Get away from the listed structures - SARS-CoV-2 Mpro known inhibitors
    {
        "component_type": "jaccard_distance", 
        "name": "Jaccard distance",         # arbitrary name for the component
        "weight": 1,                           # the weight of the component (default: 1)
        "model_path": None,                    # not required; note, this is "null" in JSON
        "smiles": ['Nc1ccc(O)c2ncccc12', 'Cc1c(-c2cnccn2)ssc1=S', 'O=C(Oc1cncc(Cl)c1)c1ccco1', 'O=C(Oc1cncc(Br)c1)c1ccco1', 'O=C(Oc1cncc(Cl)c1)c1cscn1', 'Cn1sc(=O)n(Cc2ccccc2)c1=O', 'CN1CC[[email protected]@]2(C)c3cc(O)ccc3N(C)[[email protected]@H]12', 'Nc1ccc2cc3ccc(N)cc3nc2c1', 'N#Cc1cnn(C(=O)c2cccs2)c1C1CC1', 'Nc1ccc(S(=O)(=O)[N-]c2ncccn2)cc1', 'COc1cc(C(=S)N2CCOCC2)ccc1O', 'c1ccc2sc(SNC3CCCCC3)nc2c1', 'O=c1oc(Nc2ccccc2)nc2ccccc12', 'Cc1nnn(NC(=O)Cc2cccc(Cl)c2)c1C', 'Nc1ncc(S(=O)(=O)c2ccc(Cl)cc2)c(N)n1', 'COc1cccc(C(=O)Oc2cncc(Cl)c2)c1', 'CCOC(=O)Cc1ccc(-c2ccccc2)cc1', 'O=C(Oc1cncc(Cl)c1)c1cccc2[nH]ccc12', 'COc1cc(Cl)cc(OC(=O)Nc2cccnc2)c1', 'Cc1c(N)cncc1NC(=O)Cc1cccc(Cl)c1', 'O=C(Oc1cncc(Cl)c1)c1cc2ccccc2o1', 'O=C(Oc1cncc(Cl)c1)c1cc2ccccc2[nH]1', 'O=C(Oc1cncc(Cl)c1)c1cc2ccccc2s1', 'C/N=c1/nc(-c2ccccc2)n(-c2ccccc2)s1', 'CC(CN1CC(=O)NC(=O)C1)N1CC(=O)NC(=O)C1', 'Cc1ccc(C)c(-n2sc3cc(F)ccc3c2=O)c1', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2c1CCC2', 'O=C(O)[[email protected]@H](c1ccccc1)N1Cc2ccccc2C1=O', 'N#Cc1ccc(CN2CCC(C3CCNC3)CC2)cc1', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2nc[nH]c12', 'O=C(Cc1cccc(Cl)c1)Nn1cnc2ccccc21', 'O=C(Cc1cccc(Cl)c1)Nc1cnccc1C1CC1', 'O=C(Cc1ccccc1)Nc1cncc2ccccc12', 'O=C(Cc1cccc(Cl)c1)Nn1nnc2ccccc21', 'O=C(Cc1cccc(Cl)c1)Nc1cnc2ccccn12', 'CCCCc1nc(-c2nccn2Cc2ccoc2)c[nH]1', 'CC(=O)N1CCN(CC(=O)Nc2cnccc2C)CC1', 'O=C(CC1Sc2ncnn2C1=O)Nc1ccc(Br)cc1', 'O=C(O)C(c1ccccc1)N1Cc2ccccc2C1=O', 'O=C(Cc1cncc2ccccc12)Nc1cccc(Cl)c1', 'O=C(Cc1cc(Cl)ccn1)Nc1cncc2ccccc12', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2c1CCCN2', 'Cc1ccncc1NC(=O)[[email protected]]1CCOc2ccc(Cl)cc21', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2ccncc12', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2cnccc12', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2cccnc12', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2c1OCCC2', 'O=C(Cc1cccc(Cl)c1)Nc1cncc2ccccc12', 'O=C(Cc1ccc(F)cc1)Nc1cncc2ccccc12', 'O=C(Cc1cccc(F)c1)Nc1cncc2ccccc12', 'Cc1cccc(CC(=O)Nc2cncc3ccccc23)c1', 'O=C(Cc1cncc(Cl)c1)Nc1cncc2ccccc12', 'O=C(Cc1ccc(Cl)cn1)Nc1cncc2ccccc12', 'Cc1ccncc1NC(=O)C1CCOc2ccc(Cl)cc21', 'Cc1cc(C)cc(N2C(=O)CC(Sc3nc[nH]n3)C2=O)c1', 'O=C(Cc1ccc(Cl)c(Cl)c1)Nc1cncc2ccccc12', 'O=C(COc1cccc(Cl)c1)Nc1cncc2ccccc12', 'Cc1ccncc1NC(=O)Cc1cc(Cl)cc(S(N)(=O)=O)c1', 'O=C(Cc1ccc(Cl)c(Cl)c1)Nc1cncc2cnccc12', 'O=C(Cc1ccc(Cl)c(Cl)c1)Nc1cncc2ccncc12', 'O=C(Cc1cc(Cl)cc(Cl)c1)Nc1cncc2ccccc12', 'O=C(Cc1ccc(F)c(Cl)c1)Nc1cncc2ccccc12', 'Cc1ccc(CC(=O)Nc2cncc3ccccc23)cc1Cl', 'O=C(Cc1ccc(Cl)c(F)c1)Nc1cncc2ccccc12', 'N#Cc1cccc(CC(=O)Nc2cncc3ccccc23)c1', 'O=C(Cc1cc(F)cc(Cl)c1)Nc1cncc2ccccc12', 'O=C(Cc1cc(F)cc(F)c1)Nc1cncc2ccccc12', 'O=C(Cc1cccc(F)c1F)Nc1cncc2ccccc12', 'O=C(Cc1cc(F)ccc1F)Nc1cncc2ccccc12', 'O=C(Cc1cc(Cl)ccc1F)Nc1cncc2ccccc12', 'O=C(Cc1cc(Cl)ccc1Cl)Nc1cncc2ccccc12', 'C#Cc1cccc(CC(=O)Nc2cncc3ccccc23)c1', 'O=C(Nc1cnccc1CO)C1CCOc2ccc(Cl)cc21', 'O=C(Nc1nncn1C1CC1)C1CCOc2ccc(Cl)cc21', 'O=C(Cc1cccc(Cl)c1)Nn1cnc2ccccc2c1=O', 'O=C(Oc1cncc(Cl)c1)c1ccc(-c2ccc(Cl)cc2)o1', 'Cc1ccc(-n2sc(=O)n(Cc3ccc(F)cc3)c2=O)cc1', 'CC(C)c1ccc2oc3nc(N)c(C(=O)O)cc3c(=O)c2c1', 'O=C(Nc1cncc2ccccc12)C1COc2c(Cl)cccc21', 'COC(=O)c1c(C)nn(C(=O)c2cccs2)c1-c1snnc1C', 'O=C(Nc1cncc2ccccc12)C1COc2ccc(Cl)cc21', 'O=C1C(c2cccc(Cl)c2)CCN1c1cncc2ccccc12', 'COc1ccc(Cl)cc1CC(=O)Nc1cncc2ccccc12', 'COc1ccc2cncc(NC(=O)Cc3cccc(Cl)c3)c2c1', 'O=C1N(Cc2cccc(Cl)c2)CCN1c1nccc2[nH]ncc12', 'NC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'CC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'O=C(Nc1cncc2ccccc12)C(O)c1ccc(Cl)c(Cl)c1', 'N#Cc1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'O=C(Nc1cncc2ccccc12)[[email protected]@H]1COc2ccc(Cl)cc21', 'O=[N+]([O-])c1cc(C(F)(F)F)ccc1S(=O)(=O)c1ccc(Cl)cc1', 'OC1OC(c2cccs2)=N/C1=C/c1ccc(N2CCOCC2)s1', 'O=C(Cc1cncc2ccccc12)N1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1=CCCc2ccc(Cl)cc21', 'COc1ccccc1OCCNC(=O)c1cncc2ccccc12', 'O=C(c1cncc2ccccc12)N1CCN(c2ccccc2)CC1', 'O=C(Nc1cncc2ccccc12)C1CCCc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCNc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCOc2ccc(Br)cc21', 'O=C(Cc1cc(Cl)cc2c1OCC2)Nc1cncc2ccccc12', 'O=C(Nc1cncc2ccccc12)[[email protected]@H]1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCOc2ccc(Cl)cc21', 'O=C1C(c2cccc(Cl)c2)CCCN1c1cncc2ccccc12', 'Cc1ccncc1NC(=O)Cc1cc(Cl)cc(OC2CC(=O)N2)c1', 'O=C(Nc1cncc2ccccc12)C1CCSc2ccc(Cl)cc21', 'COC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'CCOc1c(Cl)cccc1CC(=O)Nc1cncc2ccccc12', 'CNC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'O=C(Nc1cncc2ccccc12)C(F)(F)c1ccc(Cl)c(Cl)c1', 'Cn1cnc2cncc(NC(=O)C3CCOc4ccc(Cl)cc43)c21', 'O=C(Cc1cccc(C(F)(F)F)c1)Nc1cncc2ccccc12', 'O=C(Nc1cncc2ccccc12)[[email protected]@H]1CCNc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccncc12)[[email protected]]1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2c1CNCC2)C1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)N1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CNCc2ccc(Cl)cc21', 'O=C(Nc1cncc2cnccc12)[[email protected]@H]1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccncc12)[[email protected]@H]1CCOc2ccc(Cl)cc21', 'OCCn1c(CSc2nc3ccccc3o2)nc2ccc(Cl)cc21', 'Cn1nc(-c2ccc(C(F)(F)F)cc2)nc2c(=O)n(C)c(=O)nc1-2', 'CSc1[nH]nc(NC(=O)c2cccs2)c1S(=O)(=O)c1ccccc1', 'O=c1sn(-c2cccc3ccccc23)c(=O)n1Cc1ccccc1', 'O=C(Nc1cncc2cnccc12)C1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccncc12)C1CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccc(F)cc12)C1CCOc2ccc(Cl)cc21', 'O=C(c1cncc2ccccc12)N1CCN(c2cccc(Cl)c2)CC1', 'O=C1CC(C(=O)Nc2cncc3ccccc23)c2cc(Br)ccc2N1', 'C[[email protected]@H]1C[[email protected]](C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc2O1', 'C[[email protected]@H]1COc2ccc(Cl)cc2[[email protected]@H]1C(=O)Nc1cncc2ccccc12', 'C[[email protected]@H]1C[[email protected]@H](C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc2O1', 'C[[email protected]]1COc2ccc(Cl)cc2[[email protected]@H]1C(=O)Nc1cncc2ccccc12', 'C[[email protected]@]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'Cc1ccncc1NC(=O)Cc1cc(C#N)cc(OC2CC(=O)N2)c1', 'COc1ccccc1OCCNC(=O)c1cc(=O)[nH]c2ccccc12', 'O=C(Nc1cccnc1)N(CCC1CCCCC1)c1cccc(Cl)c1', 'O=C(Nc1cncc2ccccc12)[[email protected]@H]1CCOc2cc(F)c(Cl)cc21', 'N[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)[[email protected]@H]1CCNc2cc(Cl)c(Cl)cc21', 'O=C(Nc1cncc2ccc(F)cc12)[[email protected]@H]1CCOc2ccc(Cl)cc21', 'CN1CC[[email protected]](C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'CN(C)C(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'O=C(Nc1cncc2ccccc12)C1CCOc2cc(F)c(Cl)cc21', 'O=C1C(c2ccc(Cl)c(Cl)c2)CCCN1c1cncc2ccccc12', 'O=C(Nc1cncc2ccccc12)C1CCOc2c(Cl)cc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCNc2cc(Cl)c(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCOc2cc(Cl)c(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCOc2c1ccc(Cl)c2Cl', 'O=C(Nc1cncc2ccccc12)[[email protected]]1(O)CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCOc2c(O)cc(Cl)cc21', 'O=C1CC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc2N1', 'CN1CCC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'Cc1c(Cl)ccc2c1[[email protected]](C(=O)Nc1cncc3ccccc13)CCO2', 'CCn1c(-c2nonc2N)nc2cncc(CNC3CCNCC3)c21', 'CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'O=C(c1cncc2ccccc12)N1CCN(c2cccc(Cl)c2)C(=O)C1', 'O=C(c1cc(=O)[nH]c2ccccc12)N1CCN(c2cccc(Cl)c2)CC1', 'c1ccc(-c2cc(Sc3nnc(-c4cccs4)o3)n3ncnc3n2)cc1', 'COc1ccccc1OCCNC(=O)c1cc(=O)[nH]c2cccc(C)c12', 'O=C(Nc1cncc2ccccc12)C1(CO)CCOc2ccc(Cl)cc21', 'O=C1CC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)cc(Cl)c2N1', 'COc1ccc2cncc(NC(=O)C3CCOc4ccc(Cl)cc43)c2c1', 'COC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'CC(=O)NC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'CC(C(N)=O)c1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'NC[[email protected]@]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'N#Cc1ccc2cncc(NC(=O)C3CCOc4ccc(Cl)cc43)c2c1', 'CO[[email protected]@]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'O=C(Cc1cc(Cl)cc(-c2nnc[nH]2)c1)Nc1cncc2ccccc12', 'COc1c(Cl)cccc1OCCNC(=O)c1cc(=O)[nH]c2ccccc12', 'O=C(c1cc(=O)[nH]c2ccccc12)N1CCN(Cc2ccccc2)CC1', 'C[[email protected]]1COc2c(N3CCN(C)CC3)c(F)cc3c(=O)c(C(=O)O)cn1c23', 'O=C(c1cc(O)nc2ccccc12)N1CCN(c2cccc(Cl)c2)CC1', 'Cc1ccc(N(Cc2cscn2)C(=O)Cc2cncc3ccccc23)cc1', 'O=C(Cc1cc(Cl)cc(CC2CC(=O)N2)c1)Nc1cncc2ccccc12', 'O=C(Cc1cc(Cl)cc(Cc2ccn[nH]2)c1)Nc1cncc2ccccc12', 'Cc1cc(C)c(C#N)c(Sc2ncnc3c2nnn3Cc2ccccc2)n1', 'O=C(Cc1cc(Cl)cc(OC2CC(=O)N2)c1)Nc1cncc2ccccc12', 'CO[[email protected]@]1(C(=O)Nc2cncc3ccccc23)CCOc2c(F)cc(F)cc21', 'COC1(C(=O)Nc2cncc3ccccc23)CCOc2c(F)cc(F)cc21', 'COC1(C(=O)Nc2cncc3ccccc23)CCOc2cc(F)c(Cl)cc21', 'CC(=O)N1CC[[email protected]](C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'CO[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2cc(Cl)c(Cl)cc21', 'COC1(C(=O)Nc2cncc3ccccc23)CCOc2cc(Cl)c(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C(NCC1CC1)c1ccc(Cl)c(Cl)c1', 'COCCOC(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'NC(=O)N1CCC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'CC1CN(C(=O)c2cc(=O)[nH]c3ccccc23)CCN1c1cccc(Cl)c1', 'CC(=O)N1CCC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'O=C1CC2(CCOc3ccc(Cl)cc32)C(=O)N1c1cncc2ccccc12', 'O=C(c1cc(=O)[nH]c2ccccc12)N1CCN(Cc2cccc(Cl)c2)CC1', 'CNc1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1', 'COc1cccc(N2CCN(C(=O)c3cc(=O)[nH]c4ccccc34)CC2)c1', 'Cc1noc(NC(=O)c2ccc(-c3cc(C(F)(F)F)nn3C)s2)c1[N+](=O)[O-]', 'COc1cccc(N2CCN(C(=O)c3cc(O)nc4ccccc34)CC2)c1', 'O=C(c1cc(O)nc2ccccc12)N1CCN(Cc2cccc(Cl)c2)CC1', 'COc1cccc2[nH]c(=O)cc(C(=O)N3CCN(c4cccc(Cl)c4)CC3)c12', 'Cc1ccc(N(Cc2cccc(Cl)c2)C(=O)Cn2nnc3ccccc32)cc1', 'O=C(Cc1cncc2ccccc12)N(CCC1CCCCC1)Cc1cccs1', 'CN(C)c1ccc(N(Cc2cscn2)C(=O)Cn2nnc3ccccc32)cc1', 'O=C(Cc1cc(Cl)cc(Oc2cccnc2)c1)Nc1cncc2ccccc12', 'CNc1ccc(N(C(=O)Cn2nnc3ccccc32)c2cccc(Cl)c2)cc1', 'CN(C)c1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1', 'COC[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2cc(Cl)c(Cl)cc21', 'COCC1(C(=O)Nc2cncc3ccccc23)CCOc2cc(Cl)c(Cl)cc21', 'CC(NC(=O)CN)(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'COCCN1CCC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'CS(=O)(=O)c1ccc2cncc(NC(=O)C3CCOc4ccc(Cl)cc43)c2c1', 'Cc1ccncc1NC(=O)Cc1cc(Cl)cc(NC2(C)CCC(F)(F)CC2)c1', 'CN(C)c1ccc(N(Cc2ccsc2)C(=O)CN2N=NC3C=CC=CC32)cc1', 'CCC(=O)N(C(=O)C1CCOc2ccc(Cl)cc21)c1cncc2ccccc12', 'COc1ccc(N(Cc2cccc(Cl)c2)C(=O)Cn2nnc3ccccc32)cc1', 'O=C(c1cc(=O)[nH]c2ccccc12)N1CCN(c2cccc(C(F)(F)F)c2)CC1', 'O=C(Cc1cncc2ccccc12)N(CCC1CCCCC1)c1cccc(Cl)c1', 'CN(C)c1ccc(N(Cc2ccsc2)C(=O)Cc2cncc3ccccc23)cc1', 'CN(C)c1ccc(N(Cc2ccccc2)C(=O)Cn2nnc3ccccc32)cc1', 'Cc1ccncc1NC(=O)Cc1cc(Cl)cc(-c2ccc(C3CC3(F)F)cc2)c1', 'O=C(Nc1cncc2ccccc12)N(CCC1CCCCC1)c1cccc(Cl)c1', 'CNC(=O)C[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'COCC(=O)NC(C)(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'CNC(=O)CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'Cc1ccncc1NC(=O)Cc1cc(Cl)cc(-c2ccc(C3(C)CCO3)cc2)c1', 'O=C(Cc1cc(Cl)cc(Oc2cccc(=O)[nH]2)c1)Nc1cncc2ccccc12', 'CN(C)S(=O)(=O)c1cccc(C(=O)Nc2c(-c3ccccc3)nc3sccn23)c1', 'COC(=O)c1c(C)n(S(C)(=O)=O)c2ccc(N(C(=O)c3ccco3)S(C)(=O)=O)cc12', 'CN(C)c1ccc(N(Cc2cccc(Cl)c2)C(=O)Cn2nnc3ccccc32)cc1', 'COC(=O)c1ccc(-c2cc(Cl)cc(CC(=O)Nc3cnccc3C)c2)c(OC)c1', 'COc1cccc(Oc2cc(Cl)cc(NC(=O)Cc3cncc4ccccc34)c2)n1', 'O=C1CC(Oc2cc(Cl)cc(OCCNC(=O)c3cc(=O)[nH]c4ccccc34)c2)N1', 'CCC(=O)Nc1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1', 'O=C(Cc1cc(Cl)cc(NCc2ccc(Br)cc2)c1)Nc1cncc2ccccc12', 'CN(C)C(=O)C[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'CN(C)C(=O)CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCN(Cc2ncc[nH]2)c2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCN(Cc2ccn[nH]2)c2ccc(Cl)cc21', 'CN(C)C(=O)CN1CCC(C(=O)Nc2cncc3ccccc23)c2cc(Cl)ccc21', 'O=C(Nc1cncc2ccccc12)C1CCN(Cc2ncn[nH]2)c2ccc(Cl)cc21', 'O=C(Nc1cncc2ccccc12)C1CCN(Cc2cnc[nH]2)c2ccc(Cl)cc21', 'CNC(=O)Nc1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1', 'CCOC(=O)c1c(C)n(S(C)(=O)=O)c2ccc(N(C(=O)c3ccco3)S(C)(=O)=O)cc12', 'CCOC(=O)c1cc(-c2ccccc2)sc1NC(=O)Cc1csc(-c2ccccc2)n1', 'CCC(=O)Nc1ccc(N(Cc2ccccc2)C(=O)Nn2nnc3ccccc32)cc1', 'O=C(Cn1nnc2ccccc21)N(Cc1ccsc1)c1ccc(N2CCCCC2)cc1', 'COc1cc(Cl)nc(Nc2cc(Cl)cc(CC(=O)Nc3cncc4ccccc34)c2)n1', 'COc1cc(Br)ccc1Nc1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'COc1ncc(Br)cc1Nc1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'CN(C)c1ccc(N(Cc2cccc(Cl)c2)C(=O)Cc2cncc3ccccc23)nc1', 'O=C(Nc1cncc2ccccc12)C1CCN(Cc2ccccc2)c2ccc(Cl)cc21', 'O=C(Cn1nnc2ccccc21)N(Cc1ccsc1)c1ccc(N2CCOCC2)cc1', 'COc1cccc(CCN(C(=O)c2cc(=O)[nH]c3ccccc23)c2cccc(Cl)c2)c1', 'O=C(Cn1nnc2ccccc21)N(Cc1ccsc1)c1ccc(-c2ccccc2)cc1', 'O=C(Nc1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1)C1CC1', 'CC(C)(C(N)=O)c1ccc(N(Cc2ccsc2)C(=O)Cn2nnc3ccccc32)cc1', 'O=S(=O)(Cc1ccccc1F)c1oc(-c2ccc(F)cc2)nc1S(=O)(=O)c1ccccc1', 'Cc1cc(C)c(C)c(S(=O)(=O)N2CCN(C(=O)c3cc(=O)[nH]c4ccccc34)CC2)c1C', 'O=C(Cc1cc(Cl)cc(Nc2ccc(C(F)(F)F)cn2)c1)Nc1cncc2ccccc12', 'CCN(Cc1ccc(F)cc1)c1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'Cc1ccc2nc(Nc3cc(Cl)cc(CC(=O)Nc4cncc5ccccc45)c3)sc2c1', 'CCS(=O)(=O)N1CCN(c2cc(Cl)cc(CC(=O)Nc3cncc4ccccc34)c2)CC1', 'CN(C)CC(=O)NCC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'COCCNC(=O)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NC2COC2)c2cccnc2)cc1', 'CC(=O)Nc1ccc(N(Cc2cccc(Cl)c2)C(=O)Cc2cncc3ccccc23)cn1', 'CC(C)(C)NC(=O)C(c1cccnc1)N(C(=O)c1ccco1)c1ccc(C(C)(C)C)cc1', 'CC(C)c1cc(CNc2nc(Nc3cc(C4CC4)[nH]n3)cc(N3CCN(C)CC3)n2)on1', 'CC[[email protected]](CO)Nc1nc(NCc2ccc(-c3ccccn3)cc2)c2ncn(C(C)C)c2n1', 'CC(C)Nc1cccnc1N1CCN(C(=O)c2cc3cc(NS(C)(=O)=O)ccc3[nH]2)CC1', 'O=C(Cn1nnc2ccccc21)N(Cc1cccc(Cl)c1)c1ccc(N2CCOCC2)cc1', 'O=C(Cn1nnc2ccccc21)N(Cc1cccc(Cl)c1)c1ccc(N2CCCCC2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NC2CCOC2)c2cccnc2)cc1', 'COC(=O)CNC(=O)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'Cn1ccc(CNCC2(C(=O)Nc3cncc4ccccc34)CCOc3ccc(Cl)cc32)n1', 'Cc1nc2n(n1)CC(C(=O)NC(C(=O)Nc1cncc3ccccc13)c1cccc(Cl)c1)CC2', 'O=C(Cc1cc(Cl)cc(N2CCN(C(=O)c3ccco3)CC2)c1)Nc1cncc2ccccc12', 'CN(Cc1cccc2ccccc12)c1cc(Cl)cc(CC(=O)Nc2cncc3ccccc23)c1', 'O=C(Cn1ccnc1)NCC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'COC(=O)C(C)NC(=O)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NC2CCOCC2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCC2CCOC2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCC2CCCO2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NC2CC(F)(F)C2)c2cccnc2)cc1', 'Cn1c(=O)ccn(CC(=O)NC(C(=O)Nc2cncc3ccccc23)c2cccc(Cl)c2)c1=O', 'O=C(Cn1cnc(C(F)(F)F)n1)NC(C(=O)Nc1cncc2ccccc12)c1cccc(Cl)c1', 'O=C(Cn1cc(Cl)c(=O)[nH]c1=O)NC(C(=O)Nc1cncc2ccccc12)c1cccc(Cl)c1', 'O=C(CN1CCN(C2CC2)C1=O)NC(C(=O)Nc1cncc2ccccc12)c1cccc(Cl)c1', 'Cn1nc(C(=O)NC(C)(C(=O)Nc2cncc3ccccc23)c2ccc(Cl)c(Cl)c2)cc1C#N', 'CC(NC(=O)c1cc2n(n1)CCO2)(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'CC(C)(C)c1ccc(N(C(=O)c2ccco2)C(C(=O)Nc2ccccc2)c2cccnc2)cc1', 'O=C1CC(Oc2cc(Cl)cc(N(CCC3CCCCC3)C(=O)Nc3cncc4ccccc34)c2)N1', 'Cn1cc(C2=NOC(C(=O)NC(C(=O)Nc3cncc4ccccc34)c3cccc(Cl)c3)C2)cn1', 'CC1CN(C)CCN1C(=O)CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'CC(C)N1CCOC(C(=O)NC(C)(C(=O)Nc2cncc3ccccc23)c2ccc(Cl)c(Cl)c2)C1', 'CC(C)(C)c1ccc(N(C(=O)c2ccco2)C(C(=O)NCc2ccccc2)c2cccnc2)cc1', 'N#Cc1ccc(N(C(=O)c2ccco2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'COC(=O)C1CC(NC(=O)C(c2cccnc2)N(C(=O)c2cocn2)c2ccc(C(C)(C)C)cc2)C1', 'COc1ccc(NC(=O)C(c2cccnc2)N(C(=O)c2cocn2)c2ccc(C(C)(C)C)cc2)cn1', 'O=C(Cn1nc2n(c(=O)c1=O)CCC2)NC(C(=O)Nc1cncc2ccccc12)c1cccc(Cl)c1', 'O=C(Cc1ccc(-n2cnnn2)cc1)NC(C(=O)Nc1cncc2ccccc12)c1cccc(Cl)c1', 'CC(NC(=O)CN1CCN(C2CC2)C1=O)(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'Cc1nc2n(n1)CC(C(=O)NC(C)(C(=O)Nc1cncc3ccccc13)c1ccc(Cl)c(Cl)c1)CC2', 'CC(C)(C)c1ccc(N(C(=O)c2ccco2)C(C(=O)NCCN2CCOCC2)c2cccnc2)cc1', 'CC(C)(C)c1cnc(N(C(=O)c2ccco2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)s1', 'CC(C)(C)c1ccc(N(C(=O)c2ccco2)[[email protected]](C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2c[nH]cn2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cn[nH]c2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2ccco2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCCc2cccc(F)c2)c2ccnnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCCc2cccc(F)c2)c2cncnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCCc2cccc(F)c2)c2cnccn2)cc1', 'O=C(CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21)N1CCN(CCCO)CC1', 'COC(=O)[[email protected]@H]1CC[[email protected]](NC(=O)C(c2cccnc2)N(C(=O)c2cocn2)c2ccc(C(C)(C)C)cc2)C1', 'COC(=O)C1(NC(=O)C(c2cccnc2)N(C(=O)c2cocn2)c2ccc(C(C)(C)C)cc2)CCCC1', 'CC1Cn2nccc2CN1C(=O)CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'CC(C)Oc1ccc(N(C(=O)c2cocn2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2c[nH]nn2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cnco2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'NS(=O)(=O)c1[nH]ncc1C(=O)NC[[email protected]@]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21', 'CC(C)Oc1ccc(N(C(=O)c2c[nH]cn2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)C2CCCO2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2ncc[nH]2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2nccs2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'CC(C)(C)c1ccc(N(C(=O)c2ccn[nH]2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'COc1nc2ccc(Br)cc2cc1[[email protected]@H](c1ccccc1)[[email protected]@](O)(CCN(C)C)c1cccc2ccccc12', 'CC(C)(C)c1ccc(N(C(=O)c2ccc(Cl)o2)C(C(=O)NCCc2cccc(F)c2)c2cccnc2)cc1', 'Cc1ccncc1C(C(=O)NCCc1cccc(F)c1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'O=C(CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21)N1CCCCC1c1cn[nH]c1', 'O=C(CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21)N1CCN(C2CCOC2)CC1', 'O=C(C[[email protected]]1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21)N1CCN(C2CCOC2)CC1', 'CC(NC(=O)Cc1ccc(-n2cnnn2)cc1)(C(=O)Nc1cncc2ccccc12)c1ccc(Cl)c(Cl)c1', 'Cc1ccncc1C(C(=O)NCCc1cccc(F)c1)N(C(=O)c1ccco1)c1ccc(C(C)(C)C)cc1', 'O=C(NCCc1cccc(F)c1)C(c1cccnc1)N(C(=O)c1ccco1)c1ccc(C2CCCC2)cc1', 'O=C(NCCc1cccc(F)c1)C(c1cccnc1)N(Cc1ccccc1C1CCC1)C(=O)c1ccco1', 'COc1ncncc1C(C(=O)NCCc1cccc(F)c1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'O=C(CC1(C(=O)Nc2cncc3ccccc23)CCOc2ccc(Cl)cc21)N1CCN(CC(F)(F)CO)CC1', 'CC(C)(C)c1ccc(N(C(=O)c2cocn2)C(C(=O)NCC(=O)OCc2ccccc2)c2cccnc2)cc1', 'CCOC(=O)c1ncoc1CNC(=O)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(C(C)(C)C)cc1', 'O=C(NCCc1cccc(F)c1)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(Cn2cccn2)cc1', 'O=C(NCCc1cccc(F)c1)C(c1cccnc1)N(C(=O)c1cocn1)c1ccc(OC2CCOC2)cc1'], # a list of SMILES can be provided
        "specific_parameters": None            # not required; note, this is "null" in JSON
    }]
}
configuration["parameters"]["scoring_function"] = scoring_function
In [13]:
# write the configuration file to the disc
configuration_JSON_path = os.path.join(output_dir_RL, "RL_config.json")
with open(configuration_JSON_path, 'w') as f:
    json.dump(configuration, f, indent=4, sort_keys=True)

Run REINVENT Now it is time to execute REINVENT locally.

execute REINVENT

python /input.py .json

In [14]:
%%capture captured_err_stream --no-stderr

# execute REINVENT from the command-line
!python {reinvent_dir}/input.py {configuration_JSON_path}

Sampling

In [38]:
# initialize the dictionary
configuration = {
    "version": 2,                          # we are going to use REINVENT's newest release
    "run_type": "sampling"                 # other run types: "scoring", "validation",
                                           #                  "transfer_learning",
                                           #                  "reinforcement_learning" and
                                           #                  "create_model"
}
In [39]:
# add block to specify whether to run locally or not and
# where to store the results and logging
configuration["logging"] = {
    "sender": "http://127.0.0.1",          # only relevant if "recipient" is set to "remote"
    "recipient": "local",                  # either to local logging or use a remote REST-interface
    "logging_path": os.path.join(output_dir,"sampling/progress.log"), # where the run's output is stored
    "job_name": "Sample_Mpro",       # set an arbitrary job name for identification
    "job_id": "Sample_Mpro_1"                       # only relevant if "recipient" is set to "remote"
}
In [40]:
# set up the file path, where the generated molecules are
# deposited (as SMILES)
output_SMILES_path = os.path.join(output_dir,"sampling/sampled/sampled.smi")

# add the "parameters" block
configuration["parameters"] = {
    "model_path": os.path.join(output_dir_RL,        # path to prior or trained agent
                               "results",
                               "Agent.ckpt"),
    "output_smiles_path": output_SMILES_path,       # output path where SMILES will be deposited
    "num_smiles": 100,                             # specifies, how many molecules are to be sampled
    "batch_size": 128,                              # internal batch size; should match the one used
                                                    # during training
    "with_likelihood": False                        # also provide the log-likelihood
}
In [41]:
# write the configuration file to the disc
configuration_JSON_path = os.path.join(output_dir, "sampling/sampling_config.json")
with open(configuration_JSON_path, 'w') as f:
    json.dump(configuration, f, indent=4, sort_keys=True)

Run REINVENT Now it is time to execute REINVENT locally.

execute REINVENT

python /input.py .json

In [42]:
%%capture captured_err_stream --no-stderr

# execute REINVENT from the command-line
!python {reinvent_dir}/input.py {configuration_JSON_path}

Generated molecules

In [43]:
from rdkit import Chem
import pandas as pd
from rdkit.Chem import PandasTools
#gen_smi = Chem.SmilesMolSupplier(os.path.join(output_dir_sampling,"sampled/sampled.smi"))
df = pd.read_csv (os.path.join(output_dir_sampling,"sampled/sampled.smi"))
df.columns = ['smiles']
PandasTools.AddMoleculeColumnToFrame(df, smilesCol='smiles')
PandasTools.FrameToGridImage(df, molsPerRow=4)
Out[43]:
In [ ]: