Tutorials: Configuration Selection

Import the necessary packages

[1]:
from scipy import stats
import numpy as np
import random
import copy
import sys

Load data for the selection

In the next block, we load the simulated and experimental inter-chromosomal contacts and DamID sequencing data.

[2]:
target_1 = np.loadtxt('expt_constraints_HFF_100KB.txt')[-231:]
target_2 = np.loadtxt('DamID_haploid.txt')
non_zeros = np.where(target_2 != 0)
model_1 = np.loadtxt('inter_contact_simulation.txt')
model_damid = np.loadtxt('DamID_simulation.txt')

model_2 = np.zeros((len(model_damid), len(model_damid[0])))

#The simulated DamID results are taken the logarithm
for i in range(len(model_damid)):
    non_zero_sim = np.where(model_damid[i] != 0)
    model_2[i][non_zero_sim] = np.log2(model_damid[i][non_zero_sim]/np.mean(model_damid[i][non_zero_sim]))

We randomly choose 20 configurations and calculate the pearson correlation coefficients between experimental and simulated results

[3]:
choose = np.random.randint(1,201,20)

pearson_previous_1 = stats.pearsonr(target_1,np.mean(model_1[choose],axis=0))[0]
pearson_previous_2 = stats.pearsonr(target_2[non_zeros],np.mean(model_2[choose],axis=0)[non_zeros])[0]

Iteration

We then iteratively go through every configuration in the initial configuration ensemble (ICE) and replace with a structure from FCE that’s not already included in ICE. We then compute the Pearson correlation coefficient between new average ICE contact probabilities and experimental values. If the Pearson correlation coefficient is higher than the value determined from the original ICE, the new structure is accepted and the ICE is updated. Otherwise, the new structure is rejected.

[4]:
max_iteration = 1 #Choose the maximum iteration

for i in range(max_iteration):
    for j in range(len(choose)):
        for k in range(len(model_1)):
            if k not in choose:
                mid_choose = copy.deepcopy(choose)
                mid_choose[j] = k
                pearson_1 = stats.pearsonr(target_1,np.mean(model_1[mid_choose],axis=0))[0]
                pearson_2 = stats.pearsonr(target_2[non_zeros],np.mean(model_2[mid_choose],axis=0)[non_zeros])[0]
                if pearson_previous_1 < pearson_1 and pearson_previous_2 < pearson_2:
                    pearson_previous_1 = pearson_1
                    pearson_previous_2 = pearson_2
                    choose = copy.deepcopy(mid_choose)
                    print('pearson_previous_1:', pearson_previous_1)
                    print('pearson_previous_2:', pearson_previous_2)
pearson_previous_1: 0.623154678072617
pearson_previous_2: 0.6096369858640148
pearson_previous_1: 0.6342043259811628
pearson_previous_2: 0.6127018990732918
pearson_previous_1: 0.6416742081908886
pearson_previous_2: 0.6235081508950805
pearson_previous_1: 0.6507173333210494
pearson_previous_2: 0.6242140802564254
pearson_previous_1: 0.6525561813708923
pearson_previous_2: 0.6274832627994167
pearson_previous_1: 0.6546429824792156
pearson_previous_2: 0.6355704848623595
pearson_previous_1: 0.6613731669384822
pearson_previous_2: 0.6361247320921432
pearson_previous_1: 0.6664841689688658
pearson_previous_2: 0.6395072016323031
pearson_previous_1: 0.6674629251354353
pearson_previous_2: 0.6409561900073473
pearson_previous_1: 0.6729137884168772
pearson_previous_2: 0.6440289132802018
pearson_previous_1: 0.6769007569408156
pearson_previous_2: 0.6539153304771814
pearson_previous_1: 0.6786271476443753
pearson_previous_2: 0.6550591732765327
pearson_previous_1: 0.6890704364541482
pearson_previous_2: 0.6559477577096485
pearson_previous_1: 0.689291881554134
pearson_previous_2: 0.6589510083013823
pearson_previous_1: 0.6895894266804906
pearson_previous_2: 0.6606082808771188
pearson_previous_1: 0.6915695372195538
pearson_previous_2: 0.661312582709897
pearson_previous_1: 0.7006532764788085
pearson_previous_2: 0.6668180711252502
pearson_previous_1: 0.7021688436140852
pearson_previous_2: 0.6669982033733982
pearson_previous_1: 0.7136459660830662
pearson_previous_2: 0.6759114711227185
pearson_previous_1: 0.7194465415433162
pearson_previous_2: 0.6858685834822772
pearson_previous_1: 0.7269670672376706
pearson_previous_2: 0.6968069360562964
pearson_previous_1: 0.7313887518441096
pearson_previous_2: 0.697440751360122
pearson_previous_1: 0.7376432364351209
pearson_previous_2: 0.6991309548994008
pearson_previous_1: 0.7412082016688102
pearson_previous_2: 0.7007149882409178
pearson_previous_1: 0.7415915203437293
pearson_previous_2: 0.7008510946511858
pearson_previous_1: 0.7535564031431198
pearson_previous_2: 0.7015526131599754
pearson_previous_1: 0.7558136196901397
pearson_previous_2: 0.7033194415452361
pearson_previous_1: 0.760299190214547
pearson_previous_2: 0.7035751060750525
pearson_previous_1: 0.7668049571483361
pearson_previous_2: 0.7042624026198036
pearson_previous_1: 0.7675502264787035
pearson_previous_2: 0.7047260396144769
pearson_previous_1: 0.7701351949373072
pearson_previous_2: 0.7059434043197194
pearson_previous_1: 0.77016524989917
pearson_previous_2: 0.7062846327830313
pearson_previous_1: 0.7705616971572704
pearson_previous_2: 0.7066482922181797
pearson_previous_1: 0.770797485505686
pearson_previous_2: 0.707329688025022
[5]:
np.savetxt('choose_20.txt', choose.reshape((-1,1)), fmt='%d')