{ "cells": [ { "cell_type": "markdown", "id": "b8ed9c19-df9c-4b7a-a6a4-2b0ddfe19b0f", "metadata": {}, "source": [ "## Tutorials: Configuration Generation" ] }, { "cell_type": "markdown", "id": "3f36fe4a-073a-4b08-8ba9-f95b7334aa08", "metadata": {}, "source": [ "### Import the necessary packages" ] }, { "cell_type": "code", "execution_count": 1, "id": "a6b78326-7474-4740-a410-8368cc806706", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import subprocess\n", "import math\n", "import random\n", "\n", "N_configs = 10 #The number of configs that the users hope to generate\n", "bond_length = 0.5 #Bond length\n", "lamina_prox_tol = 1.0 #Extra proximity tolerance for lamina\n", "R_small_sphere = 4.0 #Radius of the small sphere in which each individual chromosome is collapsed\n", "R_low_radial = 0.5 #The lowest values for chrcom uniform position\n", "N_chr_beads = 60642 #The chromatin beads we have for the 100KB model\n", "R_nucleus = 13.0 #Radius of the nucleus (reduced unit)" ] }, { "cell_type": "markdown", "id": "51eb866c-53c7-4f81-98e8-1f95020984dc", "metadata": {}, "source": [ "### Import the necessary reference files" ] }, { "cell_type": "code", "execution_count": 2, "id": "4924d4cc-919e-4c6b-919a-13c6336742e9", "metadata": {}, "outputs": [], "source": [ "damid_average = np.loadtxt('averaged_damid.txt') #averaged DamID signals of homologous chromosomes\n", "### chromatin_info.txt: This file contains the information of each chromatin bead, and is necessary when creating the final configuration. \n", "### 1st column: the index of each bead; 2nd column: the index of belonging chromosome; 3rd column: the type of each chromatin bead.\n", "mol_info = np.loadtxt(\"chromatin_info.txt\", dtype=int)" ] }, { "cell_type": "markdown", "id": "05617565-c386-4e1d-b99c-6873fca8a9e2", "metadata": {}, "source": [ "### Generate the raw configs of chromosomes\n", "\n", "As an example, we first created a total of 10 configurations for the genome by sequentially generating the conformation of each one of the 46 chromosomes as follows. For a given chromosome, we start by placing the first bead at the center (origin) of the nucleus. The positions of the following beads, $i$, were determined from the $(i-1)$-th bead as $r_i= r_{i-1} + 0.5 v$. $v$ is a normalized random vector, and 0.5 was selected as the bond length between neighboring beads. To produce globular chromosome conformations, we rejected vectors, $v$, that led to bead positions with distance from the center larger than $4\\sigma$. Upon creating the conformation of a chromosome $i$, we shift its center of mass to a value $r^i_\\text{com}$ determined as follows. We first compute a mean radial distance, $r^i_\\text{o}$ with the following equation $$\\frac{6\\sigma - r^i_\\text{o}}{r^i_\\text{o}-2\\sigma} = \\frac{D_\\text{hi}-D}{D-D_\\text{lo}},$$ where $D_i$ is the average value of Lamin B DamID profile for chromosome $i$. $D_\\text{hi}$ and $D_\\text{lo}$ represent the highest and lowest average DamID values of all chromosomes, and $6\\sigma$ and $2\\sigma$ represent the upper and lower bound in radial positions for chromosomes. As shown in Fig. S6, the average Lamin B DamID profiles are highly correlated with normalized chromosome radial positions as reported by DNA MERFISH, supporting their use as a proxy for estimating normalized chromosome radial positions. We then select $r^i_\\text{com}$ as a uniformly distributed random variable within the range $[r^i_\\text{o}-2\\sigma,r^i_\\text{o}+2\\sigma]$. Without loss of generality, we randomly chose the directions for shifting all 46 chromosomes." ] }, { "cell_type": "code", "execution_count": 3, "id": "daf056d4-ad2b-4c72-8572-3d5611f4fff2", "metadata": {}, "outputs": [], "source": [ "ave_largest = np.amax(damid_average)\n", "ave_smallest = np.amin(damid_average)\n", "\n", "rp_pos = []\n", "\n", "for k in damid_average:\n", " p1 = k-ave_smallest\n", " p2 = ave_largest-k\n", " rp_pos.append((6*p1+2*p2)/(p1+p2)) # Artificial estimated radial positions from the DamID" ] }, { "cell_type": "code", "execution_count": 4, "id": "fd29ea00-7c29-4da0-b47f-df4798d72130", "metadata": {}, "outputs": [], "source": [ "subprocess.call([\"mkdir -p raw_configs_txt\"],shell=True,stdout=subprocess.PIPE)\n", "\n", "for k in range(N_configs):\n", " file_write = open(\"raw_configs_txt/human_init_%d\"%(k+1),\"w\")\n", "\n", " xyz_chr = np.zeros((N_chr_beads,3))\n", " xyz_chr[:,:] = 100.0\n", " atom_index = 0\n", " for chr in range(1,47):\n", " #Go chromosome by chromosome and initialize them randomly and save in xyz_chr\n", " for beads_chr in range(np.count_nonzero(mol_info[:,1] == chr)):\n", " if beads_chr!= 0:\n", " while(np.sum(xyz_chr[atom_index]**2) > (R_small_sphere)**2):\n", " #Brute force but only needs to be done once\n", " draw_norm = np.random.normal(loc=0.0,scale=1.0,size=3)\n", " draw_norm /= np.sum(draw_norm**2,axis=None)**0.5\n", " xyz_chr[atom_index] = xyz_chr[atom_index-1] + bond_length*draw_norm\n", " else:\n", " #First bead of the chr is at the origin\n", " xyz_chr[atom_index,:] = 0.\n", " atom_index += 1\n", "\n", " #Push the COM of the chr to some random point in the nucleus by choosing the radial coordinate uniformly\n", " draw_norm = np.random.normal(loc=0.0,scale=1.0,size=3)\n", " draw_norm /= np.sum(draw_norm**2,axis=None)**0.5\n", " if chr < 45: ## 1 - 44 chromosomes \n", " ## uniformly distributed random variable within the range [r-2sigma,r+2sigma]\n", " com_chr = draw_norm*(np.random.uniform(low=rp_pos[(chr-1)//2]-2,high=rp_pos[(chr-1)//2]+2)) \n", " else: # sex chromosome\n", " com_chr = draw_norm*(np.random.uniform(low=0.5,high=8.0))\n", "\n", " chr_only = (mol_info[:,1] == chr)\n", " xyz_chr[chr_only,:] += com_chr\n", " \n", " ###\n", " atom_index = 0\n", " random_counter = 0\n", " \n", " for j in range(len(mol_info)):\n", " file_write.write(\"%d %d %d 0.0 %.6f %.6f %.6f\\n\" %(mol_info[atom_index][0],mol_info[atom_index][1],mol_info[atom_index][2],\n", " xyz_chr[atom_index][0],xyz_chr[atom_index][1],xyz_chr[atom_index][2]))\n", " atom_index += 1\n", " \n", " file_write.close()" ] }, { "cell_type": "markdown", "id": "87ce6527-ac4e-4a8c-a30a-e5960db282e0", "metadata": {}, "source": [ "### Generate Speckle and Nucleolus beads\n", "\n", "All the beads are uniformly distributed within the cell nucleus in the initial configurations" ] }, { "cell_type": "code", "execution_count": 5, "id": "38561767-bce7-468c-9bb4-fedd919f2012", "metadata": {}, "outputs": [], "source": [ "n_nuc = 300\n", "n_spec = 1600\n", "\n", "for i in range(N_configs):\n", " beads = np.loadtxt(\"raw_configs_txt/human_init_%d\"%(i+1), usecols=[4,5,6])\n", " count = 0\n", "\n", " while (count < n_nuc+n_spec): # 300 Nucleolus beads and 1600 Speckle beads\n", " theta = np.pi*random.random()\n", " phi = 2*np.pi*random.random()\n", " radial = np.sqrt(12.25**2*random.random())\n", " ran1 = radial*np.sin(theta)*np.cos(phi)\n", " ran2 = radial*np.sin(theta)*np.sin(phi)\n", " ran3 = radial*np.cos(theta)\n", " pos = np.array([ran1,ran2,ran3])\n", "\n", " m = beads-pos\n", " # if there are beads within this cutoff, it will lead to a unstable system, so reject this sampling\n", " if len(np.where(np.sqrt(np.sum(m**2,axis=1))<0.4)[0]) == 0: #0.4 is the empirical cutoff\n", " beads = np.vstack((beads, pos.reshape((-1,3))))\n", " count += 1\n", "\n", " np.savetxt('raw_configs_txt/with_Sp_No_%d.txt'%(i+1), beads, fmt='%.6f')" ] }, { "cell_type": "markdown", "id": "27075bf2-fb05-49e2-95fe-0e1a9ad74879", "metadata": {}, "source": [ "### Generate Lamina beads\n", "\n", "The nuclear envelope provides an enclosure to confine DNA and a repressive environment to organize chromatin with specific interactions. To account for the role of the nuclear lamina while keeping our model simple, we approximate it with discrete particles uniformly placed on a sphere. \n", "\n", "We used the Fibonacci grid to initialize the lamina particles, which form a uniform and almost equidistant network of lamina particles on the surface of the nucleus. The Cartesian coordinates associated with the $i^{th}$ lamina particles are defined as\n", " $$x_i = 2R_\\text{N}\\times(1 - \\frac{i}{N_\\text{La}-1}), y_i = \\sqrt{R_\\text{N}^2-x^2}\\times \\cos[i\\Phi], z_i = \\sqrt{R_\\text{N}^2-x^2}\\times \\sin[i\\Phi]$$\n", "\n", "where $N_\\text{La} =8000$ represents the number of lamina particles, $i\\in \\{0,1,\\dots,N_\\text{La}-2,N_\\text{La}-1\\}$, and $\\Phi = \\pi\\times (3-\\sqrt{5})$ is the golden angle. We set $R_\\text{N} = 5\\mu m$ as the radius of the human foreskin fibroblasts (HFF) cell nucleus. " ] }, { "cell_type": "code", "execution_count": 6, "id": "b8d50afe-88a9-4039-8d3b-cd7b748f4022", "metadata": {}, "outputs": [], "source": [ "def fibonacci_sphere(samples=8000):\n", "\n", " points = []\n", " phi = math.pi * (3. - math.sqrt(5.)) # golden angle in radians\n", "\n", " for i in range(samples):\n", " y = 1 - (i / float(samples - 1)) * 2 # y goes from 1 to -1\n", " radius = math.sqrt(1 - y * y) # radius at y\n", "\n", " theta = phi * i # golden angle increment\n", "\n", " x = math.cos(theta) * radius\n", " z = math.sin(theta) * radius\n", "\n", " points.append([x, y, z])\n", "\n", " return points\n", "\n", "Lamina = R_nucleus*np.array(fibonacci_sphere())" ] }, { "cell_type": "markdown", "id": "52e50be9-1d68-4297-8971-663817b0b623", "metadata": {}, "source": [ "### Generate the initial configuration files used in the simulation" ] }, { "cell_type": "code", "execution_count": 7, "id": "45490760-43fd-4357-a66c-cad71d405c6b", "metadata": {}, "outputs": [], "source": [ "X, zero = 'X', 0\n", "\n", "n_chrom = 46\n", "nuc_type = 5\n", "spe_type1 = 6\n", "spe_type2 = 7\n", "lam_type = 8\n", "\n", "for i in range(N_configs):\n", " beads = np.loadtxt('raw_configs_txt/with_Sp_No_%d.txt'%(i+1))\n", " coor = np.vstack((beads, Lamina))\n", "\n", " file_write = open('init_config_pool/human_%d.pdb'%(i+1), 'w')\n", " for i in range(N_chr_beads):\n", " if i == 0:\n", " file_write.write('CRYST1 26.400 26.400 26.400 90.00 90.00 90.00 P 1 1\\n')\n", " file_write.write('ATOM%7s%3s%8s%4s%12.3f%8.3f%8.3f%6.2f%6.2f \\n'\n", " %(mol_info[i,0],mol_info[i,2],X,mol_info[i,1],coor[i,0],coor[i,1],coor[i,2],zero,zero))\n", " for k in range(1,len(coor)-N_chr_beads+1):\n", " idx_curr = N_chr_beads+k\n", " mol_curr = n_chrom+k\n", " if k<=n_nuc:\n", " file_write.write('ATOM%7s%3s%8s%4s%12.3f%8.3f%8.3f%6.2f%6.2f \\n'\n", " %(idx_curr,nuc_type,X,mol_curr,coor[idx_curr-1,0],coor[idx_curr-1,1],coor[idx_curr-1,2],zero,zero))\n", " elif k<=n_nuc+n_spec:\n", " file_write.write('ATOM%7s%3s%8s%4s%12.3f%8.3f%8.3f%6.2f%6.2f \\n'\n", " %(idx_curr,spe_type1,X,mol_curr,coor[idx_curr-1,0],coor[idx_curr-1,1],coor[idx_curr-1,2],zero,zero))\n", " else:\n", " file_write.write('ATOM%7s%3s%8s%4s%12.3f%8.3f%8.3f%6.2f%6.2f \\n'\n", " %(idx_curr,lam_type,X,mol_curr,coor[idx_curr-1,0],coor[idx_curr-1,1],coor[idx_curr-1,2],zero,zero))\n", " file_write.write(\"END\")\n", " file_write.close()" ] }, { "cell_type": "markdown", "id": "5fab3bce-5158-427c-b8e1-42cfac020d79", "metadata": {}, "source": [ "Delete the unnecessary raw configs" ] }, { "cell_type": "code", "execution_count": 8, "id": "da1c7076-7853-4bc6-8a31-b49827d64a49", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "subprocess.call([\"rm -r raw_configs_txt\"],shell=True,stdout=subprocess.PIPE)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.15" } }, "nbformat": 4, "nbformat_minor": 5 }