二进制编码的DNA [英] Binary to DNA encoding

查看:603
本文介绍了二进制编码的DNA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个8位二进制序列。我需要带code这8位二进制序列到DNA序列。

如,我有 10011100 ,我在下面的编码规则,

  A = 00,T = 11; G = 10; C = 01,

所以,我希望它是这样的 GCTA 。因此,我需要4位的DNA序列作为结果。

我需要为256 * 256的矩阵,其中每个元素是一个8位二进制序列做到这一点。

我用下面的code产生的矩阵

  A = imread('C:\\用户\\桌面\\ lena.png');
DISP(一);
imshow(一);
对于i = 1:1:256
    对于j = 1:1:256
        B {I,J,1} = DEC2BIN(一(I,J),8);
    结束
结束
DISP(二)


解决方案

下面是给你一个没有循环的方法。我们可以真正做到这三行。

您有第一步骤,其是采取映像中的每个8位数字,并把它转换成其二进制重新presentation。注意到,这是一个二维单元阵列,其大小如你用来做这种转换的图象相同。每个单元阵列将是数字的重新presentation的作为字符串

现在,你现在真正需要做的就是创建一个查询,然后使用该查询生成一个新的2D单元阵列每个位置四个大字。因此,我会使用 容器。地图() 类来创建每对位被映射到一个字符一个键 - 值查找。一旦我们做到这一点,我们就可以使用 cellfun 并遍历您的单元阵列中的每个8字符串,比特分解成2元字符串,并使用这些作为键进入我们的查找。我们难免会4个独立的细胞的输出,所以我们需要使用 cell2mat 把它全部重新走到一起。因此,尝试做这样的:

  codebook = containers.Map({'00','11','10','01'},{'A','T','G ','C'}); %// 抬头
outputCell = cellfun(@(x)的值(codebook,{×(1:2)中,x(3:4),X(5:6)中,x(7:8)}),...
             B,'单向',0);
finalOutput = cellfun(@ cell2mat,outputCell,'单向',0);

作为一个例子,假设我们有这个2×2的细胞元素矩阵:

  B = {'11111111','10101010'; '11001100','00001101'}b =1111111110101010
1100110000001101

通过上面的code运行,这就是我们得到:

  finalOutput =TTTT'GGGG
TATA'AATC

I have an 8 bit binary sequence. I need to encode this 8 bit binary sequence into DNA sequence.

E.g., I have 10011100, the encoding rule I'm following is,

A=00;T=11;G=10;C=01,

So I want it to be something like GCTA. Therefore I need 4 bit DNA sequence as result.

I need to do this for a 256 * 256 matrix where each element is an 8 bit binary sequence.

I've created the matrix using the following code

a=imread('C:\Users\Desktop\lena.png');
disp(a);
imshow(a);
for i=1:1:256
    for j=1:1:256
        b{i,j,1} = dec2bin(a(i,j),8);
    end
end 
disp(b)

解决方案

Here's a no for loop approach for you. We can actually do this in three lines.

You have the first step which is to take each 8-bit number in your image and convert it into its binary representation. Take note that this is a 2D cell array that is the same size as the image you used for doing this conversion. Each cell array would be the representation of the number as a string.

Now, all you really need to do now is create a lookup, then use this lookup to generate four characters per location in a new 2D cell array. As such, I would use the containers.Map() class to create a key-value lookup where each pair of bits gets mapped to a single character. Once we do this, we can then use cellfun and iterate over each 8 character string in your cell array, break up the bits into 2 element strings, and use these as keys into our lookup. We will inevitably get 4 separate cells for the output, so we'll need to use cell2mat to bring it all back together. As such, try doing this:

codebook = containers.Map({'00','11','10','01'},{'A','T','G','C'}); %// Lookup
outputCell = cellfun(@(x) values(codebook, {x(1:2),x(3:4),x(5:6),x(7:8)}), ...
             b, 'uni', 0);
finalOutput = cellfun(@cell2mat, outputCell, 'uni', 0);

As an example, let's say we had this 2 x 2 matrix of cell elements:

b = {'11111111', '10101010'; '11001100', '00001101'}

b = 

'11111111'    '10101010'
'11001100'    '00001101'

Running through the above code, this is what we get:

finalOutput = 

'TTTT'    'GGGG'
'TATA'    'AATC'

这篇关于二进制编码的DNA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆