Ancestral reconstruction

Ancestral Reconstruction

Methods

  1. Uniprot에서 목적 단백질을 검색합니다.
  2. Sequnce 정보를 얻기 위해 BLAST! 하고 결과를 서열을 다운 받습니다.
  3. Clustral Omega 에서 MSA 실행합니다.
  4. http://fastml.tau.ac.il 에서 업로드해서 분석 시작
  5. 결과 파일로 Ancestral reconstruction sequence와 tree를 다운로드합니다.

Results

In [24]:
import matplotlib
import matplotlib.pyplot as plt
from Bio import Phylo


def plot_tree(treedata, x, y):
    # handle = StringIO(treedata)  # parse the newick string
    tree = Phylo.read(treedata, "newick")
    tree.ladderize()
    matplotlib.rc("font", size=7)
    # set the size of the figure
    fig = plt.figure(figsize=(x, y), dpi=100)
    # alternatively
    # fig.set_size_inches(10, 20)
    axes = fig.add_subplot(1, 1, 1)
    axes.axis("off")
    Phylo.draw(tree, axes=axes)
    # plt.savefig(output_file)
    return
In [25]:
# Albumin phyrogenetic tree
plot_tree("tree.newick.txt", 8, 3)
No description has been provided for this image

각각의 공통조상의 아미노산 서열

FASTA 포멧으로 아래와 같습니다.

>N1
MKWVTLISLLFLFSSATSRNLQRFRRDAEAHKSEIAHRYNDLGEEHFKGLVLITFAQYLQKCPYEELAKLVKEVTDLAQACVADESAADCSKPLHTIFLDKICAVPKLRDTYGAMADCCAKADPERNECFLSHKDSQPDLVPPYQRPEPDVLCQAYQDNKESFLGHYIYEVARRHPFLYAPAILSFAQKFKAVLTECCEEADKGACLTTKLTALREKALIVSVKQRLSCGILQKFGDRVFQAWQLVRLSQKYPKAPFAEVSKLVTDLTKVHKECCHGDMLECMDDRADLTKHMCEHQDTISSKLKECCEKPIVERSHCIVELENDEMPADLPSLVEKFVEDKEVCKSFEEAKDVFLAEFLYEYSRRHPEFSVQLLLRIAKGYESTLEKCCETDNPHECYANAQDELNQLIKEPQDLVKQNCELLQKLGEYNFQNALLIRYTKKMPQVSTPTLVEISKSMTKVGSKCCKLPEAQRMPCAEGYLSVVINELCVLQETTPINENVTKCCSQSYANRRPCFTALGVDETYVPPEFNADTFTFHEDLCTLPEEERKIKKQTLLVNLVKHKPHVTEEQLKTIAGEFTAMVDKCCAAEDKEACFAEEGPKLIEQSKATLGLGA
>N2
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEIAHRFNDLGEEHFKGLVLITFSQYLQKCPYEEHAKLVKEVTDLAKACVADESAANCDKSLHTIFGDKICAVPSLRDTYGDMADCCEKQEPERNECFLQHKDDKPDLVPPFARPEPDVLCKAFHDNEEAFLGHYLYEVARRHPYFYAPELLYYAQKYKAVLTECCEAADKGACLTPKLDALREKALISSAKQRLRCASLQKFGDRAFKAWALVRLSQKFPKADFAEISKLVTDLTKVHKECCHGDLLECADDRADLAKYMCEHQDTISSKLKECCDKPILEKSHCIAELENDEMPADLPALAEEFVEDKDVCKNYEEAKDVFLGKFLYEYSRRHPDYSVSLLLRLAKAYEATLEKCCATDDPHACYAKVLDEFKPLVEEPQNLVKQNCELFEKLGEYNFQNALLVRYTKKVPQVSTPTLVEISRSLGKVGSKCCKHPEAERMPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCSESLVNRRPCFSALGVDETYVPKEFNAETFTFHADICTLPETERKIKKQTALVELVKHKPHATEEQLKTVVGEFTALVDKCCAAEDKEACFAEEGPKLVESSKATLGLGA
>N3
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEIAHRFNDLGEEHFKGLVLIAFSQYLQQCPFEEHVKLVNEVTEFAKTCVADESAANCDKSLHTLFGDKLCTVASLRETYGEMADCCEKQEPERNECFLQHKDDNPDLVPPLVRPEPDAMCTAFHDNEETFLGKYLYEVARRHPYFYAPELLYYAEKYKAVFTECCQAADKAACLTPKLDALREKVLASSAKQRLKCASLQKFGERAFKAWAVARLSQKFPKADFAEISKLVTDLTKVHKECCHGDLLECADDRADLAKYMCENQDSISSKLKECCDKPLLEKSHCIAEVENDEMPADLPALAADFVEDKDVCKNYQEAKDVFLGTFLYEYSRRHPDYSVSLLLRLAKAYEATLEKCCATDDPHACYAKVFDEFKPLVEEPQNLVKQNCELFEKLGEYGFQNALLVRYTKKVPQVSTPTLVEVSRSLGKVGSKCCKHPEAERMPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLPETEKQIKKQTALVELVKHKPKATEEQLKTVMGDFAAFVDKCCAAEDKEACFAEEGPKLVASSQAALALGA
>N4
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEIAHRFNDLGEEHFKGLVLIAFSQYLQQCPFEEHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCEKQEPERNECFLQHKDDNPNLVPPLVRPEPDAMCTAFHDNEETFLGKYLYEVARRHPYFYAPELLYYAEKYKAVFTECCQAADKAACLTPKLDALREKVLASSAKQRLKCASLQKFGERAFKAWAVARLSQKFPKADFAEVSKLVTDLTKVHKECCHGDLLECADDRADLAKYMCENQDSISSKLKECCDKPLLEKSHCIAEVENDEMPADLPALAADFVEDKDVCKNYAEAKDVFLGTFLYEYSRRHPDYSVSLLLRLAKAYEATLEKCCATADPHACYAKVFDEFKPLVEEPQNLVKQNCELFEKLGEYGFQNALLVRYTKKVPQVSTPTLVEVSRSLGKVGSKCCKHPEAERMPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLPEKEKQIKKQTALVELVKHKPKATEEQLKTVMGDFAAFVDKCCKAEDKEACFAEEGPKLVASSQAALALGA
>N5
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEIAHRFNDLGEKHFKGLVLIAFSQYLQQCPFEEHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCEKQEPERNECFLQHKDDNPNLVPPLVRPEPDAMCTAFQENPETFLGKYLYEVARRHPYFYAPELLYYAEKYKAVFTECCQAADKAACLTPKLDALKEKVLVSSAKQRLKCSSLQKFGERAFKAWAVARLSQKFPKADFAEVSKLVTDLTKVHKECCHGDLLECADDRADLAKYMCENQDSISSKLKACCDKPLLQKSHCIAEVENDDMPADLPALAADFVEDKDVCKNYAEAKDVFLGTFLYEYSRRHPDYSVSLLLRLAKTYEATLEKCCAEADPHACYATVFDEFKPLVEEPQNLVKQNCELFEKLGEYGFQNALLVRYTKKAPQVSTPTLVEVSRSLGKVGSKCCKLPEAERLPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCTESLVERRPCFSALEVDETYVPKEFKAETFTFHADICTLPEKEKQIKKQTALAELVKHKPKATEEQLKTVMGDFAAFVDKCCKAEDKEACFAEEGPKLVASSQAALALGA
>N6
MKWVTFLLLLFVSGSAFSRGVQRFRRDAEAHKSEIAHRYKDLGEKHFKGLVLIAFSQYLQKCPYEEHVKLVQEVTDFAKTCVADESAENCDKSLHTLFGDKLCAIPNLRENYGEMADCCAKQEPERNECFLQHKDDNPNLVPPFQRPEPDAMCTAFQENPETFMGHYLHEVARRHPYFYAPELLYYAEKYNAVLTECCAAADKAACLTPKLDALKEKALVSAVRQRLKCSSMQKFGERAFKAWAVARMSQTFPNADFAEITKLATDLTKVNKECCHGDLLECADDRAELAKYMCENQASISSKLQACCDKPLLQKSHCLAEVEHDDMPADLPALAADFVEDKDVCKNYAEAKDVFLGTFLYEYSRRHPDYSVSLLLRLAKKYEATLEKCCAEADPHACYGTVFDEFKPLVEEPQNLVKTNCELYEKLGEYGFQNAVLVRYTKKAPQVSTPTLVEAARSLGRVGTKCCTLPEAQRLPCVEDYLSAILNRVCVLHEKTPVSEKVTKCCSGSLVERRPCFSALTVDETYVPKEFKAETFTFHADICTLPEKEKQIKKQTALAELVKHKPKATEEQLKTVMGDFAEFVDKCCKAEDKEACFSTEGPKLVARSQEALALGA
>N7
MKWVTFLLLLFVSGSAFSRGVQRFRREAEAHKSEIAHRYKDLGEQHFKGLVLIAFSQYLQKCPYEEHVKLVQEVTDFAKTCVADESAENCDKSLHTLFGDKLCAIPNLRENYGELADCCAKQEPERNECFLQHKDDNPNLVPPFQRPEAEAMCTSFQENPTTFMGHYLHEVARRHPYFYAPELLYYAEKYNEVLTQCCAEADKAACLTPKLDAVKEKALVSAVRQRMKCSSMQKFGERAFKAWAVARMSQTFPNADFAEITKLATDLTKVNKECCHGDLLECADDRAELAKYMCENQATISSKLQACCDKPLLQKSHCLAEVEHDNMPADLPAIAADFVEDKEVCKNYAEAKDVFLGTFLYEYSRRHPDYSVSLLLRLAKKYEATLEKCCAEADPPACYGTVLAEFQPLVEEPKNLVKTNCELYEKLGEYGFQNAVLVRYTQKAPQVSTPTLVEAARNLGRVGTKCCTLPEAQRLPCVEDYLSAILNRVCVLHEKTPVSEKVTKCCSGSLVERRPCFSALTVDETYVPKEFKAETFTFHSDICTLPEKEKQIKKQTALAELVKHKPKATEEQLKTVMGDFAQFVDKCCKAADKDTCFSTEGPNLVARSKEALALGA
>N8
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEVAHRFKDLGEEHFKGLVLIAFSQYLQQCPFEEHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDDNPNLVPPLVRPEVDVMCTAFHDNEETFLKKYLYEVARRHPYFYAPELLFFAARYKAAFTECCQAADKAACLLPKLDELRDEGKASSAKQRLKCASLQKFGERAFKAWAVARLSQKFPKAEFAEVSKLVTDLTKVHTECCHGDLLECADDRADLAKYMCENQDSISSKLKECCDKPLLEKSHCIAEVENDEMPADLPSLAADFVESKDVCKNYAEAKDVFLGMFLYEYARRHPDYSVVLLLRLAKAYEATLEKCCAAADPHECYAKVFDEFKPLVEEPQNLVKQNCELFEQLGEYKFQNALLVRYTKKVPQVSTPTLVEVSRNLGKVGSKCCKHPEAKRMPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLSEKEKQIKKQTALVELVKHKPKATKEQLKTVMDDFAAFVEKCCKADDKEACFAEEGPKLVAASQAALALGA
>N9
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDDNPNLVPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADKAACLLPKLDELRDEGKASSAKQRLKCASLQKFGERAFKAWAVARLSQRFPKAEFAEVSKLVTDLTKVHTECCHGDLLECADDRADLAKYICENQDSISSKLKECCEKPLLEKSHCIAEVENDEMPADLPSLAADFVESKDVCKNYAEAKDVFLGMFLYEYARRHPDYSVVLLLRLAKTYETTLEKCCAAADPHECYAKVFDEFKPLVEEPQNLIKQNCELFEQLGEYKFQNALLVRYTKKVPQVSTPTLVEVSRNLGKVGSKCCKHPEAKRMPCAEDYLSVVLNQLCVLHEKTPVSERVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLSEKERQIKKQTALVELVKHKPKATKEQLKTVMDDFAAFVEKCCKADDKETCFAEEGKKLVAASQAALGLGA
>N10
MKWVTFISLLFLFSSAYSRGVQRFRRDAEAHKSEIAHRFNDLGEEHFKGLVLIAFSQYLQQCPFEEHVKLVNEVTEFAKTCVADESAANCDKSLHTLFGDKLCTVASLRETYGDMADCCEKQEPERNECFLQHKDDNPDLVPPLVRPEPDAMCTAFHDNEQRFLGKYLYEIARRHPYFYAPELLYYAEKYKGVFTECCQAADKAACLTPKIDALREKVLASSAKQRLKCASLQKFGERAFKAWSVARLSQKFPKAEFAEISKLVTDLTKVHKECCHGDLLECADDRADLAKYMCENQDSISSKLKECCDKPLLEKSHCIAEVEKDEMPADLPPLAADFVEDKDVCKNYQEAKDVFLGTFLYEYSRRHPEYSVSLLLRLAKEYEATLEKCCATDDPHACYAKVFDEFKPLVEEPQNLVKQNCELFEKLGEYGFQNALLVRYTKKVPQVSTPTLVEVSRSLGKVGSKCCKHPEAERMPCAEDYLSVVLNRLCVLHEKTPVSEKVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLPETEKQIKKQTALVELLKHKPKATEEQLKTVMGDFAAFVDKCCAAEDKEACFAEEGPKLVASSQAALALGA
>N11
MKWVTFISLLFLFSSAYSRGVQRVRREAEAHKSEIAHRFNDLGEEHFRGLVLVAFSQYLQQCPFEDHVKLVNEVTEFAKACVADESAANCDKSLHTLFGDKLCTVASLRDKYGDMADCCEKQEPERNECFLQHKDDNPGFVPPLVTPEPDAMCTAFHDNEQRFLGKYLYEIARRHPYFYAPELLYYAEKYKGVFTECCQAADKAACLTPKIDALREKVLASSAKERLKCASLQKFGERAFKAWSVARLSQKFPKAEFAEISKLVTDLTKVHKECCHGDLLECADDRADLAKYMCENQDSISTKLKECCDKPVLEKSHCIAEVERDELPADLPPLAADFVEDKEVCKNYQEAKDVFLGTFLYEYSRRHPEYSVSLLLRLAKEYEATLEKCCATDDPPACYAKVFDEFKPLVEEPQNLVKTNCELFEKLGEYGFQNALLVRYTKKVPQVSTPTLVEVSRSLGKVGSKCCKHPEAERMSCAEDYLSVVLNRLCVLHEKTPVSERVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADLCTLPEAEKQIKKQTALVELLKHKPKATEEQLKTVMGDFGAFVDKCCAAEDKEACFAEEGPKLVAAAQAALALGA