|
|
 |
 |
A total of 40,331 rice protein sequences* from
release 6.1 were used to identify segmental duplication in rice using an all
versus all BLAST search (WU-BLASTP, parameters "V=5 B=5 E=1e-10"). As alternative splicing occurs in the rice genome and some genes have
multiple splice forms, the largest peptide sequence was used whenever
alternative isoforms existed. Short protein sequences (<50 aa), organellar genes, and pseudogenes were excluded
from this analysis. Segmentally duplicated blocks were identified using
DAGchainer (Haas et al. 2004) with parameters "-s -I -D" which primarily
includes self comparisons, ignores tandem duplication alignments, and sets the
maximum length distance permitted between collinear gene pairs. As the maximum
length distance affects the identification of segmental duplication significantly, we generated two results with -D 100000 and -D 500000, respectively. You can click the links below to get to each page.
distance = 100kb
distance = 500kb
*: Transposable-element related genes, Small gene models (< 50 amino acids ), organellar inserted genes, and pseudogenes were excluded from this analysis.
|
 |
|
|
|