Coding regions MUST be specified in the following format, one protein per line:
[c][:label:]from-to[,from-to[,...]] ...Each coding region is specified by two numbers separated by a dash (no spaces around the dash) and coding regions (exons) of the same protein are separated by comma(s). Coding regions of different proteins must be specified on different lines. The numbers always refer to the top strand counted from left. If the first number is omitted, it is taken as 1 and if the second number is omitted, it is taken as the end of the sequence. If the optional leading letter 'c' is given, the coding region(s) of the protein will be translated from the complementary strand (bottom strand). If a region is specified with only one number, the number is taken as the start position and the translation ends at the end of the sequence. But note that the translation goes from left to right on the top strand and from right to left on the bottom strand.
10 20 30 40 50 60 70 ----------> ------------> -----------> 5' GAGCTGTTAGATGGAGCAACAGGCAACTGTTAGAACTACCAGCTGTTAGAACTCCCACATAAAAGACCTT 3'Suppose you want to tranlate the three regions on the top strand marked with ------>, and label the translated peptide as pep1, the coding regions will be specified as
:pep1:11-21,32-44,51-62
10 20 30 40 50 60 70 5' GAGCTGTTAGATGGAGCAACAGGCAACTGTTAGAACTACCAGCTGTTAGAACTCCCACATAAAAGACCTT 3' <------------------- <------------Now suppose you want to tranlate the two regions on the bottom strand marked with <------, and label the translated peptide as pep2, the coding regions will be specified as
c:pep2:7-26,48-60
Examples:
Example1: :pep1:301-900,1500-2000 c:pep2:5000-7258 :pep3:9000 Example2: 450-800,1215-2348 C4589-7237 :partial:6893 Example3: c:CDS:600The first example specifies three proteins, named pep1, pep2 and pep3. Pep2 is translated from the complementary strand. Note that the optional label, if given, must be included in two colons.