Template 2

Description of Template 2

The input format of Template 2 is more restricted in comparison to Template 1. While using input format 2 a number of default values are assumed and used while loading a input file.

An advantage of this template is, that the layout of the input file is simpler and that it may be better suited to gene expression data. This input template was provided from users of the system and a corresponding input filter was developed. It is very similar to the format of J-Express.

It is recommended to use the input template 1 in most cases, as this template provides more flexibility in terms of data definition.

Layout of the Input File

The input file consists of a single table, which is exemplarily shown below:

Screenshot of input format of input file 2.
Example Layout, which conforms to the input format for gene expression data

For identification purposes two possible columns in the Excel sheet are used: either a column which contains the text clust-ID or a column which starts with the text "EC" (e. g. EC-number). If a column with a EC-number header is found, it is preferred over a header with the heading clust-ID.

While reading the input file a number of predefined but optional columns are read. In the following the list of interpreted description columns (processed in each case, if available) is listed. A "*" in the column header description means, that any text may follow or precede the particular text. E. g. "*secure*" means, that also a column header with the content "secure?" would be interpreted. Additional columns besides the measurement value columns, described in the next paragraph, are ignored.

  • Column header "spot": A node attribute "spot" is created, filled with the corresponding value

  • Column header "info": A node attribute "info" is created, filled with the corresponding value
  • Column header "New blast result": A node attribute "new blast" is created, filled with the corresponding value
  • Column header "*E-value*": A node attribute "new blast e val" is created, filled with the corresponding value
  • Column header "New Blast score": A node attribute "new blast score" is created, filled with the corresponding value
  • Column header "*affy*": A node attribute "affy hit" is created, filled with the corresponding value
  • Column header "score": A node attribute "score" is created, filled with the corresponding value
  • Column header "*funcat*": A node attribute "funcat" is created, filled with the corresponding value
  • Column header "*secure*": A node attribute "secure" is created, filled with the corresponding value

Description of the layout of the measurement values headers

Each data column needs to conform to the following layout: "[MeasurementDay][species/genotype]_[replicate number]". The value [MeasurementDay] specifies the time of the measurment in a time series. Contrary to the input format in template 1 in this template no time unit can be specified. A default unit "day" is used instead. The [species/genotype] value divides the dataset, it may also be used to describe other relationships of the data, as shown in the example image. If replicate measurements are available, each replicate dataset must be identified with the [replicate number], the [species/genotype] and [replicate number] information needs to be divided with a underscore "_".

This layout of the data uses new rows for measured substances in comparison to template 1, where new columns are used for additional substances. Also time value definitions require new columns instead of rows.

Additional, more standard conform or common data input forms may be supported in the future. Please send an email to the author of VANTED and give suggestions about interesting canditates: feedback@vanted.org.