File Formats

36
Toolkit Data File Formats

Transcript of File Formats

Page 1: File Formats

Toolkit Data File Formats

mur360
Prepared by Geoff Podger and Robert Argent
Page 2: File Formats
Page 3: File Formats

Document History

Date Author Revision Description of Change Feb 2004 Podger 0 Content for RRL manual 11/6/04 Argent 1 Initial development of content

Jan 2005 Argent 1.1 Small changes to match recent changes; addition of .flt

19 Jan 2005 Nick Murray 1.1.1 Minor formatting changes; each example file format now fits completely on a page for convenience

Oct 2005 Robert Argent, Nick Murray 1.1.2

Added information of bsm Updated the CDT file format descriptions and examples to disambiguate month formatting. Added Appendix A - zero padding data in Excel. Added Appendix B - Times and Dates in Data Files

February 2006 Robert Argent 1.1.3 Corrected information on cell size in .asc

20 Feb 2006 Nick Murray 1.1.4

Combined file format examples with file format descriptions. Added link to SHP file format. Fixed several typos.

Copyright notice

© CRC for Catchment Hydrology, Australia 2006

Legal Information

To the extend permitted by law, the CRC for Catchment Hydrology (including its employees and consultants) accepts no responsibility and excludes all liability whatsoever in respect of any person's use or reliance on this publication or any part of it.

Page 4: File Formats

Acknowledgements

The input and assistance of Catchment Modelling Toolkit Product Managers and TIME developers is gratefully acknowledged.

Page 5: File Formats

Contents

i

Toolkit Data File Formats

CONTENTS

1 Introduction...........................................................1

1.1 The user guide ...................................................................................................................1 1.2 File Formats.......................................................................................................................1

1.2.1 Overview ......................................................................................................1 1.2.2 Features........................................................................................................2 1.2.3 Audience ......................................................................................................2 1.2.4 Limitations ....................................................................................................2

2 File Format Overview ............................................3

2.1 Time Series Formats ...........................................................................................................3 2.1.1 .cdt Comma Delimited Column Time Series ....................................................3

TIMEExample.cdt ...........................................................................................4 Daily time series ............................................................................................4 Six minute time series .....................................................................................4 Annual time series..........................................................................................4 Daily time series in ISO format ........................................................................4

2.1.2 .sdt Space Delimited Column Time Series........................................................5 TIMEExample.sdt ...........................................................................................5

2.1.3 .tts Tarsier Daily Time Series...........................................................................6 TIMEExample.tts ............................................................................................7

2.1.4 .ar1 Annual Stochastic ..................................................................................8 TIMEExample.ar1 ..........................................................................................8

2.1.5 .awb AWBM Daily Time Series .......................................................................9 TIMEExample.awb .........................................................................................9

2.1.6 .bsb SWAT BSB File Time Series ...................................................................10 TIMEExample.bsb ........................................................................................10

2.1.7 .bsm Bureau of Meteorology 6 Minute (was .pluv ).........................................11 Data Records ..............................................................................................11 TIMEExample.bsm .......................................................................................12

2.1.8 .csv Comma Separated Variable ..................................................................13 TIMEExample.csv .........................................................................................13

2.1.9 .dat Fixed Format Time Series Files (F.Chiew) ................................................14 TIMEExample.dat.........................................................................................14

2.1.10 .iqqm IQQM Time Series ............................................................................15 TIMEExample.iqqm ......................................................................................16

2.1.11 .mrf MFM Monthly Rainfall Files ...................................................................17 TIMEExample.mrf.........................................................................................17

Page 6: File Formats

Contents

ii

2.1.12 .pcp Swat Daily Time Series ........................................................................ 18 TIMEExample.pcp ....................................................................................... 18

2.1.13 .silo5 QDNR SILO Time Series (was .txt )...................................................... 19 TIMEExample.silo5...................................................................................... 19

2.1.14 .silo8 SILO Standard Time Series (8 Variable) ............................................... 20 TIMEExample.silo8...................................................................................... 20

2.2 Raster Formats ................................................................................................................ 21 2.2.1 .asc Arc ASCII Grids................................................................................... 21

TIMEExample.asc ........................................................................................ 21 2.2.2 .mwasc Map Window Ascii Grids................................................................. 22

TIMEExample.mwasc ................................................................................... 22 2.2.3 .tapesg TapesG Grid files ........................................................................... 23

TIMEExample.tapesg ................................................................................... 23 2.2.4 .flt ESRI Binary Raster Interchange format....................................................... 23

Appendix A Zero-padding data in Excel ......................24

Appendix B Times and Dates in Data Files..................27

B.1 Formats for CDT and CSV files ......................................................................................... 27 B.2 The ISO8601 standard for representing dates and times ..................................................... 30

Page 7: File Formats

Introduction

1

1 Introduction Many Catchment Modelling Toolkit products and the TIME software development system have the capacity to load and save data in a wide range of formats. Adhering to these formats across Toolkit applications will increase the usability and sharing of data files, and improve the software user experience.

1.1 The user guide This document provides an overview of file formats commonly used in TIME and relevant Catchment Modelling Toolkit products, with associated file extensions. Small examples of data files for relevant text formats are also provided.

1.2 File Formats

1.2.1 Overview

Clearly specifying file formats makes it easier to prepare files, transfer files between applications, and to interpret modelling results in output files.

The file formats included in this manual are:

Time Series (text)

.cdt Comma Delimited Column Time Series

.sdt Space Delimited Column Time Series

.tts Tarsier Daily Time Series

.ar1 Annual Stochastic

.awb AWBM Daily Time Series

.bsb Swat BSB File Time Series

.bsm Bureau of Meteorology 6 Minute (was .pluv )

.csv Comma Separated Variable

.dat Fixed Format Time Series Files (F.Chiew)

Page 8: File Formats

Toolkit Data File Formats

2

.iqqm IQQM Daily Time Series

.mrf MFM Monthly Rainfall Files

.pcp Swat Daily Time Series

.silo5 QDNR SILO Time Series Files (was .txt )

.silo8 SILO Standard Time Series (8 Variable)

Raster (text)

.asc Arc Ascii Grids

.mwasc Map Window Ascii Grids

.tapesg TapesG Grid files

.tile Tiled Raster Files

.tra Tarsier Raster Files

Others

There are several GIS, Graphics and other formats recognised by TIME but not included in this manual. These are:

.flt ESRI Binary Raster Interchange format

.mif MapInfo Interchange

.shp ESRI Shape files (SHP)

.jpg GEO JPG Image

.jpeg GEO JPEG Image

.tif GEOTif Image

.tiff GEOTiff Image

.tile tile raster file

.tra Tarsier raster file (binary)

.tne Tarsier Node Link Network Files

.tsd Tarsier Sites Data Files

1.2.2 Features

File types, formats and extensions are provided in this manual, covering time series and raster files.

1.2.3 Audience

This document is intended for use by software developers and users, as well as those preparing documentation for Toolkit products.

1.2.4 Limitations

This document does not provide specifications for common external file formats, such as .shp. For the shp file format, see http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf.

Page 9: File Formats

File Format Overview

3

2 File Format Overview

2.1 Time Series Formats

2.1.1 .cdt Comma Delimited Column Time Series

A comma delimited column time series format file is an ASCII text file that contains regular (periodic) time series data. The file commonly has no header line, but can support a single line header specifying “Date, Time series 1” if required.

There are two columns of data1, with the first column being a date string, followed by the time series value. The date string may be specified in two formats: the ISO date-time string (e.g. 2000-01-24, 1998-12), or the local culture settings of the machine (e.g. 24/01/2000 for Australia), i.e. typically what Microsoft Excel will produce by default. Both can be recognised unambiguously on a given machine, but the latter will not necessarily be transferable to another machine. You are thus encouraged to use the ISO format (see Appendix B for more details.

See also Appendix B, Times and Dates in Data Files. All time series data readings assume that the reading is at time 00:00 at the beginning of the day/month/year "date" field.

The columns are separated by a comma. Annual data can be entered using month number as 01 (e.g. 01/1995; 01/1996).

Note We strongly recommend using zero padding, eg use "01/1996" for January 1996, not 1/1996. See Appendix A for instructions on how to zero-pad files in Microsoft Excel 2003 (the instructions will be similar for other versions.)

1 Many cultures use a comma as a decimal separator, and semi-column for the Excel separator for the equivalent of the “CSV” files (Arguably this is a better choice since comma may be the thousands separator in Australia)

In other words, the default output of Excel will likely be unreadable, or worse, values not read correctly, on many non English locale machines.

Page 10: File Formats

Toolkit Data File Formats

4

TIMEExample.cdt

Daily time series

01/2000,2600 02/2000,2454 03/2000,2270 04/2000,2224 05/2000,2201 06/2000,2061 07/2000,1963 08/2000,1985 09/2000,71 10/2000,24 11/2000,23 12/2000,45 01/2001,43

Six minute time series

2000-01-01,00:00,0 2000-01-01,00:06,1 2000-01-01,00:12,2 2000-01-01,00:18,3 2000-01-01,00:24,4

Annual time series

2000,0 2001,1 2002,2

Daily time series in ISO format

2000-01-01,0.9,0.9 2000-02-01,0.7,0.9 2000-03-01,0.9,0.9 2000-04-01,0.4,0.9

Page 11: File Formats

File Format Overview

5

2.1.2 .sdt Space Delimited Column Time Series

A space delimited column time series format file is an ASCII text file that contains time series data. There is no header line in the file. There are four columns of data, being:

• Year (yyyy)

• Month (mm)

• Day (dd), and

• Value

separated by a space or tab character. Monthly and annual data can be entered using month and/or day number as 01. These files can be created in a spreadsheet application by saving correctly formatted columns to a text (txt) format.

TIMEExample.sdt

2000 1 1 14.000 2000 1 2 2.000 2000 1 3 16.000 2000 1 4 9.000 2000 1 5 13.000 2000 1 6 15.000 2000 1 7 1.000 2000 1 8 19.000 2000 1 9 45.000 2000 1 10 0.000

Page 12: File Formats

Toolkit Data File Formats

6

2.1.3 .tts Tarsier Daily Time Series

The Tarsier daily time series format file is an ASCII text file that contains daily time series data. The file has a 21 line header followed by daily data values.

The lines of information contained within the header are:

Line Purpose

1 The Tarsier version number header

2 Reference to author of Tarsier

3 File path and name

4 Name of software used to create the file

5 Date and time file was created

6 Tarsier timer series data class (e.g. TTimeSeriesData)

7 File version number

8 Number of header lines (set to 1)

9 1.

10 Number of daily data entries in the file

11 'Xlabel' is always Date/Time for time series data

12 'Y1Label Y1' fixed field

13 'Y2Label Y2' fixed field

14 Data units

15 Format information (e.g. 1)

16 Grid position east in metres

17 Grid position north in metres

18 'Latitude' followed by the latitude of the site in degrees

19 'Longitude' followed by the longitude of the site in degrees

20 'Elevation' followed by the elevation of the site in metres

21 Header character

The data is organized in four columns separated by spaces. The first column is the year, followed by the Julian day. The third column is the data value. The fourth column is a data quality code '.' is ok and '-' is missing.

Page 13: File Formats

File Format Overview

7

TIMEExample.tts

Tarsier modelling framework, Version 2.0. : Created by Fred Watson. : File Name : C:\data\TIME\TIMEExample.tts : Generated from TIME Framework : Date : 24/12/2004 11:59:30 PM : File class: TTimeSeriesData. FileVersion unknown HeaderLines 1 1. NominalNumEntries 10 XLabel Date/Time Y1Label Y1 Y2Label Y2 Units mm.day^-1 Format 1 Easting 0.000000 Northing 0.000000 Latitude 0.000000 Longitude 0.000000 Elevation 0.000000 * 2001 363 0 . 2001 364 0 . 2001 365 10.2 . 2002 1 0 . 2002 2 0 . 2002 3 0 . 2002 4 16 . 2002 5 16 . 2002 6 56 . 2002 7 0 .

Page 14: File Formats

Toolkit Data File Formats

8

2.1.4 .ar1 Annual Stochastic

This format contains replicates of annual time series data generated using the AR(1) stochastic method. The file has two header lines, consisting of:

• the site name, and

• the number of years per replicate, and number of replicates, separated by space

The replicate data sets then follow, each with a single line header with the replicate number, followed by a single line containing all the data for that replicate, separated by spaces.

Note This format is not the same as the AR(1) format (.gen) generated and exported by the Stochastic Climate Library.

TIMEExample.ar1

Melbourne Stochastic Annual Rainfall 20 2 1 785.307 679.228 621.467 558.083 519.046 450.351 789.494 821.882 646.454 549.185 499.563 468.864 850.161 691.136 702.094 607.146 720.839 793.515 586.834 586.203 2 600.649 693.050 605.785 858.584 555.068 642.778 840.894 799.987 629.751 518.411 811.712 489.935 905.870 738.260 589.943 485.806 491.066 540.803 912.968 671.630

Page 15: File Formats

File Format Overview

9

2.1.5 .awb AWBM Daily Time Series

An AWBM daily time series format file is an ASCII text file that contains daily time series data. There is no header line in the file. The data are organized in rows of one month of data separated by spaces. The first entry in a row is the number of days in the month. This is followed by data values for each day in the month. The data values are followed by the year and month of the data.

Note Dates were optional in the original AWBM file format, but are not optional in TIME.

TIMEExample.awb

31 0 0 0.8 0 0 0 0 0 0 1.2 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 1 28 0 0 0 0 0 0 0 0 0.2 0 0 0 0 0 0 0 0 0 0.8 0 0 0 0 0 0 0 0 0 0 2001 2 31 0 0 0 1.2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.2 0 0 0 0 0 0 0 0 0 0 2001 3 30 0 0 0 0 0 0 0 0.4 0 0 0 0 0 0 0 67.4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 4 31 0 45.3 0 0 0 0 0 0 0 0 0 1.2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 5 30 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3.6 0 0 0 0 0 0 0 0 0 0 2001 6 31 0 0 0 8.4 0 0 0 0 0 45.6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 7 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 8 30 0 0 0 0 0.6 6 12.3 45 0 4 0 0 0 0 0 0 0 0 0 0 3.5 0 0 0 0 0 0 0 0 0 2001 9 31 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7.2 0 0 0 0 0 0 0 0 0 0 0 0 0 2001 10

Page 16: File Formats

Toolkit Data File Formats

10

2.1.6 .bsb SWAT BSB File Time Series

This is a line-based fixed format file, typical of FORTRAN programmers. Each line consists of data for a particular basin and time step. The header line gives the fields for the file, with each data line providing basin, Identifier, MON, area, and values for each data value (e.g Precipitation). Data are given for each basin for each time step, followed by each basin for the next timestep. For more details refer to the SWAT manual.

TIMEExample.bsb

SUB GIS MON AREAkm2 PRECIPmm BoggyCk 1 0 1.14170E+02 0.000 SwampyCk 2 0 1.12000E+02 0.000 DampCk 3 0 1.79000E+02 0.000 SoggyCk 4 0 10.0200E+02 0.000 MuddyR 5 0 10.0200E+02 0.000 BoggyCk 1 0 1.14170E+02 1.200 SwampyCk 2 0 1.12000E+02 5.600 DampCk 3 0 1.79000E+02 9.200 SoggyCk 4 0 10.0200E+02 4.000 MuddyR 5 0 10.0200E+02 3.200

Page 17: File Formats

File Format Overview

11

2.1.7 .bsm Bureau of Meteorology 6 Minute (was .pluv )

A fixed format file type, typically supplied by the Australian Bureau of Meteorology for 6 minute pluviograph data. The file has two header lines, with the first containing the station identified and an integer (1), and the second the station number and name. Data lines start with the station number, followed by date (yyyymmdd), then 240 values of 6 minute data for that day. All values are on a fixed spacing when supplied, but spaced separated values can be read by TIME. There are a number of bsm file format details, as explained below, taken from a note from the Bureau of Meteorology.

Note 1 All data available in the computer archive are provided. However very few sites have a completely uninterrupted historical record, with no gaps. Such gaps or missing data may be due to many reasons from illness of the observer to a broken instrument. A site may have been closed, reopened, upgraded or downgraded during its existence, possibly causing breaks in the record of any particular element.

Note 2 Final quality control for any element usually occurs once the manuscript records have been received and processed, which may be 6-12 weeks after the end of the month. Thus quality controlled data will not normally be available immediately, in 'real time'.

Data Records

Record Type 1

Element Start End Record

No. Byte Byte Size Description Units/Values

1 1 – 6 6 Station Number

2 7 – 15 9 Blank

3 16 – 16 1 Record type

4 17 – N Blank

Record Type 2

Element Start End Record

No. Byte Byte Size Description Units/Values

5 1 – 6 6 Station Number

6 7 – 15 9 Blank

7 16 – 16 1 Record type

8 17 – 20 4 Blank

9 21 – 54 34 Station Name

10 55 – N lank

Page 18: File Formats

Toolkit Data File Formats

12

Record Type 3

Element Start End Record

No. Byte Byte Size Description Units/Values

11 1 – 6 6 Station Number

12 7 – 12 6 Blank

13 13 – 16 4 Year YYYY

14 17 – 18 2 Month MM

15 19 – 20 2 Day (Midnight to Midnight) DD

16 21 – 27 7 Rainfall value of first interval See Notes.

17 28 – 34 7 Rainfall value of second interval See Notes.

18 35 – 41 7 Rainfall value of third interval See Notes.

19 42 – N Rainfall value of other intervals See Notes.

Notes

• Fortran format F7.1 i.e. a field width of seven bytes with one decimal place. A rainfall of 2mm will show up as a figure of 20.0 tenths of mm.

• Rainfall values will be :

• 0 = Intervals of zero rain.

• ve = Value for intervals when total for the period is known.

• 9999.0 = Intervals when no data is available.

• 8888.0 = Intervals when rain fell but the total is known only for a period of several intervals. This total is entered as a negative value in the last interval of the accumulated period.

• The length of each record depends on the interval required.

• Record length (N) = 20 + 7 * (1440 / Interval)

• eg. Interval = 6 minutes, N = 20 + 7 * (1440 / 6) = 1700

• Record types 1 and 2 occur at the beginning of each station's data and are followed by a number of records of type 3.

• If there is zero rain for the whole day, no record is written for that day.

If the entire month is missing, no records are written unless padding indicator is set. Then records with days filled with missing values (-9999.0) are written. However if complete years are missing, no attempt is made to write these dummy records.

TIMEExample.bsm

61078 1 61078 2 WILLIAMTOWN RAAF 61078 19521231 .0 .0 .0 [etc., 240 values] 61078 1953 1 1 .0 .0 .0 [etc., 240 values] 61078 1953 1 3 .0 .2 .0 [etc., 240 values] 61078 1953 115 .0 .0 .2 [etc., 240 values] 61078 1953 118 .0 .0 .0 [etc., 240 values] 61078 1953 212 .0 .0 .0 [etc., 240 values] 61078 1953 213 .0 .0 .0 [etc., 240 values] 61078 1953 214 .0 .0 .0 [etc., 240 values] 61078 19521231 .0 .0 .0 [etc., 240 values] 61078 19521231 .0 .0 .0 [etc., 240 values]

Page 19: File Formats

File Format Overview

13

2.1.8 .csv Comma Separated Variable

A comma separated variable file is an ASCII text file that contains regular (periodic) time series data. There are two columns of data, with the first column being a date string (e.g. 01/01/2003; 01/1995), followed by the time series value. The two values are separated by a comma. Annual data can be entered using month number as 01 (e.g. 01/1995; 01/1996). The file may have a header line with a title for each column. The first column header is usually "Date" followed by a relevant data column header (e.g. flow, velocity, natural, historic).

TIMEExample.csv

1/01/1974,336.9 2/01/1974,771.2 3/01/1974,837.7 4/01/1974,940.2 5/01/1974,1249.3 6/01/1974,1445.4 7/01/1974,974.7 8/01/1974,797.0 9/01/1974,849.8 10/01/1974,168.4

Page 20: File Formats

Toolkit Data File Formats

14

2.1.9 .dat Fixed Format Time Series Files (F.Chiew)

This two-column daily time series file has a fixed date format of yyyymmdd in the first column (starting at Col position 3 from the left margin), and value in the second column. Spaces are used alongside single digit days or month (e.g. 1997_227______3.00).

TIMEExample.dat

19691228 0.00 19691229 0.00 19691230 3.00 19691231 2.00 1970 1 1 12.68 1970 1 2 0.00 1970 1 3 17.10 1970 1 4 3.53 1970 1 5 0.00 1970 1 6 0.00

Page 21: File Formats

File Format Overview

15

2.1.10 .iqqm IQQM Time Series

An IQQM time series format file is an ASCII text file that contains daily, monthly or annual time series data. The file has a five line header followed by tables of data. The five lines of information contained within the header are:

• Title line – a 40-character string detailing how the file was created. It is suffixed with a date and time of creation;

• Site name – a 40 character;

• Type – a 15 character string specifying the data type (eg. precipitation, evaporation or gauged flow);

• Units – a 10 character string specifying the units of data (eg. mm, mm*0.1, ML/d); and

• Date – specifies the time span of the time series data and the time interval of data stored within the file.

In daily IQQM format files the daily data is grouped in tables of yearly data with 31 columns representing each day in a month and 12 rows for each month of the year. At the end of each row of daily data is a monthly total and at the end of each table is a yearly total. These tables are repeated for the number of years of time series data.

The first line of each table specifies the year related to the data within the table. This year may be optionally followed by "factor=". Where the number following the factor is a factor that is applies to the whole of the table. If this factor does not exist then no factor is applied to the table.

The time series data values may be suffixed by a special character, which modifies the time series data value. The special characters and their function is described below:

'*' Indicates the time series value is to be multiplied by 1000;

'e' Indicates the time series value is estimated;

'E' Indicates the time series value is estimated and should be multiplied by 1000; NOTE – TIME does not act on these quality codes at present.

'n' Indicates the time series value is negative. Note negative numbers not followed by a 'n' are assumed to be missing;

'N' Indicates the time series value is negative and is to be multiplied by 1000 Note negative numbers not followed by a 'N' are assumed to be missing; and

'?' Indicates that the time series value is missing. Note typically missing values are flagged as "-1?". A negative value not suffixed with a "n" or "N" is also considered as missing e.g. "-1 " is considered as a missing value.

Page 22: File Formats

Toolkit Data File Formats

16

TIMEExample.iqqm

Title: Winding River at Middle Gauge Date:30/07/2003 Time:17:47:24.66 Site : Winding River Type : Flow Units: ML/d Date : 01/01/1985 to 31/12/2000 Interval : Daily Year:1985 Factor= 0.1E-01 ----------------------------------------- ---------------------- 01 02 03 04 05 06 30 31 Total ----------------------------------------- ---------------------- Jan 0 0 0 0 165e 97 3070 2509 10956 Feb 2040 1666 1337 1081 -1? 870 -1? Mar 545 450 358 286 217 216 1987 7309 45937 Apr 6910 5528 4520 4818 4947 3982 9081 219146 Dec 33* 32558 31655 30729 29827 29154 19364 19051 767745 ----------------------------------------- ---------------------- 14627179

Page 23: File Formats

File Format Overview

17

2.1.11 .mrf MFM Monthly Rainfall Files

A text file format containing a header line followed by a line giving the number of months of data. Data are formatted in lines with year given first, followed by 12 monthly values, all space separated.

TIMEExample.mrf

Swiftflow River @ Wooden Bridge 10 1985 12 23 34 45 35 46 56 67 65 43 11 12 1986 13 23 34 45 5 46 6 67 65 3 11 72 1987 14 23 34 45 35 46 56 67 65 43 11 82 1988 15 3 45 35 46 56 6 65 43 11 12 2 1989 16 23 34 45 35 46 56 67 6 43 11 18 1990 22 2 34 45 35 46 5 67 65 43 11 15 1991 22 23 34 45 35 46 56 67 65 43 15 12 1992 25 3 34 45 3 46 5 67 65 4 11 17 1993 15 23 3 45 35 46 56 67 6 43 11 15 1994 16 2 34 45 35 46 5 67 65 43 11 14

Page 24: File Formats

Toolkit Data File Formats

18

2.1.12 .pcp Swat Daily Time Series

An SWAT daily rainfall time series format file is an ASCII text file that contains daily time series rainfall data. The file has a four line header followed by daily data values.

The four lines of information contained within the header are:

• The swat file description header

• 'Lati' followed by the latitude of the site in degrees

• 'Long' followed by the longitude of the site in degrees

• 'Elev' followed by the elevation of the site in metres

The data is organized in one column, in format yyyydddvvv.v, where yyyy is the year, ddd is the Julian day, and vvv.v is the data value to one decimal place (5 characters).

TIMEExample.pcp

Precipitation Input File TIMEExample.pcp 20041224 AVSWAT2000 - SWAT interface MDL Lati 14.77 Long 102.7 Elev 167 1985001000.0 1985002000.0 1985003000.0 1985004000.0 1985005000.0 1985006000.0 1985007000.0 1985008000.0 1985009000.0 1985010000.0

Page 25: File Formats

File Format Overview

19

2.1.13 .silo5 QDNR SILO Time Series (was .txt )

An QDNR SILO daily time series format file is an ASCII text file that contains daily time series data. There is no header line in the file. The data is organized in five columns separated by spaces. The first four columns are the date in year, month, day and Julian day format. The fifth column is the data value.

TIMEExample.silo5

2001 12 29 363 0 2001 12 30 364 0 2001 12 31 365 10.2 2002 1 1 1 0 2002 1 2 2 0 2002 1 3 3 0 2002 1 4 4 0 2002 1 5 5 12 2002 1 6 6 11 2002 1 7 7 8

Page 26: File Formats

Toolkit Data File Formats

20

2.1.14 .silo8 SILO Standard Time Series (8 Variable)

The .silo8 format contains the full 8 column daily data set from the SILO data base. The file can have multiple header lines, enclosed in inverted commas. Data, in order, are maximum temperature, minimum temperature, rainfall, evaporation, radiation, vapour pressure, maximum relative humidity, minimum relative humidity.

TIMEExample.silo8

Big River Station 12 23 34 67 65 43 11 12 13 23 34 45 65 3 11 72 14 23 34 45 35 46 56 67 15 3 45 35 46 56 6 65 16 23 34 45 35 46 11 18 22 2 34 46 5 67 65 43 22 23 34 45 65 43 15 12 45 3 46 5 67 65 4 11 15 45 35 46 56 67 6 43 34 45 35 46 5 67 65 34

Page 27: File Formats

File Format Overview

21

2.2 Raster Formats

2.2.1 .asc Arc ASCII Grids

A space delimited grid file, with a 6 line header. The header lines are:

• ncols aa

• nrows bb

• xllcorner cc

• yllcorner dd

• cellsize ee

• NODATA_value ff

Where:

aa is the number of columns, bb the number of rows, cc and dd the coordinates of the lower left corner of the grid (may be set to 0), ee is cell side length, and ff in the null data string (e.g. -9999).

Values are arranged in space delimited rows and columns, reflecting the structure of the grid. Units for cell size length depend upon the input data, and could be either geographic (eg degrees) or projected (eg. m, km). Units are generally determined by the using application, and m is common for most TIME-based applications.

Arcinfo grid coverages can be converted to .asc files using Arc's GRIDASCII command. .asc files can be imported into Arc using the ASCIIGRID command.

TIMEExample.asc

ncols 10 nrows 10 xllcorner 0 yllcorner 0 cellsize 100 NODATA_value -9999 5 6 3 4 4 5 6 6 6 4 4 4 2 4 4 5 4 5 6 6 3 3 3 3 3 4 4 4 5 6 2 2 2 2 2 3 3 3 4 5 2 9 2 2 2 2 2 2 3 4 5 6 4 4 4 4 2 2 2 3 5 6 3 3 3 3 2 2 2 2 5 6 2 2 2 2 9 8 2 2 5 6 8 2 2 2 4 5 4 4 4 5 6 6 4 2 3 4 4 4

Page 28: File Formats

Toolkit Data File Formats

22

2.2.2 .mwasc Map Window Ascii Grids

The .mwasc ASCII grid is similar to .asc except the header rows xllcorner and yllcorner are replaced by xllcenter and yllcenter – that is, the coordinates are offset by 1/2 cellsize. The header rows do not have titles. Thus there are six header rows with parameters only, followed by the gridded data.

TIMEExample.mwasc

10 10 0 0 1 -9999 5 6 3 4 4 5 6 6 6 4 4 4 2 4 4 5 4 5 6 6 3 3 3 3 3 4 4 4 5 6 2 2 2 2 2 3 3 3 4 5 2 9 2 2 2 2 2 2 3 4 5 6 4 4 4 4 2 2 2 3 5 6 3 3 3 3 2 2 2 2 5 6 2 2 2 2 9 8 2 2 5 6 8 2 2 2 4 5 4 4 4 5 6 6 4 2 3 4 4 4

Page 29: File Formats

File Format Overview

23

2.2.3 .tapesg TapesG Grid files

A three column raster data format, with space separated values. Each line consists of the X coordinate, Y coordinate, and value.

TIMEExample.tapesg

1 1 167 2 1 165 3 1 152 4 1 133 5 1 134 1 2 144 2 2 151 3 2 156 4 2 146 5 2 132 1 3 150 2 3 141 3 3 139 4 3 135 5 3 131 1 4 158 2 4 151 3 4 153 4 4 146 5 4 136 1 5 158 2 5 160 3 5 160 4 5 153 5 5 148

2.2.4 .flt ESRI Binary Raster Interchange format

Arcinfo grid coverages can be converted to .ftl files using Arc's GRIDFLOAT command. .flt files can be imported into Arc using the FLOATGRID command.

Page 30: File Formats

Toolkit Data File Formats

24

Appendix A Zero-padding data in Excel Certain file formats require data to be zero-padded. In the table below, the first column represents months, and is not zero-padded. Some applications, including Toolkit products, will sort this data as is shown in the 2nd column - clearly this is not correct. The third column is zero-padded

Non zero-padded What happens after sorting (some applications)

Zero-padded data (will sort correctly)

1 2 3 4 5 6 7 8 9

10 11 12

1 10 11 12 2 3 4 5 6 7 8 9

01 02 03 04 05 06 07 08 09 10 11 12

How to zero-pad data in Excel 2003

In this example, we will pad numbers representing the months, for example in the CDT format (see page 3).

1 Select the column representing the data

Page 31: File Formats

Zero-padding data in Excel

25

2 Right-click the selection and select Format Cells...

3 Click the Number tab, then in the Categories list, click Custom, then select General in the Type list.

4 Type in as many zeroes as there are digits in the data field. For example, "months" has two digits, so type two zeroes:

Page 32: File Formats

Toolkit Data File Formats

26

In the Sample area at the top, you can see what the padded number looks like.

If you wanted to pad Julian days (which have 3 digits), you would type three zeroes into the Type field.

5 Click OK, and Excel pads the data:

Page 33: File Formats

Times and Dates in Data Files

27

Appendix B Times and Dates in Data Files

B.1 Formats for CDT and CSV files

Line format

The format of each line in a CDT, SDT or CSV file is Date,value1,value2,value3,value4

Missing Entries

Missing entries must be specified as -9999. Empty strings or whitespace between commas are also read as missing values.

Decimal separators

Always use a dot as a decimal separator for numerical values, irrespective of the local culture/language/locale settings for Windows.

Headers

There may be a header on the first line specifying the names of the time series in the columns. Each element in the header must also be separated by commas e.g. Date,rainfall,evaporation,temperature

Page 34: File Formats

Toolkit Data File Formats

28

Dates

Dates must comply with the ISO 8601 standard. The full date-time format used is: yyyy-MM-dd HH:mm:ss

A subset of this format may be used for the sake of compactness if there is no ambiguity or loss of precision, for example: yyyy-MM-dd

ie at most, seconds, minutes and hours can be omitted

The TIME framework (used for most Toolkit products) will always write the following: yyyy-MM-dd

though more compact formats will be read if unambiguous.

The smallest time step that CSV files can currently handle is a second. See section B.2 for more details on the ISO date-time format.

Note Microsoft Excel seems to recognise the ISO 8601 date format with at least the precision yyyy-MM-dd. This correct detection seems to be independent of the computer regional and language settings.

Dates read when loading a CSV file do not necessarily have to follow the ISO 8601 standard, but end-users should be encouraged to use it when generating CSV data files from another software tool. Local date-time culture settings are supported for the benefit of users for two main practical reasons: the default behaviour of Microsoft Excel and backward compatibility.

How the TIME framework determines date/time formats

When reading a file, the TIME framework examines the first few lines to detect the date-time format and the time step of the time series:

• If the format is recognised as being ISO 8601 compliant, this format will be used to read all subsequent dates.

• Failing that, an attempt is made to detect the dates and time step with English-Australia ("en-AU") settings, for backward-compatibility reasons.

• Last, an attempt is made to use the computer configuration for regional and language settings.

An exception, LoadDataFileIOException, occurs if all methods fail. If you see this exception you will need to check your input data.

Note The date-time format must be consistent throughout the file.

Possible problems with time steps

The only known case where a time step may be incorrectly detected when reading a file is on a computer with U.S. settings, due to the mm/dd/yyyy date format. This may happen if the whole of a daily time series covers less than 13 days, or less than 12 months for a monthly time series. Users working on such computers should use the recommended ISO 8601 format in order to prevent any ambiguity.

Page 35: File Formats

Times and Dates in Data Files

29

Examples

A CVS file with one daily time series, where the time of record is 00:00:00 every day will be written as:

2000-01-01,0.9 2000-01-02,0.7 2000-01-03,0.9 2000-01-04,0.4 An example of a daily time series with missing values at the second and fourth day:

2000-01-01,0.9 2000-01-02,-9999 2000-01-03,0.9 2000-01-04, 2000-01-05,1.1 Parsing the following file would throw an exception: missing entries are not allowed since their interpretation is likely ambiguous (missing data, or zero record?)

2000-01-01,0.9 2000-01-02,2.2 2000-01-05,1.1 2000-01-06,1.1 A time series with a six minutes time step:

2000-01-01 00:00,0 2000-01-01 00:06,1 2000-01-01 00:12,2 2000-01-01 00:18,3 2000-01-01 00:24,4 The following time series will be read as one monthly time series, with records on the first day of every month at time 00:00:00.

2000-01,0 2000-02,1 2000-03,2 2000-04,3 2000-05,4 The following time series will be read as one annual time series, with records on January the first, at time 00:00:00

2000,0 2001,1 2002,2 Any even time step is supported, e.g. a weekly time series would be written as:

2004-01-02,0 2004-01-09,1.2 2004-01-16,2.3 2004-01-23,3 2004-01-30,4 2004-02-06,4 A header may be specified:

Date,rainfall,evaporation 2000-01-01,0.9,0.9 2000-02-01,0.7,0.9 2000-03-01,0.9,0.9 2000-04-01,0.4,0.9 The following file, generated for instance with Excel, will be recognised as a monthly time series on a computer with U.S. culture (locale) settings, but loading it on a computer with different culture settings will likely fail:

01/01/2000,0.1 02/01/2000,0.8 03/01/2000,0.6 04/01/2000,0.9 05/01/2000,0.6

Page 36: File Formats

Toolkit Data File Formats

30

06/01/2000,0.4 07/01/2000,0.6 08/01/2000,0.2 09/01/2000,0.8 10/01/2000,0.6 11/01/2000,0.3 12/01/2000,0.1 01/01/2001,0.389 02/01/2001,0.389 The following file, irrespective of the computer's culture settings, will be read as a daily time series from the first of January to the fifth:

01/01/2000,0.1 02/01/2000,0.8 03/01/2000,0.6 04/01/2000,0.9 05/01/2000,0.6

B.2 The ISO8601 standard for representing dates and times The TIME framework uses a subset of the ISO8601 standard. The central part of this subset is the use of the format string: yyyy-MM-ddTHH:mm:ss It appears that de facto 'standards' such as Microsoft Excel do not detect dates with the T symbol between the date and time parts of the date-time. However if the T symbol is replaced with a space, the date-time is recognised by Excel, irrespective of the regional settings of the computer. Houston (reference 2, below) reports that according to the ISO standard the T symbol "may be omitted by those interchanging data, if ambiguity can be avoided". The T symbol will thus be replaced by a space to ensure compatibility with de facto Microsoft 'standards'. The members of this class are qualified as 'compatible' if the T symbol is replaced by a whitespace, otherwise the term 'strict compliance' will be used.

References

Kuhn M., "A Summary of the International Standard Date and Time Notation", http://www.cl.cam.ac.uk/~mgk25/iso-time.html

Houston G., "ISO 8601:1988 Date/Time Representations", http://hydracen.com/dx/iso8601.htm

Note The official ISO document is copyrighted and not freely available.