|
ActiveTcl User Guide
|
|
|
[ Main table Of Contents | Tcllib Table Of Contents | Tcllib Index ]
csv(n) 0.5.1 "CSV processing"
csv - Procedures to handle CSV data.
TABLE OF
CONTENTS
SYNOPSIS
DESCRIPTION
COMMANDS
FORMAT
EXAMPLE
SEE ALSO
KEYWORDS
COPYRIGHT
package require Tcl 8.3
package require csv ?0.5.1?
The csv package provides commands to manipulate
information in CSV FORMAT (CSV = Comma
Separated Values).
The following commands are available:
- ::csv::join values {sepChar ,}
- Takes a list of values and returns a string in CSV format
containing these values. The separator character can be defined by
the caller, but this is optional. The default is ",".
- ::csv::joinlist values {sepChar ,}
- Takes a list of lists of values and returns a string in CSV
format containing these values. The separator character can be
defined by the caller, but this is optional. The default is ",".
Each element of the outer list is considered a record, these are
separated by newlines in the result. The elements of each record
are formatted as usual (via ::csv::join).
- ::csv::read2matrix
?-alternate? chan m {sepChar ,} {expand none}
- A wrapper around ::csv::split2matrix (see
below) reading CSV-formatted lines from the specified channel
(until EOF) and adding them to the given matrix. For an explanation
of the expand argument see ::csv::split2matrix.
- ::csv::read2queue
?-alternate? chan q {sepChar ,}
- A wrapper around ::csv::split2queue (see
below) reading CSV-formatted lines from the specified channel
(until EOF) and adding them to the given queue.
- ::csv::report cmd
matrix ?chan?
- A report command which can be used by the matrix methods format 2string and format 2chan.
For the latter this command delegates the work to ::csv::writematrix. cmd is expected to
be either printmatrix or
printmatrix2channel. The channel argument, chan, has to be present for the latter and must not
be present for the first.
- ::csv::split
?-alternate? line {sepChar ,}
- converts a line in CSV format into a list of
the values contained in the line. The character used to separate
the values from each other can be defined by the caller, via sepChar, but this is optional. The default is
",".
If the option -alternate is spcified a slightly
different syntax is used to parse the input. This syntax is
explained below, in the section FORMAT.
- ::csv::split2matrix
?-alternate? m line {sepChar ,} {expand none}
- The same as ::csv::split, but appends the
resulting list as a new row to the matrix m,
using the method add row. The expansion mode
specified via expand determines how the command
handles a matrix with less columns than contained in line. The allowed modes are:
- none
- This is the default mode. In this mode it is the responsibility
of the caller to ensure that the matrix has enough columns to
contain the full line. If there are not enough columns the list of
values is silently truncated at the end to fit.
- empty
- In this mode the command expands an empty matrix to hold all
columns of the specified line, but goes no further. The overall
effect is that the first of a series of lines determines the number
of columns in the matrix and all following lines are truncated to
that size, as if mode none was set.
- auto
- In this mode the command expands the matrix as needed to hold
all columns contained in line. The overall
effect is that after adding a series of lines the matrix will have
enough columns to hold all columns of the longest line encountered
so far.
- ::csv::split2queue
?-alternate? q line {sepChar ,}
- The same as ::csv::split, but appending the
resulting list as a single item to the queue q,
using the method put.
- ::csv::writematrix m chan {sepChar ,}
- A wrapper around ::csv::join taking all rows
in the matrix m and writing them CSV formatted
into the channel chan.
- ::csv::writequeue q chan {sepChar ,}
- A wrapper around ::csv::join taking all
items in the queue q (assumes that they are
lists) and writing them CSV formatted into the channel chan.
The format of regular CSV files is specified as
- Each record of a csv file (comma-separated values, as exported
e.g. by Excel) is a set of ASCII values separated by ",". For other
languages it may be ";" however, although this is not important for
this case as the functions provided here allow any separator
character.
- If and only if a value contains itself the separator ",", then
it (the value) has to be put between "". If the value does not
contain the separator character then quoting is optional.
- If a value contains the character ", that character is
represented by "".
- The output string "" represents the value ". In other words, it
is assumed that it was created through rule 3, and only this rule,
i.e. that the value was not quoted.
An alternate format definition mainly used by MS products
specifies that the output string "" is an representatation of the
empty string. In other words, it is assumed that the output was
generated out of the empty string by quoting it (i.e. rule 2), and
not through rule 3. This is the only difference between the regular
and the alternate format.
The alternate format is activated through specification of the
option -alternate to the various split
commands.
Using the regular format the record
|
123,"123,521.2","Mary says ""Hello, I am Mary""",""
|
is parsed into the items
|
a) 123
b) 123,521.2
c) Mary says "Hello, I am Mary"
d) (the empty string)
|
Using the alternate format the result is
|
a) 123
b) 123,521.2
c) Mary says "Hello, I am Mary"
d) "
|
instead. As can be seen only item (d) is different, now a "
instead of the empty string.
matrix , queue
csv , matrix , package , queue , tcllib
Copyright © 2002 Andreas Kupries
<andreas_kupries@users.sourceforge.net>