|
ActiveTcl User Guide
|
|
|
[ Main table Of Contents | Tcllib Table Of Contents | Tcllib Index ]
bee(n) 0.1 "BitTorrent"
bee - BitTorrent Serialization Format Encoder/Decoder
TABLE OF
CONTENTS
SYNOPSIS
DESCRIPTION
PUBLIC API
ENCODER
DECODER
FORMAT
DEFINITION
EXAMPLES
KEYWORDS
COPYRIGHT
package require Tcl 8.4
package require bee ?0.1?
The bee package provides de- and encoder
commands for data in bencoding (speak 'bee'), the serialization
format for data and messages used by the BitTorrent
application.
The package provides one encoder command for each of the basic
forms, and two commands per container, one taking a proper tcl data
structure to encode in the container, the other taking the same
information as several arguments.
- ::bee::encodeString string
- Returns the bee-encoding of the string.
- ::bee::encodeNumber integer
- Returns the bee-encoding of the integer
number.
- ::bee::encodeListArgs value...
- Takes zero or more bee-encoded values and returns the
bee-encoding of their list.
- ::bee::encodeList list
- Takes a list of bee-encoded values and returns the bee-encoding
of the list.
- ::bee::encodeDictArgs key value...
- Takes zero or more pairs of keys and values and returns the
bee-encoding of the dictionary they form. The values are expected
to be already bee-encoded, but the keys must not be. Their encoding
will be done by the command itself.
- ::bee::encodeDict dict
- Takes a dictionary list of string keys and bee-encoded values
and returns the bee-encoding of the list. Note that the keys in the
input must not be bee-encoded already. This will be done by the
command itself.
The package provides two main decoder commands, one for decoding
a string expected to contain a complete data structure, the other
for the incremental decoding of bee-values arriving on a channel.
The latter command is asynchronous and provides the completed
decoded values to the user through a command callback.
- ::bee::decode string ?endvar? ?start?
- Takes the bee-encoding in the string and returns one decoded
value. In the case of this being a container all contained values
are decoded recursively as well and the result is a properly nested
tcl list and/or dictionary.
If the optional endvar is set then it is the
name of a variable to store the index of the first character
after the decoded value into. In other words, if the
string contains more than one value then endvar
can be used to obtain the position of the bee-value after the
bee-value currently decoded. together with start, see below, it is possible to iterate over the
string to extract all contained values.
The optional start index defaults to
0, i.e. the beginning of the string. It is the
index of the first character of the bee-encoded value to
extract.
- ::bee::decodeIndices string ?endvar? ?start?
- Takes the same arguments as ::bee::decode
and returns the same information in endvar. The
result however is different. Instead of the tcl value contained in
the string it returns a list describing the
value with respect to type and location (indices for the first and
last character of the bee-value). In case of a container the
structure also contains the same information for all the embedded
values.
Formally the results for the various types of bee-values are:
- string
- A list containing three elements:
- The constant string string, denoting the type
of the value.
- An integer number greater than or equal to zero. This is the
index of the first character of the bee-value in the input string.
- An integer number greater than or equal to zero. This is the
index of the last character of the bee-value in the input string.
Note that this information is present in the results for
all four types of bee-values, with only the first element changing
according to the type of the value.
- integer
- The result is like for strings, except that the type element
contains the constant string integer.
- list
- The result is like before, with two exceptions: One, the type
element contains the constant string list. And
two, the result actually contains four elements. The last element
is new, and contains the index data as described here for all
elements of the bee-list.
- dictionary
- The result is like for strings, except that the type element
contains the constant string dict. A fourth
element is present as well, with a slightly different structure
than for lists. The element is a dictionary mapping from the
strings keys of the bee-dictionary to a list containing two
elements. The first of them is the index information for the key,
and the second element is the index information for the value the
key maps to. This structure is the only which contains not only
index data, but actual values from the bee-string. While the index
information of the keys is unique enough, i.e. serviceable as keys,
they are not easy to navigate when trying to find particular
element. Using the actual keys makes this much easier.
- ::bee::decodeChannel chan -command cmdprefix ?-exact?
?-prefix data?
- The command creates a decoder for a series of bee-values
arriving on the channel chan and returns its
handle. This handle can be used to remove the decoder again.
Setting up another bee decoder on chan while a
bee decoder is still active will fail with an error message.
- -command
- The command prefix cmdprefix specified by
the required option -command is used to
report extracted values and exceptional situations (error, and EOF
on the channel). The callback will be executed at the global level
of the interpreter, with two or three arguments. The exact call
signatures are
- cmdprefix eof
token
- The decoder has reached eof on the channel chan. No further invocations of the callback will be made
after this. The channel has already been closed at the time of the
call, and the token is not valid anymore as
well.
- cmdprefix
error token message
- The decoder encountered an error, which is not eof. For example
a malformed bee-value. The message provides
details about the error. The decoder token is in the same state as
for eof, i.e. invalid. The channel however is kept open.
- cmdprefix
value token value
- The decoder received and successfully decoded a bee-value. The
format of the equivalent tcl value is the same
as returned by ::bee::decode. The channel is
still open and the decoder token is valid. This means that the
callback is able to remove the decoder.
- -exact
- By default the decoder assumes that the remainder of the data
in the channel consists only of bee-values, and reads as much as
possible per event, without regard for boundaries between
bee-values. This means that if the the input contains non-bee data
after a series of bee-value the beginning of that data may be lost
because it was already read by the decoder, but not processed.
The -exact was made for this situation. When
specified the decoder will take care to not read any characters
behind the currently processed bee-value, so that any non-bee data
is kept in the channel for further processing after removal of the
decoder.
- -prefix
- If this option is specified its value is assumed to be the
beginning of the bee-value and used to initialize the internal
decoder buffer. This feature is required if the creator of the
decoder used data from the channel to determine if it should create
the decoder or not. Without the option this data would be lost to
the decoding.
- ::bee::decodeCancel token
- This command cancels the decoder set up by ::bee::decodeChannel and represented by the handle token.
- ::bee::decodePush token string
- This command appends the string to the
internal decoder buffer. It is the runtime equivalent of the option
-prefix of ::bee::decodeChannel. Use it to push data back into the
decoder when the value callback used data from the
channel to determine if it should decode another bee-value or
not.
Data in the bee serialization format is constructed from two
basic forms, and two container forms. The basic forms are strings
and integer numbers, and the containers are lists and
dictionaries.
- String S
- A string S of length L is
encoded by the string "L:S", where the length is written out in textual
form.
- Integer N
- An integer number N is encoded by the string
"iNe".
- List v1 ... vn
- A list of the values v1 to vn is encoded by the string "lBV1...BVne" where
"BVi" is the bee-encoding of the value
"vi".
- Dict k1 -> v1 ...
- A dictionary mapping the string key ki to the value vi, for i in
1 ... n is encoded by the string
"dBKiBVi...e" for i in
1 ... n, where
"BKi" is the bee-encoding of the key string
"ki". and "BVi" is the
bee-encoding of the value "vi".
Note: The bee-encoding does not retain the order of the
keys in the input, but stores in a sorted order. The sorting is
done for the "raw strings".
Note that the type of each encoded item can be determined
immediately from the first character of its representation:
- i
- Integer.
- l
- List.
- d
- Dictionary.
- [0-9]
- String.
By wrapping an integer number into
i...e the format makes sure that
they are different from strings, which all begin with a digit.
BitTorrent , bee , bittorrent , serialization , torrent
Copyright © 2004 Andreas Kupries
<andreas_kupries@users.sourceforge.net>