one, so it can also be passed as the first, unnamed parameter (see third calling convention); =item C name of the input field, defaults to C; =item C name of the tube, useful for debugging; =item C name of the output field, defaults to C; =item C remove leading and trailing whitespaces from the extracted values; =item C set how you are going to accept input values, e.g. escaped or quoted. See L for details. =back =head2 B<< by_regex >> my $tube = by_regex($regex, %args); # OR my $tube = by_regex(%args); # OR my $tube = by_regex(\%args); parse the input text based on a regular expression, passed as argument C or C<$regex> as unnamed first parameter. The regular expression is supposed to have named captures, that will eventually be used to populate the rendered output. The following arguments are supported: =over =item C name of the input field, defaults to C; =item C name of the tube, useful for debugging; =item C name of the output field, defaults to C; =item C the regular expression to use for splitting the inputs. This is the I

argument, and can be passed also as the first unnamed one in the argument list. =back =head2 B<< by_separators >> my $tube = by_separators($separators, %args); # OR my $tube = by_separators(%args); # OR my $tube = by_separators(\%args); parse the input according to a series of separators, that will be applied in sequence. For example, if the list of separators is the following: @separators = (';', '~~'); the following input: $text = 'foo;bar~~/baz/'; will be split as: @split = ('foo', 'bar', '/baz/'); The following arguments are supported: =over =item C name of the input field, defaults to C; =item C a reference to an array containing the list of keys to be associated to the values from the split; =item C name of the tube, useful for debugging; =item C name of the output field, defaults to C; =item C a reference to an array containing the list of separators to be used for splitting the input. This parameter can also be passed as the first, unnamed argument. Each separator can be: =over =item * a I_{, that is invoked once with a reference to the arguments,
and must return either of the following forms;

=item *

a I, that will be used as-is at the right
place;

=item *

a I, that will be matched verbatim (through a regular
expression matching the string after passing it through
C);

=back

=item C

remove leading and trailing whitespaces from the extracted values.
Example:

@seps = qw< : ; , >;
$input = ' what : ever ;you,do ';
@elements = ('what', 'ever', 'you', 'do');

=item C

this is how you provide a description of what you consider a
I. It can be multiple things:

=over

=item *

a I_{, that is called and MUST provide back one of the
following alternatives;

=item *

a I, that is used directly;

=item *

a I, that is turned into an array reference by creating an
anonymous array with the string as its only element, then processed as
in the following bullet;

=item *

an I with elements inside, that will be described in
the following list.

=back

If you end up with an I, each element will be put in a
big regular expression that is the C of all elements. Each can be:

=over

=item *

a I, that is fit as-is in the big regular
expression;

=item *

the string C, that is the same as having put the three string
C, C and C;

=item *

the string C, that is the same as having put the three string
C and C;

=item *

the string C (or C), that allows you to
match a string that is delimited by single quotes, with no escaping
inside. This is always put at the beginning of the big regular
expression (although C strings can be fit before
actually);

=item *

the string C (or C), that allows you to
match a string that is delimited by double quotes, also allowing escaped
elements inside (via backslashes). This is always put at the beginning
of the big regular expression;

=item *

the string C, that allows you to match a non-greedy sequence of
escaped characters (via backslash). If C is also
specified, single quotes need to be escaped too. If C is
also specified, double quotes need to be escaped too. This is always set
at the end of the big regular expression (except for C, that
might appear after it);

=item *

the string C, that allows you to match a non-greedy sequence
of characters, i.e. it is a synonym of regular expression C<(?ms:.*?)>.
If present, it is always set at the end of the big regular expression.

=back

For example, if you want to accept single quoted, double quoted and
unquoted strings, you might provide the following:

[qw< single-quoted double-quoted whatever >]

=back

=head2 B<< by_split >>

my $tube = by_split(%args); # OR
my $tube = by_split(\%args); # OR
my $tube = by_split($separator, %args);

split the input according to a separator string, passed either as the
first unnamed parameter C<$separator> or as hash options C.

The following arguments are supported:

=over

=item C

set to the number of missing trailing elements that you are fine to
lose, in case you also provide C (see below). This is particularly
important when this function is called behind the scenes by
L, because I sets C.

In practice, suppose that you set the following C:

[qw< foo bar baz whatever >]

A normal parsing will expect to find at least four elements, so the
following input would fail:

FOO,BAR,BAZ

On the other hand, if you set C to 1, you are accepting
that there might be a missing value for C, that will be filled
with the undefined value.

=item C

name of the input field, defaults to C;

=item C

optional reference to an array containing a list of keys to be
associated to the split data. If present, it will be used as such; if
absent, a reference to an array will be set as output.

=item C

name of the tube, useful for debugging;

=item C

name of the output field, defaults to C;

=item C

the separator to be used for C. If it is a code reference,
it is invoked once with the provided arguments to get the separator
back. After this, it can be either a regular expression, used as-is, or
a string that is passed through C before being used;

=item C

remove leading and trailing whitespaces from the extracted values. As
you might expect, if the C is a colon, the following input:

$input = ' what : ever :you:do ';

would be split into the following elements:

@elements = ('what', 'ever', 'you', 'do');

=back

=head2 B<< by_value_separator >>

$tube = by_value_separator($separator, %args); # OR
$tube = by_value_separator(%args); # OR
$tube = by_value_separator(\%args);

parse a sequence of value-and-separator. This is a generalization of
L, where you can provide a way to specify what you consider
I values, e.g. to allow for escaping or quoting (hence also
allowing having the separator inside your values).

B: this function uses the regular expression construct
C<(?{...})> internally. While it is supported as of perl 5.10, this has
evolved in time, up to perl 5.18 where it was stabilized. In particular,
before perl 5.18 it was not possible to use lexical variables in the
construct, so for older perls C uses a package
variable for collecting values. This should not be a problem, but might
be.

Just to make an example, suppose that you are using semicolons as
separators. C would allow you to take this:

'some;thing'; what\;ever ; "this;\"goes\";fine"

and turn it into this:

['some;thing', 'what;ever', 'this:"goes";fine']

As noted, it is similar to L; as a matter of fact, this might
be re-implemented (less efficiently) through L.
Unless there are bugs, of course. Like L, you can provide a
C parameter (also via the first, unnamed parameter) that can
be either a sub reference, a string or a regular expression.

Additionally, you can provide a C parameter that tells what is
considered an I input value. A value can be different things
(see below), but it boils down to providing regular expressions,
indication of pre-canned matching expressions, or a combination.

When you match values, you can then I them. For example, if you
specify that you want to accept double-quoted strings, it makes sense to
remove the quotes and un-escape the remaining sequence before using it.
Depending on what you pass as a definition for a valid C, your
decoding approach might vary. Decoding can happen in two ways: either
you provide a C function that will be applied to each value, or
a C that is applied to the whole values array. You might
want to choose the latter for improving performance (1 sub call against
N).

Normally, an input would be split and an array reference would populate
the C field (that is, the field indicated by the C
argument). If you would rather get a hash, you can pass C to use,
in order. If this is the case, you can also accept getting more values
than you have keys for with C, or less of them with
C.

Last, you might want to take advantage of C if your values
shouldn't have leading/trailing spaces. Be sure to read the fine prints
about trimming quoted strings, though.

Accepted arguments are:

=over

=item C

=item C

these are integer values that set how much less/more values you are
willing to admit with respect to the provided C (see below).
Hence, they only work when C is set.

By default they are set to 0, meaning that you expect to have exactly
the same number of values as there are keys. Allowing I means
that you accept getting less values than there are keys, that will be
associated to C. Allowing I means that you're willing to
ditch that number of exceeding values;

=item C

name of the input field, defaults to C;

=item C

an array reference with the keys to be associated (one-by-one, in order)
to the extracted values;

=item C

name of the tube, useful for debugging. Defaults to
C;

=item C

name of the output field, defaults to C;

=item C

the separator to be used between two consecutive valid Is. It can
be one of the following:

=over

=item *

a I_{, that is called with whatever arguments provided (as
a hash reference) and MUST return one of the following two alternatives;

=item *

a I, that will be matched for the
separator;

=item *

a I, that will be matched verbatim.

=back

There is no default, you MUST provide one either as the first, unnamed
parameter or as argument C;

=item C

remove leading and trailing whitespaces from the extracted values. This
is applied I decoding is applied, which means that
leading/trailing whitespaces I quoted strings will be kept.
Defaults to a I value, meaning that no trimming is performed;

=item C

this is how you provide a description of what you consider a
I. It can be multiple things:

=over

=item *

a I_{, that is called and MUST provide back one of the
following alternatives;

=item *

a I, that is used directly;

=item *

a I, that is turned into an array reference by creating an
anonymous array with the string as its only element, then processed as
in the following bullet;

=item *

an I with elements inside, that will be described in
the following list.

=back

If you end up with an I, each element will be put in a
big regular expression that is the C of all elements. Each can be:

=over

=item *

a I, that is fit as-is in the big regular
expression;

=item *

the string C, that is the same as having put the three string
C, C and C;

=item *

the string C, that is the same as having put the three string
C and C;

=item *

the string C (or C), that allows you to
match a string that is delimited by single quotes, with no escaping
inside. This is always put at the beginning of the big regular
expression (although C strings can be fit before
actually);

=item *

the string C (or C), that allows you to
match a string that is delimited by double quotes, also allowing escaped
elements inside (via backslashes). This is always put at the beginning
of the big regular expression;

=item *

the string C, that allows you to match a non-greedy sequence of
escaped characters (via backslash). If C is also
specified, single quotes need to be escaped too. If C is
also specified, double quotes need to be escaped too. This is always set
at the end of the big regular expression (except for C, that
might appear after it);

=item *

the string C, that allows you to match a non-greedy sequence
of characters, i.e. it is a synonym of regular expression C<(?ms:.*?)>.
If present, it is always set at the end of the big regular expression.

=back

For example, if you want to accept single quoted, double quoted and
unquoted strings, you might provide the following:

[qw< single-quoted double-quoted whatever >]

=back

=head2 B<< ghashy >>

my $tube = ghashy(%args); # OR
my $tube = ghashy(\%args);

parse the input thext as a hash, generalized. The algorithm used is the
same as L. It is a generalization of
L below.

Accepts all arguments as L, with the
same default values except for C that is set to the empty
string (as opposed to not being defined). This means that stand-alone
values will always be accepted. This setting is in line with L
and has been set for backwards/mutual compatibility.

The following arguements are recognised too:

=over

=item C

a hash reference with default values for the output;

=item C

name of the input field, defaults to C;

=item C

name of the tube, useful for debugging. Defaults to C;

=item C

name of the output field, defaults to C;

=back

=head2 B<< hashy >>

my $tube = hashy(%args); # OR
my $tube = hashy(\%args);

parse the input text as a hash. The algorithm used is the same as
L.

=over

=item C

character used to divide chunks in the input, defaults to a space
character (ASCII 0x20);

=item C

the default key to be used when a key is not present in a chunk,
defaults to the empty string;

=item C

a hash reference with default values for the output;

=item C

name of the input field, defaults to C;

=item C

character used to divide the key from the value in a chunk, defaults to
the equal sign C<=>;

=item C

name of the tube, useful for debugging. Defaults to C;

=item C

name of the output field, defaults to C;

=back

This tube factory is strict in what accepts as inputs, in that the
separators MUST be single characters and there is no escaping mechanism.
If you need something more flexible, see L above.

=head2 B<< parse_by_format >>

Alias for L.

=head2 B<< parse_by_regex >>

Alias for L.

=head2 B<< parse_by_separators >>

Alias for L.

=head2 B<< parse_by_split >>

Alias for L.

=head2 B<< parse_by_value_separator >>

Alias for L}}.

=head2 B<< parse_ghashy >>

Alias for L.

=head2 B<< parse_hashy >>

Alias for L.

=head2 B<< parse_single >>

Alias for L.

=head2 B<< single >>

my $tube = single(%args); # OR
my $tube = single(\%args);

consider the input text as already parsed, and generate as output a hash
reference where the text is associated to a key.

=over

=item C

name of the input field, defaults to C;

=item C

key to use for associating the input text;

=item C

name of the tube, useful for debugging;

=item C

name of the output field, defaults to C;

=back

=head1 BUGS AND LIMITATIONS

Report bugs either through RT or GitHub (patches welcome).

=head1 AUTHOR

Flavio Poletti

=head1 COPYRIGHT AND LICENSE

Copyright (C) 2016 by Flavio Poletti

This module is free software. You can redistribute it and/or modify it
under the terms of the Artistic License 2.0.

This program is distributed in the hope that it will be useful, but
without any warranty; without even the implied warranty of
merchantability or fitness for a particular purpose.

=cut}}