=head1 NAME Web::DataService::Configuration - how to configure a data service =head1 SYNOPSIS This page describes how to configure a data service with L, covering the configuration attributes that apply to the data service as a whole. A full data service definition includes several different kinds of data service elements, which are documented on the following pages: =over =item L How to define data service nodes, and the attributes available for defining them. =item L How to define output formats, and the attributes available for defining them. =item L How to define vocabularies, and the attributes available for defining them. =item L How to define value sets, and the attributes available for defining them. =item L How to define output blocks, and the attributes available for defining them. =item L How to define parameter rulesets, and the attributes available for defining them. =back =head1 SYNTAX The various configuration methods provided by L all use a consistent syntax. With the possible exception of an initial name argument, all of the rest of the arguments must be either hashrefs or strings. The hashrefs each configure some object, and the strings each document the object whose definition they follow. We refer to this mix of attribute hashrefs and documentation strings as a I. $ds->define_format( { name => 'json', content_type => 'application/json', doc_node => 'formats/json', title => 'JSON', default_vocab => 'com' }, "The JSON format is intended primarily to support client applications,", "including the PBDB Navigator. Response fields are named using compact", "3-character field names.", { name => 'xml', disabled => 1, content_type => 'text/xml', title => 'XML', doc_node => 'formats/xml', default_vocab => 'dwc' }, "The XML format is intended primarily to support data interchange with", "other databases, using the Darwin Core element set."); For example, the above call defines two response formats: one named 'json' and the other named 'xml'. Each of these formats is defined by the set of attributes contained in a hashref. The documentation strings are automatically collected (joined by newlines) as the attribute C of the object whose definition they immediately follow. Note that this does not apply to C<< Web::DataService->new >>, which must be called with a single hash argument only. =head2 Attribute value syntax In general, whenever an attribute can take a list of values, you specify those values as a string with the items separated by commas and arbitrary whitespace. For example, the following are identical: output => 'basic , extra' output => 'basic,extra' =head1 CONFIGURATION PROCESS In order to fully define a data service using this framework, your code must carry out the following steps (see L for more about this): =over =item 1. Load one or more modules ("operation modules") that can serve as L L. The subroutines that implement your data service operations must be placed in these modules. =item 2. Generate a new L. The rest of the steps will be carried out using method calls on this instance. =item 3. Define one or more L using C. This step is optional, and a "null" vocabulary consisting of the field names and values obtained from the backend will be automatically used if you do not specify any. =item 4. Define one or more L using C. This must follow any vocabulary definitions, and must precede the node definitions. =item 5. Define some L using C. =item 6. Define one or more L using C. These may occur in any order with respect to the node definitions. =item 7. Define L using C (or C). This step is optional, but you will need to do this if you wish to provide optional output blocks or parameters with enumerated values. These definitions must occur before any output blocks or rulesets that depend on them. =item 8. Define one or more L using C. These may occur in any order with respect to the other definitions. =back If some or all of your operation modules define a subroutine called C, this will be called once for each module as soon as the module name is encountered as the value of a C attribute in a node definition. You can also trigger this explicitly by calling C. The routine will be called as a class method, so the module name will be the first argument. The data service instance will be the second, so you can use that to make further definitions. You may find it convenient to put some or all of the definitions from steps 5-8 (C, C, C, C, C) in these initialization routines. That will serve to locate these definitions together with the operations to which they apply. You may instead find it convenient to put all of the node definitions together, either in the main application file or in some subsidiary module, so that the hierarchical relationships will be apparent. Exactly how you structure your applicaton is up to you. =head1 CONFIGURATION DETAILS The attributes that you can use in defining these different types of elements are listed in the following sections. =head2 Data service instantiation A new data service is instantiated by calling the C method of L, as follows: my $ds = Web::DataService->new({ name => 'data1.0', ... }); The "..." in the above example represents some set of attributes chosen from the list below. With a few exceptions noted below, any attributes that you do not specify in the call to C will be looked up in the configuration file provided by the foundation framework (F in the case of L). Any not specified there will be given default values, as indicated in the documentation for the individual attributes. For most attributes, it is up to you whether to specify them in the instantiation call or in the configuration file. When a new data service is instantiated, attributes that are not explicitly specified in the instantiation call are looked up in the configuration file under the value provided for the required attribute C. If not found, they are then looked up as direct attributes. For example, if the configuration file has the contents listed below, the above call will produce a data service with a C of C<1000> and a C of C<1>. This allows you to configure several different data services that share some attribute values but not others. default_limit: 500 default_header: 1 data1.0: default_limit: 1000 data2.0: default_limit: 1200 =head2 Data service attributes In the list below, entries indicated by C<[req]> are required attributes. Those indicated by C<[inst]> must be specified in the call to C rather than in the configuration file. Those indicated by C<[mod]> have default values according to which modules have been loaded at the time the data service is instantiated. All of the data service attributes have identically-named accessor methods. These are all read-only; the attributes may only be set at the time of instantiation. =head3 name [req] [inst] Specifies a unique identifier for this data service. You must specify this in the instantiation call, because it is used to find attribute values in the configuration file. =head3 features [req] [inst] Specifies the set of built-in features to be enabled for this data service. The value of this attribute must be a comma-separated list of feature names from the list given below. You can turn a feature off by prefixing its name with C, and you can use 'standard' to enable all of the available features. So the following will enable all of the features except "doc_paths": features => 'standard, no_doc_paths' while the following will enable just 'format_suffix' and 'documentation': features => 'format_suffix, documentation' The individual features are as follows: =head4 format_suffix This feature causes the response format of any request to be set from the suffix on the URL path. If enabled, a request with the URL path "/my/operation.json" will select the operation corresponding to the data service node "my/operation" and will render the output using the "json" format. =head4 documentation This feature will auto-generate documentation pages for the various data service operations. If enabled, the URL path "/" will always generate a main documentation page, and a URL without any suffix will generate a documentation page corresponding to the selected data service node. You are also able to create additional documentation nodes and templates at will. In order to make use of this feature, you must also ensure that a L is loaded. =head4 doc_paths This feature will enable additional URL paths for accessing documentation. If enabled, a request with the URL path "/my/operation_doc" or (if C is also enabled) "/my/operation_doc.html" will produce the documentation page for the data service node "my/operation". So will "/my/operation/index.html". The URL path "/my/operation" (or "/my/operation.json" if C is also enabled) will execute the operation and return the result. You can change the documentation suffixes by setting the attributes L and L. =head4 send_files This feature will enable you to define data service nodes that respond with the contents of files from disk. Its primary purpose is to provide access to the stylesheet used by the documentation pages. You can use it to provide access to other files as well. If you disable this feature but enable the 'documentation' feature, you will need to arrange for the stylesheet to be provided separately. =head4 strict_params If this feature is enabled, then any parameter names that are not recognized by the ruleset corresponding to the selected data service node will cause a request to be rejected with a result code of 400 (bad request). If disabled, then bad parameter names will generate warnings instead. =head4 stream_output If this feature is enabled, then any response body larger than the value of L will be streamed to the client instead of being sent in a single chunk. This feature should be enabled for any service which can produce large responses, because otherwise the process of marshalling such responses will take up large amounts of server memory and CPU time, and may cause excessive paging. =head3 special_params [req] [inst] The Web::DataService module can process certain request parameters in special ways. Each of these special parameters has an internal name for use in the data service application code, and an external name which you can set to any string you choose. It is this external parameter name which is used by clients when making requests to the data service. The value of C must be a list of special parameter internal names. You can turn off any of these by prefixing the name with C, and you can change the external name (i.e. the name actually used in requests) by adding C<=name>. The name C enables the following set of parameters: show, limit, offset, header, datainfo, count, vocab, linebreak, save So the following attribute value would enable the parameters listed above except for 'datainfo', and would set the external name of the 'header' parameter to 'head'. special_params => 'standard, no_datainfo, header=head' Once a set of special parameters is chosen, clients of the data service may include any of them (or none) in any request. The special parameters are as follows: =head4 selector If enabled, this special parameter is used to select which version of the data service should respond to the request. Its external name defaults to C unless overridden. If you enable this parameter, then you should give each data service a different value for the attribute L. If you are running multiple versions of your data service from a single application, or I, then you should either enable this parameter from the very beginning or use a different value of L for each of your data services. One or the other mechanism will ensure that the proper version of your service is selected to respond to each request. See the L section of L for a more comprehensive discussion. =head4 format If enabled, this special parameter is used to select the response format for the request. It is not included in the standard set, but you can turn it on if you prefer your clients to select the response format by means of a parameter rather than through a suffix on the URL path. If you do this, then you must also disable the feature L. =head4 show If enabled, this special parameter is used to select optional output blocks in addition to the default output for a particular request. In this way, clients can tailor the output of each request to provide just the information they need and leave out information they do not need. See the documentation for . =head4 limit If enabled, this special parameter is used to limit the number of result records returned by a request. The data service attribute L can be used to provide a default limit for any request that does not specify this attribute. The value of this parameter can be any positive integer, 0, or the string C. By using the latter value, a client can ensure that the entire result set is provided. This parameter, in combination with C, can be useful for data services that are able to generate large result sets. This combination prevents clients from accidentally sending in request URLs that generate enormous responses, while allowing the ability to acquire the full results when necessary. A client can either use this parameter with a value of C to obtain the entire result set deliberately with one query, or use it in conjunction with L to obtain a large result set using a series of requests, each of which returns a portion of the desired result. =head4 offset If enabled, this parameter indicates that the response should start at the indicated position in the result set rather than at the beginning. See also L. =head4 count If enabled, a true value for this parameter indicates that the response should include not only the result of the data service operation but also a count of the number of records found, the number returned, and the elapsed time taken in executing the operation. A false value indicates that this information should not be included. The attribute L specifies whether or not that information will be included when this parameter is not specified. This is a L (see below). =head4 datainfo If enabled, a true value for this parameter indicates that the response should include not only the result of the data service operation but also a set of descriptive information about the data. The attribute L specifies whether or not that information will be included when this parameter is not specified. This is a L (see below). =head4 header If enabled, a true value for this parameter indicates that the response should include header material, the contents of which varies according to the output format and the values of the C and C parameters if these are enabled. If false, no header material should be included. This parameter is ignored by the JSON output module. With a text format response (tsv or csv), if this parameter is provided with a false value then all header material is suppressed and only the data records (one per line) are returned. The attribute L specifies whether or not the header will be included when this paramter is not specified. This is a L (see below). =head4 linebreak If enabled, this parameter can be used to select the linebreak sequence used with text format responses. The accepted values are C for a carriage return, C for a linefeed, and C for a carriage return/linefeed combination. The default external name for this parameter is C. =head4 save If enabled, this parameter can be used to indicate that the response should be saved to disk rather than displayed in a browser window. The server will provide the appropriate headers, but it is up to the web browser or other client software to decide how to handle them. If this parameter is provided with a value other than C, C, C, C, C<1>, C<0>, C, or C, then this value will be used as the default filename with the selected response format appended as a suffix. You can also use the attribute L to provide a default in case no filename was specified by the client. =head4 vocab If enabled, this parameter can be used by the client to specify which vocabulary to use in expressing the result of a data service operation. The client can use this to override the default vocabulary for the selected output format, or to select a vocabulary if the format does not specify a default. This special parameter is only relevant if you have defined one or more output vocabularies for this data service. =head3 foundation_plugin [req] [inst] [mod] This attribute is not required if one of the known foundation frameworks (currently only L) is already loaded. If you put C in your main application file before the call to instantiate your data service, then the plugin L will be loaded automatically. The purpose of this plugin module is to interact with the foundation framework, to carry out tasks such as: receiving HTTP requests, producing HTTP responses, and reading application configuration information. The only reason you might need to specify this attribute explicitly is if you wish to load a different plugin and override the default choice. If you do so, and the named module is not already loaded, it will be automatically loaded. See L for more about plugins. =head3 templating_plugin [mod] This attribute may be specified either at instantiation or in the configuration file. It must be the name of a Perl module, and will be loaded at instantiation time if it has not already been loaded. The purpose of this plugin module is to interface with a templating engine for the purpose of producing documentation pages and/or result pages [note: result pages are not yet implemented]. If this attribute is not specified, and if the module L