Rev 311 | Blame | Compare with Previous | Last modification | View Log | RSS feed
############################################################################## Pod/Select.pm -- function to select portions of POD docs## Copyright (C) 1996-2000 by Bradford Appleton. All rights reserved.# This file is part of "PodParser". PodParser is free software;# you can redistribute it and/or modify it under the same terms# as Perl itself.#############################################################################package Pod::Select;use strict;use vars qw($VERSION @ISA @EXPORT $MAX_HEADING_LEVEL %myData @section_headings @selected_sections);$VERSION = '1.36'; ## Current version of this packagerequire 5.005; ## requires this Perl version or lateruse JatsError;#############################################################################=head1 NAMEPod::Select, podselect() - extract selected sections of POD from input=head1 SYNOPSISuse Pod::Select;## Select all the POD sections for each file in @filelist## and print the result on standard output.podselect(@filelist);## Same as above, but write to tmp.outpodselect({-output => "tmp.out"}, @filelist):## Select from the given filelist, only those POD sections that are## within a 1st level section named any of: NAME, SYNOPSIS, OPTIONS.podselect({-sections => ["NAME|SYNOPSIS", "OPTIONS"]}, @filelist):## Select the "DESCRIPTION" section of the PODs from STDIN and write## the result to STDERR.podselect({-output => ">&STDERR", -sections => ["DESCRIPTION"]}, \*STDIN);oruse Pod::Select;## Create a parser object for selecting POD sections from the input$parser = new Pod::Select();## Select all the POD sections for each file in @filelist## and print the result to tmp.out.$parser->parse_from_file("<&STDIN", "tmp.out");## Select from the given filelist, only those POD sections that are## within a 1st level section named any of: NAME, SYNOPSIS, OPTIONS.$parser->select("NAME|SYNOPSIS", "OPTIONS");for (@filelist) { $parser->parse_from_file($_); }## Select the "DESCRIPTION" and "SEE ALSO" sections of the PODs from## STDIN and write the result to STDERR.$parser->select("DESCRIPTION");$parser->add_selection("SEE ALSO");$parser->parse_from_filehandle(\*STDIN, \*STDERR);=head1 REQUIRESperl5.005, Pod::Parser, Exporter, Carp=head1 EXPORTSpodselect()=head1 DESCRIPTIONB<podselect()> is a function which will extract specified sections ofpod documentation from an input stream. This ability is provided by theB<Pod::Select> module which is a subclass of B<Pod::Parser>.B<Pod::Select> provides a method named B<select()> to specify the set ofPOD sections to select for processing/printing. B<podselect()> merelycreates a B<Pod::Select> object and then invokes the B<podselect()>followed by B<parse_from_file()>.=head1 SECTION SPECIFICATIONSB<podselect()> and B<Pod::Select::select()> may be given one or more"section specifications" to restrict the text processed to only thedesired set of sections and their corresponding subsections. A sectionspecification is a string containing one or more Perl-style regularexpressions separated by forward slashes ("/"). If you need to use aforward slash literally within a section title you can escape it with abackslash ("\/").The formal syntax of a section specification is:=over 4=item *I<head1-title-regex>/I<head2-title-regex>/...=backAny omitted or empty regular expressions will default to ".*".Please note that each regular expression given is implicitlyanchored by adding "^" and "$" to the beginning and end. Also, if agiven regular expression starts with a "!" character, then theexpression is I<negated> (so C<!foo> would match anything I<except>C<foo>).Some example section specifications follow.=over 4=item *Match the C<NAME> and C<SYNOPSIS> sections and all of their subsections:C<NAME|SYNOPSIS>=item *Match only the C<Question> and C<Answer> subsections of the C<DESCRIPTION>section:C<DESCRIPTION/Question|Answer>=item *Match the C<Comments> subsection of I<all> sections:C</Comments>=item *Match all subsections of C<DESCRIPTION> I<except> for C<Comments>:C<DESCRIPTION/!Comments>=item *Match the C<DESCRIPTION> section but do I<not> match any of its subsections:C<DESCRIPTION/!.+>=item *Match all top level sections but none of their subsections:C</!.+>=back=begin _NOT_IMPLEMENTED_=head1 RANGE SPECIFICATIONSB<podselect()> and B<Pod::Select::select()> may be given one or more"range specifications" to restrict the text processed to only thedesired ranges of paragraphs in the desired set of sections. A rangespecification is a string containing a single Perl-style regularexpression (a regex), or else two Perl-style regular expressions(regexs) separated by a ".." (Perl's "range" operator is "..").The regexs in a range specification are delimited by forward slashes("/"). If you need to use a forward slash literally within a regex youcan escape it with a backslash ("\/").The formal syntax of a range specification is:=over 4=item */I<start-range-regex>/[../I<end-range-regex>/]=backWhere each the item inside square brackets (the ".." followed by theend-range-regex) is optional. Each "range-regex" is of the form:=cmd-expr text-exprWhere I<cmd-expr> is intended to match the name of one or more PODcommands, and I<text-expr> is intended to match the paragraph text forthe command. If a range-regex is supposed to match a POD command, thenthe first character of the regex (the one after the initial '/')absolutely I<must> be a single '=' character; it may not be anythingelse (not even a regex meta-character) if it is supposed to matchagainst the name of a POD command.If no I<=cmd-expr> is given then the text-expr will be matched againstplain textblocks unless it is preceded by a space, in which case it ismatched against verbatim text-blocks. If no I<text-expr> is given thenonly the command-portion of the paragraph is matched against.Note that these two expressions are each implicitly anchored. Thismeans that when matching against the command-name, there will be animplicit '^' and '$' around the given I<=cmd-expr>; and when matchingagainst the paragraph text there will be an implicit '\A' and '\Z'around the given I<text-expr>.Unlike with section-specs, the '!' character does I<not> have any specialmeaning (negation or otherwise) at the beginning of a range-spec!Some example range specifications follow.=over 4=itemMatch all C<=for html> paragraphs:C</=for html/>=itemMatch all paragraphs between C<=begin html> and C<=end html>(note that this will I<not> work correctly if such sectionsare nested):C</=begin html/../=end html/>=itemMatch all paragraphs between the given C<=item> name until the end of thecurrent section:C</=item mine/../=head\d/>=itemMatch all paragraphs between the given C<=item> until the next item, oruntil the end of the itemized list (note that this will I<not> work asdesired if the item contains an itemized list nested within it):C</=item mine/../=(item|back)/>=back=end _NOT_IMPLEMENTED_=cut##############################################################################use diagnostics;use Carp;use Pod::Parser 1.04;@ISA = qw(Pod::Parser);@EXPORT = qw(&podselect);## Maximum number of heading levels supported for '=headN' directives*MAX_HEADING_LEVEL = \3;#############################################################################=head1 OBJECT METHODSThe following methods are provided in this module. Each one takes areference to the object itself as an implicit first parameter.=cut##---------------------------------------------------------------------------## =begin _PRIVATE_#### =head1 B<_init_headings()>#### Initialize the current set of active section headings.#### =cut#### =end _PRIVATE_sub _init_headings {my $self = shift;local *myData = $self;## Initialize current section heading titles if necessaryunless (defined $myData{_SECTION_HEADINGS}) {local *section_headings = $myData{_SECTION_HEADINGS} = [];for (my $i = 0; $i < $MAX_HEADING_LEVEL; ++$i) {$section_headings[$i] = '';}}}##---------------------------------------------------------------------------=head1 B<curr_headings()>($head1, $head2, $head3, ...) = $parser->curr_headings();$head1 = $parser->curr_headings(1);This method returns a list of the currently active section headings andsubheadings in the document being parsed. The list of headings returnedcorresponds to the most recently parsed paragraph of the input.If an argument is given, it must correspond to the desired sectionheading number, in which case only the specified section heading isreturned. If there is no current section heading at the specifiedlevel, then C<undef> is returned.=cutsub curr_headings {my $self = shift;$self->_init_headings() unless (defined $self->{_SECTION_HEADINGS});my @headings = @{ $self->{_SECTION_HEADINGS} };return (@_ > 0 and $_[0] =~ /^\d+$/) ? $headings[$_[0] - 1] : @headings;}##---------------------------------------------------------------------------=head1 B<select()>$parser->select($section_spec1,$section_spec2,...);This method is used to select the particular sections and subsections ofPOD documentation that are to be printed and/or processed. The existingset of selected sections is I<replaced> with the given set of sections.See B<add_selection()> for adding to the current set of selectedsections.Each of the C<$section_spec> arguments should be a section specificationas described in L<"SECTION SPECIFICATIONS">. The section specificationsare parsed by this method and the resulting regular expressions arestored in the invoking object.If no C<$section_spec> arguments are given, then the existing set ofselected sections is cleared out (which means C<all> sections will beprocessed).This method should I<not> normally be overridden by subclasses.=cutsub select {my ($self, @sections) = @_;local *myData = $self;local $_;### NEED TO DISCERN A SECTION-SPEC FROM A RANGE-SPEC (look for m{^/.+/$}?)##---------------------------------------------------------------------## The following is a blatant hack for backward compatibility, and for## implementing add_selection(). If the *first* *argument* is the## string "+", then the remaining section specifications are *added*## to the current set of selections; otherwise the given section## specifications will *replace* the current set of selections.#### This should probably be fixed someday, but for the present time,## it seems incredibly unlikely that "+" would ever correspond to## a legitimate section heading##---------------------------------------------------------------------my $add = ($sections[0] eq '+') ? shift(@sections) : '';## Reset the set of sections to useunless (@sections) {delete $myData{_SELECTED_SECTIONS} unless ($add);return;}$myData{_SELECTED_SECTIONS} = []unless ($add && exists $myData{_SELECTED_SECTIONS});local *selected_sections = $myData{_SELECTED_SECTIONS};## Compile each specfor my $spec (@sections) {if ( defined($_ = _compile_section_spec($spec)) ) {## Store them in our sections arraypush(@selected_sections, $_);}else {carp qq{Ignoring section spec "$spec"!\n};}}#DebugDumpData("select",\@sections, \@selected_sections );}##---------------------------------------------------------------------------=head1 B<add_selection()>$parser->add_selection($section_spec1,$section_spec2,...);This method is used to add to the currently selected sections andsubsections of POD documentation that are to be printed and/orprocessed. See <select()> for replacing the currently selected sections.Each of the C<$section_spec> arguments should be a section specificationas described in L<"SECTION SPECIFICATIONS">. The section specificationsare parsed by this method and the resulting regular expressions arestored in the invoking object.This method should I<not> normally be overridden by subclasses.=cutsub add_selection {my $self = shift;return $self->select('+', @_);}##---------------------------------------------------------------------------=head1 B<clear_selections()>$parser->clear_selections();This method takes no arguments, it has the exact same effect as invoking<select()> with no arguments.=cutsub clear_selections {my $self = shift;return $self->select();}##---------------------------------------------------------------------------=head1 B<match_section()>$boolean = $parser->match_section($heading1,$heading2,...);Returns a value of true if the given section and subsection headingtitles match any of the currently selected section specifications ineffect from prior calls to B<select()> and B<add_selection()> (or ifthere are no explicitly selected/deselected sections).The arguments C<$heading1>, C<$heading2>, etc. are the heading titles ofthe corresponding sections, subsections, etc. to try and match. IfC<$headingN> is omitted then it defaults to the current correspondingsection heading title in the input.This method should I<not> normally be overridden by subclasses.=cutsub match_section {my $self = shift;my (@headings) = @_;local *myData = $self;## Return true if no restrictions were explicitly specifiedmy $selections = (exists $myData{_SELECTED_SECTIONS})? $myData{_SELECTED_SECTIONS} : undef;return 1 unless ((defined $selections) && @{$selections});## Default any unspecified sections to the current onemy @current_headings = $self->curr_headings();for (my $i = 0; $i < $MAX_HEADING_LEVEL; ++$i) {(defined $headings[$i]) or $headings[$i] = $current_headings[$i];}## Look for a match against the specified section expressionsfor my $section_spec ( @{$selections} ) {##------------------------------------------------------## Each portion of this spec must match in order for## the spec to be matched. So we will start with a## match-value of 'true' and logically 'and' it with## the results of matching a given element of the spec.##------------------------------------------------------my $match = 1;#DebugDumpData("Match", $section_spec, \@headings );for (my $i = 0; $i < $MAX_HEADING_LEVEL; ++$i) {my $regex = $section_spec->[$i];my $negated = ($regex =~ s/^\!//);$match &= ($negated ? ($headings[$i] !~ /${regex}/): ($headings[$i] =~ /${regex}/));last unless ($match);}return 1 if ($match);}return 0; ## no match}##---------------------------------------------------------------------------=head1 B<is_selected()>$boolean = $parser->is_selected($paragraph);This method is used to determine if the block of text given inC<$paragraph> falls within the currently selected set of POD sectionsand subsections to be printed or processed. This method is alsoresponsible for keeping track of the current input section andsubsections. It is assumed that C<$paragraph> is the most recently read(but not yet processed) input paragraph.The value returned will be true if the C<$paragraph> and the rest of thetext in the same section as C<$paragraph> should be selected (included)for processing; otherwise a false value is returned.=cutsub is_selected {my ($self, $paragraph) = @_;local $_;local *myData = $self;$self->_init_headings() unless (defined $myData{_SECTION_HEADINGS});## Keep track of current sections levels and headings$_ = $paragraph;if (/^=((?:sub)*)(?:head(?:ing)?|sec(?:tion)?)(\d*)\s+(.*?)\s*$/){## This is a section heading commandmy ($level, $heading) = ($2, $3);$level = 1 + (length($1) / 3) if ((! length $level) || (length $1));## Reset the current section heading at this level$myData{_SECTION_HEADINGS}->[$level - 1] = $heading;## Reset subsection headings of this one to emptyfor (my $i = $level; $i < $MAX_HEADING_LEVEL; ++$i) {$myData{_SECTION_HEADINGS}->[$i] = '';}}return $self->match_section();}#############################################################################=head1 EXPORTED FUNCTIONSThe following functions are exported by this module. Please note thatthese are functions (not methods) and therefore C<do not> take animplicit first argument.=cut##---------------------------------------------------------------------------=head1 B<podselect()>podselect(\%options,@filelist);B<podselect> will print the raw (untranslated) POD paragraphs of allPOD sections in the given input files specified by C<@filelist>according to the given options.If any argument to B<podselect> is a reference to a hash(associative array) then the values with the following keys areprocessed as follows:=over 4=item B<-output>A string corresponding to the desired output file (or ">&STDOUT"or ">&STDERR"). The default is to use standard output.=item B<-sections>A reference to an array of sections specifications (as described inL<"SECTION SPECIFICATIONS">) which indicate the desired set of PODsections and subsections to be selected from input. If no sectionspecifications are given, then all sections of the PODs are used.=begin _NOT_IMPLEMENTED_=item B<-ranges>A reference to an array of range specifications (as described inL<"RANGE SPECIFICATIONS">) which indicate the desired range of PODparagraphs to be selected from the desired input sections. If no rangespecifications are given, then all paragraphs of the desired sectionsare used.=end _NOT_IMPLEMENTED_=backAll other arguments should correspond to the names of input filescontaining POD sections. A file name of "-" or "<&STDIN" willbe interpreted to mean standard input (which is the default if nofilenames are given).=cutsub podselect {my(@argv) = @_;my %defaults = ();my $pod_parser = new Pod::Select(%defaults);my $num_inputs = 0;my $output = '>&STDOUT';my %opts;local $_;for (@argv) {if (ref($_)) {next unless (ref($_) eq 'HASH');%opts = (%defaults, %{$_});##-------------------------------------------------------------## Need this for backward compatibility since we formerly used## options that were all uppercase words rather than ones that## looked like Unix command-line options.## to be uppercase keywords)##-------------------------------------------------------------%opts = map {my ($key, $val) = (lc $_, $opts{$_});$key =~ s/^(?=\w)/-/;$key =~ /^-se[cl]/ and $key = '-sections';#! $key eq '-range' and $key .= 's';($key => $val);} (keys %opts);## Process the options(exists $opts{'-output'}) and $output = $opts{'-output'};## Select the desired sections$pod_parser->select(@{ $opts{'-sections'} })if ( (defined $opts{'-sections'})&& ((ref $opts{'-sections'}) eq 'ARRAY') );#! ## Select the desired paragraph ranges#! $pod_parser->select(@{ $opts{'-ranges'} })#! if ( (defined $opts{'-ranges'})#! && ((ref $opts{'-ranges'}) eq 'ARRAY') );}else {$pod_parser->parse_from_file($_, $output);++$num_inputs;}}$pod_parser->parse_from_file('-') unless ($num_inputs > 0);}#############################################################################=head1 PRIVATE METHODS AND DATAB<Pod::Select> makes uses a number of internal methods and data fieldswhich clients should not need to see or use. For the sake of avoidingname collisions with client data and methods, these methods and fieldsare briefly discussed here. Determined hackers may obtain furtherinformation about them by reading the B<Pod::Select> source code.Private data fields are stored in the hash-object whose reference isreturned by the B<new()> constructor for this class. The names of allprivate methods and data-fields used by B<Pod::Select> begin with aprefix of "_" and match the regular expression C</^_\w+$/>.=cut##---------------------------------------------------------------------------=begin _PRIVATE_=head1 B<_compile_section_spec()>$listref = $parser->_compile_section_spec($section_spec);This function (note it is a function and I<not> a method) takes asection specification (as described in L<"SECTION SPECIFICATIONS">)given in C<$section_sepc>, and compiles it into a list of regularexpressions. If C<$section_spec> has no syntax errors, then a referenceto the list (array) of corresponding regular expressions is returned;otherwise C<undef> is returned and an error message is printed (usingB<carp>) for each invalid regex.=end _PRIVATE_=cutsub _compile_section_spec {my ($section_spec) = @_;my (@regexs, $negated);## Compile the spec into a list of regexslocal $_ = $section_spec;s{\\\\}{\001}g; ## handle escaped backward slashess{\\/}{\002}g; ## handle escaped forward slashes## Parse the regexs for the heading titles@regexs = split(/\//, $_, $MAX_HEADING_LEVEL);## Set default regex for ommitted levelsfor (my $i = 0; $i < $MAX_HEADING_LEVEL; ++$i) {$regexs[$i] = '.*' unless ((defined $regexs[$i])&& (length $regexs[$i]));}## Modify the regexs as needed and validate their syntaxmy $bad_regexs = 0;for (@regexs) {$_ .= '.+' if ($_ eq '!');s{\001}{\\\\}g; ## restore escaped backward slashess{\002}{\\/}g; ## restore escaped forward slashes$negated = s/^\!//; ## check for negationeval "m{$_}"; ## check regex syntaxif ($@) {++$bad_regexs;carp qq{Bad regular expression /$_/ in "$section_spec": $@\n};}else {## Add the forward and rear anchors (and put the negator back)$_ = '^' . $_ unless (/^\^/);$_ = $_ . '$' unless (/\$$/);$_ = '!' . $_ if ($negated);}}return (! $bad_regexs) ? [ @regexs ] : undef;}##---------------------------------------------------------------------------=begin _PRIVATE_=head2 $self->{_SECTION_HEADINGS}A reference to an array of the current section heading titles for eachheading level (note that the first heading level title is at index 0).=end _PRIVATE_=cut##---------------------------------------------------------------------------=begin _PRIVATE_=head2 $self->{_SELECTED_SECTIONS}A reference to an array of references to arrays. Each subarray is a listof anchored regular expressions (preceded by a "!" if the expression is tobe negated). The index of the expression in the subarray should correspondto the index of the heading title in C<$self-E<gt>{_SECTION_HEADINGS}>that it is to be matched against.=end _PRIVATE_=cut#############################################################################=head1 SEE ALSOL<Pod::Parser>=head1 AUTHORPlease report bugs using L<http://rt.cpan.org>.Brad Appleton E<lt>bradapp@enteract.comE<gt>Based on code for B<pod2text> written byTom Christiansen E<lt>tchrist@mox.perl.comE<gt>=cut1;# vim: ts=4 sw=4 et