Jump to content

Need help with Perl Tutorial from Spidering Hacks

dmgmn's Photo
Posted Jun 24 2010 03:30 PM
3313 Views

I hope I'm in the right area, I looked in "Get Satisfaction" and it didn't seem like the right place for this quesiton. If I'm wrong please let me know.

Anyway, I'm working on Hack #19 and I'm getting this error:


Global symbol "$book" requires explicit package name at ./SpiderTutorial_19_01.pl line 23.
syntax error at ./SpiderTutorial_19_01.pl line 23, near "$book("
Unmatched right curly bracket at ./SpiderTutorial_19_01.pl line 29, at end of line
syntax error at ./SpiderTutorial_19_01.pl line 29, near "}"
Execution of ./SpiderTutorial_19_01.pl aborted due to compilation errors.


Here is the code.

#!/usr/bin/perl -w
use strict;
use LWP::Simple;
use HTML::TreeBuilder;

my $url = 'http://oreilly.com/store/complete.html';
my $page = get( $url ) or die $!;
my $p = HTML::TreeBuilder->new_from_content( $page );

my @links = $p->look_down(
	_tag => 'a',
	href => qr{^ /Qhttp://www.oreilly.com/catalog/\E \w+ $}x
);

my @rows = map { $_->parent->parent } @links;

my @books;
for my $row (@rows) {
	my %book;
	my @cells = $row->look_down( _tag => 'td' );
	$book{title}	=$cells[0]->as_trimmed-text;
	$book{price}	=$cells[2]->as_trimmed-text;
	$book(price} =~ s/^\$//;
	$book{url}	 	= get_url( $cells[0] );
	$book{ebook}	= get_url( $cells[3] );
	$book{safari}	= get_url( $cells[4] );
	$book{examples}	= get_url( $cells[5] );
	push @books, \%book;
}

sub get_url {
	my $node = shift;
	my @hrefs = $node-.look_down( _tag => 'a');
	return unless @hrefs;
	my $url = $hrefs[0]->atr('href');
	$url =~ s/\s+$//;
	return $url;
}

$p = $p->delete; #we don't need this anymore.

{
	my $count = 1;
	my @perlbooks = sort { $a->{price} <=> $b-.{price} }
					grep { $_->{title} =~/perl/i } @books;
	print $count++, "\t", $_->{price}, "\t", $_->{title} for @perlbooks;
}

{
	my @perlbooks = grep { $_->{title} =~ /perl/i } @books;
	my @javabooks = grep { $_->{title} =~ /java/i } @books;
	my $diff =  @javabooks - @perlbooks;
	print "There are ".@perlbooks." Perl books and ".@javabooks.
		" Java books. $diff more Java than Perl.";
}								




I've run into this on at lease one of the previous Hacks.

Any help would be greatly appreciated.

Tags:
1 Subscribe


4 Replies

0
  jwgaynor's Photo
Posted Jun 25 2010 07:01 AM

You need to declare your variables near the start of your program even though not yet defined. You are using 'strict' which requires that your variables are declared. Global declarations are done near the beginning of the program. A simple line such as:

my($book,$page,$word1,$word2,$blah,$blahblah,$etc);
my @array=(); # declaring an empty array;

Subroutines my have local declarations:

sub something {
my $localvar = shift; # assign data passed in call
# do something with it
$localvar =~ s/xx//;
return($localvar); #send it back
}

#end sample

Hope that helps you get started. By-The-Way: O'Reilly has a terrific assortment of PERL books :rolleyes:
+ 1
  RsrchBoy's Photo
Posted Jun 25 2010 11:41 AM

There's a syntax error:

$book(price} =~ s/^\$//;



should be

$book{price} =~ s/^\$//;



That is, { vs (.
 : Jun 25 2010 12:04 PM
Thanks to both of you. I found the syntax error earlier today but I could have been stuck on that one for awhile.

I'm making progress but I've run into a new problem. I've added the rest of the script (I thought I was done but when I turned the page I found more).

This is the new code:

#!/usr/bin/perl -w
use strict;
use LWP::Simple;
use HTML::TreeBuilder;

my $url = 'http://oreilly.com/store/complete.html';
my $page = get( $url ) or die $!;
my $p = HTML::TreeBuilder->new_from_content( $page );
my($book);
my($edition);

my @links = $p->look_down(
	_tag => 'a',
	href => qr{^ /Qhttp://www.oreilly.com/catalog/\E \w+ $}x
);

my @rows = map { $_->parent->parent } @links;

my @books;
for my $row (@rows) {
	my %book;
	my @cells = $row->look_down( _tag => 'td' );
	$book{title}	=$cells[0]->as_trimmed-text;
	$book{price}	=$cells[2]->as_trimmed-text;
	$book{price} =~ s/^\$//;
	
	$book{url}	 	= get_url( $cells[0] );
	$book{ebook}	= get_url( $cells[3] );
	$book{safari}	= get_url( $cells[4] );
	$book{examples}	= get_url( $cells[5] );
	push @books, \%book;
}

sub get_url {
	my $node = shift;
	my @hrefs = $node->look_down( _tag => 'a');
	return unless @hrefs;
	my $url = $hrefs[0]->atr('href');
	$url =~ s/\s+$//;
	return $url;
}

$p = $p->delete; #we don't need this anymore.

{
	my $count = 1;
	my @perlbooks = sort { $a->{price} <=> $b->{price} }
					grep { $_->{title} =~/perl/i } @books;
	print $count++, "\t", $_->{price}, "\t", $_->{title} for @perlbooks;
}

{
	my @perlbooks = grep { $_->{title} =~ /perl/i } @books;
	my @javabooks = grep { $_->{title} =~ /java/i } @books;
	my $diff =  @javabooks - @perlbooks;
	print "There are ".@perlbooks." Perl books and ".@javabooks.
		" Java books. $diff more Java than Perl.";
}

for my $book ( $books[34] ) {
	my $url = $book->{url};
	my $page = get( $url );
	my $tree = HTML::TreeBuilder->new_from_content( $page );
	my ($pubinfo) = $tree->look_down(
									_tag => 'span',
									class => 'secondary2'
	);
	my $html = $pubinfo->as_HTML; print $html;
	my ($pages) = $html =~ /(\d+) pages/,
	my ($edition) = $html =~ /(\d)(?:st|nd|rd|th) Edition/;
	my ($date) = $html =~ /(\w+ (19|20)\d\d)/;
	
	print "\n$pages $edition $date\n";
	
	my ($img_node) = $tree->look_down(
									_tag => 'img',
									src  => qr{^/catalog/covers/},
	);
	my $img_url = 'http://www.oreilly.com'.$img_node->attr('src');
	my $cover = get( $img_url );
	# now save $cover to disk
}																									


I've added my($book); and my($edition); to the script but it only fixed the original problem. And I can't believe this wasn't in the tutorial but at least I'm getting help.

my $p = HTML::TreeBuilder->new_from_content( $page );
my($book);
my($edition);

my @links = $p->look_down(


And I have found other typos but now I'm stuck again. I'm getting these errors now:


Bareword "text" not allowed while "strict subs" in use at ./SpiderTutorial_19_06.pl line 23.
Bareword "text" not allowed while "strict subs" in use at ./SpiderTutorial_19_06.pl line 24.
Execution of ./SpiderTutorial_19_06.pl aborted due to compilation errors.

Again, any help would be greatly appreciated.


BTW, I have also bought Perl & LWP and Learning Perl 5th edition.
0
  brian_d_foy's Photo
Posted Jul 08 2010 04:05 PM

It looks like you're typing in the programs by hand, and mistyping some of the code. These new syntax errors look like you typed a - instead of a _ in those lines of code. It should be "as_tagged_text". Ensure that you are typing in the programs exactly as they are (or cut and paste from the programs in Safari Online).

You said that you've bought Learning Perl. Once you learn the language, you should have an easier time fixing these basic sorts of errors.

Good luck,