Millibrachiate Tentacular Coelenterate (nja) wrote in cyber_derives,
Millibrachiate Tentacular Coelenterate
nja
cyber_derives

Geekery

Here am the bot.

You'll need Perl, and you'll also need to install the XML::RSSLite module (run ppm from the command prompt, then "install xml-rsslite"). I can't remember whether Cache::FileCache is part of the standard distribution, but if it isn't you'll need that too. It creates a cache directory structure in the directory you run it from, the cache is never checked for expiry or more recent versions of anything because frankly I couldn't be arsed, so it will only ever return the "most recent post" from the first time it saw that user unless you delete the cache directory. It's a well-behaved bot so it only downloads stuff every ten seconds. It occasionally throws up when the RSS parser finds a malformed RSS feed (the full-on expat RSS parser is even worse/better, depending on whether you think a parser ought to cope with malformed XML). It creates an HTML list in a file called lj.out. The strings "YOUR DESCRIPTION" and "YOUR EMAIL ADDRESS" need to be changed for your own details.
use strict;
use LWP::RobotUA;
use HTTP::Response;
use HTTP::Request;
use XML::RSSLite;
use Cache::FileCache;

sub get_entries($);
sub get_friends($);
sub cached_get($$);

my $cache = Cache::FileCache->new({
  'namespace' => 'my-cache',
  'cache_root' => './cache' }
  );

my $robot = LWP::RobotUA->new(
  'YOUR DESCRIPTION', 'YOUR EMAIL ADDRESS');
$robot->delay(10/60);

my %users_seen;
my $nusers;
my $continue = 1;
my $user = "nja";

open OUT, ">lj.out" or die "$!\n";
print OUT "<ul>\n";
while ($nusers < 20)
{

  print "User : $user";
  my $e = get_entries($user);
  my $f = get_friends($user);
  
  my $link = $e->[0];
  print OUT qq(<li><a href="$link">$user</a>  </li>\n);
  if ($users_seen{$user})
  {
    print "Loop.\n";
    last;
  }

  $users_seen{$user} = 1;
  $nusers++;
  print "Maximum reached.\n" if ($nusers == 20);
  my $new = int rand @$f;
  $user = $f->[$new];
}
print OUT "<ul>\n";
close OUT;
  
exit;

#----------------------------------------------------------------------
sub get_entries($)
{
  my $user = shift @_;
  my $url = "http://www.livejournal.com/users/$user/data/rss";
  my $response = cached_get($url, 1);

  my @links;
  if (defined $response)
  {
    my %rss;
    my $content = $response->content;
    parseRSS(\%rss, \$content);
    if (ref $rss{'item'} eq 'ARRAY')
    {
      foreach my $item (@{$rss{'item'}})
      {
        push @links, $item->{'link'};
      }
    }
  }
  die "No entries!\n" unless (@links);
  return \@links;
}
    
#----------------------------------------------------------------------
sub get_friends($)
{
  my $user = shift @_;
  my $url = "http://www.livejournal.com/misc/fdata.bml?user=$user";
  my $response = cached_get($url, 0);

  my @friends;
  foreach (split /\n/, $response->content)
  {
    chomp;
    if (/^</)
    {
      s/^<\s*//;
      push @friends, $_;
    }
  }
  die "No friends!\n" unless (@friends);
  return \@friends;
}

#----------------------------------------------------------------------
sub cached_get($$)
{
  my ($url, $verbose) = @_;
  my $request = HTTP::Request->new('GET', $url);

  my $response = new HTTP::Response;
  $response = $cache->get($url);
  print ((defined $response) ? " (cache)\n" : " (web)\n") if $verbose;
  $response = $robot->request($request) unless (defined $response);
  die "Catastrophic failure!\n" unless (defined $response);

  if ($response->code == 200)
  {
    die "No content!\n" unless ($response->content);
    $cache->set($url, $response);
  }
  else
  {
    die "Oops!  " .  $response->code . " " . $response->message . "\n";
  }
  return $response;
}

  • Post a new comment

    Error

    default userpic
  • 8 comments