DBD::SQLite::Cookbook.3pm

Langue: en

Version: 2010-01-08 (fedora - 01/12/10)

Section: 3 (Bibliothèques de fonctions)

NAME

DBD::SQLite::Cookbook - The DBD::SQLite Cookbook

DESCRIPTION

This is the DBD::SQLite cookbook.

It is intended to provide a place to keep a variety of functions and formals for use in callback APIs in DBD::SQLite.

Variance

This is a simple aggregate function which returns a variance. It is adapted from an example implementation in pysqlite.
   package variance;
   
   sub new { bless [], shift; }
   
   sub step {
       my ( $self, $value ) = @_;
   
       push @$self, $value;
   }
   
   sub finalize {
       my $self = $_[0];
   
       my $n = @$self;
   
       # Variance is NULL unless there is more than one row
       return undef unless $n || $n == 1;
   
       my $mu = 0;
       foreach my $v ( @$self ) {
           $mu += $v;
       }
       $mu /= $n;
   
       my $sigma = 0;
       foreach my $v ( @$self ) {
           $sigma += ($x - $mu)**2;
       }
       $sigma = $sigma / ($n - 1);
   
       return $sigma;
   }
   
   # NOTE: If you use an older DBI (< 1.608),
   # use $dbh->func(..., "create_aggregate") instead.
   $dbh->sqlite_create_aggregate( "variance", 1, 'variance' );
 
 

The function can then be used as:

   SELECT group_name, variance(score)
   FROM results
   GROUP BY group_name;
 
 

Variance (Memory Efficient)

A more efficient variance function, optimized for memory usage at the expense of precision:
   package variance2;
   
   my $sum   = 0;
   my $count = 0;
   my %hash;
   
   sub new { bless [], shift; }
   
   sub step {
       my ( $self, $value ) = @_;
   
       # by truncating and hashing, we can comsume many more data points
       $value = int($value); # change depending on need for precision
                             # use sprintf for arbitrary fp precision
       if (defined $hash{$value}) {
           $hash{$value}++;
       } else {
           $hash{$value} = 1;
       }
       $sum += $value;
       $count++;
   }
   
   sub finalize {
       my $self = $_[0];
   
       # Variance is NULL unless there is more than one row
       return undef unless $count > 1;
   
       # calculate avg
       my $mu = $sum / $count;
   
       my $sigma = 0;
       foreach my $h (keys %hash) {
           $sigma += (($h - $mu)**2) * $hash{$h};
       }
       $sigma = $sigma / ($count - 1);
   
       return $sigma;
   }
 
 

The function can then be used as:

   SELECT group_name, variance2(score)
   FROM results
   GROUP BY group_name;
 
 

Variance (Highly Scalable)

A third variable implementation, designed for arbitrarily large data sets:
   package variance;
   
   my $mu = 0;
   my $count = 0;
   my $S = 0
   
   sub new { bless [], shift; }
   
   sub step {
       my ( $self, $value ) = @_;
       $count++;
       $delta = $value - $mu;
       $mu = $mu + $delta/$count
       $S = $S + $delta*($value - $mu);
   }
   
   sub finalize {
       my $self = $_[0];
       return $S / ($count - 1);
   }
 
 

The function can then be used as:

   SELECT group_name, variance3(score)
   FROM results
   GROUP BY group_name;
 
 

SUPPORT

Bugs should be reported via the CPAN bug tracker at

http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBD-SQLite <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBD-SQLite>

TO DO

* Add more and varied cookbook recipes, until we have enough to turn them into a seperate CPAN distribution.

* Create a series of tests scripts that validate the cookbook recipies.

AUTHOR

Adam Kennedy <adamk@cpan.org> Copyright 2009 Adam Kennedy.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

The full text of the license can be found in the LICENSE file included with this module.