Getting started with Perl XS and Inline::C

download Getting started with Perl XS and Inline::C

If you can't read please download the document

Transcript of Getting started with Perl XS and Inline::C

Inline::C, The eminently palatable approach to Perl XS.

We should forget about the small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

-Donald Knuth

We should forget about the small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

-Donald Knuth Structured Programming with Go To Statements

Getting Started with XS and Inline::C

The eminently palatable approach to Perl XS

Dave Oswald

[email protected]@cpan.org

What is XS?

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl.perldoc perlxs

What is XS?

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl.perldoc perlxs

It's a complex and awkward intermediary language.Simon Cozens (co-author: Extending and Embedding Perl), in Advanced Perl Programming, 2nd Edition.

What is XS?

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl.perldoc perlxs

Hooking Perl to C using XS requires you to write a shell .pm module to bootstrap an object file that has been compiled from C code, which was in turn generated by xsubpp from a .xs source file containing pseudo-C annotated with an XS interface description. If that sounds horribly complicated, then you have achieved an accurate understanding of the use of xsubpp.

Damian Conway in Perl Best Practices

What is XS?

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl.perldoc perlxs

There is a big fat learning curve involved with setting up and using the XS environment. Brian Ingy Ingerson (Inline POD)

What is XS?

XS is an interface description file format used to create an extension interface between Perl and C code (or a C library) which one wishes to use with Perl.perldoc perlxs

The Cognitive Load is high, which is counter-indicated for producing cleanly implemented, bug-free code.Dave Oswald

...and then he said "btw, do you know C++?" and I do, though I prefer to work in perl, and he said "good, cuz these log files might be terrabytes big, and in that case we're going to need c++"

reyjyar (http://perlmonks.org/?node=38117)

Tell him you'll write a prototype in Perl, and if that turns out not to be fast enough, you'll recode the critical parts in C, thanks to Perl's sophisticated interlanguage binding and runtime profiling tools.

Randal Schwartz (http://perlmonks.org/?node=38180)

The Binary Search (pure Perl prototype)

sub binsearch (&$\@) { my ( $code, $target, $aref ) = @_; my ( $min, $max ) = ( 0, $#{$aref} ); no strict 'refs'; # Symbolic ref abuse follows. while ( $max > $min ) { my $mid = int( ( $min + $max ) / 2 ); local ( ${caller() . '::a'}, ${caller() . '::b'} ) = ( $target, $aref->[$mid] ); if ( $code->( $target, $aref->[$mid] ) > 0 ) { $min = $mid + 1; } else { $max = $mid; } } local ( ${caller() . '::a'}, ${caller() . '::b'} ) = ( $target, $aref->[$min] ); return $min if $code->( $target, $aref->[$min] ) == 0; return; # Not found.}

The XS rewrite

List::BinarySearch::XS

A Shortest Path

Use Inline::C to prototype the code.

Paste output (with minor modifications) into an XS framework for release.

Ingy got fed up with writing XS. Advanced Perl Programming, 2nd edition.

What IS Inline::C

Inline::C was created by Ingy dt Net

Inline::C creates XS, code but you mostly don't need to care how.

Embed C in your Perl application

Inline::C creates XS bindings, making it available to your application.

On first run it compiles the C source code.

Subsequent runs use the previously compiled dynamic library.

Compiled code is cached unless a change is made to the C source.

Modules build on install, so no delay for end user once installed.

Inline::C: Hello World!

From the Inline::C-Cookbook (on CPAN), Hello World in a HERE doc.

use Inline C=>more e_101c.xs#include "EXTERN.h"#include "perl.h"#include "XSUB.h"#include "INLINE.h" void greet() { printf("Hello, world\n"); }MODULE = e_101c PACKAGE = main

PROTOTYPES: DISABLE

voidgreet () PREINIT: I32* temp; PPCODE: temp = PL_markstack_ptr++; greet(); if (PL_markstack_ptr != temp) { /* truly void, because dXSARGS not invoked */ PL_markstack_ptr = temp; XSRETURN_EMPTY; /* return empty stack */ } /* must have used dXSARGS; list context implied */ return; /* assume stack size is correct */

The __C__ segment

At the end of a Perl program, add an __END__ or __DATA__ tag, and on a subsequent line add a __C__ tag.

Everything after that will be treated as C code.

An example:use strict;use warnings;use Inline C=>'DATA'; #'DATA' is actually the default.greet();__DATA____C__void greet() { printf Hello world!\n; }

Batteries Included: Basic Data Types.

Inline::C already knows how to deal with passing basic C data types as parameters and return values.

See perl/lib/ExtUtils/typemap for the full list.

Let's pass C ints into and out of a function.

use Inline 'C';my ( $apples, $oranges ) = ( 10, 5 );my $total_fruit = add( $apples, $oranges );print $apples apples plus $oranges oranges equals . $total_fruit pieces of fruit.\n;__END____C__int add( int first, int second ) { return first + second; }

Returning a string is almost as easy (ignoring Unicode for now)

use Readonly;use Inline C => 'DATA';Readonly my $string => return_string();print $string;__DATA____C__char* return_string() { return Hello world!\n;}

It's not going to stay that easy forever

If your data type isn't in the typemap file, it doesn't get converted automatically.

You may create additional typemaps of your own creation.

If you try to pass a type that isn't auto-converted, you will get a strange error message at compile-time.

int clength( char const *str ) { }Use of inherited AUTOLOAD for non-method main::clength() is deprecated at simple params2.pl line 11.

This and other cryptic messages will taunt you often.

A quick definition of terms

Perl has the following containers:

SV = Scalar Value

AV = Array Value: Contains Scalars.

HV = Hash Value: Contains Scalars.

SV's can contain one (or often more) of the following:

IV = Integer Value (or pointer)

UV = Unsigned Int

NV = double

PV = string

CV = A coderef (subref)

RV = a pointer to another SV (*SV) (Source: perlguts)

I'm afraid it's time to read perldoc perlguts.

perlguts introduces the components that comprise the Perl API

The functions documented in perlguts will be your means of manipulating Perl data in C.

The four scenarios

These four calling scenarios are documented in Inline::C

Simple: Fixed number of params, all types built into the typemap:

int Foo ( int arg1, char* arg2, SV* arg3 )

All arguments and single return value are specified in the typemap, so you just write your pretty little C subroutine and life is good. The conversions are automatic.

Return a list or nothing. Parameter list fixed.

void Foo( int arg1, char* arg2, SV* arg3 )

Either you really want to return nothing, or you want to build the return value yourself and push it onto The Stack. Either way you have to be explicit.

Variable length parameter list, simple return.

char* Foo( SV* arg1, )

Pop arguments off of The Stack.

Variable length return, variable length param list.

void Foo( SV* arg1, )

Void return and unfixed number of args: Combine 2nd and 3rd techniques.

The First Scenario

The first situation passes basic data types (or none), and returns a single basic data type. Parameter list is fixed length.

use Inline C => 'DATA';print add_one( 10 ), \n;__DATA____C__

int add_one( int arg ) { return arg + 1;}

The Stack

The stack is Perl's internal means of passing parameters to, and return values from subroutines.

INLINE.h defines a set of convenient macros used to manipulate the stack. These are sugar for less convenient XS macros.Inline_Stack_VarsInline_Stack_ItemsInline_Stack_Item(i)Inline_Stack_ResetInline_Stack_Push(sv)Inline_Stack_DoneInline_Stack_Return(n)Inline_Stack_Void

Examples are better than definitions...

The Second Scenario

Returning a list. (C doesn't do this by nature.)

__C__void Foo( int arg1, char* arg2, SV* arg3 ) { int i, max; Inline_Stack_Vars; Inline_Stack_Reset; for (i = 0; i < max; i++) Inline_Stack_Push(newSViv(i)); Inline_Stack_Done;}There are other macros explained in the cookbook, and perlguts that create and manage new SV's dynamically too.

The Third Scenario

Variable length argument list. Returning a single basic type.

Variable length lists must pass at least one parameter.

use Inline C => 'DATA';my $count = greet( qw/ George John Thomas James James / );print "That's the first $count presidents.\n";__DATA____C__int greet(SV* name1, ...) { Inline_Stack_Vars; int i; for (i = 0; i < Inline_Stack_Items; i++) printf("Hello %s!\n", SvPV(Inline_Stack_Item(i), PL_na)); return i;}

The Third scenario, part 2

Variable length arg list, void function (no return value).

When manipulating the stack, if there's no return value you must specify that explicitly.

use Inline C => 'DATA';greet( qw/ George John Thomas James James / );__DATA____C__void greet(SV* name1, ...) { Inline_Stack_Vars; int i; for (i = 0; i < Inline_Stack_Items; i++) printf("Hello %s!\n", SvPV(Inline_Stack_Item(i), PL_na)); Inline_Stack_Void; /* Explicitly specify no RV */}

The Fourth Scenario

Variable length argument list, return multiple values.

Use The Stack both to read the parameters, and to push the return values.Read items off the stack and optionally modify them with Inline_Stack_Item().Inline_Stack_Item() is a setter and getter.

Call Inline_Stack_Reset( ) if you plan to push to the stack.Resets stack pointer to the beginning of the stack.

Inline_Stack_Push() as needed.

Inline_Stack_Done

...no longer a dark, gray bird, ugly and disagreeable to look at, but a graceful and beautiful swan.

Hans Christian Andersen The Ugly Duckling

Well, it's still ugly... It's C... and Perl... and Internals ;)

When coding for Inline::C (and XS, for that matter), you may access Perl's containers: SV's, AV's, HV's, etc.

C's native data types do less, but do it faster

Perl's containers do more but are slower.

Weigh the tradeoffs:

It's often easier to write generic algorithms that don't care what Perl's SV's contain, rather than extracting and manipulating the SV's contents.

If you're going to implement the do more functionality anyway, just use the Perl containers unless you need non-Perl portability.

If you're building a structure that will immediately get passed back, build it with Perl's containers.

Use basic C data types in tight loops or for non-Perl portability.

An aside: Duff's Device: A loop...

do { /* count > 0 assumed */ *to = *from++; /* Note that the 'to' * pointer is NOT * incremented */} while(--count > 0);

An aside: Duff's Device: A loop...unrolled

send(to, from, count)register short *to, *from;register count;{ register n = (count + 7) / 8; switch(count % 8) { case 0: do { *to = *from++; case 7: *to = *from++; case 6: *to = *from++; case 5: *to = *from++; case 4: *to = *from++; case 3: *to = *from++; case 2: *to = *from++; case 1: *to = *from++; } while(--n > 0); }}

If your code is too slow, you must make it faster. If no better algorithm is available, you must trim cycles.

Tom Duff's Device Duff comp.lang.c, Aug 29, 1988

The Sieve of Eratosthenes

Problem: Find all primes less then or equal to the integer 'n'

One Efficient Method:

Sieve of EratosthenesUses a sieve to flag outcasts and retain candidates.

The sieve may be implemented as a bit vector for memory efficiency if the implementation is computationally efficient though often it's not. Otherwise, a simple array.

O(n log log n) time complexity,

A pure function that lends itself well to computational benchmarking.

We will look at Pure Perl, Inline::C, and Inline::CPP (C++).

The Sieve of Eratosthenes (Pure Perl)

sub pure_perl { my $top = ( $_[0] // $Bench::input ) + 1; return [] if $top < 2; my @primes = (1) x $top; my $i_times_j; for my $i ( 2 .. sqrt $top ) { if ( $primes[$i] ) { for( my $j = $i; ($i_times_j = $i * $j) < $top; $j++ ){ undef $primes[ $i_times_j ]; } } } return [ grep { $primes[$_] } 2 .. $#primes ];}

The Sieve targeting Inline::C (with XS macros)

SV* il_c_eratos_primes_av ( int search_to ){ AV* av = newAV(); bool* primes = 0; int i; if( search_to < 2 ) return newRV_noinc( (SV*) av ); Newxz( primes, search_to + 1 , bool ); if( ! primes ) croak( "Failed to allocate memory.\n" ); for( i = 2; i * i 6;

use Inline C => Config => LIBS => '-lm', ENABLE => 'AUTOWRAP' ;

Inline->import( C => LIBS => '-lghttp';use Inline C => Config => ENABLE => AUTOWRAP => LIBS => "-lreadline -lncurses -lterminfo -ltermcap "; use Inline C => q{ char * readline(char *); };

package main; my $x = MyTerm::readline("xyz: ");

Perl's power tools.

XS can call Perl subroutines by symbol, or subref.

XS can accept subrefs as params, or return subrefs. (Callbacks, Currying)

XS functions have access to package globals.

XS functions have access to the lexical pad.

XS functions can create lexical blocks (create closures, for example).

Practical approaches toward objects

Build in Perl, optimize in C

Build all the methods in Perl

Those object methods that stand out in profiling may be rewritten in Inline::C.

90% of the benefit of Inline::C with only 10% of the fuss.

Inline::CPP C++ objects become Perl objects.Public data members get accessors and are exposed to Perl.

Public methods are exposed to Perl.

Private data members and methods aren't exposed to Perl.

C++ constructors, destructors, inheritance, and so on.

An object with Inline::CPP

use Inline CPP =>