April 29, 1998
libperl++
, which provides a safe,
simple and complete interface to an embedded perl interpreter. What's that
you ask? An embedded perl interpreter is the same perl engine that normally
runs your Perl scripts, but it's running inside your application instead of in
a separate process. Big deal you say? Imagine all the things you've been
doing (or avoiding) in your C++ applications that are trivial in Perl.
Imagine all of the reusable modules in CPAN that you've never been able to use
with C++. Imagine how nice it would be to extend your applications with
macros written in Perl. Embedded perl makes all this possible.
The only catch is that embedded perl uses a complex API for communicating with
your application. You've got to understand the API before you can start using
embedded perl. That's why libperl++
was developed. It makes
using embedded perl much easier and safer. It also throws in quite a bit of
support for common idioms such as using Perl as your application configuration
system or macro language.
The core of libperl++
consists of a set of classes and templates
that "wrap" perl data and completely insulate your application code from
the perl internals. The wrappers provide you several benefits over just
using the perl internals directly:
libperl++
consists of: a class that interfaces with
the perl interpreter itself; some helper classes for dealing with special
Perl features like regular expressions; classes for implementing XS
routines; macros and templates for making your C++ objects available to
Perl code; and some support code for common uses of the library.
#includeThe classint main () { wPerl perl; perl.eval("print q(Hello, world!\n)"); return 0; }
wPerl
handles the overhead of starting, initializing
and stopping the interpreter. The eval()
method takes a chunk of
Perl code and asks perl to run it. The result is then returned back to C++.
(In this example the return value is ignored.) Here's a more complex, useful example:
#includeThis example was taken from the#include int main () { wPerl perl; perl.use("LWP::Simple"); wPerlScalar getstore = perl.subroutine("getstore"); getstore("ftp://ftp.sunet.se/pub/lang/perl/CPAN/src/latest.tar.gz", "perl.tar.gz"); cout << "fetched latest perl release as perl.tar.gz\n"; return 0; }
libwww
module cookbook. None of
the return values are checked so the code is not robust, but it does
demonstrate several of the key features of libperl++
. For the rest
of this document, only program fragments will be shown, not complete working
examples.
Currently, only one interpreter may exist at any given time, but multiple
interpreters can be created and destroyed in sequence. In most of the
examples, the Perl interpreter is wrapped by a local variable in
main()
. That will probably not be very convenient for your
applications. You will probably want to use a global wPerl *
variable.
A decent interface that supports multiple simultaneous interpreters, possibly
in multiple threads, hasn't been implemented in libperl++
yet.
Certainly this needs to be developed soon in order to support the new features
of Perl 5.005. However, if your code only creates a single interpreter using
new wPerl()
, then your code will continue to work with all future
versions of libperl++
.
wPerlScalar
class. Many C++ built-in types are considered scalars in Perl, so
libperl++
automatically converts from these C++ types to the
equivalent perl scalar form. This greatly simplifies writing C++ code. When converting perl scalars to C++ values however, you must explicitly request what C++ type you want. Perl automatically performs any necessary conversions though, so you don't need to write any type checking code. In fact, it is extremely uncommon to write type checking code. Perl types are often, and sometimes surprisingly, changed as side-effects of other operations.
The functionality provided by the wPerlScalar
class covers
most of the Perl scalar operators and functions. A few extra methods are
required, e.g. is_true()
, because C++ compilers aren't able
to differentiate as many contexts as Perl.
localtime()
function to get
the current date. The eval()
method uses scalar context so
localtime()
returns a string.
wPerlScalar t = perl.eval("localtime"); cout << t.as_string() << '\n';
if (t.find("Apr")) { cout << "excellent time of the year!\n"; }
my $str = "bar"; $str = "foo" . $str;The mechanical translation of this code to
libperl++
is:
wPerlScalar str = "bar"; str = wPerlScalar("foo").append(str);C++ creates two temporary values to execute this: one for
"foo"
and one for the return value from append()
. This is quite
wasteful. Libperl++
provides a prepend()
method to
solve the problem:
wPerlScalar str = "bar"; str = str.prepend("foo");Only one temporary,
"foo"
, is created. That temporary can
be eliminated by using a wPerlScalar
object instead of a
char *
.
wPerlScalar add = perl.eval("sub { my($x, $y) = @_; $x + $y }"); double x = (add(3, 2) + add(4, 2)).as_real();Perl is obviously used in the
add()
function. Perl is not
so obviously used in the + operator
as well.
unsigned short int x = 1, y = 2; add(wPerlScalar(x, wPerlScalar::Force_Integer), wPerlScalar(y, wPerlScalar::Force_Integer));That code explicitly tells the compiler to build a perl scalar from the C++ integer values.
The ambiguous conversion problem also pops up when assigning values to
wPerlScalar
objects. Libperl++
has long, ugly
assignment methods that you can use instead of the = operator
.
For example, instead of:
wPerlScalar x = 0; wPerlScalar y = 1.0; x = 2; y = 3.0;you can write:
wPerlScalar x(0, wPerlScalar::Force_Integer); wPerlScalar y(1.0, wPerlScalar::Force_Real); x.set_as_integer(2); y.set_as_real(3.0);Generally you will only need to resort to these techniques when you are working with
enum
, short
, char
or
unsigned
integer values.
Type conversion is the most difficult part of libperl++
because
it involves C++ overloading features and type conversion operators. Thankfully,
these problems are rare.
map
, grep
and sort
, have C++
equivalents. Here's the basic Perl-like array:
wPerlArray a; a.push(1); a.push(2.0); a.push("three"); cout << "a[1] = " << a[1].as_string() << '\n';The homogeneous STL-like array is similar:
tPerlArray<int> a; a.push(1); a.push(2); cout << "a[1] = " << *a[1] << '\n';This array only accepts integer values. The bracket operator returns a pointer to an integer so that a non-existent array element can be indicated with NULL.
The tPerlArray
template is completely generic and can be used for any
type of data, including C++ objects. It uses a placement constructor to
copy values into the perl array and properly destroys values when
removing elements.
tPerlArrayI<int> b; b.push(3); b.push(4); cout << "b[1] = " << b[1] << '\n';This array is similar to the previous example, but it is optimized to only hold integer-like values. This is a slight performance advantage because the value can fit directly in a perl scalar without needing additional memory. The bracket operator returns the value itself, not a pointer. -1 is used to indicate a non-existent element.
One other variation, tPerlArrayP
can be used to hold pointer
values. It works exactly like tPerlArrayI
, but returns NULL
to indicate a non-existent element.
Libperl++
allows bit strings, strings,
integers, and reals as hash keys. The difference between it and Perl is that
Perl always converts the value to a string and the C++ wrappers don't. For
example, if you use an integer hash key, the key will be the bit string value
of the integer itself, not the printed representation of the integer. If the
hash you create needs to be accessed from Perl, take care to ensure that the
keys are always strings. Most of the Perl hash functions, including
keys
, values
and each
, have C++
equivalents. Here's the basic Perl-like hash:
wPerlHash a; a.set("one", 1); a.set("two", 2.0); a.set("three", "three"); cout << "a{two} = " << a.get("two").as_string() << '\n';The homogeneous STL-like hash is similar:
tPerlHash<int> a; a.set("one", 1); a.set("two", 2); cout << "a{two} = " << *a.get("two") << '\n';This template, the
tPerlHashI
template and the
tPerlHashP
template have the same differences from
wPerlHash
as the array templates have from
wPerlArray
.
wPerlScalar
, such as find()
, that
can be used for simple searches and iteration. However, those methods compile
a regexp each time they're used. This is very inefficient. A
wPerlPattern
object can be used to avoid this because it compiles
its regexp exactly once. The compiled regexp can then be used over and over.
For example, this code creates a regexp that is used several times to find
words in an array:
wPerlPattern word("/\\w+/"); while (input.is_true()) { scalar = input.shift(); if (scalar.apply_pattern(word)) { output.push(scalar); } }The pattern object can also use the full range of Perl's regexp operations including transliteration, substitution and all of the associated flags.
wPerlScalar getstore = perl.subroutine("getstore");asks the perl interpreter to fetch a subroutine called
getstore
from the top level package. If the subroutine doesn't exist, the method
returns Perl undef. Creating an anonymous subroutine is easy too. Here's an example from above:
wPerlScalar add = perl.eval("sub {" "my($x, $y) = @_;" "$x + $y" "}");This is my preferred way of formatting a Perl subroutine embedded in C++ code. The C++ standard guarantees consecutive string constants are treated as a single constant.
When using anonymous subroutines, make sure you're using a modern version of perl; versions up to 5.004 had serious bugs and memory leaks.
Libperl++
has special syntax defined so that you can use a perl
scalar just like a regular function call:
wPerlScalar r = add(1, 2); int n = add(r, "3").as_integer();There are some limitations with this syntax though. For one, the Perl subroutine is always called in scalar context. Second, the syntax is pretty rigid. The library only allows up to 10 scalar parameters passed to the subroutine and a single scalar is always returned. If an array or hash is given, it is silently converted to a reference.
wPerlScalar join = perl.eval("sub {" "my $sep = shift;" "join $sep, @_" "}"); cout << join(", ", 1, 2.0, "three").as_string() << '\n'; wPerlArray a; a.push(1); a.push(2.0); a.push("three"); cout << join(", ", a, 4, 5.0, "six").as_string() << '\n';This produces the output:
1, 2, three ARRAY(0xd5850), 4, 5, sixThe last warning about calling subroutines is that prototypes are always ignored. An array is always passed as an array reference and a hash as a hash reference.
wPerlScalarShadow x = perl.scalar("Some::Module::x"); x = 10;This code looks up a scalar Perl variable known as
$Some::Module::x
and shadows it to the C++ variable x
. Any assignment to x
affects $Some::Module::x
and vice versa. They share the same perl
scalar value. Normal wrappers have pass by value semantics whenever they are constructed or copied. For example, when a normal wrapper is passed into a subroutine, a copy of the value is made and it is the copy that the subroutine uses. The shadow wrappers have pass by reference semantics when constructed and pass by value semantics when copied. Using the subroutine example again, when a shadow wrapper is passed to a subroutine, the value is not copied and the subroutine is able to modify the original.
Here is a brief comparison of the two wrapper flavors.
# Sample Perl code sub ModifyArgument { $_[0] = 1; } sub DontModifyArgument { my($arg) = @_; $arg = 1; } // Sample C++ code with identical semantics void ModifyArgument(wPerlScalarShadow arg) { arg = 1; } void DontModifyArgument(wPerlScalar arg) { arg = 1; }This example only serves to help you understand shadow wrappers. You should probably not write code like this. It is better to use standard C++ references, i.e.
wPerlScalar &arg
, for handling output arguments
because that is easier to read and has better performance. The shadow
wrappers are necessary for sharing values between C++ and Perl code, but try
to avoid them if possible because they are surprising to the casual reader.
Shadow wrappers have the same base name as the common wrapper, but end in the
name Shadow
. For example, wPerlArrayShadow
is the
shadow wrapper for shadowing array values.
Libperl++
has many features that are easy to use, but also
have fairly a heavy performance cost. This section examines a simple, but
fairly common, performance problem you might encounter. At first glance, you may think the following C++ code runs much faster than the equivalent Perl code. You'd be wrong. The Perl code is actually faster. On the Sun Ultra 2, the Perl code runs 10% faster than the C++ code.
// C++ code # Perl code wPerlScalar r = 0; $r = 0; for (int i = 0; i < 100000; ++i) for ($i = 0; $i < 100_000; ++$i) { { r = r + i; $r = $r + $i; } }The trouble with the C++ code is that it creates a lot more temporary values than the Perl code -- which means a lot more calls to
malloc()
. The following C++ code runs about 10x faster than the
Perl code. (Possibly faster if you have a really good C++ compiler.)
// C++ code # Perl code wPerlScalar r = 0; $r = 0; for (int i = 0; i < 100000; ++i) for ($i = 0; $i < 100_000; ++$i) { { r += i; $r += $i; } }Of course, the way to get the best performance out of C++ is to avoid using Perl objects directly when regular C++ will work. The following C++ code runs much faster than the Perl code. Perl is no slouch though, so even though the C++ code is faster, you might not notice it in your application's over-all performance. Making a habit of writing code like this won't win any points with future colleagues maintaining your code either.
// C++ code # Perl code wPerlScalar r = 0; $r = 0; int temp_r = r.as_integer(); for (int i = 0; i < 100000; ++i) for ($i = 0; $i < 100_000; ++$i) { { temp_r += i; $r += $i; } } r = temp_r;The performance advice given here only applies to doing things that either C++ or
libperl++
can do quickly, i.e. as direct functions without
having to use the Perl interpreter to evaluate. If you are doing a lot of
Perl subroutine calls or string evaluation then the C++ code will by
definition run only as fast as the equivalent Perl. The other condition to
watch out for when tuning your application is when neither your C++ code or
the Perl code are the bottlenecks. This frequently happens when doing intense
I/O or large memory allocations.
wPerl
, wPerlScalar
,
wPerlArray
, etc. objects because static C++ object
initialization sequencing is a nightmare. You'll have cases where your
scalars are initialized before your interpreter or other horrible things.
If you absolutely need to create static perl objects, you're probably
better off just letting libperl++
start a perl interpreter
whenever you need one. This is a long way from perfect because you won't
be able to control when/how the interpreter gets created, but at least
your static constructors will (probably) succeed. There is the static
method wPerl::run()
that will return (and possibly start)
the default running perl interpreter.
perl.eval("print qq(hello, world\\n)");is much better than:
perl.eval("print \"hello, world\\n\"");The rules for building a Perl expression, including those for building strings, are exactly the same when storing Perl expressions in C++ strings as they are for Perl scripts. However, the C++ compiler might throw a few surprises at you. Take the following code:
perl.eval("print q(\n)");If you're reading this code with your brain in Perl mode, you might think it will print a back slash followed by the letter
n
. Perl
would, if that was what it saw. C++ parses the string first and converts
all back slash notation before Perl even sees the string. This example
actually prints a newline.
wPerlShadow
objects can be difficult to
understand. As a general rule of thumb you should only use shadow
objects to shadow a Perl variable that you want to access from both C++
and Perl.
wPerlScalar
from a char
[]
allocated on the stack, as long as the string has been
initialized. The following code has serious problems however:
wPerlScalar str = new char[20];When perl creates a new scalar, it uses
strlen()
to compute
the initial length. In this case the string is not nul terminated and
the result of strlen()
is undefined. Even if the code doesn't
crash, a memory leak will occur.
libperl++
is still evolving and growing to fit the needs of its
users. The basic features have stabilized though, and it is being used
to implement several production applications. I'm very interested in hearing
feedback from people using the library. If you have any comments, please
mail them to me.
Hopefully the safety and simplicity of libperl++
will encourage
more programmers to embed perl into their applications. That's good for
everybody.