Introduction to Perl

hollowtexicoSoftware and s/w Development

Dec 13, 2013 (3 years and 10 months ago)

175 views

Introduction to Perl
Practical Extraction and Report Language
High-level general purpose 'scripting' language, used mainly for
text processing, administrative tasks, database and network/web
programming. Also data manipulation tasks such as in
bioinformatics and finance.
It's interpreted, procedural and dynamically typed with automated
memory management.
It's widely deployed on *nix systems distributed with most modern
distributions and also available for Windows (ActivePerl project).

"
duct tape
of the Internet
"
by Liam McNamara
History of Perl
Larry Wall is a linguist and the 'benevolent dictator' of the
language, he created it while he was a sysadmin at NASA.
Version 1.0 was released at the end of 1987, and it is now at
v5.8.
It was predominately created for
source code patching and
report generating. The perl distribution is written in C and
has been ported to over 100 platforms.
Behaviour and syntax influenced by:
awk, sed, C, shell script and even some lisp
It's free software licensed under the GNU General Public
License.
Overview of the language

Dynamically
typed making it flexible

Native regular expressions

'Interpreted', allowing fast development cycles

Modular and Object Orientated'ish

Provides automated security checks

Very expressive - problem orientated

Arguably ugly? Its been described as a 'write only language'

Language is currently defined by the implementation but
the next version is currently being specified first.

'
Makes easy things trivial and hard things possible
'
Simple Example
Mandatory Hello world
$ perl -e 'print “Hello World\n”;'
$scalar = '4'; # the ASCII character
$scalar++; # increment
print $scalar + “\n”; # prints '5' with newline
Variables are denoted by using
sigils
, this special character ($@%&
or *) shows the type (like in BASIC). This facilitates implicit type
conversion which speeds development but is sometimes confusing as
a lot can be going on behind the scenes.
$ perl -e 'print “$> $$”;' #print PID UID
A perl program is a list of commands terminated by semi-colons.
Often used as 'one-liners' as part of shell scripts.
Example 1
Arrays, Hashes and Refs
@array = ('a', 'b', 'c');
print $array[0]; #will print 'a'
%hash = ('key1' => 'value1', 'key2' => 'value2');
$hash{'key3'} = 'value3';
$ref = \@array;
print $ref->[0];
$coderef = sub {print “howdy doody\n”};
&$coderef; #will perform print
All mod-cons
Array
- N
umbered list of data items, will automatically increase size.
Hash
- An
associative list, mapping scalar keys to values.
Reference
- Simila
r to C pointer it's simply a scalar containing the
address of another variable, which could be another scalar, array,
hash, code block or even another reference.
TMTOWTDI
There's More Than One Way To Do It
if ($x == 0) {$y = 10;} else {$y = 20;}
$y = $x==0 ? 10 : 20;
$y = 20; $y = 10 if $x==0;
unless ($x == 0) {$y=20} else {$y=10}
if ($x) {$y=20} else {$y=10}
$y = (10,20)[$x != 0];
Blessing or curse?
Although this allows greater expressiveness when quantifying a
problem, it can reduce readability for other programmers (and
sometimes yourself at a later date).
All of these do the same:
Syntactical Sugar
Range operator:

1..10 # equat
es to the list
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
Default variables are specially named variables whose content depends
on the enclosing block:

$_ -variable being iterated over

@_ -the parameters to a subroutine
Readline operator will read a line from an open file, or by default the
STDIN handle:

<FileHandle> or <>
Backticks will perform a system call and return the result:

$daemons = `cat /etc/passwd| grep daemon`
Flow Control
C style
for
for (initialise; exit-condition; per-loop) {
...
}
Scripting style
for
foreach (@list) {
print $_; #don't even need the '$_' here
...
}
Iterate over the given '
@list
', setting '
$_
'
t
o each data item in
the list.

Straight-forward
while
while (expression) {
...
}
Example 2
do {
...
} while (expression);
Subroutines
Parameters are passed as a flattened list, merging all into one
large array called @_. This is defined for the lexical scope of the
subroutine block and can be accessed by '
shift
' (which will default
to 'shifting' the first element off @_).
They also employ context awareness of their calling allowing
different behaviour depending if a scalar is wanted or not, using
the builtin command
WANTARRAY
.
# adduser('someguy','secret',('users','wheel'))
sub adduser ($$@) {
$username = shift;
$password = shift;
@groups = @_;
...
}
Informal parameters
Regex
Matching /pattern/
- tests if a pattern matches the input
if ($username =~ /admin/) { doAdmin(); }
.
- matches any single character except newline
\s -match any whitespace character
a*
- matches 0-

amount of 'a's
b{1,4} - matches 1-4 occurrences of 'b'
Grouping ()
- allows naming of matched sections
'bookmark.html' =~ /(.*)\.(.*)/;
$1 # 'bookmark'
$2 # 'html'
Character class [abcd]
– matches one of a,b,c or d
The regex engine will find any possible way of matching the pattern
to the input, backtracking when a decision leads to a false result.
Regular Expressions
Regex 2
Substitution
-
s/o
ld/new/

The old brown fox” becomes “The new brown fox”

bold and old ” becomes “bnew and old”
Regular expressions can perform quite complex tasks.
Strips any leading whitespace and repeats the first word:

s/\W*((\w*).*)/$2 $1/
Tests if a
number
is prime:




perl -wle 'print “Prime” if (1 x shift) !~ /^1?$|^11+$/
number


Example 3
Object Orientated (nearly)
package objecttype;
sub new {
$self = {}; #empty hash
bless($self);
return $self
}
...
1; #returned to the '
use
' command to show success
A perl module can be created with special subroutines that allow the
module to be instantiated as an 'object'.
The 'object' can then be instantiated by using the '
new
' keyword
use objecttype;
$obj = new
objectname();
$obj->methodname();
Example 4
Tie variables
Most OO functionality can be replicated with a much cleaner feature
of perl, Tie. An OO style module is defined with special sub names,
this can then be bound to a variable (scalar, array or hash). When
that variable is consequently used the subs of the module are called.
Specially named subs need to be defined depending on the variable
type that the 'class' is going to be
tied
to.

$scalar - TIESCALAR, FETCH, STORE

@array - TIEARRAY, FETCH, STORE, FETCHSIZE, STORESIZE, SHIFT..

%hash - TIEHASH, FETCH, STORE, FIRSTKEY, NEXTKEY, EXISTS...
This feature provides many OO benefits through the natural interface
of variables, avoiding the awkward perl OO syntax;
Example 5
XS
Extension interface between Perl and C
An
XSUB
provides a layer of abstraction between the Perl engine
some C code. The interface definition written in
XS
allows perl code
to call C functions, obviously providing a performance improvement
(useful for computational bottlenecks).
The XS interface stubs are combined with the C library by the special
compiler
xsubpp
and allows the perl code to utilise it. The extra code
allows the necessary context switches making all parameters and
global variables available to the C functions.
An empty perl module has to serve as a dummy module to allow the
bootstrapping of the C code. The C library is then linked to the perl
runtime library.
POD-
Embedded
Docs
POD allows the documentation to be embedded in the code, allowing
them to be distributed together, and hopefully promote good
commenting.
=head1 Main-Header-Text
Documentation with B<
Bold Text
>
=over 4
=item and indented sections or 'items'
=back
=cut
There are tools to extract the POD and create different file types
pod2latex, pod2html, pod2man.
Plain Old Documentation
Example 6
CGI- Web coding
One of the major reasons perl has become one of the main
languages of the web is its quick development time (automatic
memory management, dynamic typing, interpreted).
Advanced text (HTML,XML) processing, focus by the community on
related tasks providing a lot of web/network related development.
It's a key feature of 'platforms' such as LAMP (Linux+ Apache+
MySQL+ perl) for web development.
use CGI;
mod_perl
is a module of the Apache web server, which provides
the ability for the web server to understand perl and avoids
spawning a
separate
process, caches memory and provides
persistent database connections between separate HTTP requests.
Community
There is a well developed community surrounding the perl language,
providing an extensive resource of testers and developers for the
distribution and 3
rd
party code projects
.
CPAN is a huge repository (9144+ projects) of perl code providing
modules to perform a wide selection of tasks, so you can avoid
'reinventing the wheel'.
There are Obfuscated Perl Contests
celebrating the non-alphanumeric
syntax and general unreadability. They are marked on brevity,
creativity and power.
Perl Mongers is an international collection of perl user groups. The
most recent meet for London.pm was last month.
comp.lang.perl
Usenet
– CPAN Comprehensive Perl Archive Network
Example 7
Future of Perl
Perl6
Perl is currently undergoing a complete rebuild of the language
and internals, there is a fully specification being specified first. It
will not be keeping backward compatibility in an effort to clean-up
many historical features. Adding ideas such as optional static
typing, Unicode operators and formal parameters.
Parrot VM features


Generation of an intermediate bytecode form (like python) for
quick re-execution


Native binary format creation possible


Register based VM (JVM is stack based)


Will be coded in itself.
There is also PUGS, an implementation of the Perl6 engine in
Haskell which will eventually be used to translate itself.
References

http://perl.com - main site

http://cpan.org - massive collection of Perl modules

http://parrotcode.org – new Perl6 VM

http://activestate.com/ActivePerl/ - Windows Perl

http://macperl.com – Perl for old Mac OS

http://pm.org – perlmongers

http://perl.apache.org- mod_perl

$ man perl

$ man perldoc
Questions?