Perl Scripts - Resources for CGI and Perl Programming

Information Technology Resources

Perl, also Practical Extraction and Report Language (a backronym, see below) is a dynamic procedural programming language designed by Larry Wall and first released in 1987. Perl borrows features from C, shell scripting (sh), awk, sed, Lisp, and, to a lesser extent, many other programming languages.

The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Its major features are that it's easy to use, supports both procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has a large collection of third-party modules.

Language features

The overall structure of Perl derives broadly from the programming language C. Perl is a procedural programming language, with variables, expressions, assignment statements, brace-delimited code blocks, control structures, and subroutines.

Perl also takes features from shell programming. All variables are marked with leading sigils. Sigils unambiguously identify variable names, allowing Perl to have a rich syntax. Importantly, sigils allow variables to be interpolated directly into strings. Like the Unix shells, Perl has many built-in functions for common tasks, like sorting, and for accessing system facilities. Perl takes lists from Lisp, associative arrays from awk, and regular expressions from sed. These simplify and facilitate all manner of parsing, text handling, and data management tasks.

In Perl 5, features were added that support complex data structures, first-class functions (i.e. closures as values), and an object-oriented programming model. These include references, packages, and class-based method dispatch. Perl 5 also saw the introduction of lexically scoped variables, which make it easier to write robust code, and modules, which make it practical to write and distribute libraries of Perl code.

All versions of Perl do automatic data typing and memory management. The interpreter knows the type and storage requirements of every data object in the program; it allocates and frees storage for them as necessary. Legal type conversions are done automatically at run time; illegal type conversions are fatal errors.

Applications

Perl has many and varied applications. It has been used since the early days of the Web to write CGI scripts, and is an integral component of the popular LAMP (Linux / Apache / MySQL / Perl, PHP, and Python) platform for web development. Large projects written in Perl include Slash, early implementations of PHP, and UseModWiki, the wiki software used in Wikipedia until 2002. It's known as one of "the three Ps" (Perl, Python, and PHP), which are the most popular server-side, open source scripting languages for the Web, though open source Java and C# implementations as well as Ruby have grown popular in recent years.

Perl is often used as a "glue language", tying together systems and interfaces that were not specifically designed to interoperate. Systems administrators use Perl as an all-purpose tool; short Perl programs can be entered and run on a single command line. Perl is widely used in finance and bioinformatics, where it is valued for rapid application development, ability to handle large data sets, and the availability of many standard and third-party modules.

Implementation

Perl is implemented as a core interpreter, written in C, together with a large collection of modules, written in Perl and C. The source distribution is, as of 2005, 12 MB when packaged in a tar file and compressed. The interpreter is 150,000 lines of C code and compiles to a 1 MB executable on typical machine architectures. Alternatively, the interpreter can be compiled to a link library and embedded in other programs. There are nearly 500 modules in the distribution, comprising 200,000 lines of Perl and an additional 350,000 lines of C code. Much of the C code in the modules consists of character encoding tables.

The interpreter has an object-oriented architecture. All of the elements of the Perl language—scalars, arrays, hashes, coderefs, file handles—are represented in the interpreter by C structs. Operations on these structs are defined by a large collection of macros, typedefs and functions; these constitute the Perl C API. The Perl API can be bewildering to the uninitiated, but its entry points follow a consistent naming scheme, which provides guidance to those who use it.

The execution of a Perl program divides broadly into two phases: compile-time and run-time. At compile time, the interpreter parses the program text into a syntax tree. At run time, it executes the program by walking the tree. The text is parsed only once, and the syntax tree is subject to optimization before it is executed, so the execution phase is relatively efficient. Compile-time optimizations on the syntax tree include constant folding, context propagation, and peephole optimization.

Perl is a dynamic language and has a context-sensitive grammar that cannot be parsed by a straight Lex/Yacc lexer/parser combination. Instead, it implements its own lexer, which coordinates with a modified GNU bison parser to resolve ambiguities in the language. It is said that "only perl can parse Perl", meaning that only the Perl interpreter (perl) can parse the Perl language (Perl). The truth of this is attested to by the persistent imperfections of other programs that undertake to parse Perl, such as source code analyzers and auto-indenters.

Maintenance of the Perl interpreter has become increasingly difficult over the years. The code base has been in continuous development since 1994. The code has been optimized for performance at the expense of simplicity, clarity, and strong internal interfaces. New features have been added, yet virtually complete backward compatibility with earlier versions is maintained. The size and complexity of the interpreter is a barrier to developers who wish to work on it.

Perl is distributed with some 90,000 functional tests. These run as part of the normal build process, and extensively exercise the interpreter and its core modules. Perl developers rely on the functional tests to ensure that changes to the interpreter do not introduce bugs; conversely, Perl users who see the interpreter pass its functional tests on their system can have a high degree of confidence that it is working properly.

There is no written specification or standard for the Perl language, and no plans to create one for the current version of Perl. There has only ever been one implementation of the interpreter. That interpreter, together with its functional tests, stands as a de facto specification of the language.

Availability

Perl is free software, and is licensed under both the Artistic License and the GNU General Public License. It is available for most operating systems. It is particularly prevalent on Unix and Unix-like systems (such as Linux, FreeBSD, and Mac OS X), and is growing in popularity on Microsoft Windows systems.

Perl has been ported to over a hundred different platforms, and can, with only six reported exceptions, be compiled from source on all Unix-like, POSIX-compliant or otherwise Unix-compatible platforms, including AmigaOS, BeOS, Cygwin, and Mac OS X (See ports). A special port, MacPerl, is available for Mac OS Classic.

Perl can be compiled from source on Windows, however many Windows installations lack a C compiler, so Windows users typically install a binary distribution, such as ActivePerl or IndigoPerl. Users without a C compiler are also limited to pure Perl modules if they wish to add to the module library that comes with Perl. There's free software that can enable these users to install C modules, however it tends to be poorly documented, especially for beginners.

Links

Perl.Com - The official Perl home page, run by O'Reilly. Contains documentation, news, and links to a variety of resources.

CPAN - The Comprehensive Perl Archive Network, the gateway to all things Perl. The canonical location for Perl code and modules.

dev.perl.org - Perl 6 - The official site for the development of Perl 6, the next generation of the Perl programming language.

Kamango.com : PERL Channel - Wide range of Perl scripts, programs and tutorials.

O'Reilly Network: Perl Weblog - Features links and commentary by a variety of authors.

O'Reilly Perl Center - Current and past products, resources, and news on O'Reilly and Associate's Perl involvement.

Perl Cabal - Humorous profiles of Perl luminaries.

The Perl Institute - Non-profit organization dedicated to making the incredibly useful Perl language even more useful for everyone. Supports Perl creators, developers, maintainers, and users. Recently disbanded, and is being incorporated into the Perl Mongers.

Perl Paraphernalia - By Mark-Jason Dominus. Perl Advanced Techniques Handbook (drafts). Hints, articles, Perl modules, programs.

Perl Quiz - Test your knowledge of Perl by answering 15 questions at basic, intermediate or advanced level. Questions change every time.

CGI Perl on the Web- put together this page back in '98, I think, while working on a CGI project. I thought I was the only user, but a few months later I found, to my surprise (well, it was and still is nothing but about a dozen links!), that this page was quite heavily requested and linked to. Its popularity has not decreased since, and occasionally I get e-mails from people telling me that some link was broken or something. Prompted by one such e-mail, I took the trouble to test the links, expel the bad ones, and find substitutes for others. However, I really don't think I am going to ever update it again -- and I'm sure there are already plenty of other places with good information that I don't know about. If no other webpage provides a better portal to the CGI world than this one (which I find very doubtful) I hope someone will step up and undertake maintaining it. At which time, I would appreciate a notice to take this one off-line.

Perl Project Course: A one-quarter project seminar to learn the programming languages perl and Java. Course CS4920-5 (3-0).

use Perl - General discussion of perl and issues relating to it, Perl news. Maintained by Naval Postgraduate School.

Perl Sites: A comprehensive list of perl sites maintained by University of Tennessee, Knoxville

Share This


Suggestions for Further Reading

Attribution: Some information for this article has been adapted from wikipedia under Creative Commons Attribution-ShareAlike License.