Perl notes ---------- Curro Pérez Bernal $Id: perl_notes.sgml,v 1.3 2011/12/12 15:41:58 curro Exp curro $ ------------------------------------------------------------------------------- Abstract -------- Short notes for an introduction to `Perl' based on R.L. Schwartz et al. _Learning Perl_ (See references). Copyright Notice ---------------- Copyright © 2011 by Curro Perez-Bernal This document may used under the terms of the GNU General Public License version 3 or higher. (http://www.gnu.org/copyleft/gpl.html) ------------------------------------------------------------------------------- Contents -------- 1. Scalar Data 1.1. Numbers 1.1.1. Numeric Operators 1.2. Strings 1.2.1. String Operators 1.3. Scalar Variables 1.4. Basic Output with `print' 1.5. Operator Associativity and Precedence 1.6. The `if' Control Structure 1.6.1. Comparison Operators 1.6.2. Using the `if' Control Structure 1.7. Getting User Input 1.8. The `while' Control Structure 1.9. The `undef' Value 2. Lists and Arrays 2.1. Defining and Accessing an Array 2.1.1. List Literals 2.2. List Assignment 2.3. Interpolating Arrays 2.4. Array Operators 2.4.1. Operators `pop' and `push' 2.4.2. Operators `shift' and `unshift' 2.4.3. The `splice' Operator 2.4.4. The `reverse' Operator 2.4.5. The `sort' Operator 2.4.6. The `each' Operator 2.4.7. Array clearing 2.5. The `foreach' control structure 2.5.1. `Perl''s default scalar variable `$_'. 2.6. Scalar and List context 2.6.1. List-producing expressions in scalar context. 2.6.2. Scalar-producing expressions in list context. 2.6.3. `STDIN' in list context. 3. References ------------------------------------------------------------------------------- 1. Scalar Data -------------- The simplest kind of data in `Perl' are _scalar data_, that can be mostly numbers or strings of characters. 1.1. Numbers ------------ Both integers and floating-point numbers have an internal double-precision floating point representation. Examples of floating-point literals[1] * `1.24' * `255.005' * `5.235E45' * `-25.0E-11' * `251.0' * `-1.9221' Examples of integer literals * `24' * `5' * `0' * `-45' * `-2501' * `19221' * `2519988585883' * `2_519_988_585_883' [1] A literal is how a particular value is represented in `Perl'. 1.1.1. Numeric Operators ------------------------ * `2 + 4' * `5.6 - 34.4447' * `4.5 * 4' * `-4 / 3' * `55.5 / 3' * `75 % 2' * `70 % 3' 1.2. Strings ------------ Strings are character sequences that may contain any possible compination of characters. We may differentiate between _single-_ and _double-_quoted string literals. Single-Quoted 'Markus' 'Lena' '' 'Shannon' 'let\'s include an apostrophe!' 'and a backslash: \\' 'a backslash and n: \n' Double-Quoted In this case the backslash is used to specify certain control characters. "Barbara" "Ana" "Hello Karen!\n" "Black\tWhile" The most important string backslash escapes are the following \a Beep \b Backspace \c "Control" caracter. \cD = CTRL-D \e Escape \f Form feed \l Make the next letter lowercase \n New line, return. \r Carriage return. \t Tab. \u Make the next letter uppercase \x Enables hex numbers \v Vertical tab \\ Print backslash \" Print double quotes \ Escape next character if known otherwise print. Also allows octal numbers. \L Make all letters lowercase until the \E \U Make all letters uppercase until the \E \Q Add a backslash-quote to all the nonalphanumerics until the \E \E Terminates the effects of \L, \U, or \Q \007 Any octal ASCII value \x7f Any hexadecimal value \cx Control-x 1.2.1. String Operators ----------------------- * `"hello"."world"' * `"GNU"."/"."Linux"' * `"This is"." "."a sentence.\n"' * `"Tuxie " x 5' * `70 x 3' `Perl' performs the conversion between numbers and strings when it is necessary. 1.3. Scalar Variables --------------------- Variables holding exactly one value that start with a `$' (named the _sigil_) followed by a `Perl' identifier. All the variables that follow are different $hello $Hello $HELLO $Starting_Value $quite_long_variable_name It is important to select meaningful variable names, making use of underscores when possible[1]. The `Perl' assignment operator is the equals sign that takes a variable name in the left side that takes the value of the expression on the right. $hello = 5; $Hello = 4.33; $HELLO = "Good morning!\n"; $Starting_Value = $index - 3; $quite_long_variable_name = $x * 2; Binary assigments are shortcuts like the following $a = $a + 3; $a += 3; $a = $a * 3; $a *= 3; $string = $string." "; $string .= " "; [1] More valuable advices in the _Perlstyle_ documentation. 1.4. Basic Output with `print' ------------------------------ print "This is a message for you.\n"; print "This is "; print "a message for you."; print "\n"; print "This is ", "a message for you."; print "The solution is ", 2*3.125,"\n"; Scalar variables in doubly-quoted string literals are subject to _variable interpolation_ $op_sys = "GNU/Linux"; print "One of the best operating systems is $op_sys\n"; To print the dollar sign it has to be escaped or between single quotes print "The \$op_sys variable value is $op_sys\n"; print 'The \$op_sys variable value ', "is $op_sys\n"; The variable name can be located between curly braces to prevent errors delimiting variable names $job = "student"; print "The book\'s owner is an $student\n"; print "This is the favourite bar of the college ${job}s\n"; 1.5. Operator Associativity and Precedence ------------------------------------------ From the _perlop_ documentation. Associativity Precedence (highest to lowest) left terms and list operators (leftward) left -> nonassoc ++ -- right ** right ! ~ \ and unary + and - left =~ !~ left * / % x left + - . left << >> nonassoc named unary operators nonassoc < > <= >= lt gt le ge nonassoc == != <=> eq ne cmp ~~ left & left | ^ left && left || // nonassoc .. ... right ?: right = += -= *= etc. left , => nonassoc list operators (rightward) right not left and left or xor In case of doubt: use parenthesis... 1.6. The `if' Control Structure ------------------------------- 1.6.1. Comparison Operators --------------------------- Comparison operators return a _true_ or _false_ value and are the following Equal, numeric == Equal, string eq Not Equal, numeric != Not Equal, string ne Less than, numeric < Less than, string lt Less than or equal, numeric <=; Less than or equal, string leq Greater than, numeric > Greater than, string gt Greater than or equal, numeric >=; Greater than or equal, string geq Comparison, numeric <=> Comparison, string comp The unary _not_ operator (`!') give the opposite value of any Boolean value. 1.6.2. Using the `if' Control Structure --------------------------------------- The `if' control structure defines a block that only executed if its associated condition returns a true value if ($name gt "Monika") { print "'$name' is after 'Monika' in sort order\n"; } The keyword `else' allows an alternative choice if ($name gt "Monika") { print "'$name' is after 'Monika' in sort order\n"; } else { print "'$name' is before 'Monika' in sort order\n"; } You may use any scalar value in the conditional $value_1 = 10.0; $value_2 = 2; $check = $value_1 > $value_2; if ($check) { print "\$value_1 is larger than \$value_2\n"; } The rules for deciding if a value is true or false are the following: * All numbers are true except `0' (zero). * All strings are true besides the empty string (''). * All other cases are converted to a number or a string and the previous rules apply. 1.7. Getting User Input ----------------------- The simplest way to get a value from the keyboard into the program is the line-input operator, `'. Every time a program finds an `' where a scalar value is expected, `Perl' reads the next complete line from the _standard input_. The newline character at the end of the line can be removed using the `chomp' operator. $value_1 = 10.0; $value_2 = ; print "\$value_2 = $value_2\n"; chomp($value_2); # newline is removed print "\$value_2 = $value_2\n"; $check = $value_1 > $value_2; if ($check) { print "\$value_1 is larger than \$value_2\n"; } This can be done in a single step $value_1 = 10.0; chomp($value_2 = ); print "\$value_2 = $value_2\n"; $check = $value_1 > $value_2; if ($check) { print "\$value_1 is larger than \$value_2\n"; } 1.8. The `while' Control Structure ---------------------------------- This is one of the possible control structures in `Perl'. It repeats a block of code as long as a given condition is accomplished: $counter = 10; while ($counter > 0) { print "\$counter = $counter\n"; $counter -= 2; } The conditional is evaluated prior to the first iteration, thus it is possible that the block is not executed a single time if the condition is initially false. 1.9. The `undef' Value ---------------------- Values used before being assigned take the special value `undef'. If it is expected to take a numerical value then the assigned value is zero, while in a string value the variable takes the empty string value. This is a standard behavior, though `Perl' will usally warn the user when unusual uses of the `undef' value occur. ------------------------------------------------------------------------------- 2. Lists and Arrays ------------------- A list is defined as an _ordered collection of scalars_, and an array is a _variable that contains a list_. Each element is an independent scalar value and a list can hold a mixture of different scalars (numbers and strings). 2.1. Defining and Accessing an Array ------------------------------------ When using the _strict_ pragma it is necessary to declare an array before it is first used. The character that defines a variable as an array variable is `@'. Thus, to define an array called `replicants' we execute[1]: my @replicants; The array elements are numbered using sequential integers, _starting at zero_, and each array element behaves as an scalar variable my @replicants; # $replicants[0] = "roy"; $replicants[1] = "leon"; $replicants[2] = "pris"; $replicants[3] = "zhora"; # print "$replicants[1]\n"; # my $index = 3; print $replicants[$index-1],"\n"; # floating-point indexes truncate to the next lower integer. The storage of an array element beyond the end of the array extends the array size, with intervening elements created as `undef' values. # $replicants[10] = "rachel"; # six undef elements # The last index of the array `replicants' is `$#replicants', which is the number of elements minus one # my $end = $#replicants; my $number_of_replicants = $end + 1; print "$replicants[$end]\n"; # To extract elements from the end of the list a negative index can be used. # print "$replicants[-1]\n"; print "$replicants[-8]\n"; # [1] In fact the scalar variable <$replicants> is a different variable, though for the sake of clarity it is better to avoid having arrays and scalar variables with the same name. 2.1.1. List Literals -------------------- A list literal is how a list is represented in the code, as a list of comma separated values between parentheses. In this case the range operator (`..') can be used. @replicants = ("zhora", "pris", "leon", "rachel", "roy"); my @numbers = (12, 32, 13, 44, 14, 66); my @elist = (); # Empty list - zero elements my @list_1 = (1..100); my @list_2 = (0..10, 50..100); my @list_3 = ($replicants[1], $replicants[0], 45 + $list_2[3]); The `qw' (quoted word) shortcut simplifies the list definition: my @numbers = qw(12 32 13 44 14 66); The elements are treated as single-quoted strings and it allows to choose any punctuation character as a delimiter @numbers = qw!12 32 13 44 14 66!; @numbers = qw/12 32 13 44 14 66/; @numbers = qw<12 32 13 44 14 66>; 2.2. List Assignment -------------------- You can assign list values to variables and easily swap variables' values my ($var_1, $var_2, $var_3) = ("one", "two", "three"); @replicants = ("zhora", "pris"); @replicants = ("pris", "zhora",); If there are extra values in the right side they are ignored, and if there are extra values in the left side they are given the `undef' value. You can mix arrays and scalars my @characters = (@replicants,"deckard","gaff"); And you can easily copy an array into another array my @copy_arr = @characters; 2.3. Interpolating Arrays ------------------------- An array into a double-quoted string is interpolated, their values expanded, separating the elements by spaces. print "The replicants are @replicants\n"; Thus, it is important to be careful when including the character `@' in a double-quoted string. For example, to define a variable containing an email address once should do it in one of these two alternative ways my $email; $email = "sebastian\@tyrell.com"; $email = 'sebastian@tyrell.com'; A single element array interpolates into its value print "The 3rd element of the array \@replicants is $replicants[2].\n"; 2.4. Array Operators -------------------- 2.4.1. Operators `pop' and `push' --------------------------------- An array can be considered as an _stack_ of information, where you add and remove from the end of the array using the operators `push' and `pop'. print pop @numbers, "\n"; print "$#numbers\n"; push @numbers, 20; push @numbers, 1..60; 2.4.2. Operators `shift' and `unshift' -------------------------------------- This is equivalen to the previous case but the programs take and add elements to the beginning of the list. print shift @numbers, "\n"; shift @numbers; print "$numbers[1]\n"; unshift @numbers, 10; print "$numbers[1]\n"; unshift @numbers, 1..60; print "$numbers[1]\n"; 2.4.3. The `splice' Operator ---------------------------- This operator takes a maximum of four arguments and allows to work with sections of an array, deleting or adding elements at any place. The last two arguments are optional. The first argument is the array and the second is the starting position. If only this two arguments are used `Perl' removes all the elements from the starting position to the end of the array and returns them to you. The third argument is a length making possible to remove some elements from the middle of the array. The fourth argument is a replacement list that is added to the array in the position stated by the second argument. Thus you can remove and add elements in a single statement. If no elements should be deleted, then argument three is made equal to zero. As with scalars, array values between double quotes are interpolated. print "\@characters = @characters\n"; my @array = splice @characters, 2; # remove everything after the third array element print "\@array = @array\n"; print "\@characters = @characters\n"; my @removed_1 = splice @numbers, 3, 6; my @new_array = splice @numbers, 3, 0, 1..5; 2.4.4. The `reverse' Operator ----------------------------- This operator takes a list of values as an argument and returns the list in the opposite order. my @num_range = 1..10; print "@num_range\n"; my @reversed_num_range = reverse @num_range; print "Original array = @num_range\n"; print "Reversed array = @reversed_num_range\n"; 2.4.5. The `sort' Operator -------------------------- This operator takes a list of values as an argument and returns the list sorted according to the internal character values (code point order). my @sorted_characters = sort @characters; print "Original array = @characters\n"; print "Sorted array = @sorted_characters\n"; my @sorted_num_range = sort @num_range; print "Original array = @num_range\n"; print "Sorted array = @sorted_num_range\n"; 2.4.6. The `each' Operator -------------------------- This operator[1] takes an array as an argument and each time it is called returns a pair of values `(index, array_value)': while (my ($index_value, $array_element) = each @characters) { print "Index: $index_value\tElement: $array_element\n"; } [1] Only valid for `Perl 5.12' and later versions. 2.4.7. Array clearing --------------------- The correct way to clear an array is to assign the array to an empty list @replicants = ("leon", "pris", "zhora"); . . . @replicants = (); # The array is emptied Note that this is different from @replicants = ("leon", "pris", "zhora"); . . . @replicants = undef; # The array is the one-element list (undef) 2.5. The `foreach' control structure ------------------------------------ This is a useful control structure to process every element of a list, one at a time, and executing a block of instructions each iteration. For example print "Contents of \@characters:\n"; foreach my $chtr (@characters) { print "$chtr\n"; } Be careful because if you modify the control variable (`$chtr' in the example) you modify the actual list element. For example, if you want to precede every list element by a tab adn add a newline character after the list element print "Contents of \@characters: @characters\n"; foreach my $chtr (@characters) { $chtr = "\t$chtr"; $chtr .= "\n"; } print "Contents of \@characters: @characters\n"; 2.5.1. `Perl''s default scalar variable `$_'. --------------------------------------------- If the control variable is omitted in the definition of the `foreach' control structure the default variable is used: `$_'. For example foreach (1..20) { print; print "$_ ", 2*$_, " ", 3*$_, " ", 4*$_,"\n"; } This is by large the most commonly used default variable in `Perl'. In many cases, when a variable is needed and no name is provided, `Perl' will use `$_' as a replacement. 2.6. Scalar and List context ---------------------------- In `Perl' a given expression can have a different meaning according to the context, thus according where it appears and how it is used. This is common in natural languages. Considering scalars and lists, when `Perl' parses a particular expression, it expects either a scalar or a list value. For example 99 + # Something should be a scalar (scalar context) sort # Something should be a list (list context) If is the same exact sequence of characters, it may give completely different values depending on the evaluation context. For example, an array variable would give the list of elements in a list context while in scalar context it would give the number of elements of the list. my @characters = ("leon","deckard","gaff"); @sorted = sort @characters; # list context :: deckard gaff leon my $i = 22 + @characters; # scalar context :: i = 22 + 3 = 25 Each expression can have different output according to the evaluation context. 2.6.1. List-producing expressions in scalar context. ---------------------------------------------------- In principle you should check the documentation to see what is the output in scalar context of expressions that are usually used to produce a list. Some expression (e.g. `sort') have no scalar-context value. Others have a different value according to the context, like `reverse' my @characters = ("leon","deckard","gaff"); @reversed = reverse @characters; # list context :: gaff deckard leon $reversed = reverse @characters; # scalar context :: noeldrakcedffag Some scalar context examples $variable = ; $tyrell[3] = ; 1234 + ; if () { ... while () { ... $tyrell[] = ; Some list context examples @variable = ; ($tyrell,$sebastian) = ; ($tyrell) = ; push @tyrell, ; foreach () { ... sort ; print ; 2.6.2. Scalar-producing expressions in list context. ---------------------------------------------------- The use of a scalar-producing expression in list context always results in the promotion of the scalar to a one-element list. @output = 6*12; # One-element list (72) 2.6.3. `STDIN' in list context. ------------------------------- The line-input operator `STDIN' in list context returns a list with all the remaining lines up to an end-of-file, each line is a separate list element. For example, reading several lines and removing the end-of-lines: @input = ; chomp(@input); Input from a file will be read until the end of the file, while keyboard input should be stopped with the end-of-file key combination, normally `CTRL-D' in `UNIX' systems. You can chomp all lines simultaneously chomp (@input = ); ------------------------------------------------------------------------------- 3. References ------------- Learning Perl. Ed. O'Reilly (http://en.wikipedia.org/wiki/Learning_Perl) perlop: Precedence and Associativity (http://perldoc.perl.org/perlop.html#Operator-Precedence-and-Associativity) ------------------------------------------------------------------------------- Perl notes Curro Pérez Bernal $Id: perl_notes.sgml,v 1.3 2011/12/12 15:41:58 curro Exp curro $