Special Variables¶
Special variables in perl are sometimes called "sigil variables" or "punctuation variables."
Table of Contents¶
- List of Special Variables
$/
and$\
- Practical Usage Examples
- Memorize These
- Perl Special Variable Cheatsheet
List of Special Variables¶
-
$_
: Default variable. Holds the current line when processing text or the default input.- Most common, used in
while (<>) {...}
,foreach
,map
, regex, etc. - In one-liners, this is the current line.
$.
: Current line number.- In a file read loop, it increments every line.
$0
: Name of the running script.- Just like
$0
in bash. Shows the name of the perl script being executed. $?
: Exit status of last system command.- Just like
$?
in bash. Shows exit code. @ARGV
: Command line arguments.- Equivalent to
$@
in Bash. $ARGV
: The name of the current file when looping over@ARGV
inwhile (<>)
.- Used inside
while (<>)
loops.
So if you're running: While readingfile1.md
,$ARGV
will the"file1.md"
, etc..
- Most common, used in
-
$!
: Last system error message.- Like
strerror(errno)
in C. - E.g.:
die "Error: $!"
- Like
$^O
: Operating system name.linux
,darwin
,MSWin32
, etc.
$ENV{VAR}
: Environment variables.- Access shell env vars. E.g.,
$HOME
would be$ENV{HOME}
- Access shell env vars. E.g.,
-
$|
: Autoflush output buffer.-
Normally, Perl buffers output and flushes it to the terminal when a newline is found.
-
Setting
$| = 1;
disables buffering, so output is immediately written. - Useful for progress bars, interactive output, long running scripts, etc.
$&
: Matched string in last regex.- Holds the whole matched string from the last regex.
- Like
${BASH_REMATCH[0]}
$1
,$2
, etc.: Capture groups in Regex.- Like bash regex captures, but instead of
\1
, it's$1
.
-
The difference between $ARGV[n]
and @ARGV
comes from how variables are accessed in Perl:
@ARGV
: Refers to the entire array. i.e., all the command-line arguments.$ARGV[0]
: Accesses a single element (scalar) from the array@ARGV
.-
$ARGV
(without[]
, scalar context): Holds file name passed in via command line arguments or stdin when used in scalar context.- This will hold the filename that is currently being processed if there are multiple files.
Advanced/Less Common Special Vars¶
-
$^I
: Stores the in-place edit extension (used with the-i
flag).-
Define this variable to enable in-place editing. Use
undef $^I
to disable in-place editing. -
Like using
sed -i.bak
, perl supports the same thing. $^I
stores the backup extension you set (perl -p -i.bak -e '..'
).- If you set it (e.g.,
our $^I = '.bak'
), Perl will create a backup of the original file. - Example from the command-line:
will back up the original file to
file.txt.bak
.
-
-
$^W
: Current value ofwarnings
.- Shows if warnings are enabled.
- Rarely used directly. Instead, use
use warnings;
.
-
$.
: Line number in the current input file. $/
: Input record separator (default is newline).- Changing it lets you change how Perl reads input.
- You can change it to read whole files in one go.
- Example:
undef $/;
reads the entire file at once.
-
$\
: Output record separator.- Adds a string after every
print
. - Ex:
(Every print automatically ends with
\n
.)
- Adds a string after every
-
$"
: Separator when interpolating arrays (default is a space" "
).- Default is a space
" "
. Example:
- Default is a space
$/
and $\
¶
These can be changed to modify input/output behavior:
Variable | Behavior |
---|---|
$/ |
Input record separator. Default is newline. Example: undef $/; reads the whole file at once. |
$\ |
Output record separator. Example: $\ = "\n"; automatically adds newline after every print. |
$/
- Input Record Separator¶
$/
defines what Perl considers the "end of a record" when reading input.
It's set to a newline \n
by default, meaning one line at a time.
$/
(Input Record Separator) Examples:¶
If you want to read the entire file contents into one variable, you can use undef
to unset this variable:
undef $/; # Remove the newline separator
open my $fh, '<', 'file.txt' or die $!;
my $file_contents = <$fh>;
close $fh;
print $file_contents;
-
open
: Builtin perl function to open a file.my $fh
: Defines$fh
as a file handle. Like a pointer to the opened file.- The
my
keyword makes it lexically scoped (only available in that block)
- The
'<'
: Open in read-only mode. Other options:'>'
: Write (overwrite)'>>'
: Append
'file.txt'
: The file to openor die $!;
: Ifopen
fails, this will terminate the script and print the system error msg from$!
.
-
my $file_contents = <$fh>;
- The angle brackets around
$fh
are called the diamond operator. - Reads input from a filehandle or from
@ARGV
if no file handle is specified - Normally this would only read one line from
$fh
by default,- But, because we did
undef $/;
, it changed the behavior to read the entire file in one shot (called "slurping").
- But, because we did
- The angle brackets around
If you want to read files where entries are separated by blank lines (like
paragraphs, config entries, etc.) you can change this to a blank line (""
):
If you're parsing delimited records, like CSV, without a CSV parser:
$/ = ",";
open my $fh, '<', 'file.csv' or die $!;
while (<$fh>) {
print "CSV field: $_\n";
}
close $fh;
$\
- Output Record Separator¶
The $\
variable is the output record separator.
This variable controls what will happen at the end of any output statements (print
,
say
, etc.).
For instance, setting $/ = "\n";
will append a newline to the end of all print
calls.
The default value of this variable is nothing ""
.
This is useful for formatting data (e.g., $/ = ",";
for formatting data into CSV).
$\
(Output Record Separator) Examples¶
Perl will append the contents of the $\
variable to every single print
statement. By default, it's empty (""
).
If you wanted to automatically add a newline to every print
statement:
\n
to the end of every print statement.
If you wanted to auto-separate output records or lines with something else:
If you wanted to format output as CSV, you can set $/
to ,
:
Practical Usage Examples¶
Print the Current Line Number¶
The $.
variable stores the current line number while looping over a file.
-n
wraps the code in an implicit while (<>) { ... }
loop.
So, this is an alternative to -p
if you want to print modified versions of the
lines.
If you did this with -p
, you'd get two copies of the lines. Since the -e
code
is run before -p
prints the $_
variable, it would print what's in the print
statement first, then it would print the actual line from the file ($_
).
Using Environment Variables¶
Showing the Last Error¶
Auto-flushing Output with $|
¶
Setting $|
stops perl from buffering output, and flushes it directly to the
terminal.
Without $| = 1
, Perl will buffer output and only display "Processing...
" after a
newline (\n
) or when the buffer is flushed.
With $| = 1
, the text is immediately flushed and visible on the screen without waiting.
Memorize These¶
$_
: Current line$.
: Current line number$1
,$2
,$&
: Regex captures$&
is the entire match, not capture groups.
$!
: Last system error@ARGV
: Command-line args$ARGV
: Current file in @ARGV$ENV{VAR}
: Environment vars$?
: Exit code$^O
: OS name$|
: Output autoflush$/
: Input separator$\
: Output separator
Perl Special Variable Cheatsheet¶
Variable | Description | Example |
---|---|---|
$_ |
Default variable. Holds the current line when processing input. | |
$. |
Current line number when reading a file. | |
$0 |
Name of the running script. | |
$? |
Exit status of the last system command (like Bash). | |
@ARGV |
Array of command-line arguments (like Bash's $@ ). |
|
$ARGV |
Current file being read when looping over @ARGV using while (<>) . |
|
$! |
Last system error message. | |
$^O |
Operating system name (linux , darwin , MSWin32 , etc.). |
|
$ENV{VAR} |
Access shell environment variables. | |
$\| |
Autoflush output buffer (1 = no buffering, 0 = default buffering). |
|
$& |
Entire matched string from the last regex match. | |
$1 , $2 , etc. |
Capture groups in regex. | |
$^I |
In-place edit extension (for -i flag). |
perl -pi.bak -e 's/foo/bar/' file.txt |
$^W |
Warnings flag. | Rarely used — instead, use use warnings; . |
$/ |
Input record separator (default: \n ). |
undef $/; reads entire file at once. |
$" |
Array element separator when interpolated. | Default: " " |
$\ |
Output record separator. | Example: $\ = "\n"; print "Hello"; adds newline after every print . |
Regular Variables¶
Variables have three typical data types in Perl, indicated by their sigil:
$var
: Scalars@var
: Arrays%var
: Hashes
Everything is basically a string or a number.
Variable Contexts¶
Variables act differently when they're in different "contexts."
There are four main contexts, which defines what perl wants:
- Scalar: Perl wants a single value.
- List: A list of values.
- Void: Throwaway result.
- Boolean: True/false evaluation.
Another way to look at it:
Context is how Perl decides what kind of value it wants from an expression.
- A single value: scalar context
- A list of values: list context
- A boolean test: boolean context (a special case of scalar)
- A void context: the value is thrown away
$x = @array; # Scalar context (returns number of elements)
@x = split /,/, $str; # List context (returns full list)
split /,/, $str; # Void context (result ignored)
if (@array) { ... }; # Boolean context (true if array is non-empty)
There are other types of contexts, but the four main ones are what we usually care
about.
Another one worth mentioning is reference context.
Reference context is used when coercing into a reference (\@array
, \%hash
, etc.).
Scalar Context¶
When using the $var = ...
syntax, this is known as scalar context.
List Context¶
When perl expects a list of values, it's list context.
E.g., using @var = ...
, this is list context.
List context is used when assigning to arrays, list assignments, foreach
loops,
function arguments, and when returning a list.
print join(", ", @items); # join gets a list context
sub names { return ("alice", "bob"); }
my @n = names(); # names() in list context
You can assign scalars from list context by using parentheses. For instance, to assign the first element of an array to a scalar:
my ($first_name) = @name_list;
# or to grab the first two elements:
my ($first_name, $second_name) = @name_list;
Void Context¶
Void context is used when the result of an expression is ignored.
Boolean Context¶
Boolean context is used when evaluating truthiness, like with if
statements:
Sigils¶
Sigils are what come before variables to define what kind of variable they are.
Sigil | Used for |
---|---|
$ |
Scalars |
@ |
Arrays |
% |
Hashes |
& |
Subroutines (code) |
Keywords for Declaring Variables¶
There are three main keywords used to declare variables.
-
This is what you'll use 99% of the time in a perl script.my
: The most common one. Makes a new lexical variable (which is privately scoped to the current block).
-
our
: Makes a lexical alias to a package global. The real global lives in a package, usuallymain::
.
-
local
: Temporarily changes the value of a package global for the duration of a block (then it auto-restores).
When we're creating normal variables for your script, use my
.
When we want to permanently change a global, e.g., @ARGV
, we can use our
.
When we want to temporarily change a global, e.g., @ARGV
or $^I
, we can use local
.
for my $file (<*.md>) {
chomp($file);
local $^I = '.bak';
local @ARGV = ($file);
while (<>) {
s/old/new/g;
print;
}
}
There's also use vars
, which can be used to declare package global
variables.
This does the same exact thing as our
, but pre-dates our
.
This one basically tells perl that "there's going to be a global variable
named @markdown_files
available in this package, don't warn me about it."
We can specify multiple different variable names here in this statement as
well.
Typeglobs and Globrefs¶
A typeglob is a special kind of variable that can hold multiple values of multiple types (incl. scalars, arrays, hashes). Allows you to access all the variables associated with a particular name in a single reference.
It basically globs all variables of the same name together, regardless of what
type they are.
The main use of typeglobs in modern Perl is to create symbol table aliases.
A globref is a reference to a typeglob, which can hold multiple variable types.
-
Basic syntax for a typeglob:
Globs all variables withtarget_name
to the typeglobglob_name
. -
Basic syntax for a globref:
The backslash in perl is how we pass a reference to something.
Using a Typeglob for Aliases¶
A single globref can reference multiple values of different types.
For instance:
This makes:
$this
an alias for$that
@this
an alias for@that
%this
an alias for%that
&this
an alias for&that
- etc...
All variables of all types sharing the same name will be aliased.
It's generally safer to use a single reference/alias.
$There::green
.
This $Here::blue
(scalar) a temporary alias for $There::green
(scalar) but
does not apply to other types/values. E.g., It does not alias @Here::blue
(array) to @There::green
, and does not apply to the other types either.
This way you're only aliasing the thing you need, with very few side effects.
If we wanted to use the this
/that
example again:
*this
will only be a reference for $that
, a scalar value.
Typeglob Terminology¶
-
Typeglob (
*name
): A single entry in a package's symbol table for the barewordname
. -
Globref (
\*name
)- Useful when you want to pass a handle to the bundle around (e.g., passa filehandle to a subroutine).
-
Symbol Table: A package's symbol table is itself a hash:
%Package::
- It maps names to typeglobs.
*{"Package::foo"}
fetches the typeglob forfoo
.
Accessing Specific Typeglob Slots¶
You can fetch references to the individual slots of a typeglob by putting the
type in braces {...}
.
*myvar{SCALAR}; # Ref to $myvar
*myvar{ARRAY}; # Ref to @myvar
*myvar{HASH}; # Ref to %myvar
*myvar{CODE}; # Ref to &myvar
*myvar{IO}; # Ref to I/O (filehandle) for myvar
*myvar{FORMAT}; # Ref to FORMAT slot
*myvar{GV}; # Ref to the glob itself
*myvar{NAME}; # "myvar" - expands to the name of the glob
*myvar{PACKAGE}; # The current package name for *myvar
For instance, if we wanted to set the scalar value of the myvar
typeglob:
# Create the (global) vars and typeglob
our $test;
our @test;
*myvar = *test;
# Modify via the typeglob
${ *myvar{SCALAR} } = 42; # Set the scalar value of myvar ($test)
push @{ *myvar{ARRAY} }, 1, 2, 3; # Add to the array value of myvar (@test)
Temporarily Redirecting Output with Typeglobs¶
The same type of syntax is used when you want to redirect STDOUT
or STDERR
.
{...}
), the local
change to STDERR
will not
persist once the block exits. So this is a way to temporarily change where
output goes.
Passing a Filehandle as a Globref¶
When you need to pass a filehandle to a subroutine, this is where globrefs come
in handy.
sub say_to {
my ($fh_globref, $msg) = @_;
print {$fh_globref} "$msg\n";
}
say_to(\*STDERR, "Error message");