Setting up a secure HTTP server on a FreeBSD virtual machine

NB: Before you begin following this article you’ll need to understand how to use vi, as we’ll be working at a command-line interface with this as the only text editor at our disposal. If you don’t know how to do basic text editing in vi I suggest you read a few articles and familiarize yourself with it otherwise you won’t have a good time Also if you’re on an IPv6 network, you can follow this article if you want but it’s all inIPv4 so..you’ve been warned

Today I’ll take you through the set-up of a secure (HTTPS) server on a FreeBSD virtual machine which you can use to host your own website for dev purposes, or for educational purposes, or whatever.

We’re going to be using VirtualBox, which can be downloaded here and FreeBSD 10.2 which is the latest stable release at time of writing, although you probably won’t have any problems using a newer release.

The web server we’re going to use is NginX, which is a highly configurable, simple, fast, secure and powerful web server.

What are the advantages of using a virtual machine?

Glad you asked that question. I’ll start with the disadvantages: A virtual machine runs software slightly slower and consumes some more resources than the host machine it’s running on. The advantages are…everything else.

What are the advantages of using a FreeBSD?

FreeBSD is a industrial-grade operating system that is extremely secure, fast, and surprisingly simple to use. Ever used OSX? You’re using FreeBSD. Yeah. OSX is basically FreeBSD with lots of `skinning’. Ever played a PS4? It’s operating system is a modified FreeBSD, same with the PS3. It’s also used for mission-critical high-traffic networking equipment such as load balancers, firewalls, email servers and webservers, basically any really important application where Linux is just not “up to the job”.

And speaking of webservers, that’s what what we’re going to be doing.

If you haven’t already done it, install VirtualBox on your Mac, Windows or Linux machine.

Oracle VM VirtualBox Manager_002

Next you’ll need to head over to FreeBSD.org and grab the 64-bit ISO disk image of the 10.2 release, Click Production 10.2, scroll down to ftp, click on the first link, which is this, then click on the 64-bit CD image, which is this, and let it start downloading.

Selection_003

Go back into VirtualBox, and click the New icon to start creating your virtual machine. When you create something, a lot of times the first thing you do is give it a name, and creating a VirtaulBox virt is no exception. In this case I’ve gone for `BSDBox’, but you’re free to use anything you like, as long as it doesn’t contain any illegal characters so go on, pick something wacky, type in your crazy name then select `BSD’ under type and `FreeBSD (64-bit)’ under version.
Create Virtual Machine_005

Next you choose how much of your computer’s memory will be allocated for this virt, in this case I’ve gone for 512MB which I’m hoping will be enough to run NginX and get me through this article. Feel free to use more if you can afford it, but not less, please.

Create Virtual Machine_006

Next we create a virtual Hard Drive… do what the picture says.

Create Virtual Machine_007

Do what the picture says.

Create Virtual Hard Drive_008

Again.

Create Virtual Hard Drive_009

Again.

Create Virtual Hard Drive_010

Wait, might take a while…

Create Virtual Hard Drive: Creating fixed medium storage unit '-home-noel-VirtualBox VMs-BSDBox-BSDBox.vdi'_014

Ok now we’ve got a virt set up with an 8GB harddrive and 512 MB of memory but no operating system.

Oracle VM VirtualBox Manager_015

If you click `Start’ you’ll get something like this. FYI, when you do this VirtualBox might ask you if you want mouse pointer integration which is supported on most modern operating systems (ie. operating systems that are capable of knowing that they’re running on a virtual machine), and the best choice is to pick yes. This `traps’ the mouse pointer inside the VirtualBox window which just makes it feel more like a real computer. The mouse pointer can be released by pressing the right ctrl key.
BSDBox [Running] - Oracle VM VirtualBox_016

Installing FreeBSD

Next we insert our virtual CD into our virtual CD drive and install the operating system. Open up the settings window for your virt, click `Storage’, highlight the DVD drive under `Storage Type’, click the picture of a DVD on the far right, click `Choose a virtual CD or DVD file…’, browse to the one you downloaded from FreeBSD.org, or if it’s not downloaded yet, wait.

BSDBox - Settings_019

Click Ok, then click Start to start up your virt. Hit enter when you see the black screen
BSDBox [Running] - Oracle VM VirtualBox_020

Hit Enter to install

BSDBox [Running] - Oracle VM VirtualBox_021

Choose a keymap, test it, then choose the first option to Continue

BSDBox [Running] - Oracle VM VirtualBox_022

BSDBox [Running] - Oracle VM VirtualBox_023

Type in the same name you chose for your virt as the hostname, hit Enter

BSDBox [Running] - Oracle VM VirtualBox_024

For speed just choose 32-bit compatibility libraries and ports tree, hit Enter

 

BSDBox [Running] - Oracle VM VirtualBox_026

Choose guided disk setup.

BSDBox [Running] - Oracle VM VirtualBox_027

Entire disk.

BSDBox [Running] - Oracle VM VirtualBox_028

GUID Partition Table.

 

BSDBox [Running] - Oracle VM VirtualBox_029

Enter

BSDBox [Running] - Oracle VM VirtualBox_030

Enter.

BSDBox [Running] - Oracle VM VirtualBox_031

Wait.

 

BSDBox [Running] - Oracle VM VirtualBox_033

Choose a root password. BSD isn’t very picky here, in fact I think you may be able to press Enter here and leave it without a root password so type carefully.

BSDBox [Running] - Oracle VM VirtualBox_034

The next screen you see will be asking you about configuring network interfaces. Just press Esc to take you to a main menu then press enter to exit the installer.

BSDBox [Running] - Oracle VM VirtualBox_035

BSDBox [Running] - Oracle VM VirtualBox_037

After a few seconds it will ask you if you want to make any final changes, choose `No’, wait for a few seconds and the virt should shut down. If not, choose “ACPI shutdown” from the Machine menu. Wait for the machine to go into the stopped state. Now remove the virtual disk from the virtual drive otherwise when you switch it on it will run the installer again.

Networking

At this point we have got a fresh install of FreeBSD on our virtual hard drive and the next step is to configure VirtualBox such that the FreeBSD virt has it’s own IP address on the same network the host machine is on. This means we can open a web browser and browse to a page that’s hosted on the virt from our host machine, or from any machine on the network. VirtualBox has lots of options for configuring networking and the default one is `NAT’. Click Settings then click Network to bring up the network settings.
BSDBox - Settings_043
As you can see `NAT’ is selected which means VirtualBox will run a sort of “virtual router” which the guest OS (FreeBSD) thinks is a real router, so when you start up a virt, the guest OS sends a DHCP request over it’s virtual network card, which is picked up by the virtual router which responds by giving it an IP address, subnet mask, gateway address and DNS address(es). Next, when a packet is sent from the virt to the virtual router, the virtual router changes the source address of the packet (and possibly the port) to be the address of the host machine then forwards it off to the real router to get to it’s final destination. When a response comes back VirtualBox again translates the destination address (and possibly the port) of the packet to match the address and port from which the first packet was sent.

This is convenient and fairly secure for tasks such as web browsing and email but not if we want to run a server since we want others on the network to be able to contact us directly, this is achieved by something called a `Bridged Adapter’, so change from NAT to bridged adapter and click OK.

BSDBox - Settings_044

Start up the virtual machine again. A bridged adapter sends the packets emitted from the virt directly to the host machine’s network card, circumventing the host’s TCP/IP stack. The network card handles the Ethernet protocol only and has no clue about what is inside any of the packets it’s transmitting and receiving so VirtualBox can just inject packets generated by the virt so it’s mixed in with the traffic from the host machine, the network card forwards them to the nearest router, the router may or may not notice that there are now 2 IP addresses associated with the same MAC address, and almost certainly won’t care, and will forward them on and deliver back the responses.

When the responses come back in, VirtualBox will be sniffing all the packets and will grab any that have the virt’s IP address as a destination. At the same time the host machine’s TCP/IP stack will ignore any packets that don’t have it’s IP address.

BSDBox [Running] - Oracle VM VirtualBox_041

When you see the login screen type `root’ as the login and the password you chose earlier. Once you’re logged in type

# ping www.google.com

You should get a message indicating that the host couldn’t be found. That’s to be expected. What we’re going to do now is enable DHCP on the virt and reboot the machine, when it reboots it should get an IP address (and the rest) from the host machine’s DHCP server, which is probably also the gateway. Once we get this IP address and test our network connection and if it works, (ie. we can ping google.com), we’ll make a note of the IP address/netmask/gateway, etc then set them manually so the virt will always use these settings. This means we don’t have to go to the bother of checking what kind of network we’re on, although chances are you’re just on a standard 192.168.x.y, some people have weird network configurations, but more importantly we’ll have a static IP address which will persist across reboots which is essential for running our webserver.

Type the following command:

# echo 'ifconfig_em0="DHCP"' >> /etc/rc.conf

Then this:

# reboot now

This adds a command at startup which configures the network interface (em0) to get it’s address via DHCP. Wait for the virt to reboot, log in again and try pinging google.com. You should get replies. Press Ctrl-C to break out of ping. Now we need to determine the virt’s IP address, netmask, gateway and DNS servers. Type ifconfig and you should see something like this:

BSDBox [Running] - Oracle VM VirtualBox_046

Make a note of the inet and netmask in the em0 section, in my case they are 192.168.1.16 and 0xffffff00. We’ll have to convert the netmask to dotted decimal notation, this can be done by stripping off the 0x then grouping the eight remaining digits into groups of two to give ff.ff.ff.00, finally use a hex calculator to convert each of these bytes into decimal, to give 255.255.255.0

That’s the IP and netmask sorted. To get the gateway address type the command

# netstat -r -f inet

which will display the local routing table. There should be 4 columns, the first tow of which are Destination and Gateway. Look under Destination and look for default, check what is listed in the corresponding Gateway column, and… that’s your gateway address!

Finally you need your DNS address(es). Type

# cat /etc/resolv.conf

You should see something like

nameserver X.X.X.X
...

There’s a high probability that there’s only one nameserver listed and it has the same address as your gateway. Whether this is true or not, it doesn’t matter, because the good news is you can just leave this file alone and it will continue to be used as the nameservers. Next you’ll want to open the file /etc/rc.conf in vi

# vi /etc/rc.conf

Remove the last line and replace it with

ifconfig_em0="inet X.X.X.X netmask Y.Y.Y.Y"
defaultrouter="Z.Z.Z.Z"

of course replacing X.X.X…. with your IP address, netmask and gateway respectively. Save and close the file and reboot your virt again. When you log in, do another ping test to check that you’re online, if you get replies, fantastic. While you’re at it, open a command prompt on your host OS and type

$ ping bsdbox

Or whatever name you chose for your virt. You should also get replies. As a final step, we’re going to edit our hosts file so it has the hostname-to-IP mapping. Sendmail was giving me problems on my virt and it probably will on yours too if you don’t take this step, so to do it all in one go type (CAREFULLY):

# echo 192.168.1.3 bsdbox >> /etc/hosts

inserting your own IP address and hostname instead. Restart the virt again. It would be no harm to do this on your host machine and any other machine(s) you plan on accessing the webserver from. It’s done the same way in Linux (as root), for OSX and Windows google “editing hosts file in (OSX/Windows)” and you should find tons of info on how to do it.

Installing NginX

Now it’s time to install our web server. First you’ll need to enter this command to get FreeBSD to update it’s ports tree

# portsnap update

This will take a few minutes, next you’ll need

# portsnap extract

This will take even more time. What’s happening here is a directory tree, rooted at /usr/ports is being built for the entire FreeBSD ports collection. That is: software that’s been ported to FreeBSD and has been tested and is known to work on FreeBSD. The ports collection is divided into categories for different types of software: such as development, games, web, math, multimedia, and so on.

When that’s done, and it will take a while, you’ll want to navigate to the /usr/ports/www/nginx directory. Now we’re going to compile NginX. If you type lsyou’ll notice a Makefile in there, so type

# make install clean

to begin building NginX. After a while you should see a blue screen asking what features of NginX you want, Just make sure HTTP_SSL is selected, then hit Enter to continue the install.

BSDBox [Running] - Oracle VM VirtualBox_048

When all that is finished, NginX should be installed on your virt. Before we reboot again, we’re going to set NginX to start up automatically on boot, that means adding one line to rc.conf:

# echo nginx_enable="YES" >> /etc/rc.conf

reboot the virt again and wait till you see the login screen. By this time NginX should have started a basic HTTP server listening on port 80. Open a web browser and into the address bar type http://<hostname>, using your hostname of course. You should see something like this:

Welcome to nginx! - Chromium_049
So we’ve got a webserver up and running serving normal unencrypted HTTP traffic. Getting there..

Generating SSL Certificate and Key

Navigate to the /etc/ssl directory of your BSD system. Issue the following commands to create two directories:

# mkdir keys
# mkdir certs

Now we’ll create a private key, remember to substitute the name of your host in place of `bsdbox’:

# openssl genrsa -aes128 -out keys/bsdbox.key 2048

This generates an RSA private key with a 2048-bit modulus, which is considered secure at time of writing. You’ll also be asked for a passphrase for your key. Choose a strong password and make a note of it. Next  we’ll secure this file by making it unreadable to anybody but root:

# chmod 400 keys/bsdbox.key

Next we’ll make a certificate signing request:

# openssl req -new -key keys/bsdbox.key -out bsdbox.csr

You’ll be asked for some information on your `Company’, fill in the details; for the FQDN use your hostname, and for email address use root@<your hostname>. Finally we’ll create and self-sign the certificate which contains our public key and is valid for the next 365 days:

# openssl x509 -req -days 365 -in bsdbox.csr -signkey keys/bsdbox.key -out certs/bsdbox.crt

You can delete bsdbox.csr file now as it’s no longer needed.

Configuring NginX

We’re nearly there. Open up /usr/local/etc/nginx/nginx.conf in vi. This is the main config file for NginX and is hierarchical in structure. Observe that there’s a http { ...} section which encompasses all the settings and inside that there’s a server{ ... } section which holds all the settings for the default website on our server. We’re only going to be editing what’s inside the server{ ... } section.

Firstly you’ll see

   listen 80;

Change this to

   listen 443 ssl;

Next change server_name from localhost to <yourhostname>; Don’t forget the semicolon!, (you need a semicolon after everything in NginX configs) then under that add the following 3 lines:

   ssl on;
   ssl_certificate /etc/ssl/certs/&lt;yourhostname&gt;.crt;
   ssl_certificate /etc/ssl/keys/&lt;yourhostname&gt;.key;

Here’s a picture of my vi screen afer making these changes:

BSDBox [Running] - Oracle VM VirtualBox_053

These are the minimal changes you need to make to have a HTTPS server up-and-running. Save the file and exit. You’ll now need to restart NginX, you can either use the command

# nginx -s reload

or if you’re like me you can just reboot the virt. Whichever you do, at some stage you’ll be asked to provide the passphrase for your private key so NginX can use it. Type it in, and wait for the server to start.

Testing Your Server

Open a browser window and enter the address https://<yourhostname&gt;, you should see a warning telling you it’s unsafe to proceed, or that this connection is untrusted.

Untrusted Connection - Mozilla Firefox_054

Follow these instructions to view the page

Firefox: “I understand the risks”, “Add exception”, “Confirm security exception”

Chrome: “Advanced” , “Proceed to…”

Welcome to nginx! - Chromium_050

And there it is in all it’s glory! A very secure webserver using 2048-bit RSA encryption hosted inside a virtual FreeBSD instance. Of course to keep it secure you’ll need to log out of FreeBSD every time you leave the computer alone since there’s only one root user, but ignoring the “physical” risks, this server is pretty much unhackable. An output from nmap on my Linux host machine:

root@CodeCook:~/ca# nmap bsdbox

Starting Nmap 6.40 ( http://nmap.org ) at 2015-10-21 08:44 IST
Nmap scan report for bsdbox (192.168.1.3)
Host is up (0.00059s latency).
rDNS record for 192.168.1.3: BSDBox
Not shown: 999 closed ports
PORT STATE SERVICE
443/tcp open https
MAC Address: 08:00:27:52:60:43 (Cadmus Computer Systems)

Nmap done: 1 IP address (1 host up) scanned in 44.05 seconds

As you can see the only open port is 443 because that’s the only software we installed. FreeBSD assumes nothing about what you’re going to do with it and errs on the side of caution. It’s a generic operating system just built to work and nothing else.

I’ll follow up this article with one where we will create our own certificate authority and install a Root cert on the client browser(s) so we can access our secure server without all the annoying warnings.

Advertisement

Perl: references, scalars, etc

A lot of people are put off using Perl because of what they believe is over-complicated syntax when dealing with references. I’m on a mission to dispel this rumour and give a simple guide to Perl references, the best ways to think about them and draw some comparisons with C pointers, and hopefully get more people using Perl.

The Perl language may not be as popular as it used to be but there’s still tens of millions of lines of Perl code out there that needs people to maintain it.

We’ll start off with the humble scalar: A scalar is a variable with a name starting with $, so if we wanted to define a scalar called $s, we’d write:

my $s;

A scalar can be any of 4 things:undefined, a number, a string, or a reference. I’ll repeat that. A scalar (that is: a Perl variable that begins with a $) can be either:

1. Undefined, as in the example above; we’ve declared a varaible $s but not said $s=anything
2. A number, for example:

my $s=42;

3. A string, for example:

my $s="hello";

4. A reference, for example:

my $s=\$y;

See the back-slash? That’s the same as the & operator in C, it gives the address of $y and stores it in $s. $s now contains the address of $y. Let’s write a simple program to demonstrate this.

#!/usr/bin/perl

use strict;
use warnings;

my $y="Noel";
my $s=\$y;

print $s ,"\n";
print $$s ,"\n";

The output from this program:

SCALAR(0x1f43310)
Noel

2 lines of output: the first is SCALAR(0x1f43310) which is what we get when we try to print $s directly. Not very useful; it tells us $s is a reference to a scalar, which itself may be one of the 4 kinds of scalar listed above. But if we put another $ in front of $s we get the scalar that $s points to, which is the string “Noel” (AKA $y). This $ put in front of the scalar is a scalar dereference operator and is similar to the * operator in C. This is kind of confusing because in C the * symbol has 3 uses: Multiplication, Pointer definition and Pointer dereference. Similarly in Perl the $ has two uses: 1) Denoting a scalar, and 2) dereferencing a scalar that references ANOTHER SCALAR.

Arrays

Moving on, Arrays (or lists) are the second kind of data structure we encounter in Perl, and an array is defined with a @. So…

Scalar … $
Array … @

If it starts with a $ it’s a scalar, and if it starts with a @ it’s an array. Got it? Good. Let’s define an array and print it’s contents.

#!/usr/bin/perl

use strict;
use warnings;

my @beatles=("John","Paul","George","Ringo");
print "@beatles\n";

The output from this program:

John Paul George Ringo

This program declares and initilizes an array called @beatles in one line, then prints the array. At this point you should be wondering “Can I use the \ operator to access the memory address of @beatles?” and the answer is yes. Let’s add a few lines to the program:

#!/usr/bin/perl

use strict;
use warnings;

my @beatles=("John","Paul","George","Ringo");
print @beatles , "\n";

my $beatles_ref=\@beatles;
print $beatles_ref , "\n";

Ok so we’ve created a scalar called $beatles_ref which holds the address of @beatles. But if we try to print $beatles_ref, we get

ARRAY(0xf96360)

This tells us $beatles_ref is a reference to an array. So how do we dereference it? I’ll tell you how. Using the array dereference operator, that is: @{}

#!/usr/bin/perl

use strict;
use warnings;

my @beatles=("John","Paul","George","Ringo");

my $beatles_ref=\@beatles;
my $beatles_ref_ref=\$beatles_ref;

print @beatles , "\n";
print @{$beatles_ref} , "\n";
print @{$$beatles_ref_ref} , "\n";

I’ve jumped the gun a little with this program but study it, and run it, and I’m sure you’ll figure it out. This program should print:

JohnPaulGeorgeRingo
JohnPaulGeorgeRingo
JohnPaulGeorgeRingo

What we’ve done here is made an array called @beatles with 4 elements, then made a scalar called $beatles_ref, set it to point to @beatles, then we created another scalar called $beatles_ref_ref and pointed that to $beatles_ref. We then printed the @beatles array 3 times, first by using the @beatles array variable, then by using the array dereference operator @{} on $beatles_ref, and finally by double-dereferencing the $beatles_ref_ref variable to access $beatles_ref, then $beatles. Here’s the same program in C++11:

#include <stdio.h>;

int main(){

   char* beatles [4] ={(char*)"John", (char*)"Paul", (char*)"George",(char*)"Ringo"};
   char** beatles_ref = (char**) &beatles;
   char*** beatles_ref_ref=&beatles_ref;

   for (char* &beatle:beatles){
      printf("%s",beatle);
   }
   printf("\n");

   // we must figure out the length of the array now since we cannot use
   // C++11-style for loops on a pointer
   int arrlen=sizeof(beatles) / sizeof(char*);

   for (int i=0; i<arrlen; i++){
      printf("%s",beatles_ref[i]);
   }
   printf("\n");

   for (int i=0; i<arrlen; i++){
      printf("%s",(*beatles_ref_ref)[i]);
   }
   printf("\n");

   return 0;
}

My eyes are bleeding. As you can see we can only use the concise C++11-style for loop on the ‘beatles’ array since this is a static chunk of data and it’s size is hard-coded into the program, if all we have is a pointer to it we need to manually compute or otherwise determine the length of the array. All of this is automagic in Perl, so, eh..use Perl. Let’s get on to hashes.

Hashes

Here’s a Perl program that defines a hash called %beer and assigns it two attributes: Name and Alcohol, then prints them.

#!/usr/bin/perl

use strict;
use warnings;

my %beer=(Name=>'Heineken', Alcohol=>0.05);

print $beer{Name} , "\n";
print $beer{Alcohol}*100 , "%\n";

So as you can see we use the % character before the variable name to declare a hash. So, to recap:

Scalar … $
Array … @
Hash … %

%beer is a hash, not a hash reference, although to access the attributes of %beer, we use the notation $beer{…}, which admittedly is confusing, but is less confusing if you think of the $ as referring to the scalar attribute rather than the containing hash. So you have

$<hashname>{<attributename>}

to access a scalar element called in the hash .

Like arrays we can also create a reference to a hash:

my $beer_ref=\%beer;

and we access the hash’s attributes with the -> operator like so:

print $beer_ref->{Name} , "\n";

$beer_ref is colloquially known as a hashref, (a reference to a hash), which crop up a lot in Perl, and, like arrays, Perl knows $beer_ref is a hashref because if we do

print $beer_ref;

we get:

HASH(0x15c5488)

But wait! we can create a hash another way! And this is the way I nearly always create hashes… We can declare a scalar and point it to an anonymous hash, like so:

my $hashref={ Name'Peter', Age='45' };

We’ve skipped the creation of a hash variable and gone straight to assigning a hash to a hashref. Did I mention we can also do this with arrays? I don’t think I did…

my $arrayref=["John","Paul","George","Ringo"];
my $hashref={Name=>'Peter', Age=>'45'};

See? Square brackets for the anonymous array and curly brackets for the anonymous hash and we can dereference them in a similar way to arrayrefs, using the %{} operator. So a %{$hashref_to_foo} will be treated the same as a %foo.

For completion:

#!/usr/bin/perl

use strict;
use warnings;

my $arrayref=["john","paul","george","ringo"];
my $hashref={Name=>'peter', Age=>'25'};

my @array=("JOHN","PAUL","GEORGE","RINGO");
my %hash=(Name=>'PETER', Age=>'45');

# Printing the arrays
print "@{$arrayref}\n";
print "@array\n";

# Printing the hashes
print "$_:$hash{$_} " for (keys %hash); print "\n";
print "$_:$hashref->{$_} " for (keys %{$hashref}); print "\n";

# Let's now re-assign $arrayref and $hashref to point to $array
# and $hash respectively and print them again

$arrayref=\@array;
$hashref=\%hash;

print "@{$arrayref}\n";
print "$_:$hashref->{$_} " for (keys %{$hashref}); print "\n";

# At this point the original values of $hashref and $arrayref
# are lost forever 😦

That code sums everything you’ve learned so far. So round brackets for the direct array and hash structures; Square brackets for anonymous arrays and curly brackets for anonymous hashes. But we prefer using references, because a reference is a scalar and a scalar is small, and small things can be moved around faster and stored using less space, but more importantly if you’re using functions (subs in Perl), using references is essential.

Subs

Compared to other languages, writing functions is syntactically very simple in Perl. There’s no need to specify a return type or function parameter list. Every function (aka sub) has an implicit parameter list (aka array) called @_.

So if I declare a function called “foo” which prints the contents of an array I could write:

sub foo{
   print "@_\n";
}

and that should do it. Should I want to call foo, I could write

foo;

which calls the foo function, with an empty parameter list, which will just print a blank line. If I wanted to pass it our @array array from the last listing, I’d say

foo @array;

which would give me

JOHN PAUL GEORGE RINGO

which prints @array (known as @_ in foo). BUT… If I write

foo @array,"STUART";

I’ll get:

JOHN PAUL GEORGE RINGO STUART

And this is a problem. What Perl does here is when foo is being called it goes through the parameter list and when it finds a scalar it adds that scalar to the @_ list, but when it finds an array it adds each element of that array to @_, making the elements of @array indistinguishable from any arbitrary scalars (ie “STUART”) in the parameter list, so foo doesn’t know which elements are from @array. This is useful in cases where foo just wants to do a simple job on each element of the array (such as run a regex) and you might have several arrays and several scalars on which you want to perform this operation; you can simply do something like this:

#!/usr/bin/perl

use strict;
use warnings;

sub capitalize{
   map{ $_ = uc $_ } @_;
}

my @beatles=("john","paul","george","ringo");
my @stones=("mick","keith","ronnie","charlie");

my $fifth_beatle="stuart";
my $stones_manager="andrew";

print "Beatles: @beatles\n";
print "Fifth Beatle: $fifth_beatle\n";
print "Stones: @stones\n";
print "Stones Manager: $stones_manager\n";
print "\n";

capitalize $stones_manager, @beatles, $fifth_beatle, @stones;

print "Beatles: @beatles\n";
print "Fifth Beatle: $fifth_beatle\n";
print "Stones: @stones\n";
print "Stones Manager: $stones_manager\n";
print "\n";

Here we’ve defined two arrays: @beatles and @stones, two scalars: $fifth_beatle and $stones_manager, printed them, and then decided we want to capitalize them all in one fell swoop. So we pass all four of these in no particular order to out capilalize function which builds the @_ array (“andrew”,”john”,”paul”,”george”,”ringo”,”stuart”,”mick”,”keith”,”ronnie”,”charlie”), then runs a map function on the array which runs a lambda function ($_ = uc $_ ) which reassigns the array element ($_ to an upper-case version of it (uc $_)). Doing this in Java would take lots of CPU cycles and coding time doing array manipulation and typing parameter lists. Furthermore if we want a copy of the newly capitalized and concatenated array we just need to change one line:

my @all=capitalize $stones_manager, @beatles, $fifth_beatle, @stones;

And we have an array with them all capitalized then taped together in the order they were sent in, nice and concise.

Passing Arrays and Hashes to Subs

Perl has a dirty little secret and it’s that you can pass a hash to a subroutine then `cast’ @_ into a %hash inside the subroutine which works but I’m sure you’ll agree is illogical and horrible.

#!/usr/bin/perl

use strict;
use warnings;

sub foo{
   my %hash=@_;
   print $hash{bar};
}

my %hash=(bar=>'baz');

foo(%hash);

The `nice’ way to do it is to pass all arrays and hashes to functions as references. So the above program would become

#!/usr/bin/perl

use strict;
use warnings;

sub foo{
   my $hashref=shift;
   print $hashref->{bar};
}

my %hash=(bar=>'baz');

foo(\%hash);

It’s also faster (I think). FYI, the shift function removes the first value off the array @_, which is our hashref. I could have equivalently used

my $hashref=@_[0]

.

Other Neat Things We Can Do

To create a two-dimensional array we simply fill an array with references to other arrays:

#!/usr/bin/perl

use strict;
use warnings;

sub print_2d_array{
   my $arr_ref=shift;

   foreach my $ele (@{$arr_ref}){
      print "@{$ele}\n";
   }
}

my @two_d_array=(
   ['A','B','C','D','E'],
   ['F','G','H','I','J']
);

print_2d_array \@two_d_array;

If we have a function that returns an array (such as sort or map), we can enclose the function call in [ and ] to give us a reference (similar to how \ converts an array variable to a reference):

#!/usr/bin/perl

use strict;
use warnings;

sub three_random_numbers{
   (rand,rand,rand);
}

my $arr_ref=[three_random_numbers];

print "@{$arr_ref}\n";

# or even (!)
print "@{[three_random_numbers]}\n";

Reversing this process, we can also obtain an array from a function that returns an array reference:

#!/usr/bin/perl

use strict;
use warnings;

sub three_random_numbers{
   [rand,rand,rand];
}

my @arr=@{(three_random_numbers)};

print "@arr\n";

# or even (!)
print "@{(three_random_numbers)}\n";

Similarly, with hashes, we can use { and } to convert the return value of random_person to a hashref

#!/usr/bin/perl

use strict;
use warnings;

sub random_person{
   (
      Name=>('Randy','Donna','Betty','Harry')[int(rand()*4)],
      Age=> int(rand()*100)
   );
}

my $hashref={random_person};

print "Name=$hashref->{Name} Age=$hashref->{Age}\n";

And for completion’s sake:

#!/usr/bin/perl

use strict;
use warnings;

sub random_person{
   {
      Name=>@{['Randy','Donna','Betty','Harry']}[int(rand()*4)],
      Age=> int(rand()*100)
   };
}

my %hash=%{(random_person)};

print "Name=$hash{Name} Age=$hash{Age}\n";

Notice the way we have to enclose the call to random_person in ( and ) ? This ensures we are dereferencing the return value
of random_person, not random_person itself, which is a subroutine, and did I mention that this can also be referenced using
\&random_person ? Yeah, you can do that too. More on that in my next Perl post.