Created Fri Oct, 09 2020 at 10:20PM

Installing perl modules locally

All credit to iu.edu Post on May 30, 2019 by cganote

Introduction

Since you’re here, you are probably trying to install a perl package or run a perl program. But what exactly is perl and why would you use it as opposed to other programming languages? Here I will share some of my perls of wisdom.

Perl is an interpreted language similar to python or R, meaning that the code you write doesn’t have to be compiled ahead of time before you run it. It excels at regular expressions – special coded search strings that allow for very flexible matching of whatever you’re looking for in whatever you’re looking in. It is not great for casual scripts because its syntax is unforgiving, and is thus becoming less and less popular.

See Perl tutorials (click here) for an introduction to the gory details of regular expressions.

Perl doesn’t give you an interactive interpreter the way python does. If you type in ‘perl’ at the command line and hit enter, it will wait for your input without any prompt.

Perl files generally end in .pl for runnable scripts and .pm for perl packages that can be included/imported into your own code with the ‘use’ command. Perl can issue commands from the command line in something called a one-liner; this can be very handy if you want to run a regex over a file or do a quick check without opening a text editor and creating a script file.

Test if dependencies are met by running this perl one-liner:

perl -e "use JSON::XS;"

This invokes perl and executes the import command, looking for a perl module called JSON::XS. Perl puts :: between layers of modules that have a hierarchical relationship. Usually, there is just one set of :: but sometimes you’ll see two or more.

Why install perl modules locally

If you are looking into installing a perl package locally, it’s likely that you have run into a similar error shown below

 perl -e "use Bio::SeqIO"

Can’t locate Bio/SeqIO.pm in @INC (you may need to install the Bio::SeqIO module) (@INC contains: /opt/moab/lib/perl5 /N/soft/rhel7/perl/gnu/5.24.1/lib/site_perl/x86_64-linux-thread-multi /N/soft/rhel7/perl/gnu/5.24.1/lib/site_perl /N/soft/rhel7/perl/gnu/5.24.1/lib/x86_64-linux-thread-multi /N/soft/rhel7/perl/gnu/5.24.1/lib .) at /N/soft/rhel7/perl/gnu/5.24.1/lib/site_perl/IO/Socket/SSL.pm line 18. BEGIN failed–compilation aborted at /N/soft/rhel7/perl/gnu/5.24.1/lib/site_perl/IO/Socket/SSL.pm line 18. Compilation failed in require at -e line 1. BEGIN failed–compilation aborted at -e line 1.

The above error message tells us that that the perl module “Bio::SeqIO”, is either not in the Perl path or not installed on the system. The clue was in the sentence, you may need to install the Bio::SeqIO module.

In such cases, installing to the perl system-wide installation is generally not an option. The perl package available system-wide will likely require special permissions to write to, and we may not have the permissions as users. The other option is to download the missing perl modules locally and add it to the perl environmental variables.

Source installation

To get a perl module installed from source, the first step is to find it. Metacpan.org is your best bet for perl modules. For example, if I need to pull in a sequence reading module from perl so that I can digest fasta files quickly, I would probably look to SeqIO:

This link (click here) gets me to the information page for SeqIO. On the left, there is a link to download the code.

Showing the location of the download button on cpan.

Right-click and copy the link location or address, then paste it after wget into your terminal:

wget https://cpan.metacpan.org/authors/id/C/CD/CDRAUG/BioPerl-1.7.4.tar.gz
tar -xzvf BioPerl-1.7.4.tar.gz
perl Makefile.PL PREFIX=$PWD
make
make install

The above set of commands downloads and installs the perl module locally. However, you still need to add this module to the perl environmental variables to let perl know where to look for the package. So how do we do this?

Environment variables that affect Perl

If you are looking for more information about environmental variables in general, here is another blogpost (link here) explaining these terms and why they are important.

Here are some perl-related environment variables that you might need or run into:

PERL_LOCAL_LIB_ROOT="/home/username/perl5"
PERL_MB_OPT="--install_base /home/username/perl5"
PERL_MM_OPT="INSTALL_BASE=/home/username/perl5"
PERL5LIB=/home/username/perl5/lib/perl5:$PERL5LIB

The PERL5LIB variable works similarly to PYTHONPATH or LD_LIBRARY_PATH – it can contain delimiters and it will search from left to right for perl packages (files ending in .pm).

You may also want to explicitly unset these variables. This is especially true in cases where you want to share the code you install with others.

export PERL_MB_OPT=
export PERL_MM_OPT=

If you get an @INC error, when installing or running (like the error shown above), this suggests perl can’t find a required package in your PERL5LIB or the package wasn’t installed correctly. You can use a package manager for Perl called CPAN. This is useful for setting up your own perl dependencies, but I find that installing manually and adding the files to PERL5LIB path is best for software packages that need to be widely shared.

So to fix the above-mentioned error in the above section, first, you should install the perl module from source and then add the modules to the environmental variable – PERL5LIB. To do this run the below command

export PERL5LIB=/home/username/BioPerl/lib:$PERL5LIB

To add the above-mentioned path permanently to the environmental variable, you can add this line to your .bashrc file (more on this here), or run the above line when you are using Bioperl scripts only. The paths are reset every time you log out and log back in.

Debugging Perl

Use of uninitialized value…

In C, it’s the segmentation fault. In Java, it’s a Null Pointer Exception. Perl’s bane is uninitialized values. If you have to stick your hands elbow-deep into the guts of a perl program, I would start by placing this at the top:

use Data::Dumper;

Then, find the line that the error is happening on, and start dumping out variables around it. Uninitialized means the variable is not set to anything at the time the line is printed, so it won’t do you any good to dump that variable at that line, but you might be able to trace it back to some point where it failed to be set. To print the contents of a variable, stick this line into the code:

print Dumper($variable_name); 

Some things will be more complex than this; before learning perl in-depth or spending a week debugging a program, we would recommend contacting the software developers or placing an issue on the project if it is on Github! If the project is abandoned and impossible to debug, it also might be a good time to investigate whether more recent alternatives have become available. Old, crufty software that isn’t maintained might not be the best choice for reproducibility or future endeavors. If you really don’t have a choice, contact us for help.