The combination of pip and virtualenv provides package management in isolated environments using Python. pip retrieves packages from the Python Package Index (PyPI). Though Anaconda is the preferred method for package and environment management on the cluster, pip and virtualenv permit you to use packages only available in PyPI. In this document, you will learn how to create and manage pip packages and virtualenv environments in the context of the cluster. Consult the official documentation for pip and virtualenv for further information on these tools.
How you create virtualenv environments depends on the Python version you require. If you use Python 2, the
virtualenv module creates environments. If you use Python 3, the
venv module creates environments. In either case, the result is that an isolated environment is created in which PyPI packages can be installed and managed. The syntax for creating a virtualenv environment does not differ between Python 2 and Python 3 outside of the different modules used to create them.
To begin, make a directory in your home directory. Name it “Environments”. Next, enter into this directory. Use
module list to verify that Python 3 is loaded into your environment. If it is not, execute
module load python3. Execute the command shown in Figure 2.1 to create a virtualenv environment using Python 3.
python3 -m venv <env-name>
Figure 2.1 – Creating an Environment with Python 3
In this example, <env-name> is the descriptive title of the environment. In other cases, you could specify an absolute path, then provide the environment’s name. For instance, to create a virtualenv environment in your Lustre scratch space, you could use the command in Figure 2.2.
python3 -m venv $SCRATCHDIR/<env-name>
Figure 2.2 – Creating an Environment with an Absolute Path
The same steps used for Python 3 will work for Python 2. However, as noted previously, the module names for virtualenv differ. To use Python 2 for both Figures 2.1 and 2.2, you would change python3 to python and venv to virtualenv. Additionally, verify that the Python 3 module is not loaded before attempting to create a Python 2 virtual environment. Outside of these differences, the function and result is the same.
After the environment has been created, activate it with the
source command. Figure 2.3 shows how to activate an environment named test-env in the ~/Environments directory.
Figure 2.3 – Activating an Environment
Every environment you create will be activated in the same way, regardless of the location or Python version. The environment’s bin directory will always have the activate script that makes the environment you specify functional. You may deactivate an environment by using the
deactivate command with no options or arguments.
To delete an environment, use the
rm command. If the initial deletion process prompts you to delete each individual file, use
rm -rf to force the removal recursively. Before you delete an environment, verify that you specified the correct one before you execute the rm command. Deleted environments cannot be restored.
Within each virtual environment is an installation of pip. Though it has several subcommands and options, the most important are
pip list command shows all the packages installed in the active environment, in addition to the version numbers of those packages. Before you install or update a package, it is helpful to execute pip list to determine what is already present in the environment before changing it.
pip search command queries the PyPI for the package you specify. Figure 3.1 shows the general syntax for this command.
pip search <package>
Figure 3.1 – Searching for a Package with pip
The output of pip search can be extensive depending on the package for which you search. Because it has no built-in mechanisms to narrow down the results, pipe the output of pip search through the
grep command. In Figure 3.2, the pip search command uses scikit-learn as an example of this process. Only packages that start with the letter “s” and have all the characters provided in the query will be captured by this search.
pip search scikit-learn | grep “^scikit-learn”
Figure 3.2 – Narrowing the Output of pip search
Be aware that the entire PyPI library is available online. Rather than search through long outputs or experiment with different package versions, you can use the PyPI website to quickly identify the package and version your work requires.
pip install command retrieves a package from PyPI and makes it part of your active environment. To install a package, use the command shown in Figure 3.3. pip will install the package itself in addition to all its dependencies.
pip install <package>
Figure 3.3 – Installing a Package with pip
Updating a package with pip is a straight-forward process. Figure 3.4 demonstrates a basic package update on Cython using pip.
pip install -U cython
Figure 3.4 – Updating a Package with pip
If you need to uninstall a package from your environment, the
pip uninstall command will perform this function. Figure 3.5 shows how the command is used.
pip uninstall <package>
Figure 3.5 – Uninstalling a Package with pip
In general, it is best to not move or rename virtualenv environments. Once they are created, Python expects to find the environment in its initial location by the name assigned to it at creation. However, pip allows portability through requirements files. These simple text files contain a list of all the packages installed in a given environment, which can be used to rebuild the environment in another location.
Consider a scenario in which you need to move an existing environment from your home directory into Lustre scratch space. You cannot use
cp for this scenario; instead, you must use pip to create the requirements file, then rebuild the environment in the appropriate location.
To begin, activate the original environment, then execute
pip list. All the pip packages installed in your current environment will be output to your screen. This allows you to verify that all the necessary packages are installed in your environment before they are ported over to the new one.
After you ensure that you have all the packages you need, execute the command shown in Figure 4.1. It will take a snapshot of the packages in the environment and put them within a text file named requirements. Be aware that the requirements.txt file will be placed in your working directory. If you change directories during this process, be sure to copy the file to your new working directory before you attempt to install packages from it.
pip freeze > requirements.txt
Figure 4.1 – Capturing the Packages in an Environment
Next, create a new virtualenv environment in your Lustre scratch space. Please review the Setting up Enviroments section for information on creating these environments. Activate the new environment once it is created. After that, execute the command shown in Figure 4.2.
pip install -r requirements.txt
Figure 4.2 – Installing Packages with a Requirements File
pip will install all the packages listed in the requirements.txt file into the environment. Execute pip list to verify that all the packages were successfully installed.