The combination of pip and virtualenv provides package management in isolated environments using Python. pip retrieves packages from the Python Package Index (PyPI). Though Anaconda is the preferred method for package and environment management on the cluster, pip and virtualenv permit you to use packages only available in PyPI. In this document, you will learn how to create and manage pip packages and virtualenv environments in the cluster context. Consult the official documentation for pip and virtualenv for further information on these tools.
How you create virtualenv environments depends on the Python version you require. If you use Python 3, the
venv module creates environments. The result is that an isolated environment is created in which PyPI packages can be installed and managed.
To begin, make a directory in your home directory. Name it “Environments”. Next, enter this directory. First, use
module avail to see what versions of Python are available for your environment. Then Use
module list to verify that the available Python 3 version is loaded into your environment. If it is not, execute
module load Python/3.6.15-gcc. Execute the command shown in Figure 2.1 to create a virtualenv environment using Python 3.
python3 -m venv <env-name>
Figure 2.1 – Creating an Environment with Python 3
In this example, <env-name> is the descriptive title of the environment. In other cases, you could specify an absolute path, then provide the environment’s name. For instance, to create a virtualenv environment in your Lustre scratch space, you could use the command in Figure 2.2.
python3 -m venv $SCRATCHDIR/<env-name>
Figure 2.2 – Creating an Environment with an Absolute Path
After the environment has been created, activate it with the
source command. Figure 2.3 shows activating an Environment using a script in its binary directory.
Figure 2.3 – command to activate an virtual Environment (bash/zsh)
Every environment you create will be activated the same way, regardless of the location or Python version. The environment’s bin directory will always have the activate script that makes the environment you specify functional. You may deactivate an environment by using the
deactivate command with no options or arguments.
To delete an environment, use the
rm command. If the initial deletion process prompts you to delete each individual file, use
rm -rf to force the removal recursively. Before you delete an environment, verify that you specified the correct one before you execute the rm command. Deleted environments cannot be restored.
Within each virtual environment is an installation of pip. Though it has several subcommands and options, the most important are
pip list command shows all the packages installed in the active environment, in addition to the version numbers of those packages. Before you install or update a package, it is helpful to execute pip list to determine what is already present in the environment before changing it.
pip search was built on the PyPI XML-RPC search API, which no longer works because that API has been disabled. We should probably remove the pip search command to save confusion.
Be aware that the entire PyPI library is available online. Rather than search through long outputs or experiment with different package versions, you can use the PyPI website to identify the package and version your work requires quickly.
pip install command retrieves a package from PyPI and makes it part of your active environment. To install a package, use the command shown in Figure 3.3. pip will install the package itself in addition to all its dependencies.
pip install <package>
Figure 3.3 – Installing a Package with pip
Updating a package with pip is a straightforward process. Figure 3.4 demonstrates a basic package update on Cython using pip.
pip install -U cython
Figure 3.4 – Updating a Package with pip
If you need to uninstall a package from your environment, the
pip uninstall command will perform this function. Figure 3.5 shows how the command is used.
pip uninstall <package>
Figure 3.5 – Uninstalling a Package with pip
In general, it is best not to move or rename virtualenv environments. Once they are created, Python expects to find the environment in its initial location by the name assigned to it at creation. However, pip allows portability through requirements files. These simple text files contain a list of all the packages installed in a given environment, which can be used to rebuild the environment in another location.
Consider a scenario in which you need to move an existing environment from your home directory into Lustre scratch space. You cannot use
cp for this scenario, instead, you must use pip to create the requirements file, then rebuild the environment in the appropriate location.
To begin, activate the original environment, then execute
pip list. All the pip packages installed in your current environment will be output to your screen. This allows you to verify that all the necessary packages are installed in your environment before being ported to the new one.
After you ensure you have all the packages you need, execute the command shown in Figure 4.1. It will take a snapshot of the packages in the environment and put them within a text file named requirements. Be aware that the requirements.txt file will be placed in your working directory. If you change directories during this process, be sure to copy the file to your new working directory before you attempt to install packages from it.
pip freeze > requirements.txt
Figure 4.1 – Capturing the Packages in an Environment
Next, create a new virtualenv environment in your Lustre scratch space. Please review the Setting up Enviroments section for information on creating these environments. Activate the new environment once it is created. After that, execute the command shown in Figure 4.2.
pip install -r requirements.txt
Figure 4.2 – Installing Packages with a Requirements File
pip will install all the packages listed in the requirements.txt file into the environment. Execute pip list to verify that all the packages were successfully installed.