It’s easy to clog up your machine with bits of software. So before downloading software and installing it, look at this method which keeps things nice and separate and clean.
Anaconda — the easiest way
If you are not very comfortable installing things from the command line (the Terminal app), of you only need to do data science on your laptop, then the best way to install on a Mac is to use the Anaconda distribution.
It’s easy and fast, but it is not the best way if you are going to want to run lots of projects or are doing coding projects alongside your data science.
Virtualenv — the better way
If however you are going to do more than data science with Python on your laptop then Anaconda is probably not the best way.
- You are going to write some applications in Python or run other projects beyond just using Jupyter notebooks
- You are comfortable using the command line
In this case it is better to create an installation in a separate virtual environment for different projects. This enables you to keep a very clean separate installation for data science vs your different Python projects. They are easy to maintain and uninstall and don’t clog up your base machine environment.
Here’s how to do this.
These instructions are focused on Mac OS X. Linux setup is the same with the exception of step 1a. Windows install is quite different.
Step 1 —check that Python 3 is installed
All Macs come with Python 2.7 installed already. If you type
python on the command line in Terminal it should start Python2.7. But Python 2.7 will be out of maintenance in 2019, so you really want to use python 3.
Check if you already have Python 3 installed. On the command line type:
If it is already installed it will load and show you the version of Python 3. Something like this:
Python 3.6.4 (default, Jan 6 2018, 11:51:15)[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwinType "help", "copyright", "credits" or "license" for more information.>>>
If it is already installed then that is great and you can go onto step 2.
Step 1a — install Python 3 if you don’t have it
There are many ways of installing Python 3 and many blogs on the web. I have round that the easiest is to use Homebrew. Follow the instructions here:
Install Python3 on a Mac How to set up Python3 on a Mac | Practical Programming classes and…
Install Python3 on a Mac | Practical Programming classes and workshops for everyone who wants to learn how to code from…
Step 2 — install virtual environments
A virtual environment is like a Python sandbox into which you can install whatever python packages you like with the
pip command and they will only be installed there. It is nice and clean.
Install the virtualenv software from the command line:
pip install virtualenv
If you don’t have permissions to install it, you may need to use the following and give your main user password (admin user password)
sudo pip install virtualenv
Now you should create a directory in which to store your virtual environments. You can put it whereever you like, but I have found that it is good to have a
.virtualenvs if you prefer to make it hidden) directory just off your home directory. You can set this up by going to your home directory and creating it
Step 2a — (optional) install virtualenvwrapper
To have an easier way to activate the different virtual environments, you can also install virtualenvwrapper. This requires a bit of configuration but it is worth it in my view.
First install the virtualenvwrapper package:
pip install virtualenvwrapper
Now you need to set up your bash configuration so that it knows where to get your virtual environments from and where the virtualenvwrapper program is stored.
First to do this you need to have:
- The directory path where you created your virtualenv folder
- The directory path where your python 3 is installed — type
which python3to see the path
Go to your home directory with
cd ~ and carefully edit your
.bashrc file which may be empty. Nano is a decent editor to do this:
Add the following 3 lines inserting the right paths for your own system and save the file.
export WORKON_HOME=~/virtualenvsexport VIRTUALENVWRAPPER_PYTHON=/usr/local/bin/python3source /usr/local/bin/virtualenvwrapper.sh
Close your terminal and reopen in order to ensure that these new configuration options are read. Then type
workon to test that virtualenvwrapper is working. It should give you no errors.
Step 3 — create a new virtual environment for this Jupyter project
Go to your virtualenv folder and create a new virtual environment with the
The parameters you will give it will be the location of your python 3 path (found by typing
which python3 on the command line) and a name for your new virtual environment — let’s call this one
mkvirtualenv --python=/usr/local/bin/python3 jupyter
This will create your virtual environment and activate it as shown by the bit in brackets
cd ~ to go back to your home directory.
Step 4 — install the software you need inside the virtual environment
When you have the virtual environment activated, whenever you install software using the
pip install command, it will be installed within this virtual environment only and not clog up the rest of your machine.
Let’s install a bunch of tools we want for data science and Jupyter notebooks in this virtual environment.
First let’s install numpy, pandas, matplotlib as core data science tools
(jupyter) matt$ pip install --upgrade pip(jupyter) matt$ pip install numpy pandas matplotlib
Let’s install the jupyter package of tools and also jupyterlab.
(jupyter) matt$ pip install jupyter jupyterlab
You can install any other packages you like using the same
pip install [package_name] command.
To list what you already have installed use
Step 5 — fire up your first notebook
Create a directory for your notebook projects to keep all the files together.
(jupyter) matt$ cd ~
(jupyter) matt$ mkdir my_project
(jupyter) matt$ cd my_project
Starting the jupyter notebook (the classic notebook style) or jupyter lab (a more modern UX version of the notebook is a simple command
(jupyter) matt$ jupyter notebook
(jupyter) matt$ jupyter lab
You can now do your data science projects
Step 6 — Activating and deactivating your virtual environment
If you installed virtualenvwrapper then you can type
workon on the command line to see all your virtual environments and
workon [env_name] to activate any one of them.
matt$ workon analytics
dummymatt$ workon jupyter(jupyter) matt$
You need to call the
activate script buried within the
bin folder within the virtual environment. To activate, use the
matt$ source ~/virtualenvs/jupyter/bin/activate(jupyter) matt$
To deactivate, type:
(jupyter) matt$ deactivatematt$
Step 7 — Dealing with the ‘Python path’
Sometimes your programs might not work if the python path is not set correctly. This should be the root directory of your project, from where you start the jupyter notebook.
You can get around this by adding the python path to your virtual environment.
You can add a python path manually. Assuming your project is in the directory we created, on the command line you type the single line:
To see what is set you can type
export from the command time.
postactivate file in the
bin folder of your virtual environment will run each time you activate this environment.
nano or another editor you can add python paths to the environment.
And for good practice you can add a line to the
predeactivate file to reset the python path when you exit the environment.
That’s it. Now you have a specific environment for your jupyter notebook work. You can install what you like in this environment and if it gets too messy just delete the virtual environment’s directory and rebuild it.
You can use separate virtual environments for different projects if they have different packages they need or specific settings. If you are also developing software you are likely to want to set up a different virtual environment for each software project you have.