Setup¶
Software Setup¶
Discussion
Installing Python
Python is a popular language for scientific computing, and a frequent choice for machine learning as well. To install Python, follow the Beginner’s Guide or head straight to the download page.
Please set up your python environment at least a day in advance of the workshop. If you encounter problems with the installation procedure, ask your workshop organizers via e-mail for assistance so you are ready to go as soon as the workshop begins.
Installing the required packages¶
Pip is the package management system built into Python. Pip should be available in your system once you installed Python successfully.
Open a terminal (Mac/Linux) or Command Prompt (Windows) and run the following commands.
Create a virtual environment called
dl_workshop:
On Linux/macOs
python3 -m venv dl_workshop
On Windows
py -m venv dl_workshop
Activate the newly created virtual environment:
On Linux/macOs
source dl_workshop/bin/activate
On Windows
dl_workshop\Scripts\activate
Remember that you need to activate your environment every time you restart your terminal!
Install the required packages:
In the course, you will have two tracks to opt from: one using PyTorch and one using Keras. We recommend PyTorch for the intermediate-level Python users and above; and Keras for beginners.
On Linux/macOs
python3 -m pip install jupyter seaborn scikit-learn pandas tqdm torchinfo torchmetrics torch torchvision
On Windows
py -m pip install jupyter seaborn scikit-learn pandas tqdm torchinfo torchmetrics torch torchvision
If you have a GPU, you might benefit from following the official commands from PyTorch
for installing the torch and torchvision packages.
On Linux/macOs
python3 -m pip install jupyter seaborn scikit-learn pandas keras tensorflow pydot
On Windows
py -m pip install jupyter seaborn scikit-learn pandas keras tensorflow pydot
In this course, for the Keras track, we will be using the TensorFlow backend. Keras can also use either PyTorch or JAX as a backend.
Note for MacOS users: there is a package tensorflow-metal which accelerates the training of machine learning models with TensorFlow on a recent Mac with a Silicon chip (M1/M2/M3).
However, the installation is currently broken in the most recent version (as of January 2025), see the developer forum.
An optional challenge in episode 2 requires installation of Graphviz and instructions for doing that can be found by following this link.
Starting Jupyter Lab¶
We will teach using Python in Jupyter Lab, a programming environment that runs in a web browser. Jupyter Lab is compatible with Firefox, Chrome, Safari and Chromium-based browsers. Note that Internet Explorer and Edge are not supported. See the Jupyter Lab documentation for an up-to-date list of supported browsers.
To start Jupyter Lab, open a terminal (Mac/Linux) or Command Prompt (Windows), make sure that you activated the virtual environment you created for this course, and type the command:
jupyter lab
Check your setup¶
To check whether all packages installed correctly, start a jupyter notebook in jupyter lab as explained above. Run the following lines of code:
import sklearn
print('sklearn version: ', sklearn.__version__)
import seaborn
print('seaborn version: ', seaborn.__version__)
import pandas
print('pandas version: ', pandas.__version__)
import torchinfo
print('torchinfo version: ', torchinfo.__version__)
import torch
print('PyTorch version: ', torch.__version__)
This should output the versions of all required packages without giving errors. Most versions will work fine with this lesson, but:
For PyTorch, the minimum version is 2.1.0
For sklearn, the minimum version is 1.2.2
import sklearn
print('sklearn version: ', sklearn.__version__)
import seaborn
print('seaborn version: ', seaborn.__version__)
import pandas
print('pandas version: ', pandas.__version__)
import keras
print('Keras version: ', keras.__version__)
import tensorflow
print('Tensorflow version: ', tensorflow.__version__)
This should output the versions of all required packages without giving errors. Most versions will work fine with this lesson, but:
For Keras and Tensorflow, the minimum version is 2.12.0
For sklearn, the minimum version is 1.2.2
Fallback option: cloud environment¶
If a local installation does not work for you, it is also possible to run this lesson in Binder Hub. This should give you an environment with all the required software and data to run this lesson, nothing which is saved will be stored, please copy any files you want to keep. Note that if you are the first person to launch this in the last few days it can take several minutes to startup. The second person who loads it should find it loads in under a minute. Instructors who intend to use this option should start it themselves shortly before the workshop begins.
Note
Training deep-learning models can take a long time if you are using Binder and you may need to reduce the number of epochs.
Alternatively you can use Google colab. If you open a jupyter notebook here, most of the required packages are already pre-installed. Note that google colab uses jupyter notebook instead of Jupyter Lab.
Downloading the required datasets¶
Download the weather dataset prediction csv and Dollar street dataset (4 files in total)