Jupyter Notebook is a standard document format for Python programs..
Jupyter Notebook: A free, open-source web application to create and share interactive documents that contain Python (and possibly other) code with optional narrative text in notebook form.
The Anaconda download provides for working with Jupyter Notebooks on your computer. The same notebooks are also found in the cloud at anaconda.cloud
, where you are also offered to Sign In
for a free cloud account in the top-right corner of the resulting window.
Further, on your computer or in the cloud, Anaconda provides the same environment, called Jupyter Lab
, for developing Jupyter Notebooks. The notebooks are interchangeable across the cloud and your computer, as is the environment that contains the notebooks and data files. The Jupyter Lab
environment includes not only a provision for creating and modifying Jupyter Notebooks, it also provides your file directory in the top-left of the window, on your computer or in the cloud, for locating and manipulating files.
Google also presents a free cloud environment, Google’s Colab, for creating standard Jupyter notebook files, though in a slightly different environment. The Google alternative may appeal more to those who prefer to work with cloud documents stored on Google Drive. However, Colab works perfectly fine for developing Jupyter Notebooks.
To run our first Python program, we need to simultaneously understand two concepts:
- Jupyter Notebook: How to enter and process information.
- Python: The syntax of a Python function call.
We need to know something about the development environment, and we need to know enough Python to enter and run at least one specific Python function call. We begin with a description of a Jupyter Notebook.
2.1 Open a Notebook
To begin or continue a Python analysis, open a new or existing Jupyter Notebook, which opens in your default web browser. Identify Jupyter Notebook files by their filetype of .ipynb
for Interactive Python Notebook. As with most files in modern data analytics, including machine learning, .ipynb
files are non-proprietary standard text files that a standard text editor could edit outside the development environment. However, a Jupyter Notebook renders the structure of the text file in convenient notebook form and provides the run time environment for executing Python code.
2.1.1 Anaconda
Before opening a notebook, first create a folder to store all notebooks and data files. A suggestion is a folder called Python in your Documents folder. If working on your computer you can do this any time. If working in the cloud you will first need to log into your cloud account.
2.1.1.1 Your Computer
To access the notebooks we begin by entering the Jupyter Lab
environment. One way to enter this environment is from Anaconda’s Navigator
app, part of the Anaconda download. Figure 2.1 shows the screen display of the Jupyter Lab
tile, displayed among other tiles when the app opens.
Identify the Jupyter Lab
tile, then click the Launch
button to access current notebooks and create new notebooks from the displayed file directory. For your first access to Navigator
, you may need to click Install
on the tile.
2.1.1.2 In the Cloud
Or, log into your anaconda.cloud account, click Notebooks
at the top of the window as in Figure 2.2, and you are automatically transported to the Jupyter Lab
environment.
After entering the
Jupyter Lab
environment, navigate to the file directory where you wish to store a new notebook or where an existing notebook is located.
When first entering Jupyter Lab
, the Launcher
tab appears, which displays multiple icons from which to launch a variety of tasks.
Chapter 1 of the video, Login and New Notebook, demonstrates some basic concepts of the Jupyter Lab environment as well as working with a newly created Jupyter Notebook. Figure 2.3 shows the relevant part of the resulting display, located at the top of the window, which is part of the display from the Jupyter Lab
Launcher
. Click the first option to access the most recent version of Anaconda. If already processing notebook and data files in your cloud account, click the +
icon at the top-left of the window.
The name of the newly created notebook is Untitled
. If you do not change the notebook name from Untitled
you end up with a collection of notebooks with names such as Untitled1
, Untitled2
, etc. Right-click on the file name in the file directory in the top-left of the window, select Rename
, and enter a meaningful name.
2.1.2 Colab
Enter Google Colab through a browser linked to Colab at colab.research.google.com
. Once there, bookmark the site for an easy return. Colab will automatically create a folder on your Google Drive called Colab Notebooks in your MyDrive folder, and then store your Python Notebooks in that folder. A suggestion is to create your own data folder at the top level of MyDrive to store your data files.
To open a Notebook: Go to the File
menu and select New notebook
. Change the notebook name by clicking on the name at the top of the window. Another way to open a Colab Notebook is directly from Google Drive. Locate the notebook in the Colab Notebooks
folder, right-click, choose Open with
, then Google Colaboratory
.
2.2 Cells
A notebook, such as Jupyter Notebook, is a computer document that consists of an ordered collection of cells, one after the other. To enter information into a Jupyter Notebook is to enter information into a specific cell, as illustrated beginning in Chapter 1 of the previously linked video, Login and New Notebook. A cell contains one of two types of information.
Cell: Self-contained block of code or narrative text.
Work with the information in a cell separately from the information in other cells. The default type of cell is for Python (or other) code.
Code cell: Cell that contains code, such as Python code.
The purpose of the code cell is to enter and process Python code.
The new notebook file within Jupyter Lab
appears as a single blank line with the name of Untitled.ipynb
, shown in Figure 2.4.
The Colab interface shown in Figure 2.5 is a little different but exactly the same concept.
Working with information in a cell is to either (a) enter and edit that information, or (b) process that information, such as running the corresponding Python code. These two types of interactions correspond to two different modes, one of which describes the state of each cell in the notebook.
Edit mode: Edit information in cells.
When running Jupyter Notebook, edit mode is indicated by a green cell border and a prompt in the editor area.
When a cell is in edit mode, you can enter information into the cell like a normal text editor. With either Jupyter Notebook or Colab, enter edit mode by pressing Enter/Return
, or use the mouse to double-click the content in the cell.
Enter/Return: Enter edit mode.
Once the code is entered into the cell, Python can run the code and display any output.
Command mode: Cell contents ready to be processed.
There needs to be a way to leave edit mode and activate command mode.
Esc: Leave edit mode.
Jupyter Notebook running on your computer indicates command mode with a grey cell border and blue left margin. Colab indicates the contents of a code cell ready to be run with an arrow that appears in the left-margin of the cell when the cursor is positioned over the cell.
Run the code in a cell either from the notebook menu. Alternatively, to run with a keyboard instruction, enter CNTRL/Enter
or CNTRL/Return
, either with Anaconda Jupyter Notebook or Colab.
2.3 First Python Program
If the information entered into a cell is a Python instruction, a function call, then that instruction can be processed and run. According to tradition, the first instruction that is run when learning a new computer language prints, that is, displays, the phrase Hello World.
Anaconda: Chapter 3, Hello World, of the previously referenced “Hello World illustrates the first Python program.
Google Colab: Hello World [video first 1:40]
This process involves entering the code as function calls, running the code, fixing the inevitable errors, and then re-running.
print() function: Display information as output.
The proper Python function call includes the quotes and the parentheses:
print("Hello World")
Then re-run the code.
With Python, Excel, and every other computer language, enclose a character string in quotes. The print()
function further requires that the information to be displayed be contained within a set of parentheses. Otherwise a syntax error is generated. If the parentheses are omitted, return to Edit mode to fix the error.
Using the print()
function provides the most flexibility for displaying information. The Jupyter Notebook also defaults to printing the output of the last line of a code cell if that function call results in displayable output. That convention saves entering many print statements, though print()
is needed even then for customization.
2.3.1 Exit and Return
When your work in a Jupyter Notebook is complete for the moment, exit and then return [video 1:41] as needed. Quitting the notebook is more than just leaving a notebook. The Anaconda system runs multiple processes in the background to support the running notebooks.
To quit all the processes running on your computer, from the notebook File
menu, choose Close and Halt
. See Figure Figure 2.6.
Return by re-opening the notebook and continue entering, modifying, and running Python code.
On Anaconda, to save the data file, click the disk icon at the top of the window for the code file, then close the Window.
On Colab, do File --> Save
or CMD/CTRL-s
to save the notebook between autosaves. Close the window to exit.
2.3.2 Markdown
A notebook can, and almost always should, contain narrative text that explains and describes the Python code and the resulting output, the results. The finished notebook reads like a book chapter or a scrolling PowerPoint presentation, with explanation interspersed with Python code and output. Accordingly, there are two primary types of cells in a notebook.
Markdown: A lightweight markup language for creating readable, formatted text using a plain-text editor by embedding codes such as italics and headers into the document that format accordingly when the cell is rendered (processed).
Cells in a Jupyter Notebook can be designated as markdown (text) cells that display the information according to any markdown codes that are present. Initially created cells are Code
cells. To convert to Markdown
, access the corresponding menu from the top of the file’s window or enter the letter m
.
Markdown cell (Jupyter Notebook) or Text cell (Colab): Cell for the display and formatting of narrative text.
Begin editing a markdown (text) cell by pressing Enter/Return
. with the cursor over the cell. Or double-click. On the Mac, pressCMD-Return
to leave, which, on Windows, would presumably beCTRL-Return\
.
To illustrate, refer to Chapter 5, Markdown Cell, in the video “Hello World
Google Colab: markdown [video after the first 1:40] A markdown cell can either be rendered or unrendered. When rendered, the cell’s contents are displayed formatted. When unrendered, the raw text source of the cell is displayed. With Jupyter Notebook, to render the selected cell with the mouse, click the button in the toolbar. To unrender the selected cell, double click on the cell. With Colab, both the unrendered version where you edit and the rendered version are displayed simultaneously.
There are a variety of standard markdown codes applicable to a wide range of environments, including Jupyter Notebooks. Markdown renders a display, such as a web page, without the need for much coding. One set of useful markdown codes is for italics and bold formatting. Italicize a word or phrase by beginning and ending it with either an underline or an asterisk. Bold with two underlines or two asterisks. Rendering the corresponding text then displays the text according to these markdown instructions.
Figure 2.7 illustrates the unrendered version, the raw text, in Edit
mode.
Render the text with Run
or the corresponding icon from the window menu or with CMD/Return
on a Mac. Find the rendered version in Figure 2.8.
Another set of markdown codes format a markdown cell as a header, useful to outline the notebook into meaningful sections. Begin a line with a single pound sign, #
, for a first-level header. A line that begins with two pound signs, ##
, formats as a second-level header, and so forth.
A notebook almost always consists of more than one cell. Add a new cell from Jupyter Notebook with the Insert menu, choosing either Insert Cell Above or Insert Cell Below. With Colab, there are + Code
and + Text
buttons toward the top of the screen. These buttons also appear when hovering with the mouse at the beginning or end of a cell. Or, for either environment, when not editing a cell, enter a
for adding a cell above the current cell, or b
to add a cell after the current cell.
2.3.3 Re-Initialize a Notebook
When working on your Python code in a Jupyter or Cloud Notebook you often need multiple work sessions to complete the task. On your computer, close down a session with the File –> Close and Halt from the notebook File menu. Then logout of Juypter notebook from the home screen. I notice on my Mac that I also need to close and terminate a Terminal window (gently with CTRL-C) to end the session completely.
When you reopen the Jupyter notebook, all of your previous output will be displayed. The problem is that this display of your notebook cells is only a display. The underlying variables and objects, such the data frame that contains your data, are no longer active. To re-initialize and get all variables and objects restored, you need to run the entire notebook.
Anaconda: Go the double-arrowhead at the top of the window for the code file and click.
Colab: Go to the Runtime
menu and choose Run all
.
Then the notebook will be in the same previous state was when the notebook was closed down. All displayed output will be live with the corresponding values active.