# NumPy Basics: Part 1¶

Ha Khanh Nguyen (hknguyen)

## 1. What is NumPy?¶

• NumPy is short for Numerical Python.
• Some of the main features of NumPy that we will look at in our course:

• ndarray: a multi-dimensional array providing fast array-oriented arithmetic access and operations.
• Mathematical functions for fast operation on entire arrays of data without having to write loops! (just like R)
• Tools for reading/writing array data to disk and working with memory-mapped files.
• Linear algebra, random number generation, etc.
• Let's compare the performance of NumPy array and the equivalent Python list!

• NumPy-based algorithms are generally 10-100 times faster than their pure Python counterparts and use significantly less memory! (again, sound just like R!)

## 2. NumPy ndarray¶

• NumPy arrays enable you to perform mathematical operations on whole blocks of data using similar syntax to the equivalent operations between scalar elements.
• An ndarray is a generic multidimensional container for homogeneous data: all of the elements must be the same type.
• Again, like vector and array in R!
• Every array has a ndim and a shape attributes which describe the dimension of the array.
• An object describing the data type of the array:

### 2.1 Creating ndarray¶

• You can use the array() function to creata a NumPy ndarray.
• This function accepts any sequence-like object like list, tuple, etc. (including other arrays) and produces a new NumPy array.
• Nested sequence (like a list of equal-length lists) will be converted into a multi-dimensional array:
• NumPy also provides functions to create special arrays such as arrays of all 0s or all 1s.
• np.arange() is an array-valued version of the built-in Python range() function:
• Here is the full list of functions for creating array in NumPy:
Function Description
array Convert input data (list, tuple, array, or other sequence type) to an ndarray either by inferring a dtype or explicitly specifying a dtype; copies the input data by default
asarray Convert input to ndarray, but do not copy if the input is already an ndarray
arange Like the built-in range but returns an ndarray instead of a list
ones Produce an array of all 1s with the given shape and dtype
ones_like Take another array and produces a 1s array of the same shape and dtype
zeros Like ones producing arrays of 0s instead
zeros_like Like ones_like producing arrays of 0s instead
empty Create new arrays by allocating new memory, but do not populate with any values like ones and zeros
empty_like Take another array and produces a new array of the same shape and dtype, but do not populate it with values
full Produce an array of the given shape and dtype with all values set to the indicated “fill value”
full_like Take another array and produces a filled array of the same shape and dtype
eye, identity Create a square NxN identify matrix (1s on the diagonal and 0s elsewhere)

### 2.2 Arithmetic with NumPy Arrays¶

• Arrays are important because they enable you to express batch operations on data WITHOUT writing any loops.
• NumPy users call this vectorization (also known as elementwise-operation).
• So does R, as some of you have seen this in STAT 385 last semester!
• Any arithmetic operations between equal-size arrays applies the operation element-wise:
• We can also use boolean operator with arrays!
• What happens if the two arrays have different sizes? In that case, the operation is called broadcasting and it is discussed in detailed in the Appendix A of the textbook.

This lecture note is modified from Chapter 4 of Wes McKinney's Python for Data Analysis 2nd Ed.