NumPy Application: Random Walks¶

Ha Khanh Nguyen (hknguyen)

1. What are Random Walks?¶

• In short, a random walk is a stochastic process.
• In this example, we will consider a simple random walk starting at 0 with steps of 1 and -1 occuring with equal probability.
• Now let's plot our walk!
• Before we can do plotting, first, you might need to install the matplotlib package by running the following command in the command line prompt (Terminal on Mac, Anaconda Prompt on Windows).
conda install matplotlib

2. Using NumPy for Simulating Random Walks¶

• walk is simply the cumulative sum of the random steps and could be evaluated as an array expression.
• Note: we have to use np.random.randint() to generate an array of random integers. random.randint() only returns 1 number at the time.
• Since we use 2 different random functions, even if we set the same seed, we won't get the same result (because the mechanisms inside the functions work differently).
• Here is the plot of new generated walk:
• From this we can begin to extract statistics like the minimum and maximum value along the walk’s trajectory:
• First crossing time = the step at which the random walk reaches a particular value. This is a more advanced statistic.
• Let's say we want to know how long it takes the random walk to get at least 10 steps away from the starting point in either direction.
• np.abs(walk) >= 10 gives us a boolean array indicating where the walk has reached or exceeded 10.
• argmax() returns the first index of the maximum value of the array. In this case, it returns the index of the first True value.

3. Simulating Many Random Walks at Once¶

• Say our goal is to generate 5000 random walks at once! How do we do that?
• Out of these walks, let's compute the minimum crossing time to 30 or -30!
• Now, note that not all of 5000 walks reach 30.
• any(1) returns True if at least 1 of the values of the row is True.
• Use the Boolean array hist30 to select only the walks that actually hit 30 or -30! Then use argmax() across axis 1 (the column) to get the crossing times:

This lecture note is modified from Chapter 4 of Wes McKinney's Python for Data Analysis 2nd Ed.