LAB 05.01 - Predictions impact

!wget --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/ai4eng.v1/main/content/init.py
import init; init.init(force_download=False); init.get_weblink()
from local.lib.rlxmoocapi import submit, session
session.LoginSequence(endpoint=init.endpoint, course_id=init.course_id, lab_id="L05.01", varname="student");

Task 1. Compute PNL from strategy

observe the following signal s, and model trend predictions p (not perfect predictions!!)

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
s = np.round((np.random.normal(size=20)*5+.5).cumsum()+100,2)
p = (np.random.random(size=len(s)-1)>.3).astype(int)
print (s.shape, p.shape)
 
plt.plot(s, color="black")
plt.scatter(np.arange(len(p))[p==0], s[:-1][p==0], color="red", label="prediction down")
plt.scatter(np.arange(len(p))[p==1], s[:-1][p==1], color="blue", label="prediction up")
plt.grid(); plt.legend(); plt.xlabel("time instant"); plt.ylabel("price")
plt.xticks(range(len(s)), range(len(s)))
pd.DataFrame(np.vstack((s,list(p)+[np.nan])), index=["signal", "prediction"])
print ("SIGNAL    ", s)
print ("PREDICTION", p)
SIGNAL     [ 92.88  90.96  94.72  89.5   87.1   94.69  87.64  87.08  85.47  88.02
  90.72  97.44  99.37  94.65  91.76  97.29  93.15  95.55 102.04 106.61]
PREDICTION [1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0]
../_images/LAB 05.01 - MEASURING PREDICTIVITY IMPACT_6_1.png

fill in the pnl variable below with a list of 19 values corresponding on applying the same strategy as in the notes, buying or selling always ONE money unit:

  • if the prediction is zero, we believe the price is going down, so we sell ONE money unit at the current price and buy it at the next instant of time

  • if the prediction is one, we do the opposite

  • BUT there is a commission of 1%, applied on the instant you make the first operation (which uses the current price)

observe that there are 20 signal points, and 19 predictions.

you can use your tool of choice (Excel, Python, etc.) to compute your answer

HINT: Understand each component of the expression for perfect_prediction below to try to obtain your answer with Python.

For instance: the following signal and predictions:

 
from IPython.display import Image
Image("local/imgs/timeseries-prediction.png", width=600)
../_images/LAB 05.01 - MEASURING PREDICTIVITY IMPACT_9_0.png

produce the following set of PNL

 2.65 7.86 -0.31 7.48 2.61 2.19 1.33 -2.08 -2.71 -2.88 0.42 -5.39 3.03 1.53 3.45 9.88 10.70 -7.69 -0.60
  • at t=0 the PNL is \((107.06-103.38)\times 1 - 103.38\times 1 \times .01=2.65\), since the prediction was correct

  • at t=2 the PNL is \((116.84-115.99)\times 1 - 115.99\times 1 \times .01=-0.31\), since the prediction was correct, BUT the price difference is small and the commission overcomes the profit.

  • at t=7 the PNL is \((111.76 - 112.71)\times1 - 112.71\times1\times.01=-2.08\), since the prediction was incorrect

in the expressions above, the first term is the net profit or loss, and the second one is due to the commission. Multiplication by \(1\) simply signals we are trading ONE unit.

also, observe that the following python code, will generate a perfect prediction signal, which, when applied to our strategy, will result in a list of all positive PNLs.

perfect_prediction = (s[1:]>s[:-1]).astype(int)
perfect_prediction

CHALLENGE 1 (not mandatory): make your answer in python

hints:

s[1:]            will give you all elements of s except the first one
s[:-1]           will give you all elements of s except the last one
s[1:] - s[:-1]   will give you the difference of price in one time with respect to the next one
(p-0.5)*2        will convert vector p (containing 0's and 1's) into a vector of -1's and +1's

fill in the following variable

pnl = [... ]

submit your answer

student.submit_task(globals(), task_id="task_01");

Task 2: Simulated prediction signal

given the following signal, produce a synthetic prediction signal with the given percentage of correct predictions.

observe that s has length 21, but your synthetic prediction will have a length of 20.

fill in the variable prediction, with a list with 20 zeros or ones, containing a prediction with acc correct predictions.

for instance, with the following signal

    [100.37 102.92 102.69 104.57 105.06  97.9  103.   100.32  97.59 107.07
     112.19 106.32 104.14 100.3   97.03 107.28 100.36 100.99 111.48 117.07
     126.04]

the following predictions:

    p = [1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0]

produce a trend prediction accuracy of 60% (acc=0.6)

HINT: Do it in Python

  • use the perfect prediction from the exercise above to start with.

  • use np.random.permutation

for instance:

 
# a list
a = np.r_[10,20,30,40,50,60,70,80,90]

# 3 positions randomly chosen
k = np.random.permutation(len(a)-1)[:3]
print (k)

# changing the value of the items on those positions
a[k] = a[k] + 1
a
[2 4 5]
array([10, 20, 31, 40, 51, 61, 70, 80, 90])

your signal and target accuracy to achieve

 
s = ((np.random.normal(size=21)*5+.5).cumsum()+100).round(2)
acc = np.round(np.random.random()*.9+.1, 1)
print ("YOUR SIGNAL", s)
print ("THE ACCURACY YOUR SYNTHETIC PREDICTIONS MUST ACHIEVE: ", acc)
YOUR SIGNAL [ 98.79  96.72  94.71 102.29  97.47  94.66  91.14 100.89 108.59 120.8
 114.14 117.63 115.83 121.9  115.78 123.07 123.68 129.23 123.68 123.83
 120.89]
THE ACCURACY YOUR SYNTHETIC PREDICTIONS MUST ACHIEVE:  0.7
my_synthetic_prediction = [ ... ]

submit your answer

student.submit_task(globals(), task_id="task_02");

Task 3: ML Metric vs Business Metric

now, your are given a signal (length=21) and you will have to create

  • an array of 9 rows x 20 columns with synthetic predictions so that the first row (row number zero in python) has accuracy of 10%, the second has 20%, etc.

  • a list of 9 numbers containing the PNL of using the synthetic predictions on the above array as input for a trading strategy.

for instance, for this signal:

[101.33,  96.75,  98.2 ,  95.3 ,  97.96,  98.75,  92.46,  82.2 , 78.61,  80.  ,  
  88.78,  98.72, 103.22, 113.65, 103.89, 107.36, 114.6 , 103.9 , 108.71, 104.2 , 107.8 ]

you will have to create the following variables:

pset = np.array([[1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0],
                 [1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0],
                 [1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0],
                 [1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1],
                 [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0],
                 [0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0],
                 [0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1],
                 [0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1],
                 [0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1]])


pnl = np.array([-121.5, -69.44, -62.90, -46.72, -4.08, -19.04, 23.5, 41.0, 77.02])

NOTE: Specify your PNL rounded to TWO decimal places

s = ((np.random.normal(size=21)*5+.5).cumsum()+100).round(2)
s
# a 9x20 numpy array
pset = 

# 9 elements numpy array or list
pnl  = 

submit your answer

student.submit_task(globals(), task_id="task_03");

understand accuracy vs. PNL

  • what is the minimum accuracy from which a model might be profitable?

  • and if the commision changes?

 
accuracies = np.linspace(.1,.9,9)
plt.plot(accuracies, pnl)
plt.axhline(0, color="black", lw=2)
plt.title("ML metric vs. Busines metric")
plt.grid(); plt.xlabel("model accuracy"); plt.ylabel("PNL")
Text(0, 0.5, 'PNL')
../_images/LAB 05.01 - MEASURING PREDICTIVITY IMPACT_47_1.png