LAB 05.01 - Predictions impact
Contents
LAB 05.01 - Predictions impact¶
!wget --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/ai4eng.v1/main/content/init.py
import init; init.init(force_download=False); init.get_weblink()
from local.lib.rlxmoocapi import submit, session
session.LoginSequence(endpoint=init.endpoint, course_id=init.course_id, lab_id="L05.01", varname="student");
Task 1. Compute PNL from strategy¶
observe the following signal s
, and model trend predictions p
(not perfect predictions!!)
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
s = np.round((np.random.normal(size=20)*5+.5).cumsum()+100,2)
p = (np.random.random(size=len(s)-1)>.3).astype(int)
print (s.shape, p.shape)
plt.plot(s, color="black")
plt.scatter(np.arange(len(p))[p==0], s[:-1][p==0], color="red", label="prediction down")
plt.scatter(np.arange(len(p))[p==1], s[:-1][p==1], color="blue", label="prediction up")
plt.grid(); plt.legend(); plt.xlabel("time instant"); plt.ylabel("price")
plt.xticks(range(len(s)), range(len(s)))
pd.DataFrame(np.vstack((s,list(p)+[np.nan])), index=["signal", "prediction"])
print ("SIGNAL ", s)
print ("PREDICTION", p)
SIGNAL [ 92.88 90.96 94.72 89.5 87.1 94.69 87.64 87.08 85.47 88.02
90.72 97.44 99.37 94.65 91.76 97.29 93.15 95.55 102.04 106.61]
PREDICTION [1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0]
fill in the pnl
variable below with a list of 19 values corresponding on applying the same strategy as in the notes, buying or selling always ONE money unit:
if the prediction is zero, we believe the price is going down, so we sell ONE money unit at the current price and buy it at the next instant of time
if the prediction is one, we do the opposite
BUT there is a commission of 1%, applied on the instant you make the first operation (which uses the current price)
observe that there are 20 signal points, and 19 predictions.
you can use your tool of choice (Excel, Python, etc.) to compute your answer
HINT: Understand each component of the expression for perfect_prediction
below to try to obtain your answer with Python.
For instance: the following signal and predictions:
from IPython.display import Image
Image("local/imgs/timeseries-prediction.png", width=600)
produce the following set of PNL
2.65 7.86 -0.31 7.48 2.61 2.19 1.33 -2.08 -2.71 -2.88 0.42 -5.39 3.03 1.53 3.45 9.88 10.70 -7.69 -0.60
at
t=0
the PNL is \((107.06-103.38)\times 1 - 103.38\times 1 \times .01=2.65\), since the prediction was correctat
t=2
the PNL is \((116.84-115.99)\times 1 - 115.99\times 1 \times .01=-0.31\), since the prediction was correct, BUT the price difference is small and the commission overcomes the profit.at
t=7
the PNL is \((111.76 - 112.71)\times1 - 112.71\times1\times.01=-2.08\), since the prediction was incorrect
in the expressions above, the first term is the net profit or loss, and the second one is due to the commission. Multiplication by \(1\) simply signals we are trading ONE unit.
also, observe that the following python code, will generate a perfect prediction signal, which, when applied to our strategy, will result in a list of all positive PNLs.
perfect_prediction = (s[1:]>s[:-1]).astype(int)
perfect_prediction
CHALLENGE 1 (not mandatory): make your answer in python
hints:
s[1:] will give you all elements of s except the first one
s[:-1] will give you all elements of s except the last one
s[1:] - s[:-1] will give you the difference of price in one time with respect to the next one
(p-0.5)*2 will convert vector p (containing 0's and 1's) into a vector of -1's and +1's
fill in the following variable
pnl = [... ]
submit your answer
student.submit_task(globals(), task_id="task_01");
Task 2: Simulated prediction signal¶
given the following signal, produce a synthetic prediction signal with the given percentage of correct predictions.
observe that s
has length 21, but your synthetic prediction will have a length of 20.
fill in the variable prediction
, with a list with 20 zeros or ones, containing a prediction with acc
correct predictions.
for instance, with the following signal
[100.37 102.92 102.69 104.57 105.06 97.9 103. 100.32 97.59 107.07
112.19 106.32 104.14 100.3 97.03 107.28 100.36 100.99 111.48 117.07
126.04]
the following predictions:
p = [1, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0]
produce a trend prediction accuracy of 60% (acc=0.6
)
HINT: Do it in Python
use the perfect prediction from the exercise above to start with.
use
np.random.permutation
for instance:
# a list
a = np.r_[10,20,30,40,50,60,70,80,90]
# 3 positions randomly chosen
k = np.random.permutation(len(a)-1)[:3]
print (k)
# changing the value of the items on those positions
a[k] = a[k] + 1
a
[2 4 5]
array([10, 20, 31, 40, 51, 61, 70, 80, 90])
your signal and target accuracy to achieve
s = ((np.random.normal(size=21)*5+.5).cumsum()+100).round(2)
acc = np.round(np.random.random()*.9+.1, 1)
print ("YOUR SIGNAL", s)
print ("THE ACCURACY YOUR SYNTHETIC PREDICTIONS MUST ACHIEVE: ", acc)
YOUR SIGNAL [ 98.79 96.72 94.71 102.29 97.47 94.66 91.14 100.89 108.59 120.8
114.14 117.63 115.83 121.9 115.78 123.07 123.68 129.23 123.68 123.83
120.89]
THE ACCURACY YOUR SYNTHETIC PREDICTIONS MUST ACHIEVE: 0.7
my_synthetic_prediction = [ ... ]
submit your answer
student.submit_task(globals(), task_id="task_02");
Task 3: ML Metric vs Business Metric¶
now, your are given a signal (length=21) and you will have to create
an array of 9 rows x 20 columns with synthetic predictions so that the first row (row number zero in python) has accuracy of 10%, the second has 20%, etc.
a list of 9 numbers containing the PNL of using the synthetic predictions on the above array as input for a trading strategy.
for instance, for this signal:
[101.33, 96.75, 98.2 , 95.3 , 97.96, 98.75, 92.46, 82.2 , 78.61, 80. ,
88.78, 98.72, 103.22, 113.65, 103.89, 107.36, 114.6 , 103.9 , 108.71, 104.2 , 107.8 ]
you will have to create the following variables:
pset = np.array([[1, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0],
[1, 0, 1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0],
[1, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 0],
[1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1],
[1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0],
[0, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1],
[0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1],
[0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 0, 1]])
pnl = np.array([-121.5, -69.44, -62.90, -46.72, -4.08, -19.04, 23.5, 41.0, 77.02])
NOTE: Specify your PNL rounded to TWO decimal places
s = ((np.random.normal(size=21)*5+.5).cumsum()+100).round(2)
s
# a 9x20 numpy array
pset =
# 9 elements numpy array or list
pnl =
submit your answer
student.submit_task(globals(), task_id="task_03");
understand accuracy vs. PNL¶
what is the minimum accuracy from which a model might be profitable?
and if the commision changes?
accuracies = np.linspace(.1,.9,9)
plt.plot(accuracies, pnl)
plt.axhline(0, color="black", lw=2)
plt.title("ML metric vs. Busines metric")
plt.grid(); plt.xlabel("model accuracy"); plt.ylabel("PNL")
Text(0, 0.5, 'PNL')