LAB 03.02 - Timeseries model¶
A synthetic timeseries¶
date_split = "2018-03-01"
idx = pd.date_range("2018-01-01", "2018-03-31", freq="6h")
i = np.linspace(-5,4,len(idx))
i = np.linspace(np.random.random()*5-5,np.random.random()*5+2,len(idx))
t = np.log(i**2+.3)*np.cos(4*i)
t += (np.random.normal(size=len(idx))*.4)
t = np.round(t*3+30,3)
d = pd.DataFrame(np.r_[[t]].T, columns=["signal"], index=idx)
d.index.name="date"
plt.figure(figsize=(15,3))
plt.plot(d[:date_split].index, d[:date_split].signal, color="black", lw="2", label="train");
plt.plot(d[date_split:].index, d[date_split:].signal, color="red", lw="2", label="test");
plt.axvline(date_split, color="grey"); plt.legend();plt.grid();
signal = d

Task 1. Build a time series training dataset¶
In this task, starting off from the time signal above, you must build an annotated dataset so that at any time instant given the three last n_timesteps_lookback
signal values and the current one we want to predict the next one.
Complete the following function so that when receiving a time indexed dataframe such as the one above, the resulting dataframe is such that:
the column signal is left untouched
there are
n_timesteps_lookback
+1 new columns:the column signal+1 contains the signal one time step into the future
the columns signal-1, signal-2, etc. contain the signal one, two, etc. time steps into the past.
the resulting dataset contains (
n_timesteps_lookback
+1) rows less than the original dataset, one due to the signal+1 column and the rest for the signal-x columns. For instance, if the original dataset contained 357 rows, withn_timesteps_lookback=3
the resulting dataframe will contain 353 rows.
Hint: use pandas.DataFrame.join
, pandas.DataFrame.shift
and pandas.DataFrame.dropna
For instance, with this input
you should produce the following output
test your code
submit your answer
Task 2. Manually apply a regression model to create predictions¶
Complete the following function to apply the a linear regression model to a dataframe such as the resulting one from the previous task:
where s corresponds to the column named signal
, s−1 to the column named signal-1
, etc.
Observe that:
column
signal+1
is not used, as it is the expected prediction. You will use it in the next task.you will have
n_timesteps_lookback
+2 w parameters, since you will have one per eachn_timesteps_lookback
, plus w0, plus w1
Expect the function arguments as follow:
td
: a Pandas dataframe such as the output of the functionmake_timeseries_dataset
of the previous task, with exactly the same column namesw
: a Numpy array withn_timesteps_lookback
+2 elements in the order [w0,w1,...]
Warn: the DataFrame td
may contain any number of lookback columns and might be in any order
Challenge: solve it with one single line of Python code
EXAMPLE: For the following dataframe and w
you should get the following results
Task 3: Measure trend prediction¶
You will now use the predictions to measure trend accuracy. We will compare any predictions which are given to us with the actual next value in column signal+1 in the following way
if signal+1>signal and ALSO your prediction>signal, then your model has a correct trend prediction regardless how different are signal+1 and prediction
if signal+1<=signal and ALSO your prediction<=signal, then your model has a correct trend prediction regardless how different are signal+1 and prediction
otherwise, your model has an incorrect prediction
Complete the following function such that when receiving a dataframe such as the resulting one from task 1 above, and a pd.Series
with the same index and price predictions, computes the accuracy of the predictions (the percentage of correct predictions).
The accuracy must be a float
and its correctness will be checked up to 2 decimal places.
Challenge: solve it with one single line of Python code.
EXAMPLE: for the following time series dataset
And the following predictions
The trend accuracy is 0.4
since the trend is correctly hit by the predictions on rows 1, 4, 7 and 8 (assuming the first row is numbered as 0)
test your code manually¶
submit your answer