!wget -nc --no-cache -O init.py -q https://raw.githubusercontent.com/rramosp/2021.deeplearning/main/content/init.py
import init; init.init(force_download=False);from local.lib.rlxmoocapi import submit, session
session.LoginSequence(endpoint=init.endpoint, course_id=init.course_id, lab_id="L04.03", varname="student");import pandas as pd
import matplotlib.pyplot as plt
from skimage import io
import numpy as np
%matplotlib inline
from IPython.display import ImageTASK 1: Create anchor boxes¶
observe how we download and extract the first 20K bounding-box annotations from the Open Images V6 dataset, out of the total 14M available. We are interested in the width and height of each box, which we must obtain by substracting the box coordinates, according to the description here. Recall that the coordinates are as they are relative to the image size
!wget -nc https://storage.googleapis.com/openimages/v6/oidv6-train-annotations-bbox.csv!wc oidv6-train-annotations-bbox.csv
!head -5 oidv6-train-annotations-bbox.csvThe X numpy array now contains the width and height of bounding boxes
!head -20001 oidv6-train-annotations-bbox.csv > oidv6-train-annotations-bbox-20k.csv
d = pd.read_csv('oidv6-train-annotations-bbox-20k.csv')
w = (d.XMax-d.XMin).values
h = (d.YMax-d.YMin).values
X = np.r_[[w,h]].T
X[:6]we can paint a sample of them
from matplotlib.patches import Rectangle
plt.figure(figsize=(5,5));
ax = plt.subplot(111)
for w,h in np.random.permutation(X)[:25]:
ax.add_patch(Rectangle((0.5-w/2,0.5-h/2),w,h, linewidth=2,edgecolor='r',facecolor='none'))

Complete the following function such that it creates n anchor boxes from the bounding boxes in X using sklearnn_clusters set to the number of anchor boxes desired. After fitting KMeans, return cluster centers. Use the random_state passed as argument in KMeans.
def get_anchor_boxes(X, n, random_state=0):
from sklearn.cluster import KMeans
# YOUR CODE HERE
km = ...
return ...you can visualize your anchor boxes. You should get something similar to this
Image("local/imgs/anchor_boxes.png", width=300)
from matplotlib.patches import Rectangle
anchors = get_anchor_boxes(X, n=10, random_state=0)
plt.figure(figsize=(5,5));
ax = plt.subplot(111)
for w,h in anchors:
ax.add_patch(Rectangle((0.5-w/2,0.5-h/2),w,h, linewidth=2,edgecolor='r',facecolor='none'))Registra tu solución en linea
student.submit_task(namespace=globals(), task_id='T1');TASK 2: Get closest anchor¶
assume we have these anchor boxes, sorted by size
kc = np.array([0.03612632, 0.05025544, 0.0982887 , 0.1392435 , 0.11913009,
0.28577818, 0.32945173, 0.23846835, 0.18874274, 0.48914381,
0.25347843, 0.77500826, 0.45506799, 0.51589807, 0.83168319,
0.39802428, 0.5539543 , 0.86824085, 0.93553054, 0.89561131]).reshape(10,2)
kc = kc[np.argsort(np.prod(kc, axis=1))]
anchors = pd.DataFrame(kc, columns=['w', 'h'])
anchorsComplete the following function so that, given a bounding box XMin, XMax, YMin, YMax and a dataframe with anchors such as kc above
returns the index of the most similar anchor to the bouding box.
Recall that all values
Given two boxes (bounding box and anchor) with widths and heights and , we define their similarity measure as:
Your return value must be an integer between 0 and 9
def get_closest_anchor_box(XMin, XMax, YMin, YMax, anchors):
anchor_index = ...
return anchor_indexyou can visualize random annotations and their corresponding anchor boxes according to your function below. Your should see things like this one
Image("local/imgs/annotations-anchors.png", width=600)
plt.figure(figsize=(12,4));
for i in range(3):
b = d.iloc[np.random.randint(len(d))]
bx, by = b.XMin + (b.XMax-b.XMin)/2, b.YMin + (b.YMax-b.YMin)/2
bw, bh = b.XMax - b.XMin, b.YMax - b.YMin
pw, ph = anchors.values[get_closest_anchor_box(b.XMin, b.XMax, b.YMin, b.YMax, anchors)]
ax = plt.subplot(1,3,i+1)
ax.add_patch(Rectangle((bx-bw/2,by-bh/2),bw,bh, linewidth=2,edgecolor='r',facecolor='none', label="annotation"))
ax.add_patch(Rectangle((bx-pw/2,by-ph/2),pw,ph, linewidth=2,edgecolor='b',facecolor='none', label="anchor box"))
plt.scatter(bx, by, color="black", label="object center")
plt.grid(); plt.legend()
plt.xlim(0,1); plt.ylim(0,1);Registra tu solución en linea
student.submit_task(namespace=globals(), task_id='T2');TASK 3: Compute desired model bounding box predictions¶
Understand the schema of YOLO coordinates below
, are the width and height of the annotation we want the model to predict
, are and coordinate of the annotation we want the model to predict
, are the width and height of its closest anchor box
, are the and coordinates of the TOP LEFT corner of the image cell responsible for detecting the annotation
Image("local/imgs/yolo_predictions.png", width=400)
Complete the following function such that, when given , , , , , , , returns:
, : the cell number in which the annotation center (, ) falls in.
, , , : the desired model predictions according to the figure above
and specify the grid size in terms of number of cells wide and number of cells high.
def get_model_target_predictions(bx, by, bw, bh, pw, ph, nw, nh):
# assume all x,y,w,h are in the [0,1] range, and nw, nh > 2
nx = ...
ny = ...
tx = ...
ty = ...
tw = ...
th = ...
return nx, ny, tx, ty, tw, thcheck your code. For the following values, you should get
, =
1, 4, =
-2.63, -3.89, =
-0.92, 0.14
Make sure the values make sense (why are they positive or negative)
nw, nh = 7, 5
bx, by = 0.21, 0.82
bw, bh = 0.02, 0.15
pw, ph = 0.05, 0.13
get_model_target_predictions(bx, by, bw, bh, pw, ph, nw, nh)try with other cases generated randomly
nw, nh = np.random.randint(6, size=2)+5
bx, by = np.round(np.random.random(size=2)*.4+.3,3)
bw, bh = np.round(np.random.random(size=2)*.25,3)
pw, ph = np.round(np.r_[bw, bh] * (1+np.random.random(size=2)*0.3-0.15),3)
print ("inputs", nw, nh, bx, by, bw, bh, pw, ph)
get_model_target_predictions(bx, by, bw, bh, pw, ph, nw, nh)Registra tu solución en linea
student.submit_task(namespace=globals(), task_id='T3');