Skip to content

Commit

Permalink
updates
Browse files Browse the repository at this point in the history
  • Loading branch information
lrhgit committed Jan 27, 2017
1 parent 2e0416c commit 07a9aba
Showing 1 changed file with 212 additions and 4 deletions.
216 changes: 212 additions & 4 deletions sensitivity_introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
"<!-- dom:AUTHOR: Leif Rune Hellevik at Department of Structural Engineering, NTNU -->\n",
"<!-- Author: --> **Leif Rune Hellevik**, Department of Structural Engineering, NTNU\n",
"\n",
"Date: **Jan 17, 2017**"
"Date: **Jan 27, 2017**"
]
},
{
Expand Down Expand Up @@ -937,9 +937,217 @@
" error.append(LA.norm((s_pc-s**2)/s**2))\n",
" \n",
"plt.figure()\n",
"plt.semilogy(Npc_list,error)\n",
"plt.xlabel('Nr samples')\n",
"plt.ylabel('L2-norm of error in Sobol indices')"
"plt.semilogy(Npc_list,error);\n",
"plt.xlabel('Nr samples');\n",
"plt.ylabel('L2-norm of error in Sobol indices');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conditional variances\n",
"\n",
"As noted previously, the importance of a factor $Z_i$ is manifested\n",
"the existence of a 'shape' or 'pattern' in the model outputs\n",
"$Y$. Conversely, a uniform cloud of output points $Y$ as a function of\n",
"$Z_i$ is a symptom, albeit not a proof, indicating that $Z_i$ is a\n",
"noninfluential factor. In this section we seek to demonstrate that\n",
"conditional variances is a usefull means to quantify the 'shape' or\n",
"'pattern' in the outputs.\n",
"\n",
"The shape in the outputs $Y$ for a given $Z_i$, may be seen in the\n",
"scatterplot as of $Y$ versus $Z_i$. In particular, we may cut the\n",
"$Z_i$-axis into slices and assess how the distribution of the outputs\n",
"$Y$ changes from slice to slice. This is illustrated in the code\n",
"snippet below, where the slices are identified with vertical dashed\n",
"lines at equidistant locations on each $Z_i$-axis, $i=1, \\ldots,4$."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# # Scatter plots of data, z-slices, and linear model \n",
"plt.figure()\n",
"\n",
"Ndz=10 #Number of slices of the Z-axes\n",
"\n",
"Zslice = np.zeros((r,Ndz)) # array for mean-values in the slices\n",
"ZBndry = np.zeros((r,Ndz+1)) # array for boundaries of the slices\n",
"dz=np.zeros(r)\n",
"\n",
"for k in range(r):\n",
" plt.subplot(2,2,k+1)\n",
" \n",
" zmin=np.min(Z[k,:]); zmax=np.max(Z[k,:]) # each Z[k,:] may have different extremas\n",
" dz[k] = (zmax-zmin )/Ndz\n",
" \n",
" ZBndry[k,:] = np.linspace(zmin,zmax, Ndz+1)\n",
" Zslice[k,:] = np.linspace(zmin+dz[k]/2.,zmax-dz[k]/2., Ndz)\n",
" \n",
" # Plot the slices\n",
" for i in range(Ndz):\n",
" plt.axvline(ZBndry[k,i], np.amin(Y), np.amax(Y),linestyle='--', color='.75')\n",
" \n",
" # Plot the data\n",
" plt.plot(Z[k,:],Y[:],'.')\n",
" xlbl='Z'+str(k)\n",
" plt.xlabel(xlbl)\n",
" plt.ylabel('Y') \n",
"\n",
" Ymodel=linear_model(w[k],Zslice[k,:],Ndz)\n",
" \n",
" plt.plot(Zslice[k,:],Ymodel)\n",
" \n",
" ymin=np.amin(Y); ymax=np.amax(Y)\n",
" plt.ylim([ymin,ymax])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note, that average value of $Y$ in a very thin slice, corresponds to keeping $Z_i$ fixed whiel while averaging over all output values of $Y$ due to all-but $Z_i$, which corresponds to the conditional expected value:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"\\begin{equation}\n",
"E_{Z_{\\sim i}} (Y\\;|\\;Z_i) \n",
"\\label{}\n",
"\\end{equation}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"For convenience we denote 'all-but $Z_i$' by $Z_{\\sim\n",
"i}$. Naturally, a measure of how much $E_{Z_{\\sim i}} (Y\\;|\\;Z_i)$\n",
"varies in the range of $Z_i$ is given by the conditional variance:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"\\begin{equation}\n",
"\\text{V}_{Z_i}(E_{Z_{\\sim i}} (Y\\;|\\;Z_i))\n",
"\\label{}\n",
"\\end{equation}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Further, the variance the output $Y$ may be decomposed into:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<!-- Equation labels as ordinary links -->\n",
"<div id=\"eq:VarDecomp\"></div>\n",
"\n",
"$$\n",
"\\begin{equation}\n",
"\\text{V}(Y) = E_{Z_i} ( V_{Z_{\\sim i}} (Y \\; | Z_{i})) + \\text{V}_{Z_i}(E_{Z_{\\sim i}} (Y\\;|\\;Z_i))\n",
"\\label{eq:VarDecomp} \\tag{19}\n",
"\\end{equation}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A large $\\text{V}_{Z_i}(E_{Z_{\\sim i}} (Y\\;|\\;Z_i))$ will imply that\n",
"$Z_i$ is an important factor and is therefore coined the first-order\n",
"effect of $Z_i$ on $Y$, and its fraction of the total variation of $Y$ is expressed by $S_i$, `the first-order sensitivity index` of $Z_i$ on $Y$:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"$$\n",
"\\begin{equation}\n",
"S_i = \\frac{\\text{V}_{Z_i}(E_{Z_{\\sim i}} (Y\\;|\\;Z_i))}{\\text{V}(Y)}\n",
"\\label{}\n",
"\\end{equation}\n",
"$$"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"By ([eq:VarDecomp](#eq:VarDecomp)), $S_i$ is number always in the range $[0,1]$,\n",
"and a high value implies an important factor."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# # Scatter plots of averaged y-values per slice, with averaged data\n",
"\n",
"Zsorted=np.zeros_like(Z)\n",
"Ysorted=np.zeros_like(Z)\n",
"YsliceMean=np.zeros((r,Ndz))\n",
"\n",
"plt.figure()\n",
"for k in range(r):\n",
" plt.subplot(2,2,k+1)\n",
"\n",
" # sort data\n",
" sidx=np.argsort(Z[k,:])\n",
" Zsorted[k,:]=Z[k,sidx].copy() \n",
" Ysorted[k,:]=Y[sidx].copy() \n",
" \n",
" for i in range(Ndz):\n",
" plt.axvline(ZBndry[k,i], np.amin(Y), np.amax(Y),linestyle='--', color='.75')\n",
"\n",
" # find indexes of z-values in the current slice \n",
" zidx_range=np.logical_and(Zsorted[k,:]>=ZBndry[k,i],Zsorted[k,:]<ZBndry[k,i+1])\n",
" \n",
" if np.any(zidx_range): #check if range has elements\n",
" YsliceMean[k,i]=np.mean(Ysorted[k,zidx_range])\n",
" else: #set value to None if noe elements in z-slice\n",
" YsliceMean[k,i]=None\n",
"\n",
" plt.plot(Zslice[k,:],YsliceMean[k,:],'.')\n",
"\n",
"# # Plot linear model\n",
" Nmodel=3\n",
" zmin=np.min(Zslice[k,:])\n",
" zmax=np.max(Zslice[k,:])\n",
" \n",
" zvals=np.linspace(zmin,zmax,Nmodel)\n",
" Ymodel=linear_model(w[k],zvals,Nmodel)\n",
" plt.plot(zvals,Ymodel)\n",
" \n",
" xlbl='Z'+str(k)\n",
" plt.xlabel(xlbl)\n",
" \n",
" plt.ylim(ymin,ymax)"
]
},
{
Expand Down

0 comments on commit 07a9aba

Please sign in to comment.