Wednesday, November 3, 2010

Math is like music, statistics is like literature

An excellent link from Dr. Gelman's blog. Statistician Dick De Veaux says:

We haven’t evolved to be statisticians. Our students who think statistics is an unnatural subject are right. This isn’t how humans think naturally. But it is how humans think rationally. And it is how scientists think. This is the way we must think if we are to make progress in understanding how the world works and, for that matter, how we ourselves work.

Monday, November 1, 2010

Multilevel Modeling on UK School Data

I came across a dataset on UK schools that has scores, student / school gender information. After some digging on the Net, I found a sample Bugs model for this data, so I decided to give it a try. The data file was mentioned here http://dss.princeton.edu/training/Multilevel101.pdf and is at http://dss.princeton.edu/training/schools.dta.

I converted this data Python readable format and saved it as testing.dat. Sample Bugs model was at http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/reviewwinbugs.pdf

Other useful info: http://www.cmm.bristol.ac.uk/learning-training/multilevel-m-software/reviewaml.pdf

The example code is written in Python / JAGS. The model can certainly be improved. Keyword dflat() does not seem to be supported anymore, so dnorm(0.0,1.0E-6) was used instead when necessary.

Data and recent code

import rpy2.robjects as R
import rpy2.rinterface as rinterface
from rpy2.robjects.packages import importr
import rpy2.robjects.numpy2ri
from rpy2.robjects.vectors import FloatVector, StrVector
import numpy as np
r2jags = importr('R2jags')

schools = np.loadtxt("testing.dat",  skiprows=1)

n=len(schools)
J = np.unique(np.max(schools[:,0]))[0]
y = schools[:,2]
school = schools[:,0]
standlrt = schools[:,3] # reading score
gender = schools[:,4] # student gender
school_gender = schools[:,5] # school gender, 1 mixed, 2 boys, 3 girls, 

boysch = np.zeros(n)
boysch[school_gender == '2'] = 1

girlsch = np.zeros(n)
girlsch[school_gender == '3'] = 1

R.r.assign('n',n)
R.r.assign('J',J)
R.r.assign('y',y)
R.r.assign('school',school)
R.r.assign('standlrt',standlrt)
R.r.assign('gender',gender)
R.r.assign('boysch',boysch)
R.r.assign('girlsch',girlsch)

jags_data = StrVector(("n", "J", "y", "school", "standlrt", "gender", "boysch", "girlsch"))
jags_params = StrVector(("a", "beta","tau.y", "tau.a"))

jags_inits = R.r('''function (){
  list (a=rnorm(J),mu=rnorm(1),sigma.y=runif(1), sigma.a=runif(1))
  }''')

jags_schools = r2jags.jags(data = jags_data, inits = jags_inits,
                           parameters_to_save = jags_params,
                           n_iter = 10000,
                           n_chains = 3, n_thin = 100,
                           model_file = "testing.bug")

print jags_schools