Thursday, July 13, 2006

 

Post the Eighteenth

Wherein your Host Announces that Phase II is Almost Complete

Life Plan
Phase I: Graduate School
Phase II: ICPSR
Phase III: ????
Phase IV: PROFIT

Only a week and some change left, I'm coming home soon everyone!

Labels:


Monday, July 10, 2006

 

Post the Seventeenth

Wherein your Host Praises the Gamma Distribution

Your host is a stupid man. But he is a stupid man who recently discovered one of his own mistakes and so is feeling pretty good right now!

I have a project that I’ve been working on for a year or more now and it just isn't going anywhere. I think the underlying ideas are interesting and important but the analysis wasn't holding up to statistical scrutiny.

One of the problems I have had with quantitative methods is learning to think statistically. This is hard to do when there is just so much out there that I simply haven’t learned yet. How could I think of framing a research question on differences in variation across some category when I didn’t know that heteroscedastic regression existed? How could I come up with a topic that examines different effects across various levels an observation is nested in when I didn’t know about hierarchical models?

It is also hard for me to think statistically because, as I have related previously I am math phobic. So today in class we were covering Generalized Linear Models. This is something a very able professor at UVA had attempted to teach me previously but we only covered the binomial and Poisson distributions. She had informed us that other probability distributions were available for analysis but, in my math-stupidity, I didn’t fully grasp what this meant.

Now I have discovered the glorious gamma distribution which far better fits my data than the normal, binomial or Poisson distributions. I get it now!

This:

Looks more like this (the red line):


Than it does like any of these:


After very quickly running some new analyses, it appears that, indeed, my hypothesized relationships do hold up to scrutiny (given a gamma distribution) and I might have a good conference paper (cross my fingers and pray it is publishable) on my hands now.

Oh the things one learns at math camp! Angels, saints, ministers of grace and methodologists pray for the humble student Nathan of modest mind who tries so hard yet has so far to go.

Credo ut intelligam.

Labels:


 

Post the Sixteenth

Wherein your Host Displays "Non-Quantitative" Tables

Today in class, a professor referred to "quantitative tables” – certainly a redundancy which he recognized – but it made me wonder what “non-quantitative tables” would look like...

... so I made up two sample "non-quantitative" tables for the qualitative methods folks and historians to use. Knock yourselves out guys:

Labels: ,


Thursday, July 06, 2006

 

Post the Fifteenth

Wherein your Host Proposes a Research Topic

Someone should look into this. Seriously.



... although I think their causal arrows are mixed up: it seems more likley that global warming is destroying delicate pirate habitats rather than a decrease in pirates is causing global warming. Maybe I'll use my new R skillz to investigate further...

Labels:


 

Post the Fourteenth

Wherein your Host Demonstrates Why R is Superior to STATA

To run a heteroskedastic regression in STATA where the independent variable is the vote gap between Republican and Democratic candidates (votegap) and the dependent variables are the partisan polls of the two major party candidates (dempoll, reppoll), the gap between these two polls (pollgap), dummy variables showing whether an incumbent is running (deminc, repinc) and we wanted to see if variance changed by days before the election (days2go) and depending on who was conducting the poll (dempoll, repopll) we would type:

ml model lf hetreg (slopes:votegap=dempoll reppoll pollgap deminc repinc) (variance: days2go dempoll reppoll)

-----------------------------------------------------------------------------------------------

In R, to run the same model, we would type:

hetreg<-function(y,X,Z,method=’BFGS’,Xnames=colnames(X),Znames=colnames(Z)) X<-cbind(1,X) colnames(X)[1]<-“Constant” nx<-ncol(X) Z<-cbind(1,Z) colnames(Z)[1]<-“Z Constant” nz<-ncol(Z)

negln<-function(theta,X,Z,y){
b<-theta[1:ncol(x)]
g<-theta[ncol(X)+1:ncol(Z)] lnl<-as.vector(-.5*(Z%*%g)-(.5/exp(Z%*%g))*(y-X%*%b)^2) -sum(ln)}

result<-c(optim(c(mean(y),rep(0,ncol(X)-1),log(var(y)),
rep(0,ncol(Z=neglnl, hessian=T, method=method, X=X, Z=Z, y=y),
list(varnames=c(Xnames,Znames),nx=nx,nz=nz))
class(result)<-“hatreg” return(result)

print.hetreg<-function(object{
coef<-object$par
names(coef)<-object$varnames print(coef)
if(object$convergence==0) cat(‘\n hetreg converged\n’)
if(!object$convergence==0) cat(‘n\ *** hetreg failed to converge *** \n’) invisible(object)}

summary.hetreg<-function(object, cover=FALSE){
coef<-object$par names(coef)<-object$varnames
nx<-object$nx nz<-object$nz
maxl<-object$value
vc<-solve(object$hessian)
colnames(vc)<-names(coef)
rownames(vc)<-names(coef)
se<-sqrt(diag(vc))
zscore<-coef/se
pz<- pnorm*-2(-abs(coef/se))
dn<-c(“Estimate”, “Std.Error”)
coef.table<-cbind(coef,se,zscore,pz) dimnames(coef.table)<-list(names(coef),c(dn,”z-value”, “Pr(>|z|)”))}

cat(“\n Heteroskedastic Linear Regression by Nathan A. Jones, Esq. of the Mad R Skillz \n”)
cat(“\n Estimated Parameters \n”)
print(coef.table)
cat(“\n Log-Likelihood: “,-object$value, “\n”)

if(cover{
cat(“\n Variance-Covariance Matrix for Parameters \n”
print(vc)}

ghat<-coef[(nx+2):length(coef)]
gvc<-vc[(nx+2):length(coef),(nx+2):length(coef)] wald<-t(ghat)%*%solve(gvc)%*%ghat
pwald<- -1-pchisq(wald,nz-1)
cat(“\n Wald Statistic: “,wald,”with”, nz-1, “degrees of freedom\n”) cat(“ p=”,pwald,”\n”)}

hregl<-hetreg(votegap,cbind(dempoll,reppoll,deminc,repinc,pollgap), cbind(days2go,dempoll,reppoll))

summary(hreg1)

-----------------------------------------------------------------------------------------------

The question here is: why code myself in R when someone far smarter than I am (Charles Franklin) has already coded the same formula in STATA for me?

One obvious answer is that by coding myself (see above), I can make the printout say "Heteroskedastic Linear Regression by Nathan A. Jones, Esq. of the Mad R Skillz" at the top of my computer screen. That, in and of itself, must be worth SOMETHING because the STATA print out just says "Results." Bo-RING.

I think I can see now why R is so much better than STATA.

Labels:


 

Post the Thirteenth

Wherein Your Host Presents A Comic



Labels: , ,


 

Post the Twelfth

Wherein your Host Recounts a Math Joke

A herpetologist grew frustrated while trying to mate two endangered snakes. After months of work she threw up her hands and exclaimed, “Nothing I’ve tried will get these snakes to breed!” One of the snakes looked up and said to her, “you could try dimming the lights.” The herpetologist was surprised at the talking snake but turned down the lights anyway.

A few weeks later, the snakes had still not yet mated and the herpetologist asked them: “I turned down the lights, is there anything else you need?” The second snake said, “Dimming the lights helped, but it still isn’t very romantic – could you put on some good music?” So the herpetologist got a Barry White album from her car and played some sweet soulful tunes near their cage.

A few weeks later, the snakes had still not yet mated and the herpetologist asked: “I turned down the lights and put on some romantic music, why aren’t you breeding?” The first snake said: “Well, it might seem silly, but back in our native jungle we had a coffee table made of wood that we really liked. If you built a table just like that in our cage, it would probably help.”

So the herpetologist got some logs and built the table and left it in the cage. A few weeks later she came back and there were hundreds of baby snakes. This story just goes to show that “with a log table even an adder can multiply.”

log(ab)=log(a)+log(b)

exp(log(a)+log(b))=ab

Labels: ,


Wednesday, July 05, 2006

 

Post the Eleventh

Wherein your Host Is Amused by the Co-op Sign

In fairness to Jones House (the co-op where I am staying for the month) they are TRYING to turn things around...

I understand the previous residents were put on probation for “deliberate destruction of ICC property.” The current crowd is doing a bit better -- entropy not intentional acts of destruction is doing most of the damage now. Objectively, the place is still a pit but at least they are TRYING…

And as evidence of these new efforts I submit to you, gentle readers, a sign that just appeared on the co-op front door. Apparently half of the disheveled people I see wandering through my hallways don’t live here. There should only be 14 people living here, but I see at least four or five sleeping on couches in the common room when I leave for class every morning.

So to fight this problem, the following sign appeared on the front door today:

If you want to sleep here tonight…
And you don’t have a contract…
And its because you are either:

a. A crackwhore
b. A homeless townie, who happens to be too lazy to get a job doing… anything!
c. Some other illegitimate, sketchy, worthless, coked up person

YOU CAN GO F**K YOURSELF.

We will throw you out of our humble abode because:

1. You are trespassing
2. We hate you
3. We don’t care if or how you die
4. You may have fleas.
5. We think you are ALL <= losers

(ed: clearly they know R code)

6. You’re going to hell either way.

Like I said, at least the lads and lasses of Jones House are trying. And I can't complain much since the co-op was 50% less than all other housing options I looked at.

Labels:


Sunday, July 02, 2006

 

Post the Tenth

Wherein your Host Submits a T-Shit Design

So there is a contest to design the t-shirt for summer methods camp and your host intends to win. My “real” entry will be a rather tame shirt that says “ICPSR Summer Methods Camp 2006” on the front and then has the R code to program that display on the back.

(Match THAT shirt with a pair of shorts and some black socks and you will really turn on the ladies, methods boys.)

My “other” design is “Chuck Norris versus the Quantitative Methodologists” and is two columns on the back:
* Chuck Norris does not sleep – he waits.
* Methodologists do not sleep – we do problem sets.

* Outer space exists because it is afraid to be in the same place with Chuck Norris.
* Residuals exist because data is afraid to be in the same place as our predictions.

* There is no evolution, just animals Chuck Norris allows to live.
* There is no population, just samples methodologists allow to represent it.

* Chuck Norris is the reason Waldo is hiding.
* Methodologists are the reason undergraduates don’t come to class.

* Chuck Norris counted to infinity – twice.
* Methodologists approach a limit of infinity – every day.

* The chief export of Chuck Norris is pain
* The chief export of methodologists are journal articles you can’t understand.

* Oscar Wilde is the Chuck Norris of words.
* R.A. Fisher is the Chuck Norris of regression.
Which shirt would you rather wear?

More Chuck Norris FACTS

Labels:


This page is powered by Blogger. Isn't yours?