logging in or signing up Paired Comparison and Multinomial Julie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 598 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Paired Comparison and Multiple Response Models: Paired Comparison and Multiple Response Models The Bradley-Terry Model Multinomial Logistic RegressionCopa America Penalty Shootouts: Copa America Penalty ShootoutsBradley-Terry Model: Bradley-Terry Model The probability that A beats B in a game is So that the odds of A beating B are proportional to their relative strengths: And the log odds (lods or logit) is a linear function Logistic Model: Logistic Model Treat the event as a Bernoulli trial. All these trials are independent (given the strengths fA and fB of the teams A and B). Therefore, we can use a generalized linear model with a binomial link. The parameters to be estimated are qA = log(fA) and qB = log(fB) The only trick is how to set up the variables and/or the design matrix from the link function: logit(E[Yn x 1]) = Xn x p qpx1Design Matrixfor Copa America Penalty Shootout: Design Matrix for Copa America Penalty Shootout Arg Braz Chile Col Ecua Hon Mex Para Peru Uru USA 0 0 0 1 0 0 0 0 0 -1 0 1 -1 0 0 0 0 0 0 0 0 0 1 0 0 -1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 1 0 -1 0 0 0 0 0 0 0 1 0 0 0 0 0 -1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 -1 0 1 0 0 0 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 -1 0 0 1 0 0 0 0 0 0 0 -1 0 -1 1 0 0 0 0 0 0 0 0 0Summary of glm fit: Summary of glm fit Call: glm(formula = resp ~ . - 1, family = binomial, data = bt1.dat) Deviance Residuals: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1.024 1.230 1.024 .00008 1.126 .00008 1.560 .00008 .00008 .00008 .00008 .00008 .8378 1.126 Coefficients: (2 not defined because of singularities) Estimate Std. Error z value Pr(>|z|) 3 Argentina 0.7439 1.6048 0.464 0.643 2 Brazil 0.8666 1.3858 0.625 0.532 6 Chile -19.5661 10754.0129 -0.002 0.999 4 Colombia 0.3719 1.6474 0.226 0.821 8 Ecuador -39.1321 15208.4709 -0.003 0.998 1 Honduras 19.5661 10754.0129 0.002 0.999 6 Mexico -19.5661 10754.0129 -0.002 0.999 5 Paraguay -19.3885 7570.8437 -0.003 0.998 8 Peru -39.1321 15208.4709 -0.003 0.998 Uruguay NA NA NA NA USA NA NA NA NA Null deviance: 19.4081 on 14 degrees of freedom Residual deviance: 9.2818 on 5 degrees of freedom AIC: 27.282Points to Notice: Points to Notice Response vector consists of all 1’s Only 14 observations to estimate 10 parameters Design matrix not adequate to estimate all 10 Implies there is some aliasing Results are only partially informative Standard errors are extremely large Ordering of top 4 seems reasonable: Honduras Brazil Argentina ColumbiaDominance Graph: Dominance Graph Co Ar Ho Ur US Br Ec Pa Pe Ch Mx Disconnected DesignR-code: R-code copa <- read.csv("F:\\Stat 665\\Data\\CopaAmerica.csv",as.is=T) names(copa) <- c("Year","Stage","Winner","Loser") attach(copa) # detach("copa") n.games <- nrow(copa) team.names <- sort(unique(c(Winner,Loser))) n.teams <- length(team.names) winnerColumns <- match(Winner,team.names) loserColumns <- match(Loser,team.names) X1 <- matrix(0,n.games,n.teams) seqn <- seq(n.games) Wspots <- cbind(seqn,winnerColumns) Lspots <- cbind(seqn,loserColumns) X1[Wspots] <- 1 X1[Lspots] <- -1 resp=rep(1,n.games) bt1.dat <- data.frame(resp,X1) bt1.model <- glm(resp ~ . -1,family=binomial,data=bt1.dat) summary(bt1.model) Multinomial Response: Multinomial Response Logit Models for Multinomial Response: Logit Models for Multinomial Response Response variable Y has J categories. Explanatory variable(s) X may be either discrete or continuous. Pr(Y=j | X=x) = pj (x), j = 1, 2, …, J Baseline-Category logit modelsImplication for Other Logits: Implication for Other Logits So all logits are linear functions of x.Estimation: Estimation Use software routines that fit the J-1 logits simultaneously More computationally efficient than fitting J-1 separate logistic regressions Gives smaller standard errors R-library nnet Make it accessible by the command >library(nnet) Find out details of the function using >?multinom Primary Food Choice of Alligators: Primary Food Choice of Alligators J=3 response categories Invertebrates Snails Aquatic Insects Crayfish Fish Other Turtles Frogs Buckeyes Baby alligators Explanatory variable X = Length of alligator Measured in meters Source: Florida Game and Fresh Water Fish Commission 59 gators sampled from Lake George, FloridaGator Data: Gator Data 1.24 I 1.30 I 1.30 I 1.32 F 1.32 F 1.40 F 1.42 I 1.42 F 1.45 I 1.45 O 1.47 I 1.47 F 1.50 I 1.52 I 1.55 I 1.60 I 1.63 I 1.65 O 1.65 I 1.65 F 1.65 F 1.68 F 1.70 I 1.73 O 1.78 I 1.78 I 1.78 O 1.80 I 1.80 F 1.85 F 1.88 I 1.93 I 1.98 I 2.03 F 2.03 F 2.16 F 2.26 F 2.31 F 2.31 F 2.36 F 2.36 F 2.39 F 2.41 F 2.44 F 2.46 F 2.56 O 2.67 F 2.72 I 2.79 F 2.84 F 3.25 O 3.28 O 3.33 F 3.56 F 3.58 F 3.66 F 3.68 O 3.71 F 3.89 F Output from multinom: Output from multinom Call: multinom(formula = Choice ~ Length, data = gators, test = "none") Coefficients: (Intercept) Length I 4.079701 -2.3553303 O -1.617713 0.1101012 Residual Deviance: 98.34124 AIC: 106.3412 Notes: The reference category is Fish (default is first alphabetical response category) Big gators lose their taste for invertibrates relative to Fish. Big gators develop a taste for more exotic fare.Recovering Response Probabilitiesfrom the Logit Models: Recovering Response Probabilities from the Logit Models Adding these three expressions gives Replacing the denominators with this expression then gives back etc.Probability of Food Choice vs. Length of Alligator: Probability of Food Choice vs. Length of AlligatorR-code for Gator Model & Plot: R-code for Gator Model & Plot Model >gators <- read.csv("F:\\Stat665\\Data\\GatorFoodPref.csv") >library(nnet) >mult.model <- multinom(Choice ~ Length,data=gators,hess=T) >mult.model Plot >attach(gators) >plot(rep(Length,3),mult.model$fitted.values,type="n", xlab="Length of Gator", ylab="Prob") >lines(Length,mult.model$fitted.values[,1],col=1,lty=1) >lines(Length,mult.model$fitted.values[,2],col=2,lty=2) >lines(Length,mult.model$fitted.values[,3],col=3,lty=3) >legend("right","(x,y)",legend=c("Fish","Invertebrates","Other"), col=c(1,2,3),lty=c(1,2,3)) You do not have the permission to view this presentation. In order to view it, please contact the author of the presentation.
Paired Comparison and Multinomial Julie Download Post to : URL : Related Presentations : Share Add to Flag Embed Email Send to Blogs and Networks Add to Channel Uploaded from authorPOINTLite Insert YouTube videos in PowerPont slides with aS Desktop Copy embed code: (To copy code, click on the text box) Embed: URL: Thumbnail: WordPress Embed Customize Embed The presentation is successfully added In Your Favorites. Views: 598 Category: Entertainment License: All Rights Reserved Like it (0) Dislike it (0) Added: November 15, 2007 This Presentation is Public Favorites: 0 Presentation Description No description available. Comments Posting comment... Premium member Presentation Transcript Paired Comparison and Multiple Response Models: Paired Comparison and Multiple Response Models The Bradley-Terry Model Multinomial Logistic RegressionCopa America Penalty Shootouts: Copa America Penalty ShootoutsBradley-Terry Model: Bradley-Terry Model The probability that A beats B in a game is So that the odds of A beating B are proportional to their relative strengths: And the log odds (lods or logit) is a linear function Logistic Model: Logistic Model Treat the event as a Bernoulli trial. All these trials are independent (given the strengths fA and fB of the teams A and B). Therefore, we can use a generalized linear model with a binomial link. The parameters to be estimated are qA = log(fA) and qB = log(fB) The only trick is how to set up the variables and/or the design matrix from the link function: logit(E[Yn x 1]) = Xn x p qpx1Design Matrixfor Copa America Penalty Shootout: Design Matrix for Copa America Penalty Shootout Arg Braz Chile Col Ecua Hon Mex Para Peru Uru USA 0 0 0 1 0 0 0 0 0 -1 0 1 -1 0 0 0 0 0 0 0 0 0 1 0 0 -1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 -1 0 0 0 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 0 0 1 0 -1 0 0 0 0 0 0 0 1 0 0 0 0 0 -1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 -1 0 0 0 0 0 0 0 0 0 -1 0 1 0 0 0 -1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 -1 0 0 1 0 0 0 0 0 0 0 -1 0 -1 1 0 0 0 0 0 0 0 0 0Summary of glm fit: Summary of glm fit Call: glm(formula = resp ~ . - 1, family = binomial, data = bt1.dat) Deviance Residuals: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1.024 1.230 1.024 .00008 1.126 .00008 1.560 .00008 .00008 .00008 .00008 .00008 .8378 1.126 Coefficients: (2 not defined because of singularities) Estimate Std. Error z value Pr(>|z|) 3 Argentina 0.7439 1.6048 0.464 0.643 2 Brazil 0.8666 1.3858 0.625 0.532 6 Chile -19.5661 10754.0129 -0.002 0.999 4 Colombia 0.3719 1.6474 0.226 0.821 8 Ecuador -39.1321 15208.4709 -0.003 0.998 1 Honduras 19.5661 10754.0129 0.002 0.999 6 Mexico -19.5661 10754.0129 -0.002 0.999 5 Paraguay -19.3885 7570.8437 -0.003 0.998 8 Peru -39.1321 15208.4709 -0.003 0.998 Uruguay NA NA NA NA USA NA NA NA NA Null deviance: 19.4081 on 14 degrees of freedom Residual deviance: 9.2818 on 5 degrees of freedom AIC: 27.282Points to Notice: Points to Notice Response vector consists of all 1’s Only 14 observations to estimate 10 parameters Design matrix not adequate to estimate all 10 Implies there is some aliasing Results are only partially informative Standard errors are extremely large Ordering of top 4 seems reasonable: Honduras Brazil Argentina ColumbiaDominance Graph: Dominance Graph Co Ar Ho Ur US Br Ec Pa Pe Ch Mx Disconnected DesignR-code: R-code copa <- read.csv("F:\\Stat 665\\Data\\CopaAmerica.csv",as.is=T) names(copa) <- c("Year","Stage","Winner","Loser") attach(copa) # detach("copa") n.games <- nrow(copa) team.names <- sort(unique(c(Winner,Loser))) n.teams <- length(team.names) winnerColumns <- match(Winner,team.names) loserColumns <- match(Loser,team.names) X1 <- matrix(0,n.games,n.teams) seqn <- seq(n.games) Wspots <- cbind(seqn,winnerColumns) Lspots <- cbind(seqn,loserColumns) X1[Wspots] <- 1 X1[Lspots] <- -1 resp=rep(1,n.games) bt1.dat <- data.frame(resp,X1) bt1.model <- glm(resp ~ . -1,family=binomial,data=bt1.dat) summary(bt1.model) Multinomial Response: Multinomial Response Logit Models for Multinomial Response: Logit Models for Multinomial Response Response variable Y has J categories. Explanatory variable(s) X may be either discrete or continuous. Pr(Y=j | X=x) = pj (x), j = 1, 2, …, J Baseline-Category logit modelsImplication for Other Logits: Implication for Other Logits So all logits are linear functions of x.Estimation: Estimation Use software routines that fit the J-1 logits simultaneously More computationally efficient than fitting J-1 separate logistic regressions Gives smaller standard errors R-library nnet Make it accessible by the command >library(nnet) Find out details of the function using >?multinom Primary Food Choice of Alligators: Primary Food Choice of Alligators J=3 response categories Invertebrates Snails Aquatic Insects Crayfish Fish Other Turtles Frogs Buckeyes Baby alligators Explanatory variable X = Length of alligator Measured in meters Source: Florida Game and Fresh Water Fish Commission 59 gators sampled from Lake George, FloridaGator Data: Gator Data 1.24 I 1.30 I 1.30 I 1.32 F 1.32 F 1.40 F 1.42 I 1.42 F 1.45 I 1.45 O 1.47 I 1.47 F 1.50 I 1.52 I 1.55 I 1.60 I 1.63 I 1.65 O 1.65 I 1.65 F 1.65 F 1.68 F 1.70 I 1.73 O 1.78 I 1.78 I 1.78 O 1.80 I 1.80 F 1.85 F 1.88 I 1.93 I 1.98 I 2.03 F 2.03 F 2.16 F 2.26 F 2.31 F 2.31 F 2.36 F 2.36 F 2.39 F 2.41 F 2.44 F 2.46 F 2.56 O 2.67 F 2.72 I 2.79 F 2.84 F 3.25 O 3.28 O 3.33 F 3.56 F 3.58 F 3.66 F 3.68 O 3.71 F 3.89 F Output from multinom: Output from multinom Call: multinom(formula = Choice ~ Length, data = gators, test = "none") Coefficients: (Intercept) Length I 4.079701 -2.3553303 O -1.617713 0.1101012 Residual Deviance: 98.34124 AIC: 106.3412 Notes: The reference category is Fish (default is first alphabetical response category) Big gators lose their taste for invertibrates relative to Fish. Big gators develop a taste for more exotic fare.Recovering Response Probabilitiesfrom the Logit Models: Recovering Response Probabilities from the Logit Models Adding these three expressions gives Replacing the denominators with this expression then gives back etc.Probability of Food Choice vs. Length of Alligator: Probability of Food Choice vs. Length of AlligatorR-code for Gator Model & Plot: R-code for Gator Model & Plot Model >gators <- read.csv("F:\\Stat665\\Data\\GatorFoodPref.csv") >library(nnet) >mult.model <- multinom(Choice ~ Length,data=gators,hess=T) >mult.model Plot >attach(gators) >plot(rep(Length,3),mult.model$fitted.values,type="n", xlab="Length of Gator", ylab="Prob") >lines(Length,mult.model$fitted.values[,1],col=1,lty=1) >lines(Length,mult.model$fitted.values[,2],col=2,lty=2) >lines(Length,mult.model$fitted.values[,3],col=3,lty=3) >legend("right","(x,y)",legend=c("Fish","Invertebrates","Other"), col=c(1,2,3),lty=c(1,2,3))