- 8 Group winner teams will be seeded.
- 8 Group runner-up teams will be unseeded.
- Teams coming from the same group and from same association can not be drawn against.

*20 million times)*using randomly generated draws. I have called this exercise a Monte Carlo simulation, rather a simple sort while there is no fancy sampling of any sort, only rejection of pair up or the given draw if the last remaining pair does not satisfy the rules. A statistical language GNU R is used to generate the likely outcome. We use Mersenne Twister RNG, a default random number generator which is known to have very large cycle. In result we basically count the likely pair-ups in a matrix (matrix of pairs) and find the frequency of each likely pair up. Simulation takes 3-4 hours on a single CPU Intel Xeon with 16 GB RAM running Ubuntu GNU/Linux. Well, to be honest I haven't optimized the code.

Qualified teams, their groups and associations definitions, the simulation and results reporting R code are as follows (note that we use xtable for html outputs, for other functions we use, see below codes after tables):

winnerTeams <- c('Paris-Saint-Germain', 'Schalke-04', 'Malaga', 'Borussia-Dortmund', 'Juventus', 'Bayern-Munich', 'Barcelona', 'Manchester-United'); winnerAssociation <- c('FR', 'DE', 'ES', 'DE', 'IT', 'DE', 'ES', 'ENG'); runnersUpTeams <- c('Porto', 'Arsenal', 'Milan', 'Real-Madrid', 'Shakhtar-Donetsk', 'Valencia', 'Celtic', 'Galatasaray'); runnersUpTeamsAssociation <- c('PT', 'ENG', 'IT', 'ES', 'UA', 'ES', 'SCO', 'TR'); countMatrix <- matrix(0, 8, 8) row.names(countMatrix) <- winnerTeams; colnames(countMatrix) <- runnersUpTeams; many <- 20e6; system.time(drawMany(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, countMatrix, many)) countMatrix <- countMatrix/many; print(countMatrix);

Simulations results can be interpreted based on frequencies (probabilities) of pairings, rather intuitively while probabilities are not that far off . For example if we consider Barcelona, Milan and Arsenal score the largest 0.23 and 0.21. So my guess based on these frequencies, which I select maximums first ,then the second maximum and so on. If all ties I select the highest count using the second table. Here are the predicted pairs, (ordered with highest probability):

Barcelona - Milan

Malaga - Arsenal

Bayern Munich - Real Madrid

Borissia Dordmund - Valencia

Manchester United - Celtic

Juventus - Galatasaray

Schalke 04 - Porto

PSG - Donetsk

Note that predicted pairs are quite depending on your selection strategy.

Table for the frequencies:

Porto | Arsenal | Milan | Real-Madrid | Shakhtar-Donetsk | Valencia | Celtic | Galatasaray | |
---|---|---|---|---|---|---|---|---|

Paris-Saint-Germain | 0.00 | 0.13 | 0.14 | 0.18 | 0.12 | 0.18 | 0.12 | 0.12 |

Schalke-04 | 0.12 | 0.00 | 0.15 | 0.19 | 0.12 | 0.19 | 0.13 | 0.12 |

Malaga | 0.19 | 0.23 | 0.00 | 0.00 | 0.19 | 0.00 | 0.20 | 0.19 |

Borussia-Dortmund | 0.13 | 0.14 | 0.15 | 0.00 | 0.13 | 0.19 | 0.13 | 0.13 |

Juventus | 0.13 | 0.15 | 0.00 | 0.22 | 0.00 | 0.22 | 0.14 | 0.13 |

Bayern-Munich | 0.13 | 0.14 | 0.15 | 0.19 | 0.13 | 0.00 | 0.13 | 0.13 |

Barcelona | 0.18 | 0.21 | 0.23 | 0.00 | 0.19 | 0.00 | 0.00 | 0.18 |

Manchester-United | 0.13 | 0.00 | 0.16 | 0.22 | 0.13 | 0.22 | 0.14 | 0.00 |

Table for the counts (actually all are integers):

Porto | Arsenal | Milan | Real-Madrid | Shakhtar-Donetsk | Valencia | Celtic | Galatasaray | |
---|---|---|---|---|---|---|---|---|

Paris-Saint-Germain | 0.00 | 2589164.00 | 2897024.00 | 3658337.00 | 2356581.00 | 3658892.00 | 2494247.00 | 2345755.00 |

Schalke-04 | 2348458.00 | 0.00 | 2924099.00 | 3735314.00 | 2371245.00 | 3743246.00 | 2517610.00 | 2360028.00 |

Malaga | 3781019.00 | 4506559.00 | 0.00 | 0.00 | 3845872.00 | 0.00 | 3997679.00 | 3868871.00 |

Borussia-Dortmund | 2500221.00 | 2819591.00 | 3098797.00 | 0.00 | 2542013.00 | 3863888.00 | 2646626.00 | 2528864.00 |

Juventus | 2659773.00 | 2983255.00 | 0.00 | 4393444.00 | 0.00 | 4392158.00 | 2892103.00 | 2679267.00 |

Bayern-Munich | 2500835.00 | 2818331.00 | 3098181.00 | 3866890.00 | 2542742.00 | 0.00 | 2647340.00 | 2525681.00 |

Barcelona | 3620075.00 | 4283100.00 | 4684023.00 | 0.00 | 3721268.00 | 0.00 | 0.00 | 3691534.00 |

Manchester-United | 2589619.00 | 0.00 | 3297876.00 | 4346015.00 | 2620279.00 | 4341816.00 | 2804395.00 | 0.00 |

This is the main simulation function:

drawMany <- function(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, countMatrix, many) { for(i in 1:many) { repeat { dr <- drawOneKnock(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation,0); if(sum(dr) > 0) break; } updateCount <- mapply(incMatrix, dr[,1], dr[,2]) } }

A single draw can be generated as follows:

drawOneKnock <- function(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, names=1) { k=1 repeat { k=k+1; if(k > 1000) return(-1); blockWin = 1:8 ; # tracking for draw blockRun = blockWin; winners = c(); # Draw results runners = c(); for(i in 1:7) { kk =1; repeat { kk=kk+1; if(kk > 1000) return(-1); winner <- sample(blockWin, 1); runner <- sample(blockRun, 1); if(!(runner == winner) && !(winnerAssociation[winner] == runnersUpTeamsAssociation[runner])) { break; } } blockWin <- blockWin[-which(blockWin == winner)]; blockRun <- blockRun[-which(blockRun == runner)]; winners <- c(winners, winner); runners <- c(runners, runner); } winner <- blockWin; runner <- blockRun; # check if last remaining is ok, otherwise invalidate draw if(!(runner == winner) && !(winnerAssociation[winner] == runnersUpTeamsAssociation[runner])) { winners <- c(winners, blockWin); runners <- c(runners, blockRun); if(names) dr <- cbind(winnerTeams[winners], runnersUpTeams[runners]); if(!names) dr <- cbind(winners, runners); break; } } dr }

And counting the pair-ups is performed by a simple function:

incMatrix <- function(i, j) { countMatrix[i,j] <<- countMatrix[i,j]+1; return(0); }

## No comments:

Post a Comment