Polarity's Character Ability Statistical Analysis (1/6/15)

2

Comments

  • Of all the simulations you've run, only Prehistoric Arms, Chemical Reaction, and Fireball have a lower average tiles destroyed in cascades then Mjolnir's Might. And no one thinks of those as being cascade starters, so it's not like Mjolnir's Might is in rarefied air or anything.

    Thinking about it, though, a cascade has to have a minimum of 3 tiles destroyed. However, there really doesn't have to be an upper limit to how many tiles are destroyed. Something like Mjolnir's Might is rarely going to cause a 10+ tile cascade, but it will happen enough when you run 200,000 simulations. I think the averages work out all right for all simulations because for an ability with a low probability for cascade, the 0's will drag down the average to where it should be. However, when you take the average of only the cascades, it is probably being pulled too high by a group of outliers. The most common result for Mjolnir's Might is probably something between 4 and 5. But with a set minimum of 3 and no real maximum, large cascades can skew the average. It's possible that for low probability abilities, or maybe all abilities in general, instead of using a mean average for the the tiles destroyed when a cascade occurs, the median would be a better indicator of how many tiles are typically being destroyed.

    I don't know how you would get a result between 4 or 5. What exactly did you mean here?
    The median would probably be 3, 5 or 6, not super useful. Further, it'd take a lot more programming to find the median as then you'd have to keep every result, not just sum up the totals.

    If we look at devil dino level 4:
    Probability of a cascade occurring: 0.278366
    Average tiles destroyed: 1.598566
    Average tiles destroyed when a match occurs: 5.74267690738


    Let's take 5 tiles in a row, where we know the middle 3 match.
    tutile.pngyellowtile.pngyellowtile.pngyellowtile.pngtutile.png
    What's the odds that the match will be larger than 3?
    1/7 chance that either side will match, which means the entire row is destroyed. 13/49 overall (assuming center placement). This means an extra 5 tiles.
    5 * 13/49 =~ 1.32

    Now the odds of also forming a T or + are about 1/16, adds about 1/8 on average

    This already takes us to over 4.4 minimum for the average when ANYTHING happens, NOT COUNTING cascades!

    Honestly, the results look reasonable.
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    Just out of curiosity, does the simulation account for any particular color biases when considering the moves used? For example, if I gathered 10 green AP for Mohawk's green, the board will almost certainly have notably less green than other colors.

    Either way, this is amazing. Really, really cool stuff, and thanks for making this. You rock! icon_e_biggrin.gif

    Thanks! The simulation does not take that into account... yet. The only way I can think of reasonably implementing this is programming the game's AI, and then programming a player AI that will try to always match green and then cast lightning storm or something like that, which is a ways off.
    daibar wrote:
    Of all the simulations you've run, only Prehistoric Arms, Chemical Reaction, and Fireball have a lower average tiles destroyed in cascades then Mjolnir's Might. And no one thinks of those as being cascade starters, so it's not like Mjolnir's Might is in rarefied air or anything.

    Thinking about it, though, a cascade has to have a minimum of 3 tiles destroyed. However, there really doesn't have to be an upper limit to how many tiles are destroyed. Something like Mjolnir's Might is rarely going to cause a 10+ tile cascade, but it will happen enough when you run 200,000 simulations. I think the averages work out all right for all simulations because for an ability with a low probability for cascade, the 0's will drag down the average to where it should be. However, when you take the average of only the cascades, it is probably being pulled too high by a group of outliers. The most common result for Mjolnir's Might is probably something between 4 and 5. But with a set minimum of 3 and no real maximum, large cascades can skew the average. It's possible that for low probability abilities, or maybe all abilities in general, instead of using a mean average for the the tiles destroyed when a cascade occurs, the median would be a better indicator of how many tiles are typically being destroyed.

    I don't know how you would get a result between 4 or 5. What exactly did you mean here?
    The median would probably be 3, 5 or 6, not super useful. Further, it'd take a lot more programming to find the median as then you'd have to keep every result, not just sum up the totals.

    If we look at devil dino level 4:
    Probability of a cascade occurring: 0.278366
    Average tiles destroyed: 1.598566
    Average tiles destroyed when a match occurs: 5.74267690738


    Let's take 5 tiles in a row, where we know the middle 3 match.
    tutile.pngyellowtile.pngyellowtile.pngyellowtile.pngtutile.png
    What's the odds that the match will be larger than 3?
    1/7 chance that either side will match, which means the entire row is destroyed. 13/49 overall (assuming center placement). This means an extra 5 tiles.
    5 * 13/49 =~ 1.32

    Now the odds of also forming a T or + are about 1/16, adds about 1/8 on average

    This already takes us to over 4.4 minimum for the average when ANYTHING happens, NOT COUNTING cascades!

    Honestly, the results look reasonable.

    Actually, I'm already saving the results of each run: one of the next features that I'm planning is a statistical breakdown of the types of matches that occur: so something along the lines of:
    0 tiles destroyed - X%
    3 tiles destroyed - Y%
    5 tiles destroyed - Z%
    .. so on and so forth.
  • Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

    Is it possible to test the difference between 4 and 5 blacktile.png covers of 2* Daken?
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    KevinMark wrote:
    Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

    Is it possible to test the difference between 4 and 5 blacktile.png covers of 2* Daken?

    The loki thing is almost assuredly from your mind remember it: statistically speaking higher levels have a higher chance of cascades.

    I cant test 4 vs 5 daken covers meaningfully. Phaserhawk gave some numbers for the probability that the board has that number of tiles on a fully random board, so you should probably use those. A meaningful simulation would be something like "what is the probability of the passive working given that you avoid matching blue", but the technology isnt there yet.
  • Pwuz_
    Pwuz_ Posts: 1,214 Chairperson of the Boards
    I apologize if this has been addressed already:

    Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    Pwuz_ wrote:
    I apologize if this has been addressed already:

    Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.

    It does not: I'm planning on adding crit tiles and when a match-5 exists on the board after the ability is cast (since that's essentially another free cascade), but the technology isn't there yet.
  • KevinMark wrote:
    Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

    Is it possible to test the difference between 4 and 5 blacktile.png covers of 2* Daken?

    It will always look that lower level is better because when you run level 5 illusion you basically get to see the result of what would've happened if it was level 1-5 (lower level would just stop earlier) and chances are very good one of those choices is better than the 5. But that's pretty meaningless because say you run it twice and one time level 3 is better than level 5, and another time level 4 is better than level 5. Well, if you ran level 3 or 4 you'd get still do worse than level 5 on the other time and over the long run the level 5 would average out better. You don't get to pick "3 and then 4" just because you can see that was the optimal result from running 2X5 twice.
  • Pwuz_
    Pwuz_ Posts: 1,214 Chairperson of the Boards
    Pwuz_ wrote:
    I apologize if this has been addressed already:

    Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.

    It does not: I'm planning on adding crit tiles and when a match-5 exists on the board after the ability is cast (since that's essentially another free cascade), but the technology isn't there yet.

    Quite impressive none the less though. Kudos on your work.
  • Pylgrim
    Pylgrim Posts: 2,332 Chairperson of the Boards
    So, if I'm reading this correctly, Loki's Purple hardly ever improves going from 3 to 5 covers? Good to keep in mind if they give him a third ability. Similarly with Storm, it seems that going from 3 green to 5 barely nets you anything beside 4 more random AP. It's arguable whether going into 5 yellow would be more useful.

    Could you please do the charts for Ragnarok's powers and Doc Ock's Manipulation? Oh and X-force.
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    Small update for Punisher's judgement. Other patterned destroy abilities such as Ragnarok, X-Force, etc will be next, but need further testing.
  • This is awesome. I only see one problem with it, which is not bad considering I'm clueless about this sort of thingie. Depending on the characters in the game, the colors of the board will not be random. For example, for thor if trying to proc red, you will collect red, so there will typically be less red on the board and normal amount of yellow (assuming the AI is not collecting yellow). However, if you are trying to collect yellow and "happen" to end up with 8 red before 12 yellow; the board will will have much less yellow than normal. Therefore, the probability of red making a yellow much is much less. Same thing if trying to collect green and happen to get enough yellow. The board will have much less green than a random or normal board.
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    This is awesome. I only see one problem with it, which is not bad considering I'm clueless about this sort of thingie. Depending on the characters in the game, the colors of the board will not be random. For example, for thor if trying to proc red, you will collect red, so there will typically be less red on the board and normal amount of yellow (assuming the AI is not collecting yellow). However, if you are trying to collect yellow and "happen" to end up with 8 red before 12 yellow; the board will will have much less yellow than normal. Therefore, the probability of red making a yellow much is much less. Same thing if trying to collect green and happen to get enough yellow. The board will have much less green than a random or normal board.

    Yup. I'm planning on having a future version of the simulator be able simulate a full MPQ match. Once this is possible, those problems go away. I could have an AI that models the current in game AI, another one representing the player who prioritizes say, green and black matches, and simulate the effects of the ability when the player AI gets say, 8 green to cast X-Force: the idea is to recreate the MPQ game engine from scratch, after all. That's a while away, so we'll have to deal with slightly suboptimal findings for now.
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    I am also building an engine, in Javascript.

    For critical tiles, i used the following technique.

    When searching for matches, i search for colors individually, and criticals are converted to each color.

    When we have a shape, that creates a critical, i do
    this.Where2Crit =function(arr){
    				if (arr.length<5) return false;  //arr = array of positions, position p = 8*y + x;
    				arr.sort();
    				for(i=0;i<arr.length;i++){
    					var p=arr[i];
    					var N=this.Neighbours(p,arr);  //returns array of neighboors extracted positions
    					if (N.length>2) return p;            // if T or + shape found
    					if (N.length==2 && N.avg()%1) return p; // if L shape found
    					}
    				return arr[Math.floor(arr.length/2)]; //if | shape, find middle position.
    				}
    
    

    might not be 100% game accurate, but accurate enough for simulation.
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    basic test of few abilities, on randomly generated board: (10000 simulations)
    Rags/Green: edit -removed because it was data feom buggy code-

    Daken/Blue
    additional cascade: 2107
    AP generated
    165(blue),6710 (green),237,243,236,274,194
    948 criticals matched. (1 can matched twice)
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    edited November 2014
    rerunned my scripts after some bugfixes, and created some heatmaps for ragnarok green & daken blue.

    Open spoiler to view:
    Rags green:
    rags.png

    Daken blue:
    dakblue.png
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    Magneto red / Groot yellow / Xforce black
    TUAP/Black generation on image is false, actual is 1200ish TUAP collected / 10000 ability uses
    magred2.png
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    What is the correct swap algorithm for Loki?

    I used:
    L=L.difference(B).shuffle();  // List of non-teamup tiles, shuffled
    var len=L.length;
    var n=14;                            // Set number of tiles to swap
    if (n>L.length) n=L.length;   // if there are less non-teamup tiles, limit swaps to available tiles number
    for(j=0;j<n;j++){
    	var p1=L[j];                               // Select a random "source" tile (array was previously shuffled!)
    	var p2=L[randomInt(len,[j])];      // Select another tile, except "source" tile, and swap those
    	GG.Swap(p1,p2);
    	}
    

    Chance of cascade (cover 1-5): 66 - 71 - 73 - 74.5 - 75.5 %

    edit: done some bugfixes, updated

    however, modifying strictness of tile reselection can affect this ratio about 2%.
    (less strict => bigger gaps )
    (excluding complete disable of tile reselection)
  • NorthernPolarity
    NorthernPolarity Posts: 3,531 Chairperson of the Boards
    turul wrote:
    What is the correct swap algorithm for Loki?

    I used:
    L=L.difference(B).shuffle();  // List of non-teamup tiles, shuffled
    var len=L.length;
    var n=14;                            // Set number of tiles to swap
    if (n>L.length) n=L.length;   // if there are less non-teamup tiles, limit swaps to available tiles number
    for(j=0;j<n;j++){
    	var p1=L[j];                               // Select a random "source" tile (array was previously shuffled!)
    	var p2=L[randomInt(len,[j])];      // Select another tile, except "source" tile, and swap those
    	GG.Swap(p1,p2);
    	}
    

    Chance of cascade (cover 1-5): 66 - 71 - 73 - 74.5 - 75.5 %

    edit: done some bugfixes, updated

    however, modifying strictness of tile reselection can affect this ratio about 2%.
    (less strict => bigger gaps )
    (excluding complete disable of tile reselection)

    That looks correct, and roughly matches my results. I think ensuring that non TU tiles are swapped is relevant for calculating ensuing cascades, so you can't exactly hand wave that fact away by reducing the number of swaps.

    Updated the original post with using Loki's illusions right after polarizing force. Added color breakdown stats to illusions.
  • turul
    turul Posts: 1,622 Chairperson of the Boards
    Cascade chances for Loki (cover 3-5): (by my algorithm!)

    after Mags red: 83 - 84 - 86 %
    after XF black: 77.5 - 78.5 - 80 %
    Mags->XF->Loki: 86 - 87.5 - 89 %

    (differences may be due to my algorithm creating and resolving criticals)
  • I'm a little confused; I thought level 1 illusions swapped 7 pairs of colored tiles, but my memory comes from Loki Lightning rounds and the approximate timing of the swap. Is that not the way it's implemented? Unless I'm misunderstanding, your implementation has it so that tiles can be swapped back into the same place and has twice the swaps.

    I would have thought it would be p2 = random(len-n)+n