Polarity's Character Ability Statistical Analysis (1/6/15)

Unknown · November 2014

Disembiggen wrote:

Of all the simulations you've run, only Prehistoric Arms, Chemical Reaction, and Fireball have a lower average tiles destroyed in cascades then Mjolnir's Might. And no one thinks of those as being cascade starters, so it's not like Mjolnir's Might is in rarefied air or anything.

Thinking about it, though, a cascade has to have a minimum of 3 tiles destroyed. However, there really doesn't have to be an upper limit to how many tiles are destroyed. Something like Mjolnir's Might is rarely going to cause a 10+ tile cascade, but it will happen enough when you run 200,000 simulations. I think the averages work out all right for all simulations because for an ability with a low probability for cascade, the 0's will drag down the average to where it should be. However, when you take the average of only the cascades, it is probably being pulled too high by a group of outliers. The most common result for Mjolnir's Might is probably something between 4 and 5. But with a set minimum of 3 and no real maximum, large cascades can skew the average. It's possible that for low probability abilities, or maybe all abilities in general, instead of using a mean average for the the tiles destroyed when a cascade occurs, the median would be a better indicator of how many tiles are typically being destroyed.

I don't know how you would get a result between 4 or 5. What exactly did you mean here?
The median would probably be 3, 5 or 6, not super useful. Further, it'd take a lot more programming to find the median as then you'd have to keep every result, not just sum up the totals.

If we look at devil dino level 4:
Probability of a cascade occurring: 0.278366
Average tiles destroyed: 1.598566
Average tiles destroyed when a match occurs: 5.74267690738

Let's take 5 tiles in a row, where we know the middle 3 match.

What's the odds that the match will be larger than 3?
1/7 chance that either side will match, which means the entire row is destroyed. 13/49 overall (assuming center placement). This means an extra 5 tiles.
5 * 13/49 =~ 1.32

Now the odds of also forming a T or + are about 1/16, adds about 1/8 on average

This already takes us to over 4.4 minimum for the average when ANYTHING happens, NOT COUNTING cascades!

Honestly, the results look reasonable.

NorthernPolarity · November 2014

Budget Player Cadet wrote:

Just out of curiosity, does the simulation account for any particular color biases when considering the moves used? For example, if I gathered 10 green AP for Mohawk's green, the board will almost certainly have notably less green than other colors.

Either way, this is amazing. Really, really cool stuff, and thanks for making this. You rock!

Thanks! The simulation does not take that into account... yet. The only way I can think of reasonably implementing this is programming the game's AI, and then programming a player AI that will try to always match green and then cast lightning storm or something like that, which is a ways off.

daibar wrote:

Disembiggen wrote:

Of all the simulations you've run, only Prehistoric Arms, Chemical Reaction, and Fireball have a lower average tiles destroyed in cascades then Mjolnir's Might. And no one thinks of those as being cascade starters, so it's not like Mjolnir's Might is in rarefied air or anything.

Thinking about it, though, a cascade has to have a minimum of 3 tiles destroyed. However, there really doesn't have to be an upper limit to how many tiles are destroyed. Something like Mjolnir's Might is rarely going to cause a 10+ tile cascade, but it will happen enough when you run 200,000 simulations. I think the averages work out all right for all simulations because for an ability with a low probability for cascade, the 0's will drag down the average to where it should be. However, when you take the average of only the cascades, it is probably being pulled too high by a group of outliers. The most common result for Mjolnir's Might is probably something between 4 and 5. But with a set minimum of 3 and no real maximum, large cascades can skew the average. It's possible that for low probability abilities, or maybe all abilities in general, instead of using a mean average for the the tiles destroyed when a cascade occurs, the median would be a better indicator of how many tiles are typically being destroyed.

I don't know how you would get a result between 4 or 5. What exactly did you mean here?
The median would probably be 3, 5 or 6, not super useful. Further, it'd take a lot more programming to find the median as then you'd have to keep every result, not just sum up the totals.

If we look at devil dino level 4:
Probability of a cascade occurring: 0.278366
Average tiles destroyed: 1.598566
Average tiles destroyed when a match occurs: 5.74267690738

Let's take 5 tiles in a row, where we know the middle 3 match.

What's the odds that the match will be larger than 3?
1/7 chance that either side will match, which means the entire row is destroyed. 13/49 overall (assuming center placement). This means an extra 5 tiles.
5 * 13/49 =~ 1.32

Now the odds of also forming a T or + are about 1/16, adds about 1/8 on average

This already takes us to over 4.4 minimum for the average when ANYTHING happens, NOT COUNTING cascades!

Honestly, the results look reasonable.

Actually, I'm already saving the results of each run: one of the next features that I'm planning is a statistical breakdown of the types of matches that occur: so something along the lines of:
0 tiles destroyed - X%
3 tiles destroyed - Y%
5 tiles destroyed - Z%
.. so on and so forth.

Unknown · November 2014

Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

Is it possible to test the difference between 4 and 5

covers of 2* Daken?

NorthernPolarity · November 2014

KevinMark wrote:

Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

Is it possible to test the difference between 4 and 5 covers of 2* Daken?

The loki thing is almost assuredly from your mind remember it: statistically speaking higher levels have a higher chance of cascades.

I cant test 4 vs 5 daken covers meaningfully. Phaserhawk gave some numbers for the probability that the board has that number of tiles on a fully random board, so you should probably use those. A meaningful simulation would be something like "what is the probability of the passive working given that you avoid matching blue", but the technology isnt there yet.

Pwuz_ · November 2014

I apologize if this has been addressed already:

Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.

NorthernPolarity · November 2014

Pwuz_ wrote:

I apologize if this has been addressed already:

Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.

It does not: I'm planning on adding crit tiles and when a match-5 exists on the board after the ability is cast (since that's essentially another free cascade), but the technology isn't there yet.

Unknown · November 2014

KevinMark wrote:

Recently it felt like Loki's Illusions were better at making matches at lower covers to me. Did anyone feel this way? My Loki has one purple cover and I feel it's more successful than the ones I face in LR or PvEs which have more purple covers. I use it with a board with no decent benefit to me and it often makes a 5-match, 4-match and/or some cascading. With AI though while it's changing location of tiles, it sets up a match and then breaks it before changing tiles ends making the board worse. Probably not worse than original though. It is possible it's just these instances happened to me and I took note of them cos of cognitive bias (it was called like that I think).

Is it possible to test the difference between 4 and 5 covers of 2* Daken?

It will always look that lower level is better because when you run level 5 illusion you basically get to see the result of what would've happened if it was level 1-5 (lower level would just stop earlier) and chances are very good one of those choices is better than the 5. But that's pretty meaningless because say you run it twice and one time level 3 is better than level 5, and another time level 4 is better than level 5. Well, if you ran level 3 or 4 you'd get still do worse than level 5 on the other time and over the long run the level 5 would average out better. You don't get to pick "3 and then 4" just because you can see that was the optimal result from running 2X5 twice.

Pwuz_ · November 2014

NorthernPolarity wrote:

Pwuz_ wrote:

I apologize if this has been addressed already:

Does the simulation account for the Critical Tiles that any Match 5 generates? If not some of these abilities may be skewed a bit toward the low side since those Critical Tiles match with any 2 tiles resulting in much bigger cascades.

It does not: I'm planning on adding crit tiles and when a match-5 exists on the board after the ability is cast (since that's essentially another free cascade), but the technology isn't there yet.

Quite impressive none the less though. Kudos on your work.

Pylgrim · November 2014

So, if I'm reading this correctly, Loki's Purple hardly ever improves going from 3 to 5 covers? Good to keep in mind if they give him a third ability. Similarly with Storm, it seems that going from 3 green to 5 barely nets you anything beside 4 more random AP. It's arguable whether going into 5 yellow would be more useful.

Could you please do the charts for Ragnarok's powers and Doc Ock's Manipulation? Oh and X-force.

NorthernPolarity · November 2014

Small update for Punisher's judgement. Other patterned destroy abilities such as Ragnarok, X-Force, etc will be next, but need further testing.

Unknown · November 2014

This is awesome. I only see one problem with it, which is not bad considering I'm clueless about this sort of thingie. Depending on the characters in the game, the colors of the board will not be random. For example, for thor if trying to proc red, you will collect red, so there will typically be less red on the board and normal amount of yellow (assuming the AI is not collecting yellow). However, if you are trying to collect yellow and "happen" to end up with 8 red before 12 yellow; the board will will have much less yellow than normal. Therefore, the probability of red making a yellow much is much less. Same thing if trying to collect green and happen to get enough yellow. The board will have much less green than a random or normal board.

NorthernPolarity · November 2014

stephen43084 wrote:

This is awesome. I only see one problem with it, which is not bad considering I'm clueless about this sort of thingie. Depending on the characters in the game, the colors of the board will not be random. For example, for thor if trying to proc red, you will collect red, so there will typically be less red on the board and normal amount of yellow (assuming the AI is not collecting yellow). However, if you are trying to collect yellow and "happen" to end up with 8 red before 12 yellow; the board will will have much less yellow than normal. Therefore, the probability of red making a yellow much is much less. Same thing if trying to collect green and happen to get enough yellow. The board will have much less green than a random or normal board.

Yup. I'm planning on having a future version of the simulator be able simulate a full MPQ match. Once this is possible, those problems go away. I could have an AI that models the current in game AI, another one representing the player who prioritizes say, green and black matches, and simulate the effects of the ability when the player AI gets say, 8 green to cast X-Force: the idea is to recreate the MPQ game engine from scratch, after all. That's a while away, so we'll have to deal with slightly suboptimal findings for now.

turul · November 2014

I am also building an engine, in Javascript.

For critical tiles, i used the following technique.

When searching for matches, i search for colors individually, and criticals are converted to each color.

When we have a shape, that creates a critical, i do

this.Where2Crit =function(arr){
				if (arr.length&lt;5) return false;  //arr = array of positions, position p = 8*y + x;
				arr.sort();
				for(i=0;i&lt;arr.length;i++){
					var p=arr&#91;i&#93;;
					var N=this.Neighbours(p,arr);  //returns array of neighboors extracted positions
					if (N.length&gt;2) return p;            // if T or + shape found
					if (N.length==2 && N.avg()%1) return p; // if L shape found
					}
				return arr&#91;Math.floor(arr.length/2)&#93;; //if | shape, find middle position.
				}

might not be 100% game accurate, but accurate enough for simulation.

turul · November 2014

basic test of few abilities, on randomly generated board: (10000 simulations)
Rags/Green: edit -removed because it was data feom buggy code-

Daken/Blue
additional cascade: 2107
AP generated
165(blue),6710 (green),237,243,236,274,194
948 criticals matched. (1 can matched twice)

turul · November 2014

rerunned my scripts after some bugfixes, and created some heatmaps for ragnarok green & daken blue.

Open spoiler to view:
Rags green:

Daken blue:

turul · November 2014

Magneto red / Groot yellow / Xforce black

TUAP/Black generation on image is false, actual is 1200ish TUAP collected / 10000 ability uses

turul · December 2014

What is the correct swap algorithm for Loki?

I used:

L=L.difference(B).shuffle();  // List of non-teamup tiles, shuffled
var len=L.length;
var n=14;                            // Set number of tiles to swap
if (n&gt;L.length) n=L.length;   // if there are less non-teamup tiles, limit swaps to available tiles number
for(j=0;j&lt;n;j++){
	var p1=L&#91;j&#93;;                               // Select a random "source" tile (array was previously shuffled!)
	var p2=L&#91;randomInt(len,&#91;j&#93;)&#93;;      // Select another tile, except "source" tile, and swap those
	GG.Swap(p1,p2);
	}

Chance of cascade (cover 1-5): 66 - 71 - 73 - 74.5 - 75.5 %

edit: done some bugfixes, updated

however, modifying strictness of tile reselection can affect this ratio about 2%.
(less strict => bigger gaps )
(excluding complete disable of tile reselection)

NorthernPolarity · December 2014

turul wrote:
What is the correct swap algorithm for Loki?

I used:
L=L.difference(B).shuffle();  // List of non-teamup tiles, shuffled
var len=L.length;
var n=14;                            // Set number of tiles to swap
if (n&gt;L.length) n=L.length;   // if there are less non-teamup tiles, limit swaps to available tiles number
for(j=0;j&lt;n;j++){
	var p1=L&#91;j&#93;;                               // Select a random "source" tile (array was previously shuffled!)
	var p2=L&#91;randomInt(len,&#91;j&#93;)&#93;;      // Select another tile, except "source" tile, and swap those
	GG.Swap(p1,p2);
	}
Chance of cascade (cover 1-5): 66 - 71 - 73 - 74.5 - 75.5 %

edit: done some bugfixes, updated

however, modifying strictness of tile reselection can affect this ratio about 2%.
(less strict => bigger gaps )
(excluding complete disable of tile reselection)

That looks correct, and roughly matches my results. I think ensuring that non TU tiles are swapped is relevant for calculating ensuing cascades, so you can't exactly hand wave that fact away by reducing the number of swaps.

Updated the original post with using Loki's illusions right after polarizing force. Added color breakdown stats to illusions.

turul · December 2014

Cascade chances for Loki (cover 3-5): (by my algorithm!)

after Mags red: 83 - 84 - 86 %
after XF black: 77.5 - 78.5 - 80 %
Mags->XF->Loki: 86 - 87.5 - 89 %

(differences may be due to my algorithm creating and resolving criticals)

Unknown · December 2014

I'm a little confused; I thought level 1 illusions swapped 7 pairs of colored tiles, but my memory comes from Loki Lightning rounds and the approximate timing of the swap. Is that not the way it's implemented? Unless I'm misunderstanding, your implementation has it so that tiles can be swapped back into the same place and has twice the swaps.

I would have thought it would be p2 = random(len-n)+n

Polarity's Character Ability Statistical Analysis (1/6/15)

Comments

Categories