## Average abundance of strategy *A* as a function of the payoff entry *a*_{0}.

Our simulation results clearly show that strategy *A* is more abundant than strategy *B* if *a*_{0} is approximately greater than 2.0 for regular networks (*k* = 2) (Panel (a)). This is in perfect agreement with our calculation based on the theorem, i.e., inequality *a*_{0} + 2*a*_{1} + *a*_{2} > *b*_{0} + 2*b*_{1} + *b*_{2} with *a*_{1} = 2, *a*_{2} = 1, *b*_{0} = 4, *b*_{1} = 1 and *b*_{2} = 1. Note that the criterion *a*_{0} > 2 is valid for aspiration across various distributions. It implies that the criterion to favor one strategy over the other is independent of the individualised aspiration, as stated in the theorem. Furthermore, the criterion also holds beyond regular networks (Panel (a)), namely, on random (Panel (b)) and scale-free networks (Panel (c)). It suggests that the criterion can be extrapolated to general population structures. The details of the simulations are as follows: The minimum degree of all the networks is set to be two such that all the individuals have enough neighbours to play the three-player game with. Personal aspirations are randomly assigned in a population with homogeneous aspiration *e*_{i} = 2 for all *i* = 1, 2…*N* (blue ◯), and a population with heterogeneous aspirations generated based on uniform distribution on the interval [0, 5] (red □), bimodal distribution with (orange ◊), and power-law distribution with probability density function *f*(*x*) = 2*x*^{−3} (purple △). Here, stands for the normal distribution with mean 2.5 and standard deviation 0.5. In addition, the minimum value of aspiration sampled from the power-law distribution is 1.0. In the beginning, we randomly set 45% of the population to be of strategy *A* and the rest to be of strategy *B*. At each time step, the focal individual randomly chooses two individuals from its neighborhood and play a single three-player game with them to obtain the payoff. Fermi function is employed as the decision making function for all the individuals. Each data point is the mean of the average abundance of strategy *A* calculated from three independent runs (5 × 10^{9} samples in each run, 1.5 × 10^{10} samples in total). In each run, we start sampling after a relaxation time of 5 × 10^{7} time steps. The average abundance of strategy *A* is obtained by averaging the abundance of strategy *A* over 5 × 10^{9} time steps. The population size *N* = 1000. The selection intensity *β* = 0.005.