-
- AlphaStar plays really well (executes actions close to flawlessly), however, the type of strategies it employs are limited and does not adapt to opponent moves especially in the late game.
- Informed MCTS is an interesting strategy to enable exploitation of possible enemy action distribution.
- However, we first require some prediction of enemy actions. Here, Bayesian classification and multivariate regression can help.
- First cluster buildorder sequences to inform priors.
- Next feed information from clustering into regression for next possible step taken.
-
- spawningtool extracted data: Discrete action list of units, structures, and upgrades produced. Unfortunately does not make available game-level resource information such as minerals available or supply utilization.
- sc2reader parsing helper functions adapted from IBM/starcraft2-replay-analysis under Apache 2.0 License. sc2reader helps us with the extraction of resource information. In particular, we are interested in the distribution of resources across the 3 classes {economy, army, technology} as the game progresses.
-
- Time Series extracted information is a huge dictionary for each player's statistics in a match and follows the following structured format. The
'data'field is a pandas dataframe with each column as a distinct feature in this particular instance of a time series (1 game for 1 player).
'1597403940_2305463': { 'race': 'P', 'matchup': 'PvP', 'data': mineral_collection_rate mineral_per_worker_rate mineral_queued_army \ 0 0 0.000000 0 1 293 24.416667 0 2 671 51.615385 0 3 671 51.615385 0 4 755 53.928571 0 .. ... ... ... 121 1903 32.254237 0 122 1875 31.779661 0 123 1903 33.982143 0 124 1903 35.240741 0 125 1903 35.240741 0 ...- Action Sequence extracted information (pending compression & cleanup) largely follows the same format.
'1597403940_2305463': { 'race': 'P', 'matchup': 'PvP', 'data': { [...] # Raw spawningtool 'buildOrder' field dictionary for now => can be compressed } }- As the data is quite large, we use python's (>=3.0) in-built compression tool bzip2 to reduce our footprint (to around 10MB from a 350MB replay pack - of course, we also threw away quite some information along the way). Each set of extracted data can be easily extracted and loaded into a python dictionary using the included
zipUtilfunctions:
from zipUtil import zip_write, zip_read # Write dic = {} # some dictionary zip_write('filename', dic) # Read read_dic = zip_read('filename')
- Time Series extracted information is a huge dictionary for each player's statistics in a match and follows the following structured format. The
-
- The nearest slice of timeseries information is found for each action taken and put into a (state,action) feature-label pair. In addition to the timeseries information (on economic resources), accumulated actions taken thus far is also extracted as a set of discrete features.
-
- Baseline: Random Forest Classifier & AdaBoost
- Naive Bayes: Condition on continuous, discrete, mixture + KDE likelihood estimation
- Gaussian Process Classification: Binary One-vs-Rest -> softmax
- Hard to select inducing set, and did not perform above heuristic strategy of picking most frequent actions
- Results are put into a json and plotted here
Top-1 and Top-3 label classification accuracy is tested and computed on a 70/10/20 train/validation/test split between the selected models. Unfortunately, we only average a measly ~10% improvement over the naive heuristic option of picking k-most frequent actions in order.
- Features are highly correlated, Naive Bayes also doesn't do so well beyond top-1
- Dataset may be underdetermined, in the sense that the same (for discrete features) or close feature vectors are labeled differently.
| [P] | Baseline (RF) | Heuristic | Continuous (Multinomial) | Discrete (Complement) | Mixture | KDE (exp) |
|---|---|---|---|---|---|---|
| Top-1 | 0.412 | 0.399 | 0.399 | 0.421 | 0.420 | 0.444 |
| Top-3 | 0.626 | 0.586 | 0.604 | 0.587 | 0.621 | 0.678 |
| [T] | Baseline (RF) | Heuristic | Continuous (Multinomial) | Discrete (Complement) | Mixture | KDE (exp) |
|---|---|---|---|---|---|---|
| Top-1 | 0.426 | 0.352 | 0.410 | 0.424 | 0.427 | 0.446 |
| Top-3 | 0.668 | 0.644 | 0.649 | 0.584 | 0.623 | 0.707 |
| [Z] | Baseline (RF) | Heuristic | Continuous (Multinomial) | Discrete (Complement) | Mixture | KDE (exp) |
|---|---|---|---|---|---|---|
| Top-1 | 0.396 | 0.322 | 0.346 | 0.411 | 0.409 | 0.442 |
| Top-3 | 0.638 | 0.596 | 0.608 | 0.648 | 0.669 | 0.716 |
Automate clustering of replays into different types of Build Orders in an unsupervised and possibly online fashion.
Dataset: 1979 Replays (Patch >= 5.0.2 BU)
- Pro-players, non-AI games
Curated Icon set: Structure,Unit,Upgrades Icon Set + File mapping
- Just unzip into same directory as the notebooks and it should work
- Python data stuctures used for mapping, will update to more portable .json format at a later time
- All copyrights belong to Blizzard Entertainment Inc., fair usage for purposes of scientific publication/educational use in this project.