Starcraft II Build Order Classification

Proposal
- AlphaStar plays really well (executes actions close to flawlessly), however, the type of strategies it employs are limited and does not adapt to opponent moves especially in the late game.
- Informed MCTS is an interesting strategy to enable exploitation of possible enemy action distribution.
- However, we first require some prediction of enemy actions. Here, Bayesian classification and multivariate regression can help.
- First cluster buildorder sequences to inform priors.
- Next feed information from clustering into regression for next possible step taken.
Data Visualization
- spawningtool extracted data: Discrete action list of units, structures, and upgrades produced. Unfortunately does not make available game-level resource information such as minerals available or supply utilization.
- sc2reader parsing helper functions adapted from IBM/starcraft2-replay-analysis under Apache 2.0 License. sc2reader helps us with the extraction of resource information. In particular, we are interested in the distribution of resources across the 3 classes {economy, army, technology} as the game progresses.

Time Series extracted information is a huge dictionary for each player's statistics in a match and follows the following structured format. The 'data' field is a pandas dataframe with each column as a distinct feature in this particular instance of a time series (1 game for 1 player).

'1597403940_2305463': {
     'race': 'P',
     'matchup': 'PvP',
     'data':      mineral_collection_rate  mineral_per_worker_rate  mineral_queued_army  \
     0                          0                 0.000000                    0   
     1                        293                24.416667                    0   
     2                        671                51.615385                    0   
     3                        671                51.615385                    0   
     4                        755                53.928571                    0   
     ..                       ...                      ...                  ...   
     121                     1903                32.254237                    0   
     122                     1875                31.779661                    0   
     123                     1903                33.982143                    0   
     124                     1903                35.240741                    0   
     125                     1903                35.240741                    0 
...

Action Sequence extracted information (pending compression & cleanup) largely follows the same format.

'1597403940_2305463': {
     'race': 'P',
     'matchup': 'PvP',
     'data': {
          [...] # Raw spawningtool 'buildOrder' field dictionary for now => can be compressed
     }
}

As the data is quite large, we use python's (>=3.0) in-built compression tool bzip2 to reduce our footprint (to around 10MB from a 350MB replay pack - of course, we also threw away quite some information along the way). Each set of extracted data can be easily extracted and loaded into a python dictionary using the included zipUtil functions:

from zipUtil import zip_write, zip_read
# Write
dic = {} # some dictionary
zip_write('filename', dic)
# Read
read_dic = zip_read('filename')

State-Action Generation
- The nearest slice of timeseries information is found for each action taken and put into a (state,action) feature-label pair. In addition to the timeseries information (on economic resources), accumulated actions taken thus far is also extracted as a set of discrete features.
Naive Bayes + Baselines
- Baseline: Random Forest Classifier & AdaBoost
- Naive Bayes: Condition on continuous, discrete, mixture + KDE likelihood estimation
- Gaussian Process Classification: Binary One-vs-Rest -> softmax
  - Hard to select inducing set, and did not perform above heuristic strategy of picking most frequent actions
- Results are put into a json and plotted here

Results

Top-1 and Top-3 label classification accuracy is tested and computed on a 70/10/20 train/validation/test split between the selected models. Unfortunately, we only average a measly ~10% improvement over the naive heuristic option of picking k-most frequent actions in order.

Features are highly correlated, Naive Bayes also doesn't do so well beyond top-1
Dataset may be underdetermined, in the sense that the same (for discrete features) or close feature vectors are labeled differently.

[P]	Baseline (RF)	Heuristic	Continuous (Multinomial)	Discrete (Complement)	Mixture	KDE (exp)
Top-1	0.412	0.399	0.399	0.421	0.420	0.444
Top-3	0.626	0.586	0.604	0.587	0.621	0.678

[T]	Baseline (RF)	Heuristic	Continuous (Multinomial)	Discrete (Complement)	Mixture	KDE (exp)
Top-1	0.426	0.352	0.410	0.424	0.427	0.446
Top-3	0.668	0.644	0.649	0.584	0.623	0.707

[Z]	Baseline (RF)	Heuristic	Continuous (Multinomial)	Discrete (Complement)	Mixture	KDE (exp)
Top-1	0.396	0.322	0.346	0.411	0.409	0.442
Top-3	0.638	0.596	0.608	0.648	0.669	0.716

Purpose

Automate clustering of replays into different types of Build Orders in an unsupervised and possibly online fashion.

Resources

Dataset: 1979 Replays (Patch >= 5.0.2 BU)

Pro-players, non-AI games

Curated Icon set: Structure,Unit,Upgrades Icon Set + File mapping

Just unzip into same directory as the notebooks and it should work
Python data stuctures used for mapping, will update to more portable .json format at a later time
All copyrights belong to Blizzard Entertainment Inc., fair usage for purposes of scientific publication/educational use in this project.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
data		data
img		img
tools		tools
.gitignore		.gitignore
Dataset Visualization.ipynb		Dataset Visualization.ipynb
Feature Extraction.ipynb		Feature Extraction.ipynb
Naive Bayes.ipynb		Naive Bayes.ipynb
Proposal.md		Proposal.md
PvT_1614448222_7313968-7314011.SC2Replay		PvT_1614448222_7313968-7314011.SC2Replay
README.md		README.md
Results Visualization.ipynb		Results Visualization.ipynb
State-Action Generation.ipynb		State-Action Generation.ipynb
dataset_config.py		dataset_config.py
icon_mapping.py		icon_mapping.py
lotv_constants.py		lotv_constants.py
sc2replayParsers.py		sc2replayParsers.py
spawningtoolParsers.py		spawningtoolParsers.py
visualizeBuildOrder.py		visualizeBuildOrder.py
zipUtil.py		zipUtil.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Starcraft II Build Order Classification

Contents

Results

Purpose

Resources

About

Uh oh!

Releases

Packages

Languages

devYaoYH/SCII_BO_Classifier

Folders and files

Latest commit

History

Repository files navigation

Starcraft II Build Order Classification

Contents

Results

Purpose

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages