Skip to content

Conversation

@Magentah
Copy link
Contributor

@Magentah Magentah commented May 12, 2025

Initial commit for match stat events.

Replays can be parsed with w3gjs but needs 0x77 and 0x78 swapped as they're not correct.

When parsing, data needs to be base64 decoded then zlib inflateRaw.

Requires #35 because of typescript-to-lua not embedding lua files in the version we currently use, which is needed for this.

Magentah added 2 commits May 13, 2025 06:55
…s but it needs to be updated to fix the 0x77 and 0x78 actions being opposite.

Parsing the replay needs to base64 decode, then zlib inflateRaw.
---@class W3Metrics
---@field prefix string Prefix used when sending events. Used to identify syncdata packets as events. Should be 2 characters at most otherwise will cause latency issues
---@field byteBudget integer Maximum number of bytes to send in a single SyncData packet. There's a hard limit of 255, but more than 180 will likely cause lag and desync issues. Defaults to 150.
---@field flushInterval number How frequently to flush events in seconds. Defaults to 2.0 seconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend splitting the raw "transferring data" from the intent of the data. You're mixing communication layer (flushing, byte budgets, pooling) with application layer logic here.

If the events you're targeting here are purely for generating statistics etc, I would say that we should not have a time-based flushing interval at all. As long as the data is flushed eventually, we should not need to worry about when it's actually emitted.

If we re-use this library for the eventual backend communication stuff, it might make sense to split the two areas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed in discord but yup, that's the plan to do before making this not a draft

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has been separated to W3CData that handles the data schemas, compressing, packing, chunking, etc and W3CEvents that handles batching, flushing, etc.

Probably some things that could be moved around but I think it's cleaner than it was before


---comment
---@return integer
local function now() return math.floor(TimerGetElapsed(w3cMetrics.clock)) end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The game already has a clock. But it's worth noting that the game clock can be halted during lag or pauses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I saw when searching, people recommended to use this approach for being able to get game time. I wasn't able to find how to get the elapsed time from the existing clock, I'm probably just blind though

value: {},
}

payload.value["hero"] = GetUnitName(unit);
Copy link
Member

@marcoabreu marcoabreu May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware of localization and later processing - I'd recommend not working with names but IDs.

Same for the other cases, I'll refrain from commenting it in other places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Names are mostly just for me being able to make sense of the data when testing. I also don't know if there's a way during parsing to map ids to names or if we need to extract data for that or something

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm following your question. Could you rephrase?

Copy link
Contributor Author

@Magentah Magentah May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnitTypeIds are the id for a specific unit type, but I don't know if we can map that id easily outside of f the map itself to be able to know what the unit data actually is when parsing the payloads later. I think that data is something we need to extract from the map files and keep as a lookup table.

payload.value["skill"] = GetObjectName(GetLearnedSkill());
payload.value["skillLevel"] = GetLearnedSkillLevel();

W3Metrics.event("HeroSkill", payload);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for us to have schemas? This way, we can ensure compatibility, and it allows us to automatically generate visualizations templates etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about schemas tbh. I like the simplicity of basically having a simple json object with only strings/numbers. Having and enforcing schemas also makes the library itself quite a bit more complex I think, which I'd like to avoid.

As it is we can probably just let mapmakers know the schemas we expect for some things that they can use if the want, otherwise we can just parse it as keyvalues and make available for them without us providing any visualizations for them

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it's less about format and more about "these fields are expected".

So where would the visualizations live? My understanding is that we would want to host all the visualizations, for each game mode.

So there should be some sort of strategy as to how we deal with new or removed fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not relevant with w3cdata.lua. I also wasn't planning on hosting any visualizations for non-melee games, I was just expecting to make the data available to map makers so they can do what they want with it.

TriggerAddCondition(heroDamageTaken, Condition(checkIsPlayerHeroTarget));
TriggerAddAction(heroDamageTaken, onHeroDamaged);

W3CMetrics.track("HeroDamage", () => heroDamageMap, 5.0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That feels like it will EXPLODE in size. You're retrieving the same map every 5 seconds and you're never cleaning it during a game. And even if for a specific hero it has not changed since the last time, you're still emitting it.

It seems like the library (not this code here) should be able to detect diffs between maps/tables and remove data that has not changed.

Or rather, I'd say that we should reset this map on emit and submit the difference since the last time (it hence becomes a rate, rather than the absolute values).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably can, and only sending deltas / changes in the base lua library is something I want to add which should handle this

isHero = IsUnitType(unit, UNIT_TYPE_HERO);
}

const eventType =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not differentiate between heroes and units and hence emit different events but simply have the unit type be a property of the event.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense. I do that in another event I'm pretty sure. I'll need to go through and probably update the event data to be a bit nicer as they're very much just trying to see if I can get data at all and not caring about structure/schema

}
}

W3CMetrics.event("ResearchDone", payload);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there no ResearchStarted and ResearchCancelled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, I just didn't add it yet

import * as W3CMetrics from "../lua/w3cMetrics"
import { Players } from "w3ts/globals";

export function trackUnitDeaths() {
Copy link
Member

@marcoabreu marcoabreu May 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware that custom maps do unit deaths VASTLY different. They basically never ever really kill a unit, because it leaks memory.

Not relevant for melee, but just for you to keep in mind in terms of allowing to specify custom events for "UNIT_DEATH" which are not actually EVENT_PLAYER_UNIT_DEATH. Basically to register more than 1 event that corresponds to the same thing you want to track - in melee, the size of that list would be 1 (EVENT_PLAYER_UNIT_DEATH). In customs, it might be more.

The same applies to dmg calculation and unit creation btw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything in TS is only going to be used for W3C maps and not custom maps. Only the lua library would actually be used for custom maps, so I don't think it'll be a problem, but good to know!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that something we would want to change?

Of course not everything would make sense, but I think it would be good if we could have a library of defaults collectors that custom maps could use, rather than them having to define it themselves.

const killingPlayer = GetOwningPlayer(killingUnit);
const killingPlayerId = GetPlayerId(killingPlayer);

if (IsUnitType(killingUnit, UNIT_TYPE_HERO)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels .. odd. Is that the right way to do this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea, I don't know anything about WC3 scripting. It works though, and at least makes sense to me that it's probably a right way to tell if a unit is a hero or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I initially had my comment at a different line. Should've been more specific.

I mean, this whole approach of starting a timer and then reading the XP; is that the way to read the XP gained?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timer is used here because there's a slight delay before a unit receives xp after the unit dies. If you read it immediately, the xp hasn't updated yet. I dunno if there's a better way to do this, there's no event for xp gain from what I saw when searching.

@diewolke9
Copy link
Member

diewolke9 commented May 14, 2025

I am generally very impressed and satisfied with how quickly @Magentah made something quite comprehensive and mature. Obviously you're a far better and focused coder than I am, kudos to you for that.

Two things that I am concerned about here:

  • encoding, as I said above, this is probably not the most effecient encoding. I would think that there's a logt of space for optimization - for example, storing time as strings is a huge waste of bandwidth, same applies to event names (as I mentioned above). I think it's important to really focus on optimizing the encoding to the maximum, because that would mean a big improvement in performance under higher load (although I have not analysed what HIGH load means in this context)
  • the batching of events - I would make it opt-in. One reason for that is, again, that you can save bandwidth, because if you send each event whenever it fires, you don't need to store the time, as you can get that from the replay then, and the time will be more reliable than a timer (although I cannot attest to how reliable or unreliable timers are).

@marcoabreu
Copy link
Member

We 100% need to keep batching on the communication level. We can't allow the map to just decide when it wants to emit data to our server. Remember, these are statistical events that have zero time pressure to be emitted.

Magentah added 2 commits May 15, 2025 16:49
W3CData is used for the compression and decompression of data, as well
as the registering of schemas for event types that data is used for.

W3CChecksum is a CRC32 implementation that can be used to continuously
update a CRC value. This can then be passed to W3CData to create
checksum packets
@Magentah Magentah force-pushed the stat-events branch 9 times, most recently from 53bcbda to 05f5946 Compare May 16, 2025 16:34
@Magentah
Copy link
Contributor Author

I am generally very impressed and satisfied with how quickly @Magentah made something quite comprehensive and mature. Obviously you're a far better and focused coder than I am, kudos to you for that.

Two things that I am concerned about here:

  • encoding, as I said above, this is probably not the most effecient encoding. I would think that there's a logt of space for optimization - for example, storing time as strings is a huge waste of bandwidth, same applies to event names (as I mentioned above). I think it's important to really focus on optimizing the encoding to the maximum, because that would mean a big improvement in performance under higher load (although I have not analysed what HIGH load means in this context)

The w3cdata.lua should hopefully be sufficient for handling this now.

  • the batching of events - I would make it opt-in. One reason for that is, again, that you can save bandwidth, because if you send each event whenever it fires, you don't need to store the time, as you can get that from the replay then, and the time will be more reliable than a timer (although I cannot attest to how reliable or unreliable timers are).

I think this is less of a concern with w3cdata.lua. The time can be 2 bytes (for up to 65k, or 18 hours if the time is by second), or set to a specific bit length (e.g. 12 bits for up to 4k, or a bit over 1 hour). I think my approach at the moment will likely be to buffer all events until it hits a max size (180 probably, vaguely remember seeing that number somewhere for being a soft limit) then flush, with some limit to how quickly we flush to prevent being able to send dozens of SyncData packets all at once.

@Magentah Magentah force-pushed the stat-events branch 2 times, most recently from 59feaa7 to 0602caa Compare May 17, 2025 15:11
Magentah added 2 commits May 23, 2025 02:13
W3CData:
- requires type=int to set bit sizes
- added 'number' type that allows setting minimum and maximum values
instead
- Updated checksum payloads and added test
- Added schema registry payload and added test
- Made signed default instead of unsigned
- Added function to register multiple schemas at once
- Added assertions that numbers are integers when expected

W3CEvents:
- Init now only inits once
- Added end_game that sends all remaining events and prevents any more
events. Also adds players results as the final events.
- Schemas are now required to be registered at once

json.lua added so we can easily compress and send schemas as we don't
support nested objects
import { ItemProps } from "war3-objectdata-th/dist/cjs/generated/items";
import * as W3Metrics from "../lua/w3cMetrics";

export function trackHeroes() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming file from heros.ts to heroes.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants