Attacking-Lab Scoring Formula v1
Scoring formula designed by Attacking-Lab to address perceived short-comings.
Summary
Each team's score is calculated from offense
, defense
and sla
components of each of their services in all rounds played.
The checker returns one of three results for each service,
up
, recovering
and down
:
- A service is considered
up
if all flags could be successfully deployed and retrieved, and all other checks were successful. - A service is considered
down
if any checks for the current round failed. - A service is considered
recovering
if not all flags successfully deployed in the so-called retention period could be retrieved by the checker. In this case points are awarded relative to the ratio of flags which could be recovered (sla_ratio
), as proposed in Tenet 7. Which flags from the retention period are missing is visualized by the scoreboard.
The retention period should last at least the round-equivalent of 5 minutes, such that there is enough time to recover from sudden flag submitter downtime, and teams need not exploit every round.
Additionally, the end of every round should feature a 5 second checker hold in which no requests by the checker are active to give teams a time slot in which services can be restarted without downtime. This also ensures no team is disadvantaged due to unfortunate scheduling of checker requests.
The following python pseudo-code captures the team score calculation:
@dataclass
class CTFInfo:
team_count: int # includes NOP team
retention_rounds: int
@dataclass
class RoundStateFlagstore:
lost: str | None # flag of the current round if stolen by any team
active: list[str] # flags of this flagstore deployed in the retention period
captures: list[str] # flags of this flagstore captured from other teams
@dataclass
class RoundStateService:
flagstores: list[RoundStateFlagstore]
checker_result: Literal["up"] | Literal["recovering"] | Literal["down"]
@property
def max_sla(self) -> int:
return 2 * len(self.flagstores) + 1
@dataclass
class RoundState:
services: list[RoundStateService]
def score(rounds: list[RoundState], ctf: CTFInfo, captures: dict[str, int]):
attack = defense = sla = 0
for rnd in rounds:
for service in rnd.services:
if service.checker_result == "up":
sla += 1
for flagstore in service.flagstores:
sla_ratio = len(flagstore.active) / ctf.retention_rounds
if service.checker_result != "down":
sla += 2 * sla_ratio
if (flag := flagstore.lost) is not None:
defense -= (1 + captures[flag] / ctf.team_count) / 2
for flag in flagstore.captures:
attack += (1 + 1 / captures[flag]) / 2
return (attack, defense, sla)
Review
- Any round that a service is unavailable, the corresponding team loses
SLA equal to
sla_max
for that round. Additionally, since some flags could not be deployed, the team will receive partial SLA for subsequent rounds in the retention period, at most(retention_rounds - 1) / retention_rounds * sla_max
. Therefore, the total cost of a service becoming unavailable forn
rounds is at leastsla_max
and at mostn * sla_max + (retention_rounds - 1) / retention_rounds * 2 =~ n * sla_max + 2
, both of which are greater than the maximum relative gain of an attacker (len(flagstores) * 2
). - To incentivize defense and reduce the relative cost of patching, defense
points start at
-0.5
for a single attacker and scale linearly to-1
with the number of captures thereafter. - When a service becomes unavaiable due to patching,
the lost points can only be recovered relative to the unpatched state if the
service will be unsuccessfully attacked for (at worst with
len(flagstores) = 1
)(n * sla_max + 2) / (len(flagstores) / 2) - n = 5 * n + 4
rounds more than the patching made the service unavaiable for. Patching should reasonably result in at most a few rounds of downtime (e.g.2
), the lost points can be recovered in only a few rounds of subsequent uptime (6 * 2 + 4 = 16
). Additionally, the checker hold makes it feasible for valid patches to be deployed with zero downtime deterministically. - Captured flags' value scales with the number of captures, therefore this formula suffers from the same quirk as FaustCTF 2024 and similar, which is that the attack score may decrease over time, confusing players. To mitigate this, the scoreboard displays both the expected and realized attack points.
Tenets
Total score MUST increase with more flags captured
Score increases with attack, which scales with flags captured.
Total score MUST decrease with more flags lost
Score decreases with defense, which scales with flags lost.
Flag value MUST diminish with more successful attacks
A flag's value scales inversely with the number of captures.
Perfect SLA MUST be worth more than any attacker's relative gain
The maximum points gained by any attack (
flagstores * 2
) is less than the minimum cost of downtime (sla_max = flagstores * 2 + 1
).The cost of downtime MUST NOT outweigh the benefits of patching
The cost of downtime due to patching can be recovered in few subsequent rounds of prevented exploitation.
SLA SHOULD decrease fairly with every missing flag in the retention period
sla_ratio
decreases fairly with every missing flag in the retention period.Flag value SHOULD be calculated independent of its flagstore
Flag value does not depend on the amount of flagstores in the service.