We'd like some clarification on the qualification that goes in round 1 and 2.
1) Does the agent know if it is playing in round 1 or 2 to be able to try something different in another round, e.g. from a command line parameter?
I'm aware that you can get some comparison scores via the interface, but that doesn't seem sufficient to identify where this comes from.
2) How exactly is the score calculated?
Let's say we score X in round 1 and Y in round 2.
As we read the rules, this would mean that the score for this level will be:
X + max(X, Y)
i.e. the sum of rounds 1 and 2, with the fact that you can only improve in round 2.
Is this correct?