@vladh hmm, i'd guess that you already know that one, but you could just use a very old x86_64 laptop (12 y/o) or something and mesure the performance of applications on that. maby you can even disable cpu boosting and stuff, that'd give somewhat accurate results.
i think the biggest problem with software these days is that it's simply written in very inefficient, and by design, very energy hungry technology such as electron, react and whatnot. people have to buy new hardware every few years to run new software, which is bad for both user experience and obviously creates a lot of carbon emissions too for new hardware.
i think that the best that we can do is improve actually good native tech like gtk or qt or create new ones so that people can use that to build their apps. these large-scale things such as fixing microsoft defender (which is still largely a piece of shit and energy hungry like nothing i've ever seen) are only really possible by big tech companies.
if you want to mesure software, for a badge or something, that is going to be damn hard to do right. but not impossible, i think you could use stuff like getrusage(2) which gives back the time used by the process and some other very useful counters. maby linux has some more fancy stuff for that, but i am on openbsd and that works perfectly fine here.