Appeal for Feedback: A New Reporting System

November 12, 2018

Last week there was a hot Reddit discussion about ProtonDB's rating system. This led me to reflect on where we were and where we are since we switched to a little over two months ago now.

While I'd say that the medal rating system has been an improvement over the previous system, it was never the final destination. As reports have proliferated there are patterns that make me think we could do better. Here's my take on some of the current weaknesses:

  • Definition: While more objective than what we had before, there is still a noticeable variance between reporters' conclusions of what tier should be given. There's particular drift between what constitutes a 'workaround' and the degree of effort to setup. This leads to inconsistent grading, and some are perceived to be too generous with Platinum ratings.
  • Aggregation: It's enticing but inherently flawed to gather up the current reports for a game and come to a qualitative conclusion. One platinum and one silver report in aggregate with the current algorithm would convey a 'Gold' tier. But the Gold tier definition does not describe the experience of either reporter. Having the same tiers for aggregate ratings as individual ratings is not serving us well.
  • Clarity: Does a set of reports give enough insight to recommend whether a game is worth purchasing for play on Linux? It's often unclear, because the meat on the bones is usually in the notes rather than the rating itself. It's very unstructured data. And what is a 'Platinum' game anyway?

In hopes to alleviate each of these, I'd like to propose a new reporting flow, one that leaves medals behind to instead focus on two things:

  1. A binary answer to: is the game in a reasonable enough state to purchase and play right now?
  2. Collect as much actionable structured data as possible on what or what is not working

For the first focus, similar to the Steam store itself, why not leave open to interpretation by the reporter and rely on the collective average to come to a rough conclusion? A hip shot.

For the second focus, a commitment to working with the facts we know: Does the game have graphical artifacts? Does the sound get out of sync? Does the Steam Controller work? Is the multiplayer functional? Then let the data speak for itself over time as new releases of Proton improve the experience.

The two balance each other out nicely. One gives a rough idea of what everyone thinks, the other gives the tools to identify and measure more precisely what works and what still needs fixing.

To that end, I've made an experimental branch of ProtonDB that uses a new reporting flow. Here you can make either a quick or comprehensive report, without the pressure of making a decision of what tier the game belongs to. Tell us what you know, then how you feel with a final verdict in response to:

"At this point in time, independent of the quality of the game itself and based on its compatibility and the effort you've taken to set up, would you recommend to the typical gamer that this game could be purchased to play on Linux with Proton?"

Scores can be derived from the gut calls and the data can be pooled into algorithms that determine tiers based as much as possible on the facts provided in comprehensive reports. Those tiers can be ones that are broadly understandable. Borked/Flawed/Capable/Strong/Champion?

I encourage you all to try out the new flow and let me know what you think, either on Reddit or on Discord. Nothing here is a done deal. There's no tiers, and there's no submitting functionality yet. Try different combinations. What's missing? Is it too long? Does it achieve what we want it to?

Looking forward to hearing from you!

migelius (@buck)

Previous Post

Help to the Rescue(November 11, 2018)

Following Post

New Filter: Whitelisted(November 14, 2018)
,