trevelyaninc said: why can’t you just say “I don’t like this UX” or “78% of users surveyed disliked Metro”
You can say “I don’t like that”, obviously - that’s an opinion, and it’s fine, but that is all that that is, but someone else can say “I like it, though” and that has just as much substance.
You can also say “78% of users disliked metro”, but you’ll have only a very vague measure of wether it is actually good - you will have measured one aspect of usability - user satisfaction - of some interface in some vague, non-specific way for some non-defined group of users and will not know if that is any kind of significant result - and if you want to be an “expert”, who people depend on to make decisions that affect lots of things, then there is a standard you should have to live up to:
- Maybe you asked 50 people, 39 said “I like this”… but what if you ask 50 more? Could the result be entirely different? You need a significance measure, and you need to define what significant means _before_ you even start surveying, to avoid your own biases getting into the way.
- Maybe the users really hated the interface, but when asked to actually use it, were twice as fast. Maybe they really loved the interface, but the quality of the results was terrible! Usability comprises effectiveness (How good are the results), efficiency (how quick do users get results) and user satisfaction (How much does the user like the interface), and for complex tasks and interfaces, the correlation between those is not a thing that is understood at this time - you _have_ to measure all three separately.
- To begin with, what is “good”? What do we want to measure, and when do we consider it significant? If we make an interface that gets users quality results faster but they hate it (say, make an interface that automates a lot of the task - but thus takes away a lot of control from the user), do we consider this “better”?
- I keep saying good or better, so lets throw that in also: Comparisons! Saying only that some percentage disliked some thing doesn’t really establish anything, a greater percentage might have disliked the thing it is being compared with. If 78% of people hated Metro and 74% hated the previous iteration of the UI, when does that difference become significant?
- What group of people is our target group, what are our target tasks?
- Bias! Are we properly sampling our users?
- Experiment design! Are we controlling for things we don’t want to measure? Are we randomizing enough? Does our design avoid introducing further bias? If we’re testing different groups of people, can we even claim a causal link?
- Probably other things I am forgetting!
This might seem like asking a little much, but it’s this, or fine, state your opinions, but stop pretending that they are anything but that when they are not. Maybe a hypothesis based on some more or less consistent theory, but that is about it - and, as mentioned, nothing is worse than an untested theory that seems to make sense.



![acqua:
「39」/「とあざ【3日目東レ59a】」のイラスト [pixiv]](http://25.media.tumblr.com/47295271c7ca61def75b03d5d6f7c76b/tumblr_mokk4tGEcX1qz53a8o1_500.jpg)

![pleatedjeans:
cat falls asleep in water. [via][video]](http://24.media.tumblr.com/0798843644c862737ce1258821b5938a/tumblr_mnba38vUWI1qzcv7no1_400.gif)



