In "network hot" questions formula, discard answers when voting. Why do you have to discard formula an hour after using?
Stack Exchange network includes Stack Overflow, the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.
There is a question and answer site for the Stack Exchange family of Q&A websites.It only takes a short time to sign up.
When votes of 20... 30... 100 users clearly indicate that only one or two answers are popular.
All answers up to 10 are assumed to contribute to question "hotness score", including those downvoted into oblivion.
If there is strong evidence that the answers do not provide good data points for question popularity, it is a good idea to discard them.
The hotness formula assumes that the answers are not good data points.This is an example of justifying the discarding of accepted answers.
The accepted answers weight less in hotness.I feel that accepted answers are a fine social contract, but not a good data point for question or answer quality.
I can tell you that indiscriminate inclusion of low score answers in questions with lots of views and votes also goes against an underlying assumption in formula "specification".
The popularity of the Hot Questions sidebar is related to the negative impact of counting low quality answers.
Some SE users add their answers to the questions from the top of the sidebar.The amount of answers brought into question could quickly increase since this audience involves hundreds of users.
By counting these answers into a hotness score, formula pushes impacted questions closer to the top of the sidebar, which in turn brings more visitors, who add their answers, and so on.
The growth of hotness in turn blocks a time decay mechanism embedded in the formula, which contributes to a positive feedback loop.
Even when the number of really popular answers remains the same, a popular question can remain on the top of the sidebar for hours.
The time decay component of the formula would be more efficient if the answers were discarded.
The last contribution of under-scored posts into "question hotness" forces questions with multiple low quality answers stick for a long time at the top of the sidebar, making wrong impression on what kind posts are welcome at Stack.
The effect that is amplified by these questions being highly visible to sidebar audience is that they look like good questions.Misguided users spread attitude further into other questions and answers, following what they saw at the "hot" questions.
Do not count proven low score answers in hotness formula.Let user voting and time decay contribute to hotness score by rolling the dice fairly.Please promote less brain- damaging content to the sidebar audience.
That's a great idea!Balance would be brought back into the hotness values.The run-away questions should be cut down by this change.
Being an engineer, I have a tendency to have numbers to back up assertions.I gathered some data and am editing my answer to reflect it.
The question looked like it would ride on top of the hotness collider for a while.I'm not suggesting anything was wrong with that question.I thought it would provide some useful data.
I don't know the values I calculated for the denominator.The equation is supposed to have an age-drag effect.I took the calculated numerator and divided it by the Collider value to figure out what the denominator should be.The problem is that most of the time the question is boosted by the age-drag effect.
I assumed it was safe to ignore the denominator since it wasn't relevant to the change.
Looking at the "percentage of numerator from..." columns, we can see that QScore * NumAnswers / 5 grows in impact to the collider value.The total number of answers and the question's score are more important than the quality of the answers.Even though noisy answers don't help the question, they do help it's hotness value.Quite telling is the scores on the answers themselves - that's a long tail of low to no up-votes on answers.
During the time I watched this question, our number of answers dropped off and held steady at 9.The relative effect of QScore is pretty consistent.I think ignoring zero scored answers from the Collider formula would be a good way to curb excessive hotness scores.
The result of QScore will always remain zero until an answer gets an up-vote.That seems unfair to the question.An answer that gets an up-vote will gain traction in this category.
As the question ages, I think the first thing to notice is that the other factors for the numerator are not as important.This is the right approach for us to take.The quality of the answers should determine how hot a question should remain.
The impact from total views is a little greater and doesn't decay as quickly.The number of views is a factor in the hotness score.The total views retain half of their original impact.
If you're still following along, you might be wondering what the impact is of my suggestion to have negative voted answers.It's not much.At the end, only 2 questions had a negative value, so take 2 points off of SumAnswers.The final numerator value from my observations is about 20% off the QScore.
It's only one question and one set of observations.You can't draw a trend from a single point.
There is enough information to show that this suggestion is worth considering.Hotness scores would be slowed down without punishing high quality questions.It would probably give up-and-coming high quality questions a better chance since the impact of the answer scores has a greater weight.
Engineers and their equations can sometimes lead to clearer discussions about change.That's right.
I think you have a good idea in reining in the hotness formula, but I don't think your approach goes far enough.The data shows it would be pretty decent without my suggestion.
The hotness value should be dragged down instead of being ignored.Any answer with a negative score should have an equal effect on the hotness value.As a positively voted answer, it should be flipped around.
Poor questions attract poor answers and the current formula encourages that behavior, which is one of the problems that you're trying to address and describe in the under-damped feedback cycle.Extending your request and moving into an over-damped control system will make the formula punish poor answers.
The incentive to down-vote the poor answers is provided by the fact that it will reduce the potential popularity of a poor question.
We like to see people contribute in concrete ways.It's almost impossible for you guys to solve this one if you don't know the implementation details.
The network hot question list uses a different hotness formula than the "hot" page.The hot page favors questions that are "instantly" hot.This is documented by the collider algorithm.A lot of the research here is based on a false premise.
We cap the number of answers that contribute to the score at 10.Any answer after 10 does not contribute to the hotness score.
Views don't contribute either.Someone just removed them from the calculation because they were inefficient to query.
This query is based on what data is easy to query.It's easy to use the denormalized column in the formula.Forcing the query to consider each answer and count it or not based on a formula is not possible in a query that has to cover so many posts.It seems like adding a denormalized field is too much.It would be easier to attack this using the columns in the database.
I'm marking it status-declined since it's based on a specific modification to the algorithm.If you want to pursue this, I suggest opening a more open-ended feature-request that demonstrates the problem you perceive, and then we can start playing with the formula.
Here is a more detailed explanation of the suggested cut value.
There were other discussions, but the main focus was on the data in above two.
I believe that the answers listed in the above posts tend to have low scores.
The way in which current formula tries to take this into account in sum(Ascores) is insufficient due to indiscriminate stuffing of these into Qanswers.
I tried to figure out how formula could be adjusted to take into account low scores.
The first approach was to ignore the negative score answers.It's hard to imagine how these could be considered popular.We would have changed the formula several months ago if I had proposed it back then.
It didn't work on the data I studied.There were too few answers with negative scores to make a difference.My guess is that sympathy upvotes work in hot questions.When there are hundreds, thousands of eyeballs on a negative score answer, there is a good chance that someone will like it.
If the answers are cut at zero score, I don't know how it will work.There were too few to make a difference.Random upvotes in high view questions made it impossible to consider a cut.
Cut like at +2 would probably be ok at the question with 1K views, but at one with 3k views it would fail miserably and a positive feedback loop of fake popularity would kick in again.It wouldn't make sense to cut at something like +5 at lower views, with views at or below 1K.The end is dead.
Messing with ideas of cut at a constant score has led me to believe that cut should be proportional to the score of the top voted post.I tried Top Post 100 and it made sense on the set provided in sticky questions post but made too little difference in examples in answer quality post, so I decided to see how it would work if I cut harder.
I changed the name of the site to TopAnswer because I wanted it to be based on comparison of more uniform kinds of posts (comparing score between answers felt more reliable than between questions).
I decided to pick a single value of these, the one that would be easier to explain, because if memory serves, cuts between TopAnswers/15 worked well on my data sets.
It made it as it blends well in Stack Exchange's "value system": 10x score difference is one that makes a Nice Answer, same difference as between Nice and Great answer, all right.
I need to figure out how to score questions at early stages when there are not many votes to make sense of it.
I wanted it because my studies of how hotness score works made me believe that at early stages, current formula works quite well.
I wanted the current formula to work until at least one of the answers gets votes and qualifies for the Good Answer Badge.The final version was made if I subtract 1 and make a good approximation.Quoting self.
When the question has answers with non- negative score, the answers at -2 or 888-282-0465 888-282-0465 are ignored.formula would discard answers with a negative score if they reached a score of 10.When there is an answer at +20, formula would ignore those having less than +2, and so on.
My evaluation of quality was subjective.I read the answers and tried to figure out if they made a fair attempt to answer the question.One-liners without an explanation were considered low quality even when I agreed 200%.
I took a while to ponder on performance after I arrived at conclusion that cut would be proportional to the score of the top voted post.Even without knowing if it's feasible or not, I wanted a variant that would be comparable to the current one in terms of performance.
I took into account that I really need an evidence-based score for a limited amount of posts and that current formula tends to get in trouble only when posts are near the top of collider.
I came up with the idea to pre-select candidates for collider using current formula, which is already proven to have acceptable performance, and only re-calculate and reorder these posts, like just few hundreds questions.
My idea was for the amount of questions to pre-select to be at least 3x of what is needed for collider.The point is to make sure that the candidates pool includes questions that have been out of the way.Even if current hot questions have their initial approximate score messed up by low interest answers, these are still guaranteed to compete against questions that gained popularity in a more natural way.
Reorder questions picked for the hot list based on adjusted hotness score is a feature request.
One way to approximate suggested feature is to make aging factor depend on the amount of answers, so that questions with more answers start aging away sooner and stronger.
Underlying concept is the same as in proposed feature, that is either 1) voted high enough to compensate for increased age decay and keep popularity high, or 2) makes the hotness score decrease faster, thus lowering question exposure and decreasing chances for further damage.
When low quality answers contribute to resolving the problem instead of making it worse, the purpose is the same.
Hot questions with multiple answers age away faster on smaller sites.
I think it's a great idea, but any algorithm in the ballpark would probably be fine.At least a little easier to implement would be the Qscore*(Sum of AnswerScores).We want to show off the great answers at least as much as the questions, so it would allow them to bring a question into the collider.
I think this is a great idea.In my experience, the collider's habits attach to a fixed set of questions which all have the same thing in common.If answers that lack distinct quality are not counted in the chance to enter the collider, this will have a clear and distinct affect on questions of that nature, which are frequently not particularly good to begin with.
The synergy between questions that attract many answerers and the advertising in the collider can only have one affect, which is going to be good when the many answered questions are adding quality up-voted content.Everyone benefits from stem the tide into the question that the collider brings when it's everyone adding their two-cents-opinions in and doing no benefit to the information archive that is SE.