During Summer 2022, I interned at Meta ono the news recommendations team. This team owns all of the code related to how ranking and recommendations work on the news tab of the Facebook app. One of the features that's currently being built out is user feedback for specific news articles. If the algorithm messes up and gives users an article they do not like, the user can voice that they want to see less of this type of content. Vice versa, a user can say they want to see more of a specific type of content. My project this summer involved taking this feedback and incorporating it less naively into our machine learning ranking stack.
The current ranking pipeline has two stages: candidate retrieval and the ranking model pass. A bunch of articles are smartly generated via
the generation pipeline, and then all of these articles are filtered and then ranked. The final output is a set of articles in the order that
they will be presented to the user in the near future.
I added custom candidate generator pipelines to include news articles that include similar news events to those that users gave a "see more" reaction to. In addition, added new filters to remove news articles from the candidate pool if they shared news events that were similar to those that users gave a "see less" reaction to. This allowed for see more/see less to be more accurately represented in the articles that are outputted from our pipeline.
The second part of my project involved the actual ranking pass. Currently, there are very few users of these feedback features. Within the user base, there is a good proportion of people that are very heavy users of the feedback features. This means our training dataset is very biased towards heavy users. I performed an extensive analysis to group users into light, medium, and heavy buckets based on their article hide usage. I then trained a new model with a custom demotion for training samples based on the article hide usage of that sample's user. This allowed the model to be more accurate for light and medium user groups. I noticed a 3% increase in model AUC, and improvement in precision and recall for light and medium user groups after online experimentation.
I enjoyed my internship a lot in Summer 2022! This was the first in-person experience I'd ever had, and I really made the most of it living in New York City. I would love to come back to this city again soon!