Explain estimated probability that one or more Twitter accounts is a "bot"

explain_bot(x, batch_size = 100, ...)

Arguments

x

Input data either character vector of Twitter identifiers (user IDs or screen names) or a data frame of Twitter data

batch_size

Number of users to process per batch. Relevant if x contains user names or timeline data for more than 100 Twitter users. Because the data processing involves user-level aggregation (grouping by user), it can create computational bottlenecks that are easily avoided by breaking the data into batches of users. Manipulating this number may speed up or slow down data processing, but for most jobs the speed difference is likely negligible, meaning this argument may only be useful if you are working on either a very slow/low-memory machine or very fast/high-memory machine. Default is 100.

...

Other arguments are passed on to rtweet functions. This is mostly just to allow users to specify the Twitter API token, e.g., predict_bot("kearneymw", token = token) or predict_bot("kearneymw", token = rtweet::bearer_token()).

Value

A data frame with the user id, screen name, probability estimate, feature name, and feature contribution

Examples

if (FALSE) { ## estimate prediction and return with feature contribution kmw <- explain_bot("kearneymw") ## view data kmw ## prob_bot should be roughly equal to sum of log-odds of 'value' kmw[, .(prob_bot = stats::qlogis(prob_bot[1]), contr = sum(value)), by = screen_name ] }