Preprocess data

Prepares data for Twitter bot model

preprocess_bot(x, batch_size = 100, ...)

Arguments

x	Input data either character vector of Twitter identifiers (user IDs or screen names) or a data frame of Twitter data
batch_size	Number of users to process per batch. Relevant if x contains user names or timeline data for more than 100 Twitter users. Because the data processing involves user-level aggregation (grouping by user), it can create computational bottlenecks that are easily avoided by breaking the data into batches of users. Manipulating this number may speed up or slow down data processing, but for most jobs the speed difference is likely negligible, meaning this argument may only be useful if you are working on either a very slow/low-memory machine or very fast/high-memory machine. Default is 100.
...	Other arguments are passed on to rtweet functions. This is mostly just to allow users to specify the Twitter API token, e.g., `predict_bot("kearneymw", token = token)` or `predict_bot("kearneymw", token = rtweet::bearer_token())`.

Value

Returns a data frame used to generate predictions

Examples


if (FALSE) {

#' ## vector of screen names
x <- c("netflix_bot", "aasfdiouyasdoifu", "madeupusernamethatiswrong",
  "a_quilt_bot", "jack", "SHAQ", "aasfdiouyasdoifu5", NA_character_,
  "madeupusernamethatiswrong", "a_quilt_bot")

## preprocess_bot - returns features data.table
ftrs <- preprocess_bot(x)

## use features to generate predictions
predict_bot(ftrs)

}

Arguments

Value

Examples

Contents