天道酬勤,学无止境

Error in mutate_impl(.data, dots) using "join" code

I have a dataset with 100000 rows where order_date shows the order date and user_id where shows the user's ID. I am trying to create a new variable that shows the user's total order within the same day. My data is like this:

order_date=structure(c(15587, 15647, 15734, 15560, 15599, 15778, 15708, 
15520, 15592, 15447, 15718, 15787, 15519, 15486, 15514, 15784, 
15619, 15705, 15552, 15734, 15493, 15661, 15563, 15600, 15790, 
15485, 15546, 15767, 15704, 15726), class = "Date") 

user_id=c(22607, 28275, 32238, 20202, 4391, 7983, 29590, 11820, 22956, 
3196, 31125, 11709, 6586, 2920, 9698, 36814, 6954, 30368, 19052, 
827, 6599, 517, 8761, 20174, 37367, 11647, 18764, 27271, 30302, 
14808)

daten = data.frame(order_date = order_date, user_id = user_id)

I am using this code:

daten<-join(daten, count(daten, c("order_date", "user_id")))

It creates a new variable called "freq" and it was working until today. Now it doesn't work and I am getting an error message like this:

Error in mutate_impl(.data, dots) : Column c("order_date", "user_id") must be length 100000 (the number of rows) or one, not 2

I checked the structure of both variables using str and it says both have 100000 rows.

标签

评论

I'm not sure which join (inner_join) you intend to use but one thing certainly not correct in your code is about count.

count(daten, c("order_date", "user_id")) should be changed to:

count(daten, order_date, user_id)

I run into the same error message with passing string arguments to group_by function a vector of string variables as an argument. Thus, also following clarifications by @MKR, I'll add the solution to my problem, that also seems to solve the problem of the initial question:

daten %>% 
group_by_at(vars(one_of(c("order_date", "user_id")))) %>% 
summarise(n = n())

With the original data, it doesn't make much sense (as all entries are unique in both columns), but in other cases, this might be useful

受限制的 HTML

  • 允许的HTML标签:<a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • 自动断行和分段。
  • 网页和电子邮件地址自动转换为链接。

相关推荐
  • mutate_impl(.data, dots) Evaluation error: object not found
    I had some working code. I had to update R (and install all packages again) and when I try to run the code again hit a wall. Here's a toy example: WORKING CODE # get cyl column mtcars %>% dplyr::select(cyl) # add 1 to all numeric mtcars %>% dplyr::mutate_if(is.numeric, ~.+1) WALL But when I try to divide all numeric columns for the cyl column I can't. mtcars %>% mutate_if(is.numeric, ~./cyl) Error in mutate_impl(.data, dots) : Evaluation error: object 'cyl' not found. By the way...this works mtcars %>% mutate_if(is.numeric, ~./mtcars$cyl) For some reason mutate_if is not finding the column
  • Error in mutate_impl(.data, dots) : Evaluation error: Only year, quarter, month, week, and day periods are allowed for an index of class Date
    I am using Anomalize package to detect the Anomalies, but I am getting the mentioned error even though I have defined the Date as index : Sample Code : x <- as.data.frame(data %>% group_by(date,acc_id) %>% summarise(count = as.numeric(n_distinct(d_id))) %>% ungroup()) x$acc_id <- as.character(x$acc_id) x <- x %>% tibbletime::as_tbl_time(index = date) x %>% time_decompose(count, method = "twitter", trend = "2 months") %>% anomalize(remainder, method = "gesd") %>% time_recompose() %>% plot_anomalies(time_recomposed = TRUE) Error : Error in mutate_impl(.data, dots) : Evaluation error: Only year
  • R: Recoding variables using recode, mutate and case_when
    I want to recode the following values < 4 = -1, 4 = 0, > 4 = 1 for the following variables defined by core.vars in the dataset, and still keep the rest of the variables in the data frame. temp.df <- as.tibble (mtcars) other.vars <- c('hp', 'drat', 'wt') core.vars <- c('mpg', 'cyl', 'disp') temp.df <- rownames_to_column (temp.df, var ="cars_id") temp.df <- temp.df %>% mutate_if (is.integer, as.numeric) I have tried a number of ways to implement this. Using case_when, mutate, recode but with no luck. recode requires a vector and so my thought was to create a vector using case_when or mutate for
  • dplyr: Evaluation error: object '.' not found with gamlss but all good with lm, gam, glm methods
    Context: tidyverse and dplyr environment/work-flow. I'd appreciate insights into how to resolve the following issue, which I have encountered while trying to work with collections of regression results. This minimal reproducible shows the issue mtcars %>% gamlss(mpg ~ hp + wt + disp, data = .) %>% model.frame() The example below illustrates a broader context and works as expected (producing the images shown). It also works if all I do is change ~lm(...) to be ~glm(...) or ~gam(...): library(tidyverse) library(broom) library(gamlss) library(datasets) mtcars %>% nest(-am) %>% mutate(am = factor
  • Defunct as of rlang 0.3.0 and mutate_impl
    I am trying to use the following function but every time I do, I receive the error below. I tried installing an older version of rlang as it works on a different R Studio but I was unable to do that. It seems the error is due to the 0.3.0 version. Any suggestions on how to fix this error would be appreciated. details2 <- details %>% mutate(rownames=rownames(.)) %>% filter(isdir==FALSE) %>% arrange(desc(ctime)) Error in mutate_impl(.data, dots) : Evaluation error: `as_dictionary()` is defunct as of rlang 0.3.0. Please use `as_data_pronoun()` instead.
  • Passing string variable to forcats::fct_reorder
    Any idea how to pass a string sorting variable to fct_reorder? require(dplyr) require(forcats) require(ggplot2) order_var = 'displ' mpg %>% mutate(manufacturer = fct_reorder(manufacturer, order_var)) #> Error in mutate_impl(.data, dots): Evaluation error: length(f) == length(.x) is not TRUE. Tried with bang bang !!: mpg %>% mutate(manufacturer = fct_reorder(manufacturer, !!order_var)) #> Error in mutate_impl(.data, dots): Evaluation error: length(f) == length(.x) is not TRUE. Tried with eval: as.name(eval(order_var)) #> displ mpg %>% mutate(manufacturer = fct_reorder(manufacturer, as.name(eval
  • Mutate usage length of column error
    I'm a new R user. I've been trying to compute the time difference between two time points, and calculate a percentage value (habitual_efficiency). Using this percentage value, I need to assign specific value to a new variable (i). time_difference1 <- as.numeric(difftime(strptime(var$psqi_1_bedtime, "%H:%M"), strptime(var$psqi_3_waketime, "%H:%M"))) timedifference <- abs (24 - time_difference1) var4 <- data.frame (habitual_efficiency = 1: nrow(var)) transmute (var, habitual_efficiency = (var$psqi_4_sleeptime*100/timedifference)) var4 <- data.frame (comp4score = 1: nrow(var)) var4 <- var4 %>%
  • iterating over formulas in purrr
    I have a bunch of formulas, as strings, that I'd like to use, one at a time in a glm, preferably using tidyverse functions. Here's where I am at now. library(tidyverse) library(broom) mtcars %>% dplyr::select(mpg:qsec) %>% colnames -> targcols paste('vs ~ ', targcols) -> formulas formulas #> 'vs ~ mpg' 'vs ~ cyl' 'vs ~ disp' 'vs ~ hp' 'vs ~ drat' 'vs ~ wt' 'vs ~ qsec' I can run a general linear model with any one of these formulas as glm(as.formula(formulas[1]), family = 'binomial', data = mtcars) %>% glance #> null.deviance, df.null, logLik, AIC, BIC, deviance, df.residual #> 43.86011, 31,
  • enquo() inside a magrittr pipeline
    I just would like to understand what's going wrong here. In the first case (working), I assign the enquo()-ted argument to a variable, in the second case, I use the enquoted argument directly in my call to mutate. library("dplyr") df <- tibble(x = 1:5, y= 1:5, z = 1:5) # works myfun <- function(df, transformation) { my_transformation <- rlang::enquo(transformation) df %>% gather("key","value", x,y,z) %>% mutate(value = UQ(my_transformation)) } myfun(df,exp(value)) # does not work myfun_2 <- function(df, transformation) { df %>% gather("key","value", x,y,z) %>% mutate(value = UQ(rlang::enquo
  • Sampling different numbers of rows by group in dplyr tidyverse
    I'd like to sample rows from a data frame by group. But here's the catch, I'd like to sample a different number of records based on data from another table. Here is my reproducible data: df <- data_frame( Stratum = rep(c("High","Medium","Low"), 10), id = c(1:30), Value = runif(30) ) sampleGuide <- data_frame( Stratum = c("High","Medium","Low"), Surveys = c(3,2,5) ) Output should look like this: # A tibble: 10 × 2 Stratum Value <chr> <dbl> 1 High 0.21504972 2 High 0.71069005 3 High 0.09286843 4 Medium 0.52553056 5 Medium 0.06682459 6 Low 0.38793128 7 Low 0.01285081 8 Low 0.87865734 9 Low 0
  • standard evaluation in dplyr: “could not find function” error for functions in global environment
    I am trying to use standard evaluation in dplyr with functions in the global environment but I get the "could not find function" error. Here is some code # create data.frame df <- data.frame(x = rnorm(10), y=rnorm(10)) # define arbitrary function test <- function(x) x^2 # use standard evaluation with dplyr df %>% mutate_("mean(x)") df %>% mutate_("test(x)") # Error in mutate_impl(.data, dots) : could not find function "test" I can circumvent this problem by using df %>% mutate_(~test(x)) but I would like to use the string-based approach for various reasons. The main reason is that the user can
  • dplyr 中的标准评估:全局环境中函数的“找不到函数”错误(standard evaluation in dplyr: “could not find function” error for functions in global environment)
    问题 我试图在 dplyr 中对全局环境中的函数使用标准评估,但出现“找不到函数”错误。 这是一些代码 # create data.frame df <- data.frame(x = rnorm(10), y=rnorm(10)) # define arbitrary function test <- function(x) x^2 # use standard evaluation with dplyr df %>% mutate_("mean(x)") df %>% mutate_("test(x)") # Error in mutate_impl(.data, dots) : could not find function "test" 我可以通过使用df %>% mutate_(~test(x))来规避这个问题,但由于各种原因我想使用基于字符串的方法。 主要原因是用户可以将变量名称x的向量传递给我的函数。 我可以使用这个向量将相同的转换应用于所有变量,方法是将像这样的东西as.list(sprintf('mean(%s, na.rm=TRUE)', x))传递给.dots 。 这是完整的代码 x <- c('x','y') df %>% mutate_(.dots=as.list(sprintf('mean(%s, na.rm=TRUE)', x))) df %>%
  • Using switch statement within dplyr's mutate
    I would like to use a switch statement within dplyr's mutate. I have a simple function that performs some operations and assigns alternative values via switch, for example: convert_am <- function(x) { x <- as.character(x) switch(x, "0" = FALSE, "1" = TRUE, NA) } This works as desired when applied to scalars: >> convert_am(1) [1] TRUE >> convert_am(2) [1] NA >> convert_am(0) [1] FALSE I would like to arrive at equivalent results via mutate call: mtcars %>% mutate(am = convert_am(am)) This fails: Error inmutate_impl(.data, dots) : Evaluation error: EXPR must be a length 1 vector. I understand
  • using quosures within formula inside an anonymous function
    I am trying to use quosures to pass along variable names within a custom function for data processing and use in a formula, but my use of quosures in the formula is not correct. Is there a better way to unquote arguments within a formula? library(dplyr) library(broom) library(purrr) library(tidyr) foo <- function(mydata, dv, iv, group_var) { dv = enquo(dv) iv = enquo(iv) group_var = enquo(group_var) mydata <- mydata %>% group_by(!!group_var) %>% nest() mydata %>% mutate(model = map(data, ~summary(lm(formula(substitute(dv ~ iv)), data = .)) )) %>% unnest(model %>% map(tidy)) } foo(mydata=mtcars
  • Return list using mutate and rowwise
    I'm trying to return a list using mutate and rowwise but get the error shown in the code. These questions Q1 Q2 helped, but I'd like to keep it simple by iterating over rows using rowwise(), and the questions are 3yr 7mth old. Thanks. library(tidyverse) df <- data.frame(Name=c("a","a","b","b","c"),X=c(1,2,3,4,5), Y=c(2,3,4,2,2)) TestFn <- function(X,Y){ Z <- list(X*5,Y/2,X+Y,X*2+5*Y) return (Z) } #this works SingleResult <- TestFn(5,20) #error - Error in mutate_impl(.data, dots) : incompatible size (4), expecting 1 (the group size) or 1 dfResult <- df %>% rowwise() %>% mutate(R=TestFn(X,Y))
  • How do I group my data by hour?
    I have read my data into R, where I am trying to group my data by hour with the code below: tweets <- read.csv("tweetCSV.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE) tweets %>% group_by(format(Time, "%H"), Word) %>% summarise(count=n()) When I run this code I get an error (shown below) which I cannot get my head around: "Error in mutate_impl(.data, dots) : invalid 'trim' argument" I was wondering if anybody can help me overcome this problem? Thanks James Sample of the data-set is accessible via this link: https://docs.google.com/spreadsheets/d
  • How to use purrr::pmap to plot multiple ggplot in nested.data.frame
    I have some questions about purrr::pmap to make multiple plots of ggplot in nested.data.frame. I can run below code without problem by using purrr::map2 and I can make multiplots(2 plots) in nested.data.frame. As a example, I used the iris dataset in R. library(tidyverse) iris0 <- iris iris0 <- iris0 %>% group_by(Species) %>% nest() %>% mutate(gg1 = purrr::map(data, ~ ggplot(., aes(Sepal.Length, Sepal.Width)) + geom_point())) %>% mutate(gg2 = purrr::map(data, ~ ggplot(., aes(Sepal.Length, Petal.Width)) + geom_point())) %>% mutate(g = purrr::map2(gg1, gg2, ~ gridExtra::grid.arrange(.x, .y)))
  • mutate_at evaluation error when using group_by
    mutate_at() shows an evaluation error when used with group_by() and when imputing a numerical vector for column position as the first (.vars) argument. Issue shows up when using R3.4.2 and dplyr0.7.4 version Works fine when using R3.3.2 and dplyr0.5.0 Works fine if .vars is character vector (column name) Example: # Create example dataframe Id <- c('10_1', '10_2', '11_1', '11_2', '11_3', '12_1') Month <- c(2, 3, 4, 6, 7, 8) RWA <- c(0, 0, 0, 1.579, NA, 0.379) dftest = data.frame(Id, Month, RWA) # Define column to fill NAs nacol = c('RWA') # Fill NAs with last period dftest_2 <- dftest %>% group
  • can't use emmeans inside map
    This works: testmodel=glm(breaks~wool,data=warpbreaks) emmeans::emmeans(testmodel,"wool") This works: warpbreaks %>% group_by(tension) %>% do(models=glm(breaks~wool,data=.)) %>% ungroup() %>% mutate(means=map(models,~emmeans::emmeans(.x,"wool"))) This doesn't: warpbreaks %>% group_by(tension) %>% nest() %>% mutate(models=map(data,~glm(breaks~wool,data=.x))) %>% mutate(means=map(models,~emmeans::emmeans(.x,"wool"))) Error in is.data.frame(data) : object '.x' not found Error in mutate_impl(.data, dots) : Evaluation error: Perhaps a 'data' or 'params' argument is needed. Any idea what's causing
  • Substract date from previous row by group (using R)
    I'm having a similar question to this one (subtract value from previous row by group), but I want to subtract the previous date from the current date, by group ID in order to have an estimated number of days. I tried editing the scripts suggesed previously by replacing "value" by "date". Although I tried different suggested methods, but i keep getting this error message "Error in mutate_impl(.data, dots) : Evaluation error: unable to find an inherited method for function first for signature "POSIXct"." Data id date 2380 10/30/12 2380 10/31/12 2380 11/1/12 2380 11/2/12 20100 10/30/12 20100 10