Can not set sheet name with dplyr package

2020-03-26 r dplyr

I am exporting multiple dataframe in a list to different sheet in one excel file, and I can do this with the below code(use mtcars as an example):

library(tidyverse)
library(ImportExport)

data_list <- split(mtcars, mtcars$cyl)
table_name <- names(data_list)

# Run
excel_export(data_list, "foo.xlsx", table_names = tab_name)

And then, I want to do this with another way, cause I see the dplyr document said that:

.y to refer to the key, a one row tibble with one column per grouping variable that identifies the group.

So I thought .y equal to my created variable table_name, and I do this:

data_list %>% excel_export("foo.xlsx", table_names = .y)

Then I got an error:

Error in .jcall(wb, "Lorg/apache/poi/ss/usermodel/Sheet;", "createSheet", : object '.y' not found

Could someone explain why and how can I do this with .y.

Any help will be highly appreciated.

Answers

  1. If you reference a quote, I think it makes sense to use the function in which that quote is used. In this case, I found it in group_map (which includes group_walk, a complementary function that operates primarily in side-effect).

  2. You still need to group_by the data. More specifically, it needs to operate on a tbl_df (not a list), typically (but not always) a grouped tibble.

I have neither ImportExport nor xlsx (on which the former depends) installed, so I'll proxy your action with write.csv.

library(dplyr)
mtcars %>%
  group_by(cyl) %>%
  group_walk(~ write.csv(.x, paste0(.y, ".csv")))

The side-effect of this is that three files are created in the current directory named 4.csv, 6.csv, and 8.csv.

If you want to operate on a named list, one could also use one of:

# using: data_list <- split(mtcars, mtcars$cyl)
purrr::imap(data_list,
            ~ write.csv(.x, paste0("~/Downloads/", .y, ".csv")))
Map(write.csv, data_list, paste0(names(data_list), ".csv"))

The effect is the same.

.x and .y are not global parameters that can be used anywhere, those are reserved for specific functions. (usually map)

From ?map

.f A function, formula, or vector (not necessarily atomic).

If a formula, e.g. ~ .x + 2, it is converted to a function. There are three ways to refer to the arguments:

  • For a single argument function, use .
  • For a two argument function, use .x and .y
  • For more arguments, use ..1, ..2, ..3 etc

Let's take a simple list

library(purrr)
listvec1 <- list(a = 1:3, b = 4:6, c = 2:4)

1) Let's say we want to multiply every element in the list with 2, map has a single argument so we use . here.

map(listvec1, ~. * 2)
#$a
#[1] 2 4 6

#$b
#[1]  8 10 12

#$c
#[1] 4 6 8

Coincidently .x works here as well :

map(listvec1, ~.x * 2)

but if you use .y it will give an error because there are no 2 arguments in map.

map(listvec, ~.y * 2)

Error in .f(.x[[i]], ...) : the ... list contains fewer than 2 elements

2) Let's take another list and now add listvec1 with listvec2. For this, we can use map2 which has two arguments so here we refer them as .x and .y.

listvec2 <- list(a = 7, b = 8, c = 9)
map2(listvec1,listvec2, ~.x + .y)
#you could actually simplify this as but anyway this is just an example
#map2(listvec1,listvec2, `+`)

#$a
#[1]  8  9 10

#$b
#[1] 12 13 14

#$c
#[1] 11 12 13

In the same we use .x and .y in imap where .x is the element and .y is either the name of the list (if present) or index.


In the post above .y means nothing, so using it as

data_list %>% excel_export("foo.xlsx", table_names = .y)

would definitely yield an error. You need to use specific functions as described above to use .x, .y. So if you want to use pipes for your command, you should use

data_list %>% excel_export("foo.xlsx", table_names = tab_name)

Related