Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Begging for feedback! #20

Closed
moodymudskipper opened this issue Nov 9, 2019 · 8 comments
Closed

Begging for feedback! #20

moodymudskipper opened this issue Nov 9, 2019 · 8 comments

Comments

@moodymudskipper
Copy link
Owner

I received positive comments when I released the package but it's REALLY HARD to get specific feedback, so If you end up here, have 5 min to spare, and would like to make me happy, please share :

  • For which type of task do you use unglue ?
  • Does it work as you expect ?
  • What would you like to do with unglue that you can't, or think you can't ?
    • any feature request and criticism
@moodymudskipper moodymudskipper pinned this issue Nov 9, 2019
@tmastny
Copy link

tmastny commented Feb 19, 2020

Love the package!

One thing that would be cool would be to have a shortcut to ignore whitespace. Here's an example:

library(unglue)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)

# common string format
example_text <- c("20-20-32    1 file_name", "20-20-33   23 file_name2")
cat(example_text[1], example_text[2], sep = '\n')
#> 20-20-32    1 file_name
#> 20-20-33   23 file_name2

# how I do it today
example_text %>%
  unglue_data("{date} {size} {file}") %>%
  mutate(unglued = unglue(str_trim(file), "{bytes} {name}")) %>%
  tidyr::unnest(unglued)
#> # A tibble: 2 x 5
#>   date     size  file             bytes name      
#>   <chr>    <lgl> <chr>            <int> <chr>     
#> 1 20-20-32 NA    "  1 file_name"      1 file_name 
#> 2 20-20-33 NA    " 23 file_name2"    23 file_name2

# I'd like to do this to get the same answer
example_text %>%
  unglue_data("{date} {size} {file}")
#>       date size           file
#> 1 20-20-32   NA    1 file_name
#> 2 20-20-33   NA  23 file_name2

Created on 2020-02-19 by the reprex package (v0.3.0)

@moodymudskipper
Copy link
Owner Author

Hi @tmastny , thanks for the kind words!

I believe you can get what you want by running :

example_text %>%
  unglue_data("{date}{=\\s+}{size}{=\\s+}{file}")

Where the {=\\s+} will match any number of space and won't assign it to any variable.

I understand that there might be value in something more obvious though but I can't make it the default, I'll think about it as I don't have any idea now,

@tmastny
Copy link

tmastny commented Feb 19, 2020

Thanks for the tip!

I definitely agree, ignoring whitespace shouldn't be the default. I was thinking like a function argument, unglue(...., ignore_whitespace = TRUE). But I think {=\\s+} makes a more sense, and is consistent with the rest of glue.

I was originally thinking something along the lines of this issue: #19
so you don't need any regex (even something like \\s+).

One reason I like unglue so much is that it is intuitive and I can figure out the parsing without any regex.

@moodymudskipper
Copy link
Owner Author

moodymudskipper commented Feb 20, 2020

I had forgotten this wild experiment in #19! I was hesitant to implement this as I've tried to make unglue "tidy compliant" and I don't think they'd approve this weird feature.

Do you feel that the following is intuitive ?

example_text %>%
  unglue_data("{date}{~space(s)}{size}{~space(s)}{file}")

example_text %>%
  unglue_data("{date}{~one or more spaces}{size}{~one or more spaces}{file}")

Or is it just weird and mildly interesting ? :)

ignore_space = TRUE seems ambiguous to me, I'm not sure what it means here exactly.

Note that you can also do (still using regex):

example_text %>%
  gsub("\\s+", " ", .) %>% 
  unglue_data("{date} {size} {file}")

To avoid regex, if it's a task common enough, we can define a helper function

merge_multiple_spaces <- function(x) gsub("\\s+", " ", x) 

example_text %>%
  merge_multiple_spaces() %>% 
  unglue_data("{date} {size} {file}")

Or use stringr::str_squish(), which does just that

example_text %>%
  stringr::str_squish() %>% 
  unglue_data("{date} {size} {file}")
#>       date size       file
#> 1 20-20-32    1  file_name
#> 2 20-20-33   23 file_name2

Actually the latter is now my official recommended solution for this case if you use tidyverse tools in your workflow :).

@ymer
Copy link

ymer commented Mar 19, 2020

I would like to use it with a tidyverse tibble in a simple way.

For example, let's say we start with this tibble:
a <- tibble(l = c("so_word1", "so_word2"))

Then I would like to run a command like this:
a %>% unglue(l, "so_{word}")

To get a tibble like this:
tibble(word = c("word1", "word2"))

Maybe this is already simple to do, but I have a hard time understanding how from the vignette.

@moodymudskipper
Copy link
Owner Author

Hi @ymer, I believe you want unglue_unnest(), it will do just that. I'll try to clarify the doc. Tell me if you still have issues and I ll run a reprex when I m in front of my computer.

@moodymudskipper
Copy link
Owner Author

leaving this pinned, feedback is always welcome, but closing, please open new issues!

@github-actions
Copy link

github-actions bot commented Mar 8, 2022

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Mar 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants