1

I have a dataframe with object names. Some of the objects have special symbols like ® and ™

Example:
ProStand® Front Air Suspension
OnCo™ Connector

I've tried db[grep("®",db$objectName), ] to find the special symbols but R isn't picking it up even though I see them in the dataframe.

This didn't work either
db$objectName[db$objectName == "ProStand® Front Air Suspension" ]<- "ProStand Front Air Suspension"

How do I find the special characters and remove them from the strings in my dataframe?

lomingchun
  • 37
  • 2
  • 8
  • 2
    `gsub("[®™]", "", db$objectName)` – G5W Oct 09 '19 at 22:06
  • 1
    from https://stackoverflow.com/questions/9934856/removing-non-ascii-characters-from-data-files ; `x = c("ProStand® Front Air Suspension", "OnCo™ Connector") ; iconv(x, "latin1", "ASCII", sub="")` – user20650 Oct 09 '19 at 23:15

2 Answers2

1

If you're looking for something a little bit more generic, using stringr, you could try with, for example:

str_remove(string = "ProStand® Front Air Suspension", pattern = "[^[:alnum:][:space:]]+")

which gives

"ProStand Front Air Suspension"

This basically removes everything that is not a number, not a letter, and not a space.

giocomai
  • 3,043
  • 21
  • 24
1

Here is one by matching the unicode character and replace it with blank

library(stringr)
str_replace_all(str1, "\\u00AE|\\u00a9|\\u2122", "")
#[1] "ProStand Front Air Suspension"

data

str1 <- "ProStand® Front Air© Suspension™"
akrun
  • 874,273
  • 37
  • 540
  • 662