Is there a R function (or any way) to color code all numeric values within a single string? - string

`I successfully manage to color code certain words within a string.
However, I am unable to color code ALL numeric values within that same tring.
I want all the numeric values within the string (example: 1, 2.3, 4, 6.87) to be of the same color (example: blue)
I am all ears to any solution, thank you so much for your help.
`library(tidyverse)
library(crayon)
library(stringr)
library(readr)`
creating the string/vector # ALL GOOD HERE
`text <- tolower(c("Kyste simple inférieur de
56 mm. Aorto-bi-iliaque en
5,9cm. Artères communes de 19mm et
de 87mm. 120mm stable.")) #tolower is removing capital letters#`
individuate words # ALL SEEMS GOOD HERE
`unique_words <- function(x) {
purrr::map(.x = x,
.f = ~ unique(base::strsplit(x = ., split = " ")[[1]],
collapse = " "))
}`
creating a dataframe with crayonized text # ALL SEEMS GOOD HERE
df <- tibble::enframe(unique_words(x = text)) %>% tidyr::unnest() %>%
here you can specify the color/word combinations you need # PROBLEM SEEMS TO BE HERE
`dplyr::mutate(.data = .,
value2 = dplyr::case_when(value == "aorto-bi-iliaque" ~ crayon::green(value),
value == gsub(".*(\\b\\d+\\b).*", "\\1", text) ~ crayon::blue(value), **# Here is where I need help #**
TRUE ~ value)) %>%
dplyr::select(., -value) `
printing the text
`print(cat(df$value2))`
enter image description here This is what it gives me, the word is correctly color coded but all the n`umeric value are still unchanged

Related

Python3, Random entries into a format string problem

I am making a program that generates the phrase,
"The enemy of my friend is my enemy!"
actually, I want this phrase to print out a random permutation of (Friend/Enemy)
in each place every time it is re-ran.
so far I got the code to print out the same word 3 times in each,
but not a different word in each place.
I couldn't get python to access each string individually from a list.
Any ideas?
Thanks!
`
import random
en = 'Enemy'
fr = 'Friend'
words = en, fr
for word in words:
sentence = f"The {word} of my {word} is my {word}!"
print(sentence)
`
If you want to print random sentence each time a script is run, you can use random.choices to choose 3 words randomly and str.format to format the new string. For example:
import random
en = "Enemy"
fr = "Friend"
words = en, fr
sentence = "The {} of my {} is my {}!".format(*random.choices(words, k=3))
print(sentence)
Prints (randomly):
The Enemy of my Friend is my Friend!
I'd make more changes to the code as it looks like the code in the question is trying to use the wrong tools.
import random
words = ['Enemy', 'Friend']
# three independent draws from your words
w1 = random.choice(words)
w2 = random.choice(words)
w3 = random.choice(words)
# assemble together using an f-string
sentence = f"The {w1} of my {w2} is my {w3}!"
print(sentence)
Not sure if this will be easier to understand, but hopefully!

Identifying, counting, AND labelling spaces in a column?

I have a dataframe of 1 column in R. In it is a bunch of names, e.g. Claire Randall Fraser. I know how to make a looping function that will apply a second function to each and every cell. But I'm stuck on how to create that second function, which will be to identify and LABEL each space (" ") in each cell. E.g. Claire[1]Randall[2]Fraser.
Is there a way to do this?
Thanks in advance, and please explain like I'm a beginner in R.
Here's an initial solution using a mixed bag of methods:
Data:
str <- c("Claire Randall Fraser", "Peter Dough", "Orson Dude Welles Man")
Solution:
library(data.table)
library/dplyr)
data.frame(str) %>%
# create row ID:
mutate(row = row_number()) %>%
# split strings into separate words:
separate_rows(str, sep = " ") %>%
# for each `row`:
group_by(row) %>%
# create two new columns:
mutate(
# add a column with the run-length number enclosed in "[]":
add = paste0("[", cumsum(rleid(row)), "]"),
# paste the separate names and the values in `add` together:
str_0 = paste0(str, add)) %>%
# put everything back onto the original rows:
summarise(str = paste0(str_0, collapse = "")) %>%
# deactivate grouping:
ungroup() %>%
# remove string-final `[...]`:
mutate(str = sub("\\[\\d+\\]$", "", str))
Result:
# A tibble: 3 × 2
row str
<int> <chr>
1 1 Claire[1]Randall[2]Fraser
2 2 Peter[1]Dough
3 3 Orson[1]Dude[2]Welles[3]Man

Python - how to accumulate comma using a for loop

community - this is my first post, so please forgive me if I failed to properly display this message. I am trying to add commas as depicted below in the test cases. It appears there are more efficient ways than what I have coded below; however, I would like to solve the problem using my code below. What in the world am I missing?
def get_country_codes(prices):
country_prices = prices
p = ""
for i in country_prices:
if i.isalpha() and i == ",":
p = p + i[0] + ","
return (p)
My code is returning:
Test Failed: expected NZ, KR, DK but got
Test Failed: expected US, AU, JP but got
Test Failed: expected AU, NG, MX, BG, ES but got
Test Failed: expected CA but got
from test import testEqual
testEqual(get_country_codes("NZ$300, KR$1200, DK$5"), "NZ, KR, DK")
testEqual(get_country_codes("US$40, AU$89, JP$200"), "US, AU, JP")
testEqual(get_country_codes("AU$23, NG$900, MX$200, BG$790, ES$2"), "AU, NG, MX, BG, ES")
testEqual(get_country_codes("CA$40"), "CA")
It would be better to accept a list instead of a string type as a parameter for get_country_codes. This will prevent you from having to worry about parsing the string and ignoring the comma. I'd recommend Joining Lists and Splitting Strings by diveintopython.net.
This code accepts a list, iterates through it, splits each value on the $, grabs the first token, and checks if this passes isalpha(). If it does, it appends the returned list.
def get_country_codes(prices):
"""Get country codes given a list of prices.
:param prices: list of prices
:return: list of alphabetical country codes
"""
countries = []
for price in prices:
country = price.split('$')[0]
if country.isalpha():
countries.append(country)
return countries
# As a proof, this line will concatenate the returned country codes
# with a comma and a space:
print(', '.join(get_country_codes(["NZ$300", "KR$1200", "DK$5"])))
# As another proof, if your input has to be a string:
print(get_country_codes("US$40, AU$89, JP$200".split(", ")))
print(get_country_codes(["AU$23", "NG$900", "MX$200", "BG$790", "ES$2"]))
print(get_country_codes(["CA$40"]))
The code returns
NZ, KR, DK
['US', 'AU', 'JP']
['AU', 'NG', 'MX', 'BG', 'ES']
['CA']
Finally, to make an assertion:
testEqual(get_country_codes(["NZ$300", "KR$1200", "DK$5"]), ["NZ", "KR", "DK"])
Using your code ...
For every character check if is allowed.
Include "," and space in the allowed characters.
def get_country_codes(prices):
country_prices = prices
p = ""
for i in country_prices:
if i.isalpha() or i == "," or i == " ":
p = p + i[0]
return (p)

Convert string into a Tkinter notebook frame

Ok so I am trying to find the frame which Tkinter is using, then take its width and height and resize the window so that everything fits nicely without ugly spaces left. So far I have gotten the following...
convert = {"tab1_name", "tab1"; "tab2_name", "tab2"; "tab3_name", "tab3") ##(it goes on)
a = mainframe.tab(mainframe.select(), "text")
b = convert[a]
w = b.winfo_reqwidth()
h = b.winfo_reqheight()
mainframe.configure(width=w, height=h)
The names of each frame in the notebook are tab1, tab2, tab3, etc., but the labels on them are unique because they describe what happens in the tab. I want to be able to take the string returned from the convert dictionary function and use it as the frame's name. I am not sure if the frame is a class or what else. Is there a way to convert the string b into the frame's name and somehow use it in the .winfo_reqheight()? I do not want to have to make a thing which says...
if b=="tab1":
w = tab1.winfo_reqwidth()
h = tab1.winfo_reqheight()
mainframe.configure(width=w, height=h)
for each frame because I want it to be easy to add new frames without having to add so much code.
Thank you
Option 1:
You can store actual objects in dictionaries. So try:
convert = {"tab1_name": tab1, "tab2_name": tab2, "tab3_name": tab3}
a = mainframe.tab(mainframe.select(), "text")
b = convert[a]
w = b.winfo_reqwidth()
h = b.winfo_reqheight()
mainframe.configure(width=w, height=h)
Option 2:
Executing strings is possible with the 'exec('arbitrary code in a string')' function
See How do I execute a string containing Python code in Python?.
You could do this: (with just text in the dictionary or whatever convert is)
convert = {"tab1_name": "tab1", "tab2_name": "tab2", "tab3_name": "tab3"}
a = mainframe.tab(mainframe.select(), "text")
b = convert[a]
code1 = "w = %s.winfo_reqwidth()" % b
code2 = "h = %s.winfo_reqheight()" % b
exec(code1) # for python 2 it is: exec code1
exec(code2) # python 3 changed the exec statement to a function
mainframe.configure(width=w, height=h)
Be careful that you don't let malicious code into the exec statement, because python will run it. This is usually only a problem if an end user can input things into the function(it sounds like you don't have to worry about this).
btw, I think your first line is incorrect. You open with a { but close with ). Proper dictionary syntax would be:
convert = {"tab1_name": "tab1", "tab2_name": "tab2", "tab3_name": "tab3"}
Notice the colons separating key and value, and commas in-between entries.

Subset a dataframe by a variable number of specific columns R

this one has been bugging me for a couple of days now, and I havent had any luck on stack exchange yet. Essentially, I have two tables, one table defines what columns (by column number) to select from the second table. My initial plan was to string together the columns and pass that into a subselect statement, however when I define the string as as.character it's not happy, i.e.:
# Data Sets, Variable_Selection: Table of Columns to Select from Variable_Table
VARIABLE_SELECTION <- data.frame(Set.1 = c(3,1,1,1,1), Set.2 = c(0,3,2,2,2), Set.3 = c(0,0,3,4,3),
Set.4 = c(0,0,0,5,4), Set.5 = c(0,0,0,0,5))
VARIABLE_TABLE <- data.frame(Var.1 = runif(100,0,10), Var.2 = runif(100,-100,100), Var.3 = runif(100,0,1),
Var.4 = runif(100,-1000,1000), Var.5 = runif(100,-1,1), Var.6 = runif(100,-10,10))
# Sting rows into character string of columns to select
VARIABLE_STRING <- apply(VARIABLE_SELECTION,1,paste,sep = ",",collapse = " ")
VARIABLE_STRING <- gsub(" ",",",VARIABLE_STRING)
VARIABLE_STRING <- data.frame(VAR_STRING = gsub(",0","",VARIABLE_STRING))
# Will actually be part of lapply function but, one line selection for demonstration:
VARIABLE_SINGLE_SET <- as.character(VARIABLE_STRING[4,])
# Subset table for selected columns
VARIABLE_TABLE_SUB_SELECT <- VARIABLE_TABLE[,c(VARIABLE_SINGLE_SET)]
# Error Returned:
# Error in `[.data.frame`(VARIABLE_TABLE, , c(VARIABLE_SINGLE_SET)) :
# undefined columns selected
I know the text formatting is the problem but I can't find a workaround, any suggestions?
You should avoid sub-setting by number of columns and process by variables names or at least keep your index as integer list( no need to coerce to a string)
First To stay in the same idea, this correct your code. Basciaclly I coerce your variable to vector:
VARIABLE_TABLE[,as.numeric(unlist(strsplit(
VARIABLE_SINGLE_SET,',')))]
Does this give the desired result?
lapply(VARIABLE_SELECTION, function(x) VARIABLE_TABLE[ , x[x != 0], drop = FALSE])
Produces a list where each element is a subset of 'VARIABLE_TABLE' given by 'VARIABLE_SELECTION' (using a 'VARIABLE_TABLE' with fewer rows).
# $Set.1
# Var.3 Var.1 Var.1.1 Var.1.2 Var.1.3
# 1 0.09536403 5.593292 5.593292 5.593292 5.593292
# 2 0.09086404 6.339074 6.339074 6.339074 6.339074
#
# $Set.2
# Var.3 Var.2 Var.2.1 Var.2.2
# 1 0.09536403 65.81870 65.81870 65.81870
# 2 0.09086404 66.79157 66.79157 66.79157
#
# $Set.3
# Var.3 Var.4 Var.3.1
# 1 0.09536403 -674.6672 0.09536403
# 2 0.09086404 -576.7986 0.09086404
#
# $Set.4
# Var.5 Var.4
# 1 0.5155411 -674.6672
# 2 -0.9593219 -576.7986
#
# $Set.5
# Var.5
# 1 0.5155411
# 2 -0.9593219

Resources