Extract part of string - string

I've been trying to extract something from a string (actually a $call) in R, and it's driving me nuts. If you have:
library(vars)
data <- as.data.frame(matrix(c(runif(40)), ncol=2))
z <- matrix(c(runif(40)), ncol=2)
var.modell <- VAR(data, p = 2, exogen=z, type = "trend")
How do you extract the z? I've tried googling and searching stack overflow. I found this: R extract a part of a string in R
which made me try:
sub(".*?exogen=(.*?)", "\\1", var.modell$call, perl = TRUE)
But it returns:
[1] "VAR" "data" "2" "trend" "z"
What am I doing wrong?

Look at the call object itself:
m <- lm(speed~dist,data=cars)
m$call$data
## cars
You'll want var.modell$call$exogen.

Related

Specific string generation in python of 5 alphabets followed by 4 numbers. example ABCDE1234

I have used the following code but not getting the desired output. please help me.
"{}{}{}".format((random.choices(string.ascii_uppercase)for i in range(5)), random.randint(1000,9999))
I don't know exactly why it doesn't work, but I managed to get it to print the desired result using this code:
x = []
y = ""
for i in range(5):
x += random.choices(string.ascii_uppercase)
x += str(random.randint(1000,9999))
print (y.join(x))
My guess is that it's because you're trying to add a list (your method of string generation produces a list of string characters) and an integer (randint produces an integer) to a string.

Getting the largest and smallest word at a string

when I run this codes the output is (" "," "),however it should be ("I","love")!!!, and there is no errors . what should I do to fix it ??
sen="I love dogs"
function Longest_word(sen)
x=" "
maxw=" "
minw=" "
minl=1
maxl=length(sen)
p=0
for i=1:length(sen)
if(sen[i]!=" ")
x=[x[1]...,sen[i]...]
else
p=length(x)
if p<min1
minl=p
minw=x
end
if p>maxl
maxl=p
maxw=x
end
x=" "
end
end
return minw,maxw
end
As #David mentioned, another and may be better solution can be achieved by using split function:
function longest_word(sentence)
sp=split(sentence)
len=map(length,sp)
return (sp[indmin(len)],sp[indmax(len)])
end
The idea of your code is good, but there are a few mistakes.
You can see what's going wrong by debugging a bit. The easiest way to do this is with #show, which prints out the value of variables. When code doesn't work like you expect, this is the first thing to do -- just ask it what it's doing by printing everything out!
E.g. if you put
if(sen[i]!=" ")
x=[x[1]...,sen[i]...]
#show x
and run the function with
Longest_word("I love dogs")
you will see that it is not doing what you want it to do, which (I believe) is add the ith letter to the string x.
Note that the ith letter accessed like sen[i] is a character not a string.
You can try converting it to a string with
string(sen[i])
but this gives a Unicode string, not an ASCII string, in recent versions of Julia.
In fact, it would be better not to iterate over the string using
for i in 1:length(sen)
but iterate over the characters in the string (which will also work if the string is Unicode):
for c in sen
Then you can initialise the string x as
x = UTF8String("")
and update it with
x = string(x, c)
Try out some of these possibilities and see if they help.
Also, you have maxl and minl defined wrong initially -- they should be the other way round. Also, the names of the variables are not very helpful for understanding what should happen. And the strings should be initialised to empty strings, "", not a string with a space, " ".
#daycaster is correct that there seems to be a min1 that should be minl.
However, in fact there is an easier way to solve the problem, using the split function, which divides a string into words.
Let us know if you still have a problem.
Here is a working version following your idea:
function longest_word(sentence)
x = UTF8String("")
maxw = ""
minw = ""
maxl = 0 # counterintuitive! start the "wrong" way round
minl = length(sentence)
for i in 1:length(sentence) # or: for c in sentence
if sentence[i] != ' ' # or: if c != ' '
x = string(x, sentence[i]) # or: x = string(x, c)
else
p = length(x)
if p < minl
minl = p
minw = x
end
if p > maxl
maxl = p
maxw = x
end
x = ""
end
end
return minw, maxw
end
Note that this function does not work if the longest word is at the end of the string. How could you modify it for this case?

R: combinatorial string replacement

I am on the lookout for a gsub based function which would enable me to do combinatorial string replacement, so that if I would have an arbitrary number of string replacement rules
replrules=list("<x>"=c(3,5),"<ALK>"=c("hept","oct","non"),"<END>"=c("ane","ene"))
and a target string
string="<x>-methyl<ALK><END>"
it would give me a dataframe with the final string name and the substitutions that were made as in
name x ALK END
3-methylheptane 3 hept ane
5-methylheptane 5 hept ane
3-methyloctane 3 oct ane
5-methyloctane 5 ... ...
3-methylnonane 3
5-methylnonane 5
3-methylheptene 3
5-methylheptene 5
3-methyloctene 3
5-methyloctene 5
3-methylnonene 3
5-methylnonene 5
The target string would be of arbitrary structure, e.g. it could also be string="1-<ALK>anol" or each pattern could occur several times, as in string="<ALK>anedioic acid, di<ALK>yl ester"
What would be the most elegant way to do this kind of thing in R?
How about
d <- do.call(expand.grid, replrules)
d$name <- paste0(d$'<x>', "-", "methyl", d$'<ALK>', d$'<END>')
EDIT
This seems to work (substituting each of these into the strplit)
string = "<x>-methyl<ALK><END>"
string2 = "<x>-ethyl<ALK>acosane"
string3 = "1-<ALK>anol"
Using Richards regex
d <- do.call(expand.grid, list(replrules, stringsAsFactors=FALSE))
names(d) <- gsub("<|>","",names(d))
s <- strsplit(string3, "(<|>)", perl = TRUE)[[1]]
out <- list()
for(i in s) {
out[[i]] <- ifelse (i %in% names(d), d[i], i)
}
d$name <- do.call(paste0, unlist(out, recursive=F))
EDIT
This should work for repeat items
d <- do.call(expand.grid, list(replrules, stringsAsFactors=FALSE))
names(d) <- gsub("<|>","",names(d))
string4 = "<x>-methyl<ALK><END>oate<ALK>"
s <- strsplit(string4, "(<|>)", perl = TRUE)[[1]]
out <- list()
for(i in seq_along(s)) {
out[[i]] <- ifelse (s[i] %in% names(d), d[s[i]], s[i])
}
d$name <- do.call(paste0, unlist(out, recursive=F))
Well, I'm not exactly sure we can even produce a "correct" answer to your question, but hopefully this helps give you some ideas.
Okay, so in s, I just split the string where it might be of most importance. Then g gets the first value in each element of r. Then I constructed a data frame as an example. So then dat is a one row example of how it would look.
> (s <- strsplit(string, "(?<=l|\\>)", perl = TRUE)[[1]])
# [1] "<x>" "-methyl" "<ALK>" "<END>"
> g <- sapply(replrules, "[", 1)
> dat <- data.frame(name = paste(append(g, s[2], after = 1), collapse = ""))
> dat[2:4] <- g
> names(dat)[2:4] <- sapply(strsplit(names(g), "<|>"), "[", -1)
> dat
# name x ALK END
# 1 3-methylheptane 3 hept ane

Haskell flag pattern writing function?

I really need help in writing this function in Haskell, I don't even know where to start. Here are the specs:
Define a function flagpattern that takes a positive Int value greater than or equal to five and returns a String that can be displayed as the following `flag' pattern of dimension n, e.g.
Main> putStr (flagpattern 7)
#######
## ##
# # # #
# # #
# # # #
## ##
#######
Assuming you want a "X" enclosed in 4 lines, you need to write a function that given a coordinate (x,y) returns what character should be at that position:
coordinate n x y = if i == 0 then 'X' else ' '
(This version outputs only the leftmost X'es, modify it, remember indices start with 0)
Now you want them nicely arranged in a matrix, use a list comprehension, described in the linked text.
You should start from your problem definition:
main :: IO ()
main = putStr . flagPattern $ 7
Then, you should ask yourself about how much dots flag has:
flagPattern :: Int -> String
flagPattern = magic $ [1..numberOfDots]
Then, (hard) part of magic function should decide for each dot whether it is   or #:
partOfMagic ...
| ... = "#" -- or maybe even "#\n" in some cases?
| otherwise = " "
Then, you can concatenate parts into one string and get the answer.
Start with the type signature.
flagpattern :: Int -> String
Now break the problem into subproblems. For example, suppose I told you to produce row 2 of a size 7 flag pattern. You would write:
XX XX
Or row 3 of a size 7 flag pattern would be
X X X X
So suppose we had a function that could produce a given row. Then we'd have
flagpattern :: Int -> String
flagpattern size = unlines (??? flagrow ???)
flagrow :: Int -> Int -> String
flagrow row size = ???
unlines takes a list of Strings and turns it into a single String with newlines between each element of the list. See if you can define flagrow, and get it working correctly for any given row and size. Then see if you can use flagrow to define flagpattern.

R: How can I replace let's say the 5th element within a string?

I would like to convert the a string like be33szfuhm100060 into BESZFUHM0060.
In order to replace the small letters with capital letters I've so far used the gsub function.
test1=gsub("be","BE",test)
Is there a way to tell this function to replace the 3rd and 4th string element? If not, I would really appreciate if you could tell me another way to solve this problem. Maybe there is also a more general solution to change a string element at a certain position into a capital letter whatever the element is?
A couple of observations:
Cnverting a string to uppercase can be done with toupper, e.g.:
> toupper('be33szfuhm100060')
> [1] "BE33SZFUHM100060"
You could use substr to extract a substring by character positions and paste to concatenate strings:
> x <- 'be33szfuhm100060'
> paste(substr(x, 1, 2), substr(x, 5, nchar(x)), sep='')
[1] "beszfuhm100060"
As an alternative, if you are going to be doing this alot:
String <- function(x="") {
x <- as.character(paste(x, collapse=""))
class(x) <- c("String","character")
return(x)
}
"[.String" <- function(x,i,j,...,drop=TRUE) {
unlist(strsplit(x,""))[i]
}
"[<-.String" <- function(x,i,j,...,value) {
tmp <- x[]
tmp[i] <- String(value)
x <- String(tmp)
x
}
print.String <- function(x, ...) cat(x, "\n")
## try it out
> x <- String("be33szfuhm100060")
> x[3:4] <- character(0)
> x
beszfuhm100060
You can use substring to remove the third and fourth elements.
x <- "be33szfuhm100060"
paste(substring(x, 1, 2), substring(x, 5), sep = "")
If you know what portions of the string you want based on their position(s), use substr or substring. As I mentioned in my comment, you can use toupper to coerce characters to uppercase.
paste( toupper(substr(test,1, 2)),
toupper(substr(test,5,10)),
substr(test,12,nchar(test)),sep="")
# [1] "BESZFUHM00060"

Resources