How to correctly read column from file when first element is empty

How to correctly read column from file when first element is empty - io

I have a data file data.txt
a
5 b
3 c 7
which I would like to load and have as
julia> loaded_data
3×3 Matrix{Any}:
"" "a" ""
5 "b" ""
3 "c" 7
but it is unclear to me how to do this. Trying readdlm
julia> using DelimitedFiles
julia> readdlm("data.txt")
3×3 Matrix{Any}:
"a" "" ""
5 "b" ""
3 "c" 7
does not correctly identify the first element of the first column as empty space, and instead reads "a" as the first element (which of course makes sense that it would). The closest I think I've gotten to what I want is using readlines
julia> readlines("data.txt")
3-element Vector{String}:
" a "
"5 b "
"3 c 7"
but from here I'm not sure how to proceed. I can grab one of the rows with all the columns and split it, but not sure how that helps me identify the empty elements in other rows.

Here's a possibility:
cnv(s) = (length(s) > 0 && all(isdigit, s)) ? parse(Int, s) : s
cnv.(stack(split.(replace.(eachline("data.txt")," "=>" "), " "), dims=1))

If the contents of the columns are sufficiently distinguishable to make the parsing uniquely defined, I'd use a regex on each line:
julia> lines
3-element Vector{String}:
" a "
"5 b "
"3 c 7"
julia> [match(r"\s*(\d*)\s*([a-z]*)\s*(\d*)", s).captures for s in lines]
3-element Vector{Vector{Union{Nothing, SubString{String}}}}:
["", "a", ""]
["5", "b", ""]
["3", "c", "7"]
You can then proceed to parse and concatenate as you wish, e.g.
julia> mapreduce(vcat, lines) do line
x, y, z = match(r"\s*(\d*)\s*([a-z]*)\s*(\d*)", line).captures
[tryparse(Int, x) y tryparse(Int, z)]
end
3×3 Matrix{Any}:
nothing "a" nothing
5 "b" nothing
3 "c" 7
In Julia 1.9, I think you should be able to write this as
stack(lines; dims=1) do line
x, y, z = match(r"\s*(\d*)\s*([a-z]*)\s*(\d*)", line).captures
(tryparse(Int, x), y, tryparse(Int, z))
end

This problem may have many edge cases to clarify.
Here is a longer option than the other answer, but perhaps better suited to tweak for the edge cases:
function splittable(d)
# find all non-space locations
t = sort(union(findall.(!isspace, d)...))
# find initial indices of fields
tt = t[vcat(1,findall(diff(t).!=1).+1)]
# prepare ranges to extract fields
tr = [tt[i]:tt[i+1]-1 for i in 1:length(tt)-1]
# extract substrings
vs = map(s -> strip.(vcat([s[intersect(r,eachindex(s))] for r in tr],
tt[end]<=length(s) ? s[tt[end]:end] : "")), d)
# fit substrings into matrix
L = maximum(length.(vs))
String.([j <= length(vs[i]) ? vs[i][j] : ""
for i in 1:length(vs), j in 1:L])
end
And:
julia> d = readlines("data.txt")
3-element Vector{String}:
" a "
"5 b "
"3 c 7"
julia> dd = splittable(d)
3×3 Matrix{String}:
"" "a" ""
"5" "b" ""
"3" "c" "7"
To get the partial parsing effect:
function parsewhatmay(m)
M = tryparse.(Int, m)
map((x,y)->isnothing(x) ? y : x, M, m)
end
and now:
julia> parsewhatmay(dd)
3×3 Matrix{Any}:
"" "a" ""
5 "b" ""
3 "c" 7

Related

Julia: concat strings with separator (equivalent of R's paste)

I have an array of strings that I would like to concatenate together with a specific separator.
x = ["A", "B", "C"]
Expected results (with sep = ;):
"A; B; C"
The R's equivalent would be paste(x, sep=";")
I've tried things like string(x) but the result is not what I look for...

Use join. It is not clear if you want ";" or "; " as a separator.
julia> x = ["A", "B", "C"]
3-element Array{String,1}:
"A"
"B"
"C"
julia> join(x, ';')
"A;B;C"
julia> join(x, "; ")
"A; B; C"
If you just want ; then just use a character ';'as a separator, if you also want the space, you need to use a string: "; "

How would you solve the letter changer in Julia?

I found this challenge:
Using your language, have the function LetterChanges(str) take the str parameter being passed and modify it using the following algorithm. Replace every letter in the string with the letter following it in the alphabet (ie. c becomes d, z becomes a). Then capitalize every vowel in this new string (a, e, i, o, u) and finally return this modified string.
I am new in Julia, and I was challenging myself in this challenge. I found this challenge very hard in Julia lang and I could not find a solution.
Here I tried to solve in the way below, but I got error: the x value is not defined
How would you solve this?
function LetterChanges(stringis::AbstractString)
alphabet = "abcdefghijklmnopqrstuvwxyz"
vohels = "aeiou"
for Char(x) in split(stringis, "")
if x == 'z'
x = 'a'
elseif x in vohels
uppercase(x)
else
Int(x)+1
Char(x)
println(x)
end
end
end
Thank you

As a side note:
The proposed solution works properly. However, if you would need high performance (which you probably do not given the source of your problem) it is more efficient to use string builder:
function LetterChanges2(str::AbstractString)
v = Set("aeiou")
#sprint(sizehint=sizeof(str)) do io # use on Julia 0.7 - new keyword argument
sprint() do io # use on Julia 0.6.2
for c in str
c = c == 'z' ? 'a' : c+1 # we assume that we got only letters from 'a':'z'
print(io, c in v ? uppercase(c) : c)
end
end
end
it is over 10x faster than the above.
EDIT: for Julia 0.7 this is a bit faster:
function LetterChanges2(str::AbstractString)
v = BitSet(collect(Int,"aeiouy"))
sprint(sizehint=sizeof(str)) do io # use on Julia 0.7 - new keyword argument
for c in str
c = c == 'z' ? 'a' : c+1 # we assume that we got only letters from 'a':'z'
write(io, Int(c) in v ? uppercase(c) : c)
end
end
end

There is a logic error. It says "Replace every letter in the string with the letter following it in the alphabet. Then capitalize every vowel in this new string". Your code checks, if it is a vowel. Then it capitalizes it or replaces it. That's different behavior. You have to first replace and then to check if it is a vowel.
You are replacing 'a' by 'Z'. You should be replacing 'z' by 'a'
The function split(stringis, "") returns an array of strings. You can't store these strings in Char(x). You have to store them in x and then you can transform theses string to char with c = x[1].
After transforming a char you have to store it in the variable: c = uppercase(c)
You don't need to transform a char into int. You can add a number to a char: c = c + 1
You have to store the new characters in a string and return them.
function LetterChanges(stringis::AbstractString)
# some code
str = ""
for x in split(stringis, "")
c = x[1]
# logic
str = "$str$c"
end
return str
end

Here's another version that is a bit faster than #BogumilKaminski's answer on version 0.6, but that might be different on 0.7. On the other hand, it might be a little less intimidating than the do-block magic ;)
function changeletters(str::String)
vowels = "aeiouy"
carr = Vector{Char}(length(str))
i = 0
for c in str
newchar = c == 'z' ? 'a' : c + 1
carr[i+=1] = newchar in vowels ? uppercase(newchar) : newchar
end
return String(carr)
end

At the risk of being accused of cheating, this is a dictionary-based approach:
function change_letters(s::String)::String
k = collect('a':'z')
v = vcat(collect('b':'z'), 'A')
d = Dict{Char, Char}(zip(k, v))
for c in Set("eiou")
d[c - 1] = uppercase(d[c - 1])
end
b = IOBuffer()
for c in s
print(b, d[c])
end
return String(take!(b))
end
It seems to compare well in speed terms with the other Julia 0.6 methods for long strings (e.g. 100,000 characters). There's a bit of unnecessary overhead in constructing the dictionary which is noticeable on small strings, but I'm far too lazy to type out the 'a'=>'b' construction long-hand!

Type mismatch when converting averages to grade (beginner)

I am taking a computer sciences class right now, and I just have no idea how to convert an average made from 3 scores to a letter grade. At first I thought I could do something like:
PRINT name$(c); TAB(6) ; USING("###.#", average(c))
As:
PRINT name$(c); TAB(6) ; USING("***something for text here***", average(c))
But after my searches and scouring on the internet, I came up with nothing. After a while I rewrote a majority of my code, but it still doesnt work correctly. Can someone tell me what I can do to get it working?
Here it is:
dim names(20)
dim average$(20)
x = 0
input "Please input Teacher's name:"; teacher$
rem teacher$
cls
input "Input student's name:"; studentname$
do while studentname$ <> ""
name$(x)=studentname$
rem name$(x)
input "Input first number:"; e
input "Input second number:"; f
input "Input third number:"; g
avg$=(e+f+g)/3
average(x)= avg
x=x+1
cls
input "Input the next name or press enter to finish:"; studentname$
loop
print teacher$; "'s Class Report"
for c = 1 to X
if (avg$>89 and avg$<101) then let avg= "A" else if
if (avg$>79 and avg$<89) then let avg= "B" else if
if (avg$>69 and avg$<79) then let avg= "C" else if
if (avg$>59 and avg$<69) then let avg= "D" else if
if (avg$<59) then let avg= "F"; print names(c), TAB(6) average$(c)
next c
end

Three thing to note here.
First off, the dollar sign $ is only used at the end of variablenames that contain text values not numeric values. So it's a$ = "hello" and i = (12+34+56) / 3 etc.
Secondly, in the input part you calculate the average value and store it in variable avg$. Then in the for-loop where you want to print the letter grades you check the same variable name. However, you never set avg$ within that for-loop, so it will always just contain the last calculated value. And also it should be without $ because it's a numeric value.
Finally, like Shawn Mehan also already commented, you should rename your variables to better reflect what they are used for. That will probably clear up some of the confusion. So something like dim avgpoint(20) for the 0-100 scores, and avgletter$="A" etc. for the letters grade.
So to combine these things, I would change your code to something like this:
input "Input first grade number (0-100):"; grade1
input "Input second grade number (0-100):"; grade2
input "Input third grade number (0-100):"; grade3
calcavg = (grade1+grade2+grade3)/3
avgpoint(x) = calcavg
and then
for c = 1 to x
p = avgpoint(x)
if (p>89 and p<=101) then let avgletter$ = "A"
'etc.

Here is some coding sample for a grade report program:
DIM Names(20) AS STRING
DIM Average(20) AS SINGLE
INPUT "Please input Teacher's name"; Teacher$
PRINT "Enter up to 20 names, <enter> to quit:"
DO UNTIL x = 20
PRINT "Input student"; x + 1; "name";
INPUT StudentName$
IF StudentName$ = "" THEN EXIT DO
x = x + 1: Names(x) = StudentName$
INPUT "Input first number"; J
INPUT "Input second number"; K
INPUT "Input third number"; L
Average(x) = (J + K + L) / 3
LOOP
PRINT Teacher$; "'s Class Report"
FOR c = 1 TO x
SELECT CASE Average(c)
CASE 0 TO 59
Grade$ = "F"
CASE 60 TO 69
Grade$ = "D"
CASE 70 TO 79
Grade$ = "C"
CASE 80 TO 89
Grade$ = "B"
CASE ELSE
Grade$ = "A"
END SELECT
PRINT Names(c); SPC(6); Grade$
NEXT
END

Another coding sample of a grade report program with variable number of scores:
DIM Names(20) AS STRING
DIM Average(20) AS SINGLE
INPUT "Please input Teacher's name"; Teacher$
PRINT "Enter up to 20 names, <enter> to quit:"
DO UNTIL x = 20
PRINT "Input student"; x + 1; "name";
INPUT StudentName$
IF StudentName$ = "" THEN EXIT DO
x = x + 1: Names(x) = StudentName$
y = 0 ' number of scores
z = 0 ' total of scores
PRINT "Enter scores, <enter> to quit:"
DO
PRINT "Enter score"; y + 1;
INPUT I$
IF I$ = "" THEN EXIT DO
IF VAL(I$) >= 0 AND VAL(I$) <= 100 THEN
y = y + 1
z = z + VAL(I$)
ELSE
PRINT "Value must be 0 to 100."
END IF
LOOP
IF y > 0 THEN ' avoid division by zero
Average(x) = z / y
END IF
LOOP
PRINT Teacher$; "'s Class Report"
FOR c = 1 TO x
SELECT CASE Average(c)
CASE 0 TO 59
Grade$ = "F"
CASE 60 TO 69
Grade$ = "D"
CASE 70 TO 79
Grade$ = "C"
CASE 80 TO 89
Grade$ = "B"
CASE ELSE
Grade$ = "A"
END SELECT
PRINT Names(c); SPC(6); Grade$
NEXT
END

Generating substrings and random strings in R

Please bear with me, I come from a Python background and I am still learning string manipulation in R.
Ok, so lets say I have a string of length 100 with random A, B, C, or D letters:
> df<-c("ABCBDBDBCBABABDBCBCBDBDBCBDBACDBCCADCDBCDACDDCDACBCDACABACDACABBBCCCBDBDDCACDDACADDDDACCADACBCBDCACD")
> df
[1]"ABCBDBDBCBABABDBCBCBDBDBCBDBACDBCCADCDBCDACDDCDACBCDACABACDACABBBCCCBDBDDCACDDACADDDDACCADACBCBDCACD"
I would like to do the following two things:
1) Generate a '.txt' file that is comprised of 20-length subsections of the above string, each starting one letter after the previous with their own unique name on the line above it, like this:
NAME1
ABCBDBDBCBABABDBCBCB
NAME2
BCBDBDBCBABABDBCBCBD
NAME3
CBDBDBCBABABDBCBCBDB
NAME4
BDBDBCBABABDBCBCBDBD
... and so forth
2) Take that generated list and from it comprise another list that has the same exact substrings with the only difference being a change of one or two of the A, B, C, or Ds to another A, B, C, or D (any of those four letters only).
So, this:
NAME1
ABCBDBDBCBABABDBCBCB
Would become this:
NAME1.1
ABBBDBDBCBDBABDBCBCB
As you can see, the "C" in the third position became a "B" and the "A" in position 11 became a "D", with no implied relationship between those changed letters. Purely random.
I know this is a convoluted question, but like I said, I am still learning basic text and string manipulation in R.
Thanks in advance.

Create a text file of substrings
n <- 20 # length of substrings
starts <- seq(nchar(df) - 20 + 1)
v1 <- mapply(substr, starts, starts + n - 1, MoreArgs = list(x = df))
names(v1) <- paste0("NAME", seq_along(v1), "\n")
write.table(v1, file = "filename.txt", quote = FALSE, sep = "",
col.names = FALSE)
Randomly replace one or two letters (A-D):
myfun <- function() {
idx <- sample(seq(n), sample(1:2, 1))
rep <- sample(LETTERS[1:4], length(idx), replace = TRUE)
return(list(idx = idx, rep = rep))
}
new <- replicate(length(v1), myfun(), simplify = FALSE)
v2 <- mapply(function(x, y, z) paste(replace(x, y, z), collapse = ""),
strsplit(v1, ""),
lapply(new, "[[", "idx"),
lapply(new, "[[", "rep"))
names(v2) <- paste0(names(v2), ".1")
write.table(v2, file = "filename2.txt", quote = FALSE, sep = "\n",
col.names = FALSE)

I tried breaking this down into multiple simple steps, hopefully you can get learn a few tricks from this:
# Random data
df<-c("ABCBDBDBCBABABDBCBCBDBDBCBDBACDBCCADCDBCDACDDCDACBCDACABACDACABBBCCCBDBDDCACDDACADDDDACCADACBCBDCACD")
n<-10 # Number of cuts
set.seed(1)
# Pick n random numbers between 1 and the length of string-20
nums<-sample(1:(nchar(df)-20),n,replace=TRUE)
# Make your cuts
cuts<-sapply(nums,function(x) substring(df,x,x+20-1))
# Generate some names
nams<-paste0('NAME',1:n)
# Make it into a matrix, transpose, and then recast into a vector to get alternating names and cuts.
names.and.cuts<-c(t(matrix(c(nams,cuts),ncol=2)))
# Drop a file.
write.table(names.and.cuts,'file.txt',quote=FALSE,row.names=FALSE,col.names = FALSE)
# Pick how many changes are going to be made to each cut.
changes<-sample(1:2,n,replace=2)
# Pick that number of positions to change
pos.changes<-lapply(changes,function(x) sample(1:20,x))
# Find the letter at each position.
letter.at.change.pos<-lapply(pos.changes,function(x) substring(df,x,x))
# Make a function that takes any letter, and outputs any other letter from c(A-D)
letter.map<-function(x){
# Make a list of alternate letters.
alternates<-lapply(x,setdiff,x=c('A','B','C','D'))
# Pick one of each
sapply(alternates,sample,size=1)
}
# Find another letter for each
letter.changes<-lapply(letter.at.change.pos,letter.map)
# Make a function to replace character by position
# Inefficient, but who cares.
rep.by.char<-function(str,pos,chars){
for (i in 1:length(pos)) substr(str,pos[i],pos[i])<-chars[i]
str
}
# Change every letter at pos.changes to letter.changes
mod.cuts<-mapply(rep.by.char,cuts,pos.changes,letter.changes,USE.NAMES=FALSE)
# Generate names
nams<-paste0(nams,'.1')
# Use the matrix trick to alternate names.Drop a file.
names.and.mod.cuts<-c(t(matrix(c(nams,mod.cuts),ncol=2)))
write.table(names.and.mod.cuts,'file2.txt',quote=FALSE,row.names=FALSE,col.names = FALSE)
Also, instead of the rep.by.char function, you could just use strsplit and replace like this:
mod.cuts<-mapply(function(x,y,z) paste(replace(x,y,z),collapse=''),
strsplit(cuts,''),pos.changes,letter.changes,USE.NAMES=FALSE)

One way, albeit slowish:
Rgames> foo<-paste(sample(c('a','b','c','d'),20,rep=T),sep='',collapse='')
Rgames> bar<-matrix(unlist(strsplit(foo,'')),ncol=5)
Rgames> bar
[,1] [,2] [,3] [,4] [,5]
[1,] "c" "c" "a" "c" "a"
[2,] "c" "c" "b" "a" "b"
[3,] "b" "b" "a" "c" "d"
[4,] "c" "b" "a" "c" "c"
Now you can select random indices and replace the selected locations with sample(c('a','b','c','d'),1) . For "true" randomness, I wouldn't even force a change - if your newly drawn letter is the same as the original, so be it.
Like this:
ibar<-sample(1:5,4,rep=T) # one random column number for each row
for ( j in 1: 4) bar[j,ibar[j]]<-sample(c('a','b','c','d'),1)
Then, if necessary, recombine each row using paste

For the first part of your question:
df <- c("ABCBDBDBCBABABDBCBCBDBDBCBDBACDBCCADCDBCDACDDCDACBCDACABACDACABBBCCCBDBDDCACDDACADDDDACCADACBCBDCACD")
nstrchars <- 20
count<- nchar(df)-nstrchars
length20substrings <- data.frame(length20substrings=sapply(1:count,function(x)substr(df,x,x+20)))
# to save to a text file. I chose not to include row names or a column name in the .txt file file
write.table(length20substrings,"length20substrings.txt",row.names=F,col.names=F)
For the second part:
# create a function that will randomly pick one or two spots in a string and replace
# those spots with one of the other characters present in the string:
changefxn<- function(x){
x<-as.character(x)
nc<-nchar(as.character(x))
id<-seq(1,nc)
numchanges<-sample(1:2,1)
ids<-sample(id,numchanges)
chars2repl<-strsplit(x,"")[[1]][ids]
charspresent<-unique(unlist(strsplit(x,"")))
splitstr<-unlist(strsplit(x,""))
if (numchanges>1) {
splitstr[id[1]] <- sample(setdiff(charspresent,chars2repl[1]),1)
splitstr[id[2]] <- sample(setdiff(charspresent,chars2repl[2]),1)
}
else {splitstr[id[1]] <- sample(setdiff(charspresent,chars2repl[1]),1)
}
newstr<-paste(splitstr,collapse="")
return(newstr)
}
# try it out
changefxn("asbbad")
changefxn("12lkjaf38gs")
# apply changefxn to all the substrings from part 1
length20substrings<-length20substrings[seq_along(length20substrings[,1]),]
newstrings <- lapply(length20substrings, function(ii)changefxn(ii))

Password generator function in R

I am looking for a smart way to code a password generator function in R:
generate.password (length, capitals, numbers)
length: the length of the password
capitals: a vector of defining where capitals shall occur, vector reflects the corresponsing password string position, default should be no capitals
numbers: a vector defining where capitals shall occur, vector reflects the corresponsing password string position, default should be no numbers
Examples:
generate.password(8)
[1] "hqbfpozr"
generate.password(length=8, capitals=c(2,4))
[1] "hYbFpozr"
generate.password(length=8, capitals=c(2,4), numbers=c(7:8))
[1] "hYbFpo49"

There is function which generates random strings in the stringi (version >= 0.2-3) package:
require(stringi)
stri_rand_strings(n=2, length=8, pattern="[A-Za-z0-9]")
## [1] "90i6RdzU" "UAkSVCEa"
So using different patterns you can generate parts for your desired password and then paste it like this:
x <- stri_rand_strings(n=4, length=c(2,1,2,3), pattern=c("[a-z]","[A-Z]","[0-9]","[a-z]"))
x
## [1] "ex" "N" "81" "tsy"
stri_flatten(x)
## [1] "exN81tsy"

Here's one approach
generate.password <- function(length,
capitals = integer(0),
numbers = integer(0)) {
stopifnot(is.numeric(length), length > 0L,
is.numeric(capitals), capitals > 0L, capitals <= length,
is.numeric(numbers), numbers > 0L, numbers <= length,
length(intersect(capitals, numbers)) == 0L)
lc <- sample(letters, length, replace = TRUE)
uc <- sample(LETTERS, length(capitals), replace = TRUE)
num <- sample(0:9, length(numbers), replace = TRUE)
pass <- lc
pass[capitals] <- uc
pass[numbers] <- num
paste0(pass, collapse = "")
}
## Examples
set.seed(1)
generate.password(8)
# [1] "gjoxfxyr"
set.seed(1)
generate.password(length=8, capitals=c(2,4))
# [1] "gQoBfxyr"
set.seed(1)
generate.password(length=8, capitals=c(2,4), numbers=c(7:8))
# [1] "gQoBfx21"
You can also add other special characters in the same fashion. If you want repeated values for letters and numbers, then add replace =TRUE in sample function.

I liked the solution given by #Hadd E. Nuff... and What I did, is the inclusion of digits between 0 and 9, at random... here is the modified solution...
generate.password <- function(LENGTH){
punct <- c("!", "#", "$", "%", "&", "(", ")", "*", "+", "-", "/", ":",
";", "<", "=", ">", "?", "#", "[", "^", "_", "{", "|", "}", "~")
nums <- c(0:9)
chars <- c(letters, LETTERS, punct, nums)
p <- c(rep(0.0105, 52), rep(0.0102, 25), rep(0.02, 10))
pword <- paste0(sample(chars, LENGTH, TRUE, prob = p), collapse = "")
return(pword)
}
generate.password(8)
This will generate very strong passwords like:
"C2~mD20U" # 8 alpha-numeric-specialchar
"+J5Gi3" # 6 alpha-numeric-specialchar
"77{h6RsGQJ66if5" # 15 alpha-numeric-specialchar

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to correctly read column from file when first element is empty - io

Here's a possibility: cnv(s) = (length(s) > 0 && all(isdigit, s)) ? parse(Int, s) : s cnv.(stack(split.(replace.(eachline("data.txt")," "=>" "), " "), dims=1))

Related

Julia: concat strings with separator (equivalent of R's paste)

How would you solve the letter changer in Julia?

Type mismatch when converting averages to grade (beginner)

Generating substrings and random strings in R

Password generator function in R

Categories

Resources