How to extract only the odd position value of an string? - string

I have a string Data="0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0";
I want to extract only the odd position of Data.
I mean New string with value:-0 0 0 0 0 0 0 0

You could do this:
var Data="0 0 1 0 2 0 3 0 4 0 5 0 6 0 7 0";
var output= string.Join(" ", Data
.Split(' ')
.Select ((s,i) =>new {s,i})
.Where (w =>w.i % 2 != 0 )
.Select (s => s.s));
Output will be:
0 0 0 0 0 0 0 0
You could also do this:
private IEnumerable<string> GetOdd(string data)
{
var split=data.Split(' ');
for(int i=0;i<split.Length;i++)
{
if(i % 2 != 0)
yield return split[i];
}
}
And then call the function like this:
var output= string.Join(" ", GetOdd(Data))

Related

how do you replace only a certain number of items in a list randomly?

board = []
for x in range(0,8):
board.append(["0"] * 8)
def print_board(board):
for row in board:
print(" ".join(row))
this code creates a grid of zeros but I wish to replace 5 of them with ones and another five with twos
does anyone know a way to do this?
If you want to randomly set some coordinates with "1" and "2", you can do it like this:
import random
board = []
for x in range(0, 8):
board.append(["0"] * 8)
def print_board(board):
for row in board:
print(" ".join(row))
def generate_coordinates(x, y, k):
coordinates = [(i, j) for i in range(x) for j in range(y)]
random.shuffle(coordinates)
return coordinates[:k]
coo = generate_coordinates(8, 8, 10)
ones = coo[:5]
twos = coo[5:]
for i, j in ones:
board[i][j] = "1"
for i, j in twos:
board[i][j] = "2"
print_board(board)
Output
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0
0 0 2 0 0 0 0 0
1 0 0 0 2 0 0 0
0 0 0 0 2 0 0 2
2 0 0 0 0 0 0 1
Notes:
The code above generates a random sample each time so the output will be different each time (to generate the same use random.seed(42), you can change 42 for any number you want.
The function generate_coordinates receives x (number of rows), y (number of columns) and k (the number of coordinates to pick). It generates a sequence of coordinates of x*y, shuffles it and picks the k first.
In your specific case x = 8, y = 8 and k = 10 (5 for the ones and 5 for the twos)
Finally, this picks the positions for the ones and twos and changes the values:
ones = coo[:5]
twos = coo[5:]
for i, j in ones:
board[i][j] = "1"
for i, j in twos:
board[i][j] = "2"

BASH Script - Check if consecutive numbers in a string are above a value

I am echoing some data from an Oracle DB cluster, via a bash script. Currently, my output into a variable in the script from SQLPlus is:
11/12 0 0 0 0 0 0 1 0 1 0 5 4 1 0 0 0 0 0 0 0 0 0 0 0
What I'd like to be able to do is evaluate that string of numbers, excluding the first one (the date), to see if any consecutive 6 of the numbers are above a certain value, lets say 10.
I only want the logic to return true if all 6 consecutive values were above "10".
So for example, if the output was:
11/12 0 0 8 10 5 1 1 0 8 10 25 40 6 2 0 0 0 0 0 0 0 0 0 0
The logic should return false/null/zero, anything I can handle negatively.
But if the string looked like this:
11/12 0 0 0 0 5 9 1 0 1 10 28 10 12 19 15 11 6 7 0 0 0 0
Then it would return true/1 etc..
Is there any bash component that I can make use of to do this? I've been stuck on this part for a while now.
For variety, here is a solution not depending on awk:
#!/usr/bin/env bash
contains() {
local nums=$* count=0 threshold=10 limit=6 i
for i in ${nums#* }; do
if (( i >= threshold )); then
(( ++count >= limit )) && return 0
else
count=0
fi
done
return 1
}
output="11/12 0 0 0 0 5 9 1 0 1 10 28 10 12 19 15 11 6 7 0 0 0 0"
if contains "$output"; then
echo "Yaaay!"
else
echo "Noooo!"
fi
Say your string is in $S, then
echo $S | awk '
{ L=0; threshold = 10; reqLength = 6;
for (i = 2; i <= NF; ++i) {
if ($i >= threshold) {
L += 1
if (L >= reqLength) {
exit(1);
}
} else {
L = 0
}
}
}'
would do it. ($? will be 1 if you have enough numbers exceeding your threshold)

Creating 2D matrix from RDD

I have the following RDD of the Type ((UserID, MovieID),1):
val data_wo_header=dropheader(data).map(_.split(",")).map(x=>((x(0).toInt,x(1).toInt),1))
I want to convert this data structure into a 2D array such that all elements(userID Movie ID) present in the original RDD have a 1 else 0.
I think we have to map the user ID's to 0-N if N is the number of distinct users and map Movie ID's to 0-M if Mis the number of distinct movies.
EDIT: example
Movie ID->
Userid 1 2 3 4 5 6 7
1 0 1 1 0 0 1 0
2 0 1 0 1 0 0 0
3 0 1 1 0 0 0 1
4 1 1 0 0 1 0 0
5 0 1 1 0 0 0 1
6 1 1 1 1 1 0 0
7 0 1 1 0 0 0 0
8 0 1 1 1 0 0 1
9 0 1 1 0 0 1 0
The RDD will be of the sort
(userID, movID,rating)
101,1002,3.5
101,1003,2.5
101,1006,3
102,1002,3.5
102,1004,4.0
103,1002,1.0
103,1003,1.0
103,1007,5.0
….
val baseRDD = sc.parallelize(Seq((101, 1002, 3.5), (101, 1003, 2.5), (101, 1006, 3), (102, 1002, 3.5), (102, 1004, 4.0), (103, 1002, 1.0), (103, 1003, 1.0), (103, 1007, 5.0)))
baseRDD.map(x => (x._1, x._2)).groupByKey().foreach(println)
(userID, movID,rating) format as you mentioned
Result:
(101,CompactBuffer(1002, 1003, 1006))
(102,CompactBuffer(1002, 1004))
(103,CompactBuffer(1002, 1003, 1007))
HI I managed to generate the 2D matrix using the following function. It takes in the RDD of the format
((userID, movID),rating)
101,1002,3.5
101,1003,2.5
101,1006,3
102,1002,3.5
102,1004,4.0
103,1002,1.0
103,1003,1.0
103,1007,5.0
and returns the characteristic Matrix:
def generate_characteristic_matrix(data_wo_header:RDD[((Int, Int), Int)]):Array[Array[Int]]={
val distinct_user_IDs=data_wo_header.sortByKey().map(x=>x._1._1).distinct().collect().sorted
val distinct_movie_IDs=data_wo_header.sortByKey().map(x=>x._1._2).distinct().collect().sorted
var movie_count=distinct_movie_IDs.size
var user_count=distinct_user_IDs.size
var a =0
var map_movie = new ArrayBuffer[(Int, Int)]()
var map_user = new ArrayBuffer[(Int, Int)]()
//map movie ID's from (0,movie_count)
for( a <- 0 to movie_count-1){
map_movie+=((distinct_movie_IDs(a),a))
}
//map user ID's from (0,user_count)
for( a <- 0 to user_count-1){
map_user+=((distinct_user_IDs(a),a))
}
//size of char matrix is user_countxmovie_count
var char_matrix = Array.ofDim[Int](user_count,movie_count)
data_wo_header.collect().foreach(x => {
var user =x._1._1
var movie=x._1._2
var movie_mappedid=map_movie.filter(x=>x._1==movie).map(x=>x._2).toArray
var user_mappedid=map_user.filter(x=>x._1==user).map(x=>x._2).toArray
char_matrix(user_mappedid(0))(movie_mappedid(0))=1
})
return char_matrix
}

Count the number of overlapping substrings within a string

example:
s <- "aaabaabaa"
p <- "aa"
I want to return 4, not 3 (i.e. counting the number of "aa" instances in the initial "aaa" as 2, not 1).
Is there any package to solve it? Or is there any way to count in R?
I believe that
find_overlaps <- function(p,s) {
gg <- gregexpr(paste0("(?=",p,")"),s,perl=TRUE)[[1]]
if (length(gg)==1 && gg==-1) 0 else length(gg)
}
find_overlaps("aa","aaabaabaa") ## 4
find_overlaps("not_there","aaabaabaa") ## 0
find_overlaps("aa","aaaaaaaa") ## 7
will do what you want, which would be more clearly expressed as "finding the number of overlapping substrings within a string".
This a minor variation on Finding the indexes of multiple/overlapping matching substrings
substring might be useful here, by taking every successive pair of characters.
( ss <- sapply(2:nchar(s), function(i) substring(s, i-1, i)) )
## [1] "aa" "aa" "ab" "ba" "aa" "ab" "ba" "aa"
sum(ss %in% p)
## [1] 4
I needed the answer to a related more-general question. Here is what I came up with generalizing Ben Bolker's solution:
my.data <- read.table(text = '
my.string my.cov
1.2... 1
.21111 2
..2122 3
...211 2
112111 4
212222 1
', header = TRUE, stringsAsFactors = FALSE)
desired.result.2ch <- read.table(text = '
my.string my.cov n.11 n.12 n.21 n.22
1.2... 1 0 0 0 0
.21111 2 3 0 1 0
..2122 3 0 1 1 1
...211 2 1 0 1 0
112111 4 3 1 1 0
212222 1 0 1 1 3
', header = TRUE, stringsAsFactors = FALSE)
desired.result.3ch <- read.table(text = '
my.string my.cov n.111 n.112 n.121 n.122 n.222 n.221 n.212 n.211
1.2... 1 0 0 0 0 0 0 0 0
.21111 2 2 0 0 0 0 0 0 1
..2122 3 0 0 0 1 0 0 1 0
...211 2 0 0 0 0 0 0 0 1
112111 4 1 1 1 0 0 0 0 1
212222 1 0 0 0 1 2 0 1 0
', header = TRUE, stringsAsFactors = FALSE)
find_overlaps <- function(s, my.cov, p) {
gg <- gregexpr(paste0("(?=",p,")"),s,perl=TRUE)[[1]]
if (length(gg)==1 && gg==-1) 0 else length(gg)
}
p <- c('11', '12', '21', '22', '111', '112', '121', '122', '222', '221', '212', '211')
my.output <- matrix(0, ncol = (nrow(my.data)+1), nrow = length(p))
for(i in seq(1,length(p))) {
my.data$p <- p[i]
my.output[i,1] <- p[i]
my.output[i,(2:(nrow(my.data)+1))] <-apply(my.data, 1, function(x) find_overlaps(x[1], x[2], x[3]))
apply(my.data, 1, function(x) find_overlaps(x[1], x[2], x[3]))
}
my.output
desired.result.2ch
desired.result.3ch
pre.final.output <- matrix(t(my.output[,2:7]), ncol=length(p), nrow=nrow(my.data))
final.output <- data.frame(my.data[,1:2], t(apply(pre.final.output, 1, as.numeric)))
colnames(final.output) <- c(colnames(my.data[,1:2]), paste0('x', p))
final.output
# my.string my.cov x11 x12 x21 x22 x111 x112 x121 x122 x222 x221 x212 x211
#1 1.2... 1 0 0 0 0 0 0 0 0 0 0 0 0
#2 .21111 2 3 0 1 0 2 0 0 0 0 0 0 1
#3 ..2122 3 0 1 1 1 0 0 0 1 0 0 1 0
#4 ...211 2 1 0 1 0 0 0 0 0 0 0 0 1
#5 112111 4 3 1 1 0 1 1 1 0 0 0 0 1
#6 212222 1 0 1 1 3 0 0 0 1 2 0 1 0
A tidy, and I think more readable solution is
library(tidyverse)
PatternCount <- function(text, pattern) {
#Generate all sliding substrings
map(seq_len(nchar(text) - nchar(pattern) + 1),
function(x) str_sub(text, x, x + nchar(pattern) - 1)) %>%
#Test them against the pattern
map_lgl(function(x) x == pattern) %>%
#Count the number of matches
sum
}
PatternCount("aaabaabaa", "aa")
# 4

Matlab string operation

I have converted a string to binary as follows
message='hello my name is kamran';
messagebin=dec2bin(message);
Is there any method for storing it in array?
I am not really sure of what you want to do here, but if you need to concatenate the rows of the binary representation (which is a matrix of numchars times bits_per_char), this is the code:
message = 'hello my name is kamran';
messagebin = dec2bin(double(message));
linearmessagebin = reshape(messagebin',1,numel(messagebin));
Please note that the double conversion returns your ASCII code. I do not have access to a Matlab installation here, but for example octave complains about the code you provided in the original question.
NOTE
As it was kindly pointed out to me, you have to transpose the messagebin before "serializing" it, in order to have the correct result.
If you want the result as numeric matrix, try:
>> str = 'hello world';
>> b = dec2bin(double(str),8) - '0'
b =
0 1 1 0 1 0 0 0
0 1 1 0 0 1 0 1
0 1 1 0 1 1 0 0
0 1 1 0 1 1 0 0
0 1 1 0 1 1 1 1
0 0 1 0 0 0 0 0
0 1 1 1 0 1 1 1
0 1 1 0 1 1 1 1
0 1 1 1 0 0 1 0
0 1 1 0 1 1 0 0
0 1 1 0 0 1 0 0
Each row corresponds to a character. You can easily reshape it into to sequence of 0,1

Resources