matlab string vector / array handling (multiplication u and str2num)

matlab string vector / array handling (multiplication u and str2num) - string

I would like to understand if this is really correct, or if this might be an issue in matlab.
I create an string vector/array via:
>>a=['1','2';'3','4']
It returns:
a =
12
34
Now I would like to convert the content from string to number and multiply this with a number:
>>6*str2num(a)
The result looks like this:
a =
72
204
I don't understand why the comma separated elements (strings) will be concatenated and not separated handled. If you use number instead of strings they will be separated handled. Then it looks like this:
>> a=[1,2;3,4]
a =
1 2
3 4
>> 6*a
ans =
6 12
18 24
I would expect the same results. Any ideas ?
Thanks

Have you read about how string handling is done in MATLAB?
Basically, multiple strings can only be stored as a column vector (of strings). If attempted to store as a row vector, they will be concatenated. This is why strings '1' and '2' are being concatenated, as well as '3' and '4'. Also note, that this is only possible if all resulting strings are of the same length.
I'm not sure what you're trying to do, but if you want to store strings as a matrix (that is, multiple strings in a row), consider storing them in a cell array, for instance:
>> A = {'1', '2'; '3', '4'}
A =
'1' '2'
'3' '4'
>> cellfun(#str2num, A)
ans =
1 2
3 4

I would say that using a cell array as #EitanT suggests would probably be the best solution for you.
However, it is possible to handle strings (or rather characters) like the way you tried by manually inserting spaces and lining up the number of characters.
For example
>> a=['1 2';'3 4']
produces
a =
1 2
3 4
and using
>> 6*str2num(a)
produces
ans =
6 12
18 24
Converting between a matrix and a string using
b=[1,2;3,10000];
num2str(b)
spaces are inserted automatically and the characters are lined up properly. This produces
ans =
1 2
3 10000

Related

How to start a string at a certain character and end it at a certain character? (Lua)

Here's my question. I'm using Lua and I have a string that looks something like this:
"Start1.2.3.4.5-1.2.3.4.5-1.2.3.4.5-1.2.3.4.5-1.2.3.4.5End"
The five numbers between each hyphen are all paired to the same "object" but each represents a separate set of data. The period between the numbers separates the data.
So after Start, 1 = our first value, 2 = our second value, 3 = our third value, 4 = our fourth value, and 5 = our fifth value. These 5 values are stored to the same object. Then we hit our first hyphen which separates the "objects". So there's 5 objects and 5 values per object.
I used 1.2.3.4.5 as an example but these numbers will be randomized with up to 4 digits. So it could say something like Start12.3.100.1025.50- etc...
Hopefully that makes sense. Here's what I have done so far:
MyString = the long string I posted above
local extracted = string.match(MyString, "Start(.*)")
This returns everything beyond Start in the string. However, I want it to return everything after Start and then cut off once it reaches the next hyphen. Then from that point on I'll repeat the process but instead find everything between the hyphens until I reach End. I also need to filter out the periods. Also, the hyphens/periods can change to something else as long as they aren't numbers.
Any ideas on how to do this?

Just use a pattern that captures anything that contains numbers and periods.
"([%d%.]+)" Note that you have to escape the period with % as it is a magic character.
local text = "Start1.2.3.4.5-1.2.3.4.5-1.2.3.4.5-1.2.3.4.5-1.2.3.4.5End"
for set in text:gmatch("([%d%.]+)") do
print(set)
local numbers = {}
for num in set:gmatch("%d+") do
table.insert(numbers, num)
end
print(table.unpack(numbers))
end
prints:
1.2.3.4.5
1 2 3 4 5
1.2.3.4.5
1 2 3 4 5
1.2.3.4.5
1 2 3 4 5
1.2.3.4.5
1 2 3 4 5
1.2.3.4.5
1 2 3 4 5

Python: setting a string so it always has 2 decimals after comma

I'm looking to write a Python function that adds a number to the back of the string. However, I want it in a way that the string always has 2 characters after the comma.
I believe using a string is easier to remove and skip characters. I will be converting the result with the float() method.
As an example:
I start at the string "0.00"
Adding a 5 will make it "0.05"
Adding a 5 and a 6 will make it "5.56" etc
Another example:
again we start at "0.00". Adding consecutive the characters "5" "4" "3" "2" "1" will ultimately result in "543.21"

Why not just convert the string to an integer and divide the number by 100?
num = int(input())
print(num/float(100))
E.g. input = '5',
convert to integer = 5,
Divide by 100 = 0.05

What is the most efficient format for storing strings from a for loop?

I have a script that runs through a series of strings and using regex pulls out certain strings (approx 4 output strings per input string).
e.g. HelloStackOverflowWorld
-> Hello; Stack; Overflow; World;
The final output would ideally be a table where I can filter based upon the strings in the columns. Using the case above, column 1 row 1 would have 'Hello', column 2 row 1 would have 'Stack' and so on.
The problem is, the size of the output will change depending on the input so I am unsure of what output format to use.
At the moment I used something similar to this:
if strfind(missing{ii},'hello')
miss.exch = [miss.exch;'hello'];
temp.exc = regexp(missing{ii},'(?<=\d[Q|T])(\w*?)(?=[q])','match');
miss.exc = [miss.exc;temp.exc];
temp.TQ= regexp(missing{ii},'(Qc|Tc)','match');
if strcmp(temp.TQ{1,1}, 'Tc')
miss.TQ = [miss.TQ;'variableA'];
elseif temp.TQ{1,1} == 'Qc'
miss.TQ = [miss.TQ;'variableB'];
end
else if .........
end
Which obviously results in a 1x1 struct consisting of a number of fields each with many cells. This makes filtering on strings an issue!
How can I define and add data into a 'table of strings' that I can then filter?

I think you are just looking for a cell array. Here is a simple example of what they can do:
C = {'Abc','Bcd';'Cde',[]}
strcmp(C,'Cde')
Results in:
ans =
0 0
1 0
Make sure to check doc cell to see how you can access them.

How to change stringified numbers in data frame into pure numeric values in R

I have the following data.frame:
employee <- c('John Doe','Peter Gynn','Jolie Hope')
# Note that the salary below is in stringified format.
# In reality there are more such stringified numerical columns.
salary <- as.character(c(21000, 23400, 26800))
df <- data.frame(employee,salary)
The output is:
> str(df)
'data.frame': 3 obs. of 2 variables:
$ employee: Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2
$ salary : Factor w/ 3 levels "21000","23400",..: 1 2 3
What I want to do is to convert the change the value from string into pure number
straight fro the df variable. At the same time preserve the string name for employee.
I tried this but won't work:
as.numeric(df)
At the end of the day I'd like to perform arithmetic on these numeric
values from df. Such as df2 <- log2(df), etc.

Ok, there's a couple of things going on here:
R has two different datatypes that look like strings: factor and character
You can't modify most R objects in place, you have to change them by assignment
The actual fix for your example is:
df$salary = as.numeric(as.character(df$salary))
If you try to call as.numeric on df$salary without converting it to character first, you'd get a somewhat strange result:
> as.numeric(df$salary)
[1] 1 2 3
When R creates a factor, it turns the unique elements of the vector into levels, and then represents those levels using integers, which is what you see when you try to convert to numeric.

read complicated .txt file into Matlab

I would like to read a .txt file into Matlab.
One of the columns contains both letters and numbers.
(So I guess one way is to read this column is as string.)
The problem is I also need to find out numbers which are larger than 5 within that column.
e.g. The .txt looks like
12 1
21 2
32 7
11 a
03 b
22 4
13 5
31 6
i.e. Ultimately, I would like to get
32 7
31 6
How can I get it?? Any experts, please help!

You can read the contents of the file into a cell array of strings using TEXTSCAN, convert the strings to numeric values using CELLFUN and STR2NUM (characters like 'a' and 'b' will result in the empty matrix []), remove rows of the cell array that have any empty cells in them, then convert the remaining data into an N-by-2 matrix using CELL2MAT:
fid = fopen('junk.txt','r'); %# Open the file
data = textscan(fid,'%s %s','CollectOutput',true); %# Read the data as strings
fclose(fid); %# Close the file
data = cellfun(#str2num,data{1},'UniformOutput',false); %# Convert to numbers
data(any(cellfun('isempty',data),2),:) = []; %# Remove empty cells
data = cell2mat(data); %# Convert to N-by-2 array
The matrix data will now look like this, given your sample file in the question:
>> data
data =
12 1
21 2
32 7
22 4
13 5
31 6
And you can get the rows that have a value greater than 5 in the second column like so:
>> data(data(:,2) > 5,:)
ans =
32 7
31 6

fid = fopen('txt.txt','r');
Aout = [];
while(1)
[a1,count1] = fscanf(fid,'%s',1);
[a2,count2] = fscanf(fid,'%s',1);
if(count1 < 1 | count2 < 1)
break;
end
if(~isempty(str2num(a2)) & str2num(a2) > 5 & (~isempty(str2num(a1))) )
Aout = [ Aout ; str2num(a1) str2num(a2) ];
end
end
fclose(fid);
Violates the unspoken rule of growing a Matlab variable during a loop, but it's text processing anyway so you probably won't notice the slowness.
Edit: Had too many errors in previous version, had to start fresh.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

matlab string vector / array handling (multiplication u and str2num) - string

Related

How to start a string at a certain character and end it at a certain character? (Lua)

Python: setting a string so it always has 2 decimals after comma

What is the most efficient format for storing strings from a for loop?

How to change stringified numbers in data frame into pure numeric values in R

read complicated .txt file into Matlab

Categories

Resources