kdb/q: How to apply a string manipulation function to a vector of strings to output a vector of strings?

kdb/q: How to apply a string manipulation function to a vector of strings to output a vector of strings? - string

Thanks in advance for the help. I am new to kdb/q, coming from a Python and C++ background.
Just a simple syntax question: I have a string with fields and their corresponding values
pp_str: "field_1:abc field_2:xyz field_3:kdb"
I wrote an atomic (scalar) function to extract the value of a given field.
get_field_value: {[field; pp_str] pp_fields: " " vs pp_str; pid_field: pp_fields[where like[pp_fields; field,":*"]]; start_i: (pid_field[0] ss ":")[0] + 1; end_i: count pid_field[0]; indices: start_i + til (end_i - start_i); pid_field[0][indices]}
show get_field_value["field_1"; pp_str]
"abc"
show get_field_value["field_3"; pp_str]
"kdb"
Now how do I generalize this so that if I input a vector of fields, I get a vector of values? I want to input ("field_1"; "field_2"; "field_3") and output ("abc"; "xyz"; "kdb"). I tried multiple approaches (below) but I just don't understand kdb/q's syntax well enough to vectorize my function:
/ Attempt 1 - Fail
get_field_value[enlist ("field_1"; "field_2"); pp_str]
/ Attempt 2 - Fail
get_field_value[; pp_str] /. enlist ("field_1"; "field_3")
/ Attempt 3 - Fail
fields: ("field_1"; "field_2")
get_field_value[fields; pp_str]

To run your function for each you could project the pp_str variable and use each for the others
q)get_field_value[;pp_str]each("field_1";"field_3")
"abc"
"kdb"
Kdb actually has built-in functionality to handle this: https://code.kx.com/q/ref/file-text/#key-value-pairs
q){#[;x](!/)"S: "0:y}[`field_1;pp_str]
"abc"
q)
q){#[;x](!/)"S: "0:y}[`field_1`field_3;pp_str]
"abc"
"kdb"

I think this might be the syntax you're looking for.
q)get_field_value[; pp_str]each("field_1";"field_2")
"abc"
"xyz"

Related

Elifs conditions are not working in my program

This is my code that takes a number of codons. Codons are a group of three nucleotides, each coding for an Amino Acid
codon_sequence=[]
print("Enter no. of codons you want")
n=int(input())
for i in range(n):
codon=str(input())
codon_sequence.append(codon)
print(codon_sequence)
for i in range(n):
if(codon_sequence[i]=="UUU" or "UUC" or "TTT" or "TTC"):
print("Phe_")
elif(codon_sequence[i]=="UUA" or "UUG" or "CUU" or "CUC" or "CUG" or "CUA" or "TTA" or "TTG" or "CTT" or "CTC" or "CTG" or "CTA"):
print("Leu_")
elif(codon_sequence[i]=="UCU" or "UCC" or "UCG" or "UCA" or "AGU" or "AGC" or "TCT" or "TCC" or "TCG" or "TCA" or "AGT" or "AGC"):
print("Ser_")
elif(codon_sequence[i]=="UAU" or "UAC" or "TAT" or "TAC"):
print("Tyr_")
elif(codon_sequence[i]=="UGU" or "UGC" or "TGT" or "TGC"):
print("Cys_")
elif(codon_sequence[i]=="UGG" or "TGG"):
print("Trp_")
elif(codon_sequence[i]=="CCU" or "CCC" or "CCA" or "CCG" or "CCT"):
print("Pro_")
elif(codon_sequence[i]=="CGU" or "CGC" or "CGA" or "CGG" or "AGA" or "AGG" or "CGT"):
print("Arg_")
elif(codon_sequence[i]=="CAU" or "CAC" or "CAT"):
print("His_")
elif(codon_sequence[i]=="CAA" or "CAG"):
print("Gln_")
elif(codon_sequence[i]=="AUU" or "AUC" or "AUA" or "ATT" or "ATC" or "ATA"):
print("Ile_")
elif(codon_sequence[i]=="AUG"):
print("Met_")
elif(codon_sequence[i]=="ACU" or "ACC" or "ACA" or "ACG" or "ACT"):
print("Thr_")
elif(codon_sequence[i]=="GUU" or "GUC" or "GUA" or "GUG" or "GTT" or "GTC" or "GTA" or "GTG"):
print("Val_")
elif(codon_sequence[i]=="GCU" or "GCC" or "GCA" or "GCG" or "GCT"):
print("Ala_")
elif(codon_sequence[i]=="GGU" or "GGC" or "GGA" or "GGG" or "GGT"):
print("Gly_")
elif(codon_sequence[i]=="GAU" or "GAC" or "GAT"):
print("Asp_")
elif(codon_sequence[i]=="GAA" or "GAG"):
print("Glu_")
elif(codon_sequence[i]=="AAU" or "AAC" or "AAT"):
print("Asn_")
elif(codon_sequence[i]=="AAA" or "AAG"):
print("Lys_")
else:
print("Stop_")
This is however, giving me only 'Phe_' as result, and ignores all other conditions

Reason why your code is not hitting the elif blocks
Your if and elif blocks should look like this.
It should check if codon_sequence[i] is equal to a string of interest.
if(codon_sequence[i]=="UUU" or codon_sequence[i]=="UUC" or codon_sequence[i]=="TTT" or codon_sequence[i]=="TTC"):
Instead you have an or condition against just plain strings like UUC.
This will result in the first if condition always being True.
Thereby you will never hit the elif block.
Also a better way of writing the if statement would be:
if codon_sequence[i] in ["UUU", "UUC", "TTT", "TTC"]:
print("Phe_")

This would be a great candidate for a switch statement, but as the previous answer mentioned you can't put an "or" between each string like you're doing.

How can I take the outer product of string vectors in J?

I'm trying to replicate the outer product notation in APL:
∘.,⍨ 'x1' 'y1' 'z1' 'x2' 'y2' 'z2' 'x3' 'y3' 'z3'
which yields
x1x1 x1y1 x1z1 x1x2 x1y2 x1z2 x1x3 x1y3 x1z3
y1x1 y1y1 y1z1 y1x2 y1y2 y1z2 y1x3 y1y3 y1z3
z1x1 z1y1 z1z1 z1x2 z1y2 z1z2 z1x3 z1y3 z1z3
x2x1 x2y1 x2z1 x2x2 x2y2 x2z2 x2x3 x2y3 x2z3
y2x1 y2y1 y2z1 y2x2 y2y2 y2z2 y2x3 y2y3 y2z3
z2x1 z2y1 z2z1 z2x2 z2y2 z2z2 z2x3 z2y3 z2z3
x3x1 x3y1 x3z1 x3x2 x3y2 x3z2 x3x3 x3y3 x3z3
y3x1 y3y1 y3z1 y3x2 y3y2 y3z2 y3x3 y3y3 y3z3
z3x1 z3y1 z3z1 z3x2 z3y2 z3z2 z3x3 z3y3 z3z3
But I can't figure out how to do something similar in J. I found this Cartesian product in J post that I thought would be similar enough, but I just can't seem to translate it to an array of strings from an array of numbers.
Adapting Dan Bron's answer therein and applying it to a simpler example
6 6 $ , > { 2 # < 'abc'
gives
aaabac
babbbc
cacbcc
aaabac
babbbc
cacbcc
which is almost what I want, but I don't know how to generalize it to use 2-letter (or more) strings instead of single ones in a similar fashion. I also don't know how to format those results with spaces between the pairs like the APL output, so it may not be the right path either.
Similarly, I tried adapting Michael Berry's answer from that thread to get
9 36 $ ,,"1/ ~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
which gives
x1x1x1y1x1z1x1x2x1y2x1z2x1x3x1y3x1z3
y1x1y1y1y1z1y1x2y1y2y1z2y1x3y1y3y1z3
z1x1z1y1z1z1z1x2z1y2z1z2z1x3z1y3z1z3
x2x1x2y1x2z1x2x2x2y2x2z2x2x3x2y3x2z3
y2x1y2y1y2z1y2x2y2y2y2z2y2x3y2y3y2z3
z2x1z2y1z2z1z2x2z2y2z2z2z2x3z2y3z2z3
x3x1x3y1x3z1x3x2x3y2x3z2x3x3x3y3x3z3
y3x1y3y1y3z1y3x2y3y2y3z2y3x3y3y3y3z3
z3x1z3y1z3z1z3x2z3y2z3z2z3x3z3y3z3z3
Again, this is almost what I want, and this one handled the multiple characters, but there are still no spaces between them and the command is getting farther from the simplicity of the APL version.
I can get the same results a bit more cleanly with ravel items
,. ,"1/ ~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
I've been going through the J primer and exploring parts that look relevant in the dictionary, but I'm still very new, so I apologize if this is a dumb question. I feel like the rank conjunction operator should be able to help me here, but I had a hard time following its explanation in the primer. I played with ": to try to format the strings to have trailing spaces, but I also couldn't figure that out. The fact that this was so easy in APL also makes me think I'm doing something very wrong in J to be having this much trouble.
After reading more of the primer I got something that looks like what I want with
,. 9 1 $ ' ' ,."2 ,"1/~ [ ;._2 'x1 y1 z1 x2 y2 z2 x3 y3 z3 '
but this is still way more complicated than the APL version, so I'm still hoping there is an actually elegant and concise way to do this.

I think that the only thing that I can add to the things that you have already pointed out is that to keep a string separate into components you would need to box.
<#,"1/~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
+----+----+----+----+----+----+----+----+----+
|x1x1|x1y1|x1z1|x1x2|x1y2|x1z2|x1x3|x1y3|x1z3|
+----+----+----+----+----+----+----+----+----+
|y1x1|y1y1|y1z1|y1x2|y1y2|y1z2|y1x3|y1y3|y1z3|
+----+----+----+----+----+----+----+----+----+
|z1x1|z1y1|z1z1|z1x2|z1y2|z1z2|z1x3|z1y3|z1z3|
+----+----+----+----+----+----+----+----+----+
|x2x1|x2y1|x2z1|x2x2|x2y2|x2z2|x2x3|x2y3|x2z3|
+----+----+----+----+----+----+----+----+----+
|y2x1|y2y1|y2z1|y2x2|y2y2|y2z2|y2x3|y2y3|y2z3|
+----+----+----+----+----+----+----+----+----+
|z2x1|z2y1|z2z1|z2x2|z2y2|z2z2|z2x3|z2y3|z2z3|
+----+----+----+----+----+----+----+----+----+
|x3x1|x3y1|x3z1|x3x2|x3y2|x3z2|x3x3|x3y3|x3z3|
+----+----+----+----+----+----+----+----+----+
|y3x1|y3y1|y3z1|y3x2|y3y2|y3z2|y3x3|y3y3|y3z3|
+----+----+----+----+----+----+----+----+----+
|z3x1|z3y1|z3z1|z3x2|z3y2|z3z2|z3x3|z3y3|z3z3|
+----+----+----+----+----+----+----+----+----+
If you want to get rid of the boxes and instead insert spaces then you are not really going to have the character items separately, you will have long strings with the spaces as part of the result.
And it is a very good question because it requires you to understand the fact that character strings in J are vectors. I suppose that technically what you are looking for is this which results in a 9 9 4 shape, but it won't look the way that you expect.
,"1/~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
x1x1
x1y1
x1z1
x1x2
x1y2
x1z2
x1x3
x1y3
x1z3
y1x1
y1y1
y1z1
y1x2
y1y2
y1z2
y1x3
y1y3
y1z3
z1x1
z1y1
z1z1
z1x2
z1y2
z1z2
z1x3
z1y3
z1z3
x2x1
x2y1
x2z1
x2x2
x2y2
x2z2
x2x3
x2y3
x2z3
y2x1
y2y1
y2z1
y2x2
y2y2
y2z2
y2x3
y2y3
y2z3
z2x1
z2y1
z2z1
z2x2
z2y2
z2z2
z2x3
z2y3
z2z3
x3x1
x3y1
x3z1
x3x2
x3y2
x3z2
x3x3
x3y3
x3z3
y3x1
y3y1
y3z1
y3x2
y3y2
y3z2
y3x3
y3y3
y3z3
z3x1
z3y1
z3z1
z3x2
z3y2
z3z2
z3x3
z3y3
z3z3
$ ,"1/~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
9 9 4
You could also take the boxes and convert them to symbols, which might be closer to what you want, although they do have the backtick indicator as part of their representation.
s:#<#,"1/~ 9 2 $ 'x1y1z1x2y2z2x3y3z3'
`x1x1 `x1y1 `x1z1 `x1x2 `x1y2 `x1z2 `x1x3 `x1y3 `x1z3
`y1x1 `y1y1 `y1z1 `y1x2 `y1y2 `y1z2 `y1x3 `y1y3 `y1z3
`z1x1 `z1y1 `z1z1 `z1x2 `z1y2 `z1z2 `z1x3 `z1y3 `z1z3
`x2x1 `x2y1 `x2z1 `x2x2 `x2y2 `x2z2 `x2x3 `x2y3 `x2z3
`y2x1 `y2y1 `y2z1 `y2x2 `y2y2 `y2z2 `y2x3 `y2y3 `y2z3
`z2x1 `z2y1 `z2z1 `z2x2 `z2y2 `z2z2 `z2x3 `z2y3 `z2z3
`x3x1 `x3y1 `x3z1 `x3x2 `x3y2 `x3z2 `x3x3 `x3y3 `x3z3
`y3x1 `y3y1 `y3z1 `y3x2 `y3y2 `y3z2 `y3x3 `y3y3 `y3z3
`z3x1 `z3y1 `z3z1 `z3x2 `z3y2 `z3z2 `z3x3 `z3y3 `z3z3

I'd say the closest direct analogue of the APL expresion is to keep each string boxed:
,&.>/~ 'x1';'y1';'z1';'x2';'y2';'z2';'x3';'y3';'z3'
┌────┬────┬────┬────┬────┬────┬────┬────┬────┐
│x1x1│x1y1│x1z1│x1x2│x1y2│x1z2│x1x3│x1y3│x1z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│y1x1│y1y1│y1z1│y1x2│y1y2│y1z2│y1x3│y1y3│y1z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│z1x1│z1y1│z1z1│z1x2│z1y2│z1z2│z1x3│z1y3│z1z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│x2x1│x2y1│x2z1│x2x2│x2y2│x2z2│x2x3│x2y3│x2z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│y2x1│y2y1│y2z1│y2x2│y2y2│y2z2│y2x3│y2y3│y2z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│z2x1│z2y1│z2z1│z2x2│z2y2│z2z2│z2x3│z2y3│z2z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│x3x1│x3y1│x3z1│x3x2│x3y2│x3z2│x3x3│x3y3│x3z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│y3x1│y3y1│y3z1│y3x2│y3y2│y3z2│y3x3│y3y3│y3z3│
├────┼────┼────┼────┼────┼────┼────┼────┼────┤
│z3x1│z3y1│z3z1│z3x2│z3y2│z3z2│z3x3│z3y3│z3z3│
└────┴────┴────┴────┴────┴────┴────┴────┴────┘

Eliminate one list according to another list in Python

I have two dimensional list like that
x_irp_group = [['x1_1_4', 'x1_2_4', 'x1_3_4', 'x1_4_4', 'x1_5_4', 'x1_6_4', 'x1_7_4', 'x1_8_4', 'x1_9_4', 'x1_10_4', 'x1_1_5', 'x1_2_5', 'x1_3_5', 'x1_4_5', 'x1_5_5', 'x1_6_5', 'x1_7_5', 'x1_8_5', 'x1_9_5', 'x1_10_5', 'x1_1_6', 'x1_2_6', 'x1_3_6', 'x1_4_6', 'x1_5_6', 'x1_6_6', 'x1_7_6', 'x1_8_6', 'x1_9_6', 'x1_10_6', 'x1_1_7', 'x1_2_7', 'x1_3_7', 'x1_4_7', 'x1_5_7', 'x1_6_7', 'x1_7_7', 'x1_8_7', 'x1_9_7', 'x1_10_7', 'x1_1_8', 'x1_2_8', 'x1_3_8', 'x1_4_8', 'x1_5_8', 'x1_6_8', 'x1_7_8', 'x1_8_8', 'x1_9_8', 'x1_10_8'], ['x1_1_8', 'x1_2_8', 'x1_3_8', 'x1_4_8', 'x1_5_8', 'x1_6_8', 'x1_7_8', 'x1_8_8', 'x1_9_8', 'x1_10_8', 'x1_1_9', 'x1_2_9', 'x1_3_9', 'x1_4_9', 'x1_5_9', 'x1_6_9', 'x1_7_9', 'x1_8_9', 'x1_9_9', 'x1_10_9', 'x1_1_10', 'x1_2_10', 'x1_3_10', 'x1_4_10', 'x1_5_10', 'x1_6_10', 'x1_7_10', 'x1_8_10', 'x1_9_10', 'x1_10_10', 'x1_1_11', 'x1_2_11', 'x1_3_11', 'x1_4_11', 'x1_5_11', 'x1_6_11', 'x1_7_11', 'x1_8_11', 'x1_9_11', 'x1_10_11', 'x1_1_12', 'x1_2_12', 'x1_3_12', 'x1_4_12', 'x1_5_12', 'x1_6_12', 'x1_7_12', 'x1_8_12', 'x1_9_12', 'x1_10_12']]
I wanna eliminate this two dimensional list if the elements in another one dimensional list like that
x_irp_eliminated_list = ['x1_1_4', 'x1_1_8', 'x1_1_12', 'x1_1_16', 'x1_1_19', 'x1_1_22', 'x1_1_26', 'x1_1_30', 'x1_1_34', 'x1_1_37', 'x1_1_43', 'x1_1_49', 'x1_1_55', 'x1_1_61', 'x1_1_68', 'x1_1_75', 'x1_1_81', 'x1_1_87', 'x1_1_92', 'x1_1_96', 'x1_1_101', 'x1_1_107', 'x1_1_112', 'x1_1_116', 'x1_1_121', 'x1_1_126', 'x1_1_131', 'x1_1_134', 'x1_1_137', 'x1_1_141', 'x1_1_145', 'x1_1_149', 'x1_1_152', 'x1_1_155', 'x1_1_160', 'x1_1_164', 'x1_1_169', 'x1_1_173', 'x1_1_181', 'x1_1_189', 'x1_1_197', 'x1_1_205', 'x1_2_8', 'x1_2_10', 'x1_2_13', 'x1_2_17', 'x1_2_21', 'x1_2_25', 'x1_2_28', 'x1_2_30', 'x1_2_34', 'x1_2_40', 'x1_2_45', 'x1_2_51', 'x1_2_58', 'x1_2_66', 'x1_2_71', 'x1_2_77', 'x1_2_82', 'x1_2_86', 'x1_2_91', 'x1_2_97', 'x1_2_102', 'x1_2_106', 'x1_2_111', 'x1_2_117', 'x1_2_122', 'x1_2_125', 'x1_2_129', 'x1_2_132', 'x1_2_135', 'x1_2_139', 'x1_2_143', 'x1_2_147', 'x1_2_151', 'x1_2_154', 'x1_2_157', 'x1_2_161', 'x1_2_166', 'x1_2_172', 'x1_2_177', 'x1_2_181', 'x1_2_189', 'x1_2_197', 'x1_2_205', 'x1_2_214', 'x1_3_1', 'x1_3_4', 'x1_3_8', 'x1_3_11', 'x1_3_15', 'x1_3_18', 'x1_3_22', 'x1_3_25', 'x1_3_28', 'x1_3_32', 'x1_3_35', 'x1_3_39', 'x1_3_42', 'x1_3_46', 'x1_3_49', 'x1_3_52', 'x1_3_56', 'x1_3_59', 'x1_3_63', 'x1_3_66', 'x1_3_70', 'x1_3_73', 'x1_3_77', 'x1_3_81', 'x1_3_85', 'x1_3_88', 'x1_3_91', 'x1_3_94', 'x1_3_97', 'x1_3_101', 'x1_3_105', 'x1_3_109', 'x1_3_112', 'x1_3_115', 'x1_3_118', 'x1_3_122', 'x1_3_126', 'x1_3_130', 'x1_3_134', 'x1_3_137', 'x1_3_140', 'x1_3_143', 'x1_3_147', 'x1_3_151', 'x1_3_156', 'x1_3_159', 'x1_3_163']
I write a code like that but it did not work well.
x_final = [i for i, j in zip(x_irp_group, x_irp_eliminated_list) if i == j]
I shorten the lists. Normally their sizes are much bigger than that

the list comprehension you have isn't working because you are zipping the elements together, which isn't what the operation represents (they are not parallel arrays) what you want is something along the lines of:
x_final = [i for i in x_irp_group[0] if (i not in x_irp_eliminated_list)]
Note that for a 2d list you may need to nest this like:
# writing normal loops you'd write:
# for row in x_irp_group:
# for i in row:
# if (...):
# so I typically try to indent the loops similarly since nested array comprehension
# gets complicated, honestly I'd likely prefer using generator functions for this anyway
x_final = [[i for i in row
if (i not in x_irp_eliminated_list)
]for row in x_irp_group
]
although know that i not in x_irp_eliminated_list will be very slow for a list, changing it to a set would improve performance:
x_irp_eliminated_set = set(x_irp_eliminated_list)
x_final = [i for i in x_irp_group[0] if (i not in x_irp_eliminated_set)]
Or if the lists are trivially sorted, then you could convert them both to sets, do a subtraction then sort it again:
x_final = [ sorted(set(x_irp_group[0]) - set(x_irp_eliminated_list)) ]
although if you have super giant lists this would probably be less desirable.

x_irp_eliminated_list_set = set(x_irp_eliminated_list)
x_last = [i for row in x_irp_group
for i in row
if (i in x_irp_eliminated_list_set)]
print(x_last[:30])
I used this for faster operation. Set approach made it faster. Thanks for that information. I learn one new thing. But it creates one dimensional list. I would like to create two dimensional list like original x_irp_group

How to match optional Number along with alphanumeric in Ruta Script

I am working on entity extraction in Pega. I have requirement to match a policy number which has 3 parts:
1) Optionally 1 would be first character in policy. It is optional
2) alphanumeric of length 2 followed by optionally Hyphen or Space
3) alphanumeric of length 3
So some examples of formats are:
AB-CDE, AB CDE, ABCDE, 1AB-CDE
23-456, 23 456, 23456, 123456
AB-2B4, AB-B2C, A1-2B4, 2A-34B, 12A-34B, 123-45C etc.
I am facing problem whenever policy number is starting with 2 or 3 digits or it don't have any space or hyphen.
For example 12A-34B, 123-45C, 23456, 123456.
I have written below script:
PACKAGE uima.ruta.example;
Document{-> RETAINTYPE(SPACE)};
("1")+? ((NUM* W*)|(W* NUM*)){REGEXP(".{2}")} ("-"|SPACE)? ((NUM* W* NUM*)|(W* NUM* W*)){REGEXP(".{3}")->MARK(EntityType,1,4)};
((NUM* W*)|(W* NUM*)){REGEXP(".{2}")} ("-"|SPACE)? ((NUM* W* NUM*)|(W* NUM* W*)){REGEXP(".{3}")->MARK(EntityType,1,3)};
This code is working fine for patterns having space/hyphen like:
AB-CDE, AB CDE, 1AB-CDE. But not working if don't have space and hyphen or pattern starts with 2 or 3 digits.
Please help to write correct pattern.
Thanks in advance.

The UIMA Ruta seed annotation NUM, covers the whole number. Therefore, examples like 23456, 123456 cannot be split in subannotations by Ruta.
A solution would be to use pure regexp to annotate all the mentioned examples:
"\\w{2,3}[\\-|\\s]?\\w{2,3}" -> EntityType;

Sort list python3

I would like to order this list.
From:
01104D-BB'42
01104D-BB42
01104D-BB43
01104D-CC'42
01104D-CC'72
01104D-CC32
01104D-CC42
01104D-CC62
01104D-CC72
01104D-DD'74
01104D-DD'75
01104D-DD'76
01104D-DD'77
01104D-DD'78
01104D-DD75
01104D-DD76
01104D-DD77
01104D-DD78
01104D-EE'102
01104D-EE'12
01104D-EE'2
01104D-EE'32
01104D-EE'42
01104D-EE'52
01104D-EE'53
01104D-EE'72
01104D-EE'82
01104D-EE'92
01104D-EE102
01104D-EE12
01104D-EE2
01104D-EE3
01104D-EE32
01104D-EE42
01104D-EE52
01104D-EE62
01104D-EE72
01104D-EE82
01104D-EE83
01104D-EE92
01104D-EE93
To:
01104D-BB42
01104D-BB43
01104D-BB'42
01104D-CC32
01104D-CC42
01104D-CC62
01104D-CC72
01104D-CC'42
01104D-CC'72
01104D-DD75
01104D-DD76
01104D-DD77
01104D-DD78
01104D-DD'74
01104D-DD'75
01104D-DD'76
01104D-DD'77
01104D-DD'78
01104D-EE102
01104D-EE12
01104D-EE2
01104D-EE3
01104D-EE32
01104D-EE42
01104D-EE52
01104D-EE62
01104D-EE72
01104D-EE82
01104D-EE83
01104D-EE92
01104D-EE93
01104D-EE'102
01104D-EE'12
01104D-EE'2
01104D-EE'32
01104D-EE'42
01104D-EE'52
01104D-EE'53
01104D-EE'72
01104D-EE'82
01104D-EE'92
Can you help me?
thanks

I'm guessing here, because you haven't explained how you want the sort to be done. But it looks like you want the character ' to sort after the digits 0-9, and the ascii sort order puts it before the digits. If that is correct, then you need to substitute a different character for '. A good choice might be ~ because it is the last printable ascii character.
If your data is in mylist, then
mylist.sort(key=lambda a: a.replace("'","~"))
will sort it in the order I'm guessing you want.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

kdb/q: How to apply a string manipulation function to a vector of strings to output a vector of strings? - string

I think this might be the syntax you're looking for. q)get_field_value[; pp_str]each("field_1";"field_2") "abc" "xyz"

Related

Elifs conditions are not working in my program

How can I take the outer product of string vectors in J?

Eliminate one list according to another list in Python

How to match optional Number along with alphanumeric in Ruta Script

Sort list python3

Categories

Resources