Kotlin and String.contains() not working as I thought - string

I thought I knew how string.contains() worked in Kotlin and Java, but apparently I don't.
I'vet got a small piece of code that takes a list of file names, and puts them in another list if the do not contain certain words.
for (i in 0..filliste.size-1) {
if (!filliste[i].contains("utenfastbopel") || !filliste.contains("sperret") ||
!filliste.contains("reservert")){
var a = filliste[i]
tempFnrliste += filliste[i].split("_")[0]
}
}
However, this does not exclude a file which contains the phrase "sperretstrengtreservert", even though both "reservert" and "sperret" is in the "not contains".
How come? I thought .contains found every occurence of a substring?
But if you look at the debug run, a file containing two of the phrases that are to be ignored, is indeed not ignored:
UPDATE:
To be clear, I'm looking for any of the file names to contain one OR more of the strings. So the logical OR/|| is correct.
However, I missed some indices. But adding them changed nothing. See the updated code below:
As far as I can see, the code now clearly says IF THE STRING DOES NOT CONTAIN THIS, THIS OR THIS SUBSTRING... But still, a string containing two of the substrings gets a match.
Strangely, if I only use ONE substring in the "not-contains" - for instance "reservert", the code does indeed skip all strings not containing that. But when I use the || operator for several substrings, things gets messed up.

"sperretstrengtreservert" does not contain utenfastbopel.
You are using || aka OR. Your first condition is true.
If any of these is true, it will go to the body of the condition.
!filliste[i].contains("utenfastbopel") ||
!filliste.contains("sperret") ||
!filliste.contains("reservert")
Also as said you are not accessing the same object in the follow-up conditions although it wouldn't change the result as is.
You need to change it from "at least one of these conditions must be true" to "all of these conditions must be true" && aka AND.
for (i in 0..filliste.size-1) {
val f = filliste[i]
if (!f.contains("utenfastbopel") && !f.contains("sperret") && !f.contains("reservert")) {
tempFnrliste += f.split("_")[0]
}
}

Related

Dict key getting overwritten when created in a loop

I'm trying to create individual dictionary entries while looping through some input data. Part of the data is used for the key, while a different part is used as the value associated with that key. I'm running into a problem (due to Python's "everything is an object, and you reference that object" operations method) with this as ever iteration through my loop alters the key set in previous iterations, thus overwriting the previously set value, instead of creating a new dict key and setting it with its own value.
popcount = {}
for oneline of datafile:
if oneline[:3] == "POP":
dat1, dat2, dat3, dat4, dat5, dat6 = online.split(":")
datid = str.join(":", [dat2, dat3])
if datid in popcount:
popcount[datid] += int(dat4)
else:
popcount = { datid : int(dat4) }
This iterates over seven lines of data (datafile is a list containing that information) and should create four separate keys for datid, each with their own value. However, what ends up happening is that only the last value for datid exist in the dictionary when the code is run. That happens to be the one that has duplicates, and they get summed properly (so, at least i know that part of the code works, but the other key entries just are ... gone.
The data is read from a file, is colon (:) separated, and treated like a string even when its numeric (thus the int() call in the if datid in popcount).
What am I missing/doing wrong here? So far I haven't been able to find anything that helps me out on this one (though you folks have answered a lot of other Python questions i've run into, even if you didn't know it). I know why its failing; or, i think i do -- it is because when I update the value of datid the key gets pointed to the new datid value object even though I don't want it to, correct? I just don't know how to fix or work around this behavior. To be honest, its the one thing I dislike about working in Python (hopefully once I grok it, I'll like it better; until then...).
Simply change your last line
popcount = { datid : int(dat4) } # This does not do what you want
This creates a new dict and assignes it to popcount, throwing away your previous data.
What you want to do is add an entry to your dict instead:
popcount[datid] = int(dat4)

Processing Split (server)

I am doing 2player game and when I get informations from server, it's in format "topic;arg1;arg2" so if I am sending positions it's "PlayerPos;x;y".
I then use split method with character ";".
But then... I even tried to write it on screen "PlayerPos" was written right, but it cannot be gained through if.
This is how I send info on server:
server.write("PlayerPos;"+player1.x+";"+player1.y);
And how I accept it on client:
String Get=client.readString();
String [] Getted = split(Get, ';');
fill(0);
text(Get,20,20);
text(Getted[0],20,40);
if(Getted[0]=="PlayerPos"){
text("HERE",20,100);
player1.x=parseInt(Getted[1]);
player1.x=parseInt(Getted[2]);
}
It writes me "PlayerPos;200;200" on screen, even "PlayerPos" under it. But it never writes "HERE" and it never makes it into the if.
Where is my mistake?
Don't use == when comparing String values. Use the equals() function instead:
if(Getted[0].equals("PlayerPos")){
From the Processing reference:
To compare the contents of two Strings, use the equals() method, as in if (a.equals(b)), instead of if (a == b). A String is an Object, so comparing them with the == operator only compares whether both Strings are stored in the same memory location. Using the equals() method will ensure that the actual contents are compared. (The troubleshooting reference has a longer explanation.)

String Comparison with Elasticsearch Groovy Dynamic Script

I have an elasticsearch index that contains various member documents. Each member document contains a membership object, along with various fields associated with / describing individual membership. For example:
{membership:{'join_date':2015-01-01,'status':'A'}}
Membership status can be 'A' (active) or 'I' (inactive); both Unicode string values. I'm interested in providing a slight boost the score of documents that contain active membership status.
In my groovy script, along with other custom boosters on various numeric fields, I have added the following:
String status = doc['membership.status'].value;
float status_boost = 0.0;
if (status=='A') {status_boost = 2.0} else {status_boost=0.0};
return _score + status_boost
For some reason associated with how strings operate via groovy, the check (status=='A') does not work. I've attempted (status.toString()=='A'), (status.toString()=="A"), (status.equals('A')), plus a number of other variations.
How should I go about troubleshooting this (in a productive, efficient manner)? I don't have a stand-alone installation of groovy, but when I pull the response data in python the status is very much so either a Unicode 'A' or 'I' with no additional spacing or characters.
#VineetMohan is most likely right about the value being 'a' rather than 'A'.
You can check how the values are indexed by spitting them back out as script fields:
$ curl -XGET localhost:9200/test/_search -d '
{
"script_fields": {
"status": {
"script": "doc[\"membership.status\"].values"
}
}
}
'
From there, it should be an indication of what you're actually working with. More than likely based on the name and your usage, you will want to reindex (recreate) your data so that membership.status is mapped as a not_analyzed string. If done, then you won't need to worry about lowercasing of anything.
In the mean time, you can probably get by with:
return _score + (doc['membership.status'].value == 'a' ? 2 : 0)
As a big aside, you should not be using dynamic scripting. Use stored scripts in production to avoid security issues.

Groovy - How can I compare part of a list to a string

I have a list which contains two types of text. One type is used for authorization while other type is used for all other purposes.
The type used for authorization always uses the same text + some code after it.
I would like to compare content of these two types of text and separate them based on content.
My idea is to look for pattern in authorization type and if it matches the pattern then this would be marked as authorization, otherwise it would be marked as "other".
I researched about comparison of patterns in Groovy, but all variations I tried did not work for me. Here is the part which should do the comparison, I am obviously doing something wrong but I don't know what.
jdbcOperations.queryForList(sql).collect { row ->
if(assert (row['MSG'] ==~ /token/)){
mark as authorization
}
else{
mark as other
}
}
Sorry for the vague code, I can not share more than this.
I think you just missing the match for the rest of the text, since you are looking only for the first part to match.
assert ("abc" ==~ /abc/) == true
assert ("abcdefg" ==~ /abc/) == false
assert ("abcdefg" ==~ /abc(.*)/) == true // <--- This can also be made more complicated

Searching for Number of Term Appearances in Mathematica

I'm trying to search across a large array of textual files in Mathematica 8 (12k+). So far, I've been able to plot the sheer numbers of times that a word appears (i.e. the word "love" appears 5,000 times across those 12k files). However, I'm running into difficulty determining the number of files in which "love" appears once - which might only be in 1,000 files, with it repeating several times in others.
I'm finding the documentation WRT FindList, streams, RecordSeparators, etc. a bit murky. Is there a way to set it up so it finds an incidence of a term once in a file and then moves onto the next?
Example of filelist:
{"89001.txt", "89002.txt", "89003.txt", "89004.txt", "89005.txt", "89006.txt", "89007.txt", "89008.txt", "89009.txt", "89010.txt", "89011.txt", "89012.txt", "89013.txt", "89014.txt", "89015.txt", "89016.txt", "89017.txt", "89018.txt", "89019.txt", "89020.txt", "89021.txt", "89022.txt", "89023.txt", "89024.txt"}
The following returns all of the lines with love across every file. Is there a way to return only the first incidence of love in each file before moving onto the next one?
FindList[filelist, "love"]
Thanks so much. This is my first post and I'm largely learning Mathematica through peer/supervisory help, online tutorials, and the documentation.
In addition to Daniel's answer, you also seem to be asking for a list of files where the word only occurs once. To do that, I'd continue to run FindList across all the files
res =FindList[filelist, "love"]
Then, reduce the results to single lines only, via
lines = Select[ res, Length[#]==1& ]
But, this doesn't eliminate the cases where there is more than one occurrence in a single line. To do that, you could use StringCount and only accept instances where it is 1, as follows
Select[ lines, StringCount[ #, RegularExpression[ "\\blove\\b" ] ] == 1& ]
The RegularExpression specifies that "love" must be a distinct word using the word boundary marker (\\b), so that words like "lovely" won't be included.
Edit: It appears that FindList when passed a list of files returns a flattened list, so you can't determine which item goes with which file. For instance, if you have 3 files, and they contain the word "love", 0, 1, and 2 times, respectively, you'd get a list that looked like
{, love, love, love }
which is clearly not useful. To overcome this, you'll have to process each file individually, and that is best done via Map (/#), as follows
res = FindList[#, "love"]& /# filelist
and the rest of the above code works as expected.
But, if you want to associate the results with a file name, you have to change it a little.
res = {#, FindList[#, "love"]}& /# filelist
lines = Select[res,
Length[ #[[2]] ] ==1 && (* <-- Note the use of [[2]] *)
StringCount[ #[[2]], RegularExpression[ "\\blove\\b" ] ] == 1&
]
which returns a list of the form
{ {filename, { "string with love in it" },
{filename, { "string with love in it" }, ...}
To extract the file names, you simply type lines[[All, 1]].
Note, in order to Select on the properties you wanted, I used Part ([[ ]]) to specify the second element in each datum, and the same goes for extracting the file names.
Help > Documentation Center > FindList item 4:
"FindList[files,text,n]
includes only the first n lines found."
So you could set n to 1.
Daniel Lichtblau

Resources