How to split a String by bodySize in Groovy Script - string

Before anything else, I hope that this world situation is not affecting you too much and that you can be as long as possible at home and in good health.
You see, I'm very, very new to Groovy Script and I have a question: How can I separate a String based on its body size?
Assuming that the String has a size of 3,000 characters getting the body like
def body = message.getBody (java.lang.String) as String
and its size like
def bodySize = body.getBytes (). Length
I should be able to separate it into 500-character segments and save each segment in a different variable (which I will later set in a property).
I read some examples but I can't adjust them to what I need.
Thank you very much in advance.

Assuming it's ok to have a List of segment strings, you can simply do:
def segments = body.toList().collate(500)*.join()
This splits the body into a list of characters, collates these into 500 length groups, and then joins each group back to a String.
As a small example
def body = 'abcdefghijklmnopqrstuvwxyz'
def segments = body.toList().collate(5)*.join()
Then segments equals
['abcde', 'fghij', 'klmno', 'pqrst', 'uvwxy', 'z']

Related

Is there a way to only list a certain format of text from a list?

I am quite new to python.
And i want to only get a certain format from a bigger list, example:
Whats in the list:
/ABC/EF213
/ABC/EF
/ABC/12AC4
/ABC/212
However the only on i want listed are the ones with this format /###/##### while the rest gets discarded
You could use a generator expression or a for loop to check each element of the list to see if it matches a pattern. One way of doing this would be to check if the item matches a regex pattern.
As an example:
import re
original_list = ["Item I don't want", "/ABC/EF213", "/ABC/EF", "/ABC/12AC4", "/ABC/212", "123/456", "another useless item", "/ABC/EF"]
filtered_list = [item for item in original_list if re.fullmatch("\/\w+\/\w+", item) is not None]
print(filtered_list)
outputs
['/ABC/EF213', '/ABC/EF', '/ABC/12AC4', '/ABC/212', '/ABC/EF']
If you need help making regex patterns, there are many great websites such as regexr which can help you
Every String can be used as a list without any conversion. If the only format you want to check is /###/##### then you can simply make if commands like these:
for text in your_list:
if len(text) == 10 and text[0] == "/" and text[4] == "/" (and so on):
print(text)
Of course this would require a lot of if statements and would take a pretty long time. So I would recomend doing a faster and simpler scan. We could perform this one by, for example, splitting the texts, which would look something like this:
for text in your_list:
checkstring = text.split("/")
Now you have your text Split in parts, and you can simply check what lengths these new parts have with the len() command.

Defining a function to find the unique palindromes in a given string

I'm kinda new to python.I'm trying to define a function when asked would give an output of only unique words which are palindromes in a string.
I used casefold() to make it case-insensitive and set() to print only uniques.
Here's my code:
def uniquePalindromes(string):
x=string.split()
for i in x:
k=[]
rev= ''.join(reversed(i))
if i.casefold() == rev.casefold():
k.append(i.casefold())
print(set(k))
else:
return
I've tried to run this line
print( uniquePalindromes('Hanah asked Sarah but Sarah refused') )
The expected output should be ['hanah','sarah'] but its returning only {'hanah'} as the output. Please help.
Your logic is sound, and your function is mostly doing what you want it to. Part of the issue is how you're returning things - all you're doing is printing the set of each individual word. For example, when I take your existing code and do this:
>>> print(uniquePalindromes('Hannah Hannah Alomomola Girafarig Yes Nah, Chansey Goldeen Need log'))
{'hannah'}
{'alomomola'}
{'girafarig'}
None
hannah, alomomola, and girafarig are the palindromes I would expect to see, but they're not given in the format I expect. For one, they're being printed, instead of returned, and for two, that's happening one-by-one.
And the function is returning None, and you're trying to print that. This is not what we want.
Here's a fixed version of your function:
def uniquePalindromes(string):
x=string.split()
k = [] # note how we put it *outside* the loop, so it persists across each iteration without being reset
for i in x:
rev= ''.join(reversed(i))
if i.casefold() == rev.casefold():
k.append(i.casefold())
# the print statement isn't what we want
# no need for an else statement - the loop will continue anyway
# now, once all elements have been visited, return the set of unique elements from k
return set(k)
now it returns roughly what you'd expect - a single set with multiple words, instead of printing multiple sets with one word each. Then, we can print that set.
>>> print(uniquePalindromes("Hannah asked Sarah but Sarah refused"))
{'hannah'}
>>> print(uniquePalindromes("Hannah and her friend Anna caught a Girafarig and named it hannaH"))
{'anna', 'hannah', 'girafarig', 'a'}
they are not gonna like me on here if I give you some tips. But try to divide the amount of characters (that aren't whitespace) into 2. If the amount on each side is not equivalent then you must be dealing with an odd amount of letters. That means that you should be able to traverse the palindrome going downwards from the middle and upwards from the middle, comparing those letters together and using the middle point as a "jump off" point. Hope this helps

Concatenating FOR loop output

I am very new to Python (first week of active use). I have some bash scripting experience but have decided to learn Python.
I have a variable of multiple strings which I am using to build a URL in FOR loop. The output of URL is JSON and I would like to concatenate complete output into one file.
I will put random URL for privacy reasons.
The code looks like this:
==================
numbers = ['24246', '83367', '37643', '24245', '24241', '77968', '63157', '76004', '71665']
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
result = restAPI.json
==================
the problem is that if I do print(result) I will get only output of last iteration, i.e. www.google.com/test/71665&test2
Creating a list by adding text = [] worked (content was concatenated) but I would like to keep the original format.
text = []
for id in numbers:
restAPI = s.get(urljoin(baseurl, '/test/' + id + '&test2'))
Does anyone have idea how to do this
When the for loop ends, the variable assigned inside the for loop only keeps the last value. I.e. Every time your code for loops through, the restAPI variable gets reset each time.
If you wanted to keep each URL, you could append to a list outside the scope of the for loop every time, i.e.
restAPI = s.get(urljoin(baseurl, ...
url_list.append(restApi.json)
Or if you just wanted to print...
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
print(restAPI.json)
If you added them to a list, you could perform seperate functions with the new list of URLs.
If you think there might be duplicates, feel free to use a set() instead (which automatically removes the dupes inside the iterable as new values are added). You can use set_name.add(restAPI.json)
To be better, you could implement a dict and assign the id as the key and the json object as the value. So you could:
dict_obj = dict()
for id in numbers:
restAPI = s.get(urljoin(baseurl, ...
dict_obj[id] = restAPI.json
That way you can query the dictionary later in the script.
Note that if you're querying many URLs, storing the JSON's in memory might be intensive depending on your hardware.

Proper Syntax for List Comprehension Involving an Integer and a Float?

I have a List of Lists that looks like this (Python3):
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
I'm trying to use a list comprehension to convert the first item of each inner list to an int and the second item to a float so that I end up with this:
newLOL = [[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
I'm learning list comprehensions, can somebody please help me with this syntax?
Thank you!
[edit - to explain why I asked this question]
This question is a means to an end - the syntax requested is needed for testing. I'm collecting sensor data on a ZigBee network, and I'm using an Arduino to format the sensor messages in JSON. These messages are published to an MQTT broker (Mosquitto) running on a Raspberry Pi. A Redis server (also running on the Pi) serves as an in-memory message store. I'm writing a service (python-MQTT client) to parse the JSON and send a LoL (a sample of the data you see in my question) to Redis. Finally, I have a dashboard running on Apache on the Pi. The dashboard utilizes Highcharts to plot the sensor data dynamically (via a web socket connection between the MQTT broker and the browser). Upon loading the page, I pull historical chart data from my Redis LoL to "very quickly" populate the charts on my dashboard (before any realtime data is added dynamically). I realize I can probably format the sensor data the way I want in the Redis store, but that is a problem I haven't worked out yet. Right now, I'm trying to get my historical data to plot correctly in Highcharts. With the data properly formatted, I can get this piece working.
Well, you could use ast.literal_eval:
from ast import literal_eval
myLOL = ["['1466279297', '703.0']", "['1466279287', '702.0']", "['1466279278', '702.0']", "['1466279268', '706.0']", "['1466279258', '713.0']"]
items = [[int(literal_eval(i)[0]), float(literal_eval(i)[1])] for i in myLOL]
Try:
import json
newLOL = [[int(a[0]), float(a[1])] for a in (json.loads(s.replace("'", '"')) for s in myLOL)]
Here I'm considering each element of the list as a JSON, but since it's using ' instead of " for the strings, I have to replace it first (it only works because you said there will be only numbers).
This may work? I wish I was more clever.
newLOL = []
for listObj in myLOL:
listObj = listObj.replace('[', '').replace(']', '').replace("'", '').split(',')
newListObj = [int(listObj[0]), float(listObj[1])]
newLOL.append(newListObj)
Iterates through your current list, peels the string apart into a list by replace un-wanted string chracters and utilizing a split on the comma. Then we take the modified list object and create another new list object with the values being the respective ints and floats. We then append the prepared newListObj to the newLOL list. Considering you want an actual set of lists within your list. Your previously documented input list actually contains strings, which look like lists.
This is a very strange format and the best solution is likely to change the code which generates that.
That being said, you can use ast.literal_eval to safely evaluate the elements of the list as Python tokens:
>>> lit = ast.literal_eval
>>> [[lit(str_val) for str_val in lit(str_list)] for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
We need to do it twice - once to turn the string into a list containing two strings, and then once per resulting string to convert it into a number.
Note that this will succeed even if the strings contain other valid tokens. If you want to validate the format too, you'd want to do something like:
>>> def process_str_list(str_list):
... l = ast.literal_eval(str_list)
... if not isinstance(l, list):
... raise TypeError("Expected list")
... str_int, str_float = l
... return [int(str_int), float(str_float)]
...
>>> [process_str_list(str_list) for str_list in myLOL]
[[1466279297, 703.0], [1466279287, 702.0], [1466279278, 702.0], [1466279268, 706.0], [1466279258, 713.0]]
Your input consists of a list of strings, where each string is the string representation of a list. The first task is to convert the strings back into lists:
import ast
lol2 = map(ast.literal_eval, mylol) # [['1466279297', '703.0'], ...]
Now, you can simply get int and float values from lol2:
newlol = [[int(a[0]), float(a[1])] for a in lol2]

how use struct.pack for list of strings

I want to write a list of strings to a binary file. Suppose I have a list of strings mylist? Assume the items of the list has a '\t' at the end, except the last one has a '\n' at the end (to help me, recover the data back). Example: ['test\t', 'test1\t', 'test2\t', 'testl\n']
For a numpy ndarray, I found the following script that worked (got it from here numpy to r converter):
binfile = open('myfile.bin','wb')
for i in range(mynpdata.shape[1]):
binfile.write(struct.pack('%id' % mynpdata.shape[0], *mynpdata[:,i]))
binfile.close()
Does binfile.write automatically parses all the data if variable has * in front it (such in the *mynpdata[:,i] example above)? Would this work with a list of integers in the same way (e.g. *myIntList)?
How can I do the same with a list of string?
I tried it on a single string using (which I found somewhere on the net):
oneString = 'test'
oneStringByte = bytes(oneString,'utf-8')
struct.pack('I%ds' % (len(oneString),), len(oneString), oneString)
but I couldn't understand why is the % within 'I%ds' above replaced by (len(oneString),) instead of len(oneString) like the ndarray example AND also why is both len(oneString) and oneString passed?
Can someone help me with writing a list of string (if necessary, assuming it is written to the same binary file where I wrote out the ndarray) ?
There's no need for struct. Simply join the strings and encode them using either a specified or an assumed text encoding in order to turn them into bytes.
''.join(L).encode('utf-8')

Resources