I'm using groovy scripting in Soapui and I'm facing a comparison problem between two lists of lists:
on the one hand I have the expected value (type String), on the other I have the recovered value (type ArrayList).
expected value is like this :
expected [[hsb:[100, 100, 100], type:hsb, wt:null], [hsb:null, type:wt, wt:60]]
recovered one is like this :
recovered [[type:hsb, wt:null, hsb:[100, 100, 100]], [type:wt, wt:60, hsb:null]]
so basically it should match, however I can't figure out how to do it programmatically.
If I convert "expected" into an array, using expected_collection = expected_value.tokenize('[]'), I lose my elements and parsing the array gives the following :
Thu Nov 14 16:42:31 CET 2019: INFO: hsb:
Thu Nov 14 16:42:31 CET 2019: INFO: 100, 100, 100
Thu Nov 14 16:42:31 CET 2019: INFO: , type:hsb, wt:null
Thu Nov 14 16:42:31 CET 2019: INFO: ,
Thu Nov 14 16:42:31 CET 2019: INFO: hsb:null, type:wt, wt:60
Is it possible to tokenize on a limited level ? ie. only the first level of [] ?
Related
I am fairly new to web scraping and decided to dive straight into the deep end. I want select any product and "all months" in a dropdown above the table from https://www.cmegroup.com/tools-information/quikstrike/options-calendar.html and extract the table data into a scv file. The problem araises because the website is dynamic (not all HTML code is displayed when clicking inspect sourse in browser) and generates the table in css (from what i managed to understand). I tried using Selenium to load the webpage, but I am getting an error.
[12508:8412:0216/220631.827:ERROR:ssl_client_socket_impl.cc(985)] handshake failed; returned -1, SSL error code 1, net_error -101
I am assuming this has to do with the webdriver initialisation and I need to give it some settings, just not sure which ones.
Here is the code:
from selenium import webdriver
from bs4 import BeautifulSoup
# Set up the Selenium driver
driver = webdriver.Chrome()
# Open the webpage
url = 'https://www.cmegroup.com/tools-information/quikstrike/options-calendar.html'
driver.get(url)
# Render the page and extract the HTML code
html = driver.page_source
# Parse the HTML using BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
# Extract the data you want from the soup object
tables = soup.findAll("table")
print(tables)
# Close the Selenium driver
driver.quit()
I have tried going the short route and reproducing the requests made by the browser and catching the response with the HTML code (yes the request only returns HTML, not JSON), but this backfired as I couldnt reproduce payload. How do I get the data from the calendar?
The <table> element is within an <iframe> so to access/print the <table> contents you have to:
Induce WebDriverWait for the desired frame to be available and switch to it.
Induce WebDriverWait for the visibility_of_element_located.
You can use either of the following locator strategies:
driver.get('https://www.cmegroup.com/tools-information/quikstrike/options-calendar.html')
WebDriverWait(driver, 10).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button#onetrust-accept-btn-handler"))).click()
WebDriverWait(driver, 10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[#class='cmeIframe']")))
print(WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='ui-widget-info clearfix']/table//tbody[not(#class)][.//tr[#class='group compact']]"))).text)
Console output:
FEBRUARY 2023
FIRST AVAIL DATE OPTION EXPIRATION DTE PRODUCT OPTION FUTURE FUTURE EXPIRATION DTE
20 Jan 2023 - Fri 17 Feb 2023 - Fri 1 Natural Gas Weekly Financial Option Week 3 LN3G3 NGH3 24 Feb 2023 - Fri 8
24 Nov 2010 - Wed 23 Feb 2023 - Thu 7 Natural Gas Option (European) LNEH3 NGH3 24 Feb 2023 - Fri 8
27 Jan 2023 - Fri 24 Feb 2023 - Fri 8 Natural Gas Weekly Financial Option Week 4 LN4G3 NGJ3 29 Mar 2023 - Wed 41
MARCH 2023
FIRST AVAIL DATE OPTION EXPIRATION DTE PRODUCT OPTION FUTURE FUTURE EXPIRATION DTE
03 Feb 2023 - Fri 03 Mar 2023 - Fri 15 Natural Gas Weekly Financial Option Week 1 LN1H3 NGJ3 29 Mar 2023 - Wed 41
10 Feb 2023 - Fri 10 Mar 2023 - Fri 22 Natural Gas Weekly Financial Option Week 2 LN2H3 NGJ3 29 Mar 2023 - Wed 41
21 Feb 2023 - Tue 17 Mar 2023 - Fri 29 Natural Gas Weekly Financial Option Week 3 LN3H3 NGJ3 29 Mar 2023 - Wed 41
27 Feb 2023 - Mon 24 Mar 2023 - Fri 36 Natural Gas Weekly Financial Option Week 4 LN4H3 NGJ3 29 Mar 2023 - Wed 41
24 Nov 2010 - Wed 28 Mar 2023 - Tue 40 Natural Gas Option (European) LNEJ3 NGJ3 29 Mar 2023 - Wed 41
06 Mar 2023 - Mon 31 Mar 2023 - Fri 43 Natural Gas Weekly Financial Option Week 5 LN5H3 NGK3 26 Apr 2023 - Wed 69
APRIL 2023
FIRST AVAIL DATE OPTION EXPIRATION DTE PRODUCT OPTION FUTURE FUTURE EXPIRATION DTE
13 Mar 2023 - Mon 06 Apr 2023 - Thu 49 Natural Gas Weekly Financial Option Week 1 LN1J3 NGK3 26 Apr 2023 - Wed 69
20 Mar 2023 - Mon 14 Apr 2023 - Fri 57 Natural Gas Weekly Financial Option Week 2 LN2J3 NGK3 26 Apr 2023 - Wed 69
27 Mar 2023 - Mon 21 Apr 2023 - Fri 64 Natural Gas Weekly Financial Option Week 3 LN3J3 NGK3 26 Apr 2023 - Wed 69
24 Nov 2010 - Wed 25 Apr 2023 - Tue 68 Natural Gas Option (European) LNEK3 NGK3 26 Apr 2023 - Wed 69
03 Apr 2023 - Mon 28 Apr 2023 - Fri 71 Natural Gas Weekly Financial Option Week 4 LN4J3 NGM3 26 May 2023 - Fri 99
Note : You have to add the following imports :
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
Reference
You can find a couple of relevant discussions in:
Switch to an iframe through Selenium and python
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element while trying to click Next button with selenium
selenium in python : NoSuchElementException: Message: no such element: Unable to locate element
I have a column of timestamp converted to human readable form.
I have tried to sort it from epochtime as well as after converting. It's giving me
Fri, 08 Feb 2019 17:24:16 IST
Mon, 11 Feb 2019 02:19:40 IST
Sat, 09 Feb 2019 00:22:43 IST
which is not sorted.
I have used sort_values()
each_tracker_df = each_tracker_df.sort_values(["timestamp"],ascending=True)
why it isn't working??
Since all the time is in IST. Replace the string IST with NULL.
>>import datetime
>>times=['Fri, 10 Feb 2010 17:24:16','Fri, 11 Feb 2010 17:24:16','Fri, 11 Feb 2019 17:24:16']
>>change_format=[]
>> for time in times:
change_format.append(datetime.datetime.strptime(time, '%a, %d %b %Y %H:%M:%S'))
>>change_format.sort()
Below is content of file. I want to find out difference between each line of first field.
0.607401 # Tue Mar 27 04:30:01 IST 2018
0.607401 # Tue Mar 27 04:35:02 IST 2018
0.606325 # Tue Mar 27 04:40:02 IST 2018
0.606223 # Tue Mar 27 04:45:01 IST 2018
0.606167 # Tue Mar 27 04:50:02 IST 2018
0.605716 # Tue Mar 27 04:55:01 IST 2018
0.605716 # Tue Mar 27 05:00:01 IST 2018
0.607064 # Tue Mar 27 05:05:01 IST 2018
output:-
0
-0.001076
-0.000102
.019944
..
..
.001348
CODE:
awk '{s=$0;getline;print s-$0;next}' a.txt
However this does not work as expected...
Could you help me please?
You can use the following awk code:
$ awk 'NR==1{save=$1;next}NR>1{printf "%.6f\n",($1-save);save=$1}' file
0.000000
-0.001076
-0.000102
-0.000056
-0.000451
0.000000
0.001348
and format the output as you want by modifying the printf.
The way you are currently doing will skip some lines!!!
I need to extract all sequential lines from a text file based on the sequence in the 4th column. This sequence is the current time, and there is only one entry for each second (so only one line). Sometimes in the file the sequence will break, because something has slowed down the script that creates it and it has skipped a second or two. As in the below example:
Thu Jun 8 14:17:31 CEST 2017 sync:1
Thu Jun 8 14:17:32 CEST 2017 sync:1
Thu Jun 8 14:17:33 CEST 2017 sync:1
Thu Jun 8 14:17:37 CEST 2017 sync:1 <--
Thu Jun 8 14:17:38 CEST 2017 sync:1
Thu Jun 8 14:17:39 CEST 2017 sync:1
Thu Jun 8 14:17:40 CEST 2017 sync:1
I need bash to ignore this line and continue without printing it, but still print everything before and after it. How should I go about that?
If you only care about the seconds field (eg, 14:17:39 -> 15:22:40 is clearly not sequential, but this code will think it is; if your data is sufficiently simple this may be fine):
awk 'NR==1 || $6 == (p + 1)%60 ; {p=$6}' FS=':\| *' input
To check the hour and minute, you could simply convert to seconds from midnight or add logic to compare the hours and minutes. Something like:
awk '{s=$4 * 3600 + $5 * 60 + $6} NR==1 || s == (p + 1)%86400 ; {p=s}' FS=':\| *' input
I've got a date string as such:
Tue Jul 29 2014 23:44:06 GMT+0000 (UTC)
How can I add two hours to this?
So I get:
Tue Jul 29 2014 01:44:06 GMT+0000 (UTC)
Here's one solution:
var date = new Date('Tue Jul 29 2014 23:44:06 GMT+0000 (UTC)').getTime();
date += (2 * 60 * 60 * 1000);
console.log(new Date(date).toUTCString());
// displays: Wed, 30 Jul 2014 01:44:06 GMT
Obviously once you have the (new) date object, you can format the output to your liking if the native Date functions do not give you what you need.
Using MomentJS:
var moment = require('moment');
var date1 = moment("Tue Jul 29 2014 23:44:06 GMT+0000 (UTC)");
//sets an internal flag on the moment object.
date1.utc();
console.log(date1.format("ddd MMM DD YYYY HH:mm:ss [GMT]ZZ (UTC)"));
//adds 2 hours
date1.add(2, 'h');
console.log(date1.format("ddd MMM DD YYYY HH:mm:ss [GMT]ZZ (UTC)"));
Prints out the following:
Tue Jul 29 2014 23:44:06 GMT+0000 (UTC)
Wed Jul 30 2014 01:44:06 GMT+0000 (UTC)
This works well:
const epoch = new Date('01-01-2000')
const notBeforeDate = new Date(epoch).setSeconds(notBefore)
const notAfterDate = new Date(epoch).setSeconds(notAfter)
NOTE: the setSeconds() call actually adds seconds to the current Date value, it does not reset the Date to some absolute number of seconds. This detail is poorly addressed in the documentation and causes a lot of heartache when first trying to work with Dates in JavaScript.