pd.to_datetime with different date formats - python-3.x

I need to analyse a large dataset with dates formatted in several different formats:
Mon, 04 Nov 2019 06:12:44 -0800 (PST)
Mon, 4 Nov 2019 15:16:58 +0100 (CET)
Mon, 4 Nov 2019 08:03:13 +0000 (UTC)
Mon, 4 Nov 2019 12:05:54 +0100
dfMail.Date = pd.to_datetime(dfMail.Date, format = "%a, %d %b %Y %H:%M:%S %z")
returns error: ValueError: unconverted data remains: (PST)
What is the best strategy to convert these dates?
Thanks

I see that the () extension might be troublesome. In which case, you can just ignore it:
pd.to_datetime(dfMail.Date.str.replace('( \(.*\))', ''), utc=True)
Input:
Date
0 Mon, 04 Nov 2019 06:12:44 -0800 (PST)
1 Mon, 4 Nov 2019 15:16:58 +0100 (CET)
2 Mon, 4 Nov 2019 08:03:13 +0000 (UTC)
3 Mon, 4 Nov 2019 12:05:54 +0100
4 Thu, 17 Oct 2019 23:19:41 +0100 (GMT+01:00)
Output:
0 2019-11-04 14:12:44+00:00
1 2019-11-04 14:16:58+00:00
2 2019-11-04 08:03:13+00:00
3 2019-11-04 11:05:54+00:00
4 2019-10-17 22:19:41+00:00
Name: 0, dtype: datetime64[ns, UTC]

Related

AttributeError: object has no attribute 'published' when parsing CNN source

I'm facing this problem since parsing CNN.RSS site. It only get first 7 entries then i get this problem. Below it my log .. Please help me :(. Thanks you
This is my code:
import feedparser
url = "http://rss.cnn.com/rss/edition.rss"
feed = feedparser.parse(url)
for news in feed.entries:
print(news.published)
Log
My log:
https://pastebin.com/vMJSXD0J
To debug this you should first check if published is a part of the keys for news or not.
>>> news.keys()
dict_keys(['title', 'title_detail', 'summary', 'summary_detail', 'links', 'link', 'id', 'guidislink', 'published', 'published_parsed', 'media_content'])
According to this published is a part of the keys for news so your code should work which actually does work.
>>> import feedparser
>>> url = "http://rss.cnn.com/rss/edition.rss"
>>> feed = feedparser.parse(url)
>>> for news in feed.entries:
print(news.published)
Thu, 11 Mar 2021 04:53:36 GMT
Thu, 11 Mar 2021 03:21:32 GMT
Wed, 10 Mar 2021 12:54:12 GMT
Thu, 11 Mar 2021 05:13:03 GMT
Wed, 10 Mar 2021 23:46:07 GMT
Wed, 10 Mar 2021 17:56:03 GMT
Thu, 11 Mar 2021 05:50:56 GMT
Thu, 11 Mar 2021 00:37:19 GMT
Thu, 11 Mar 2021 04:44:57 GMT
Wed, 10 Mar 2021 03:46:09 GMT
Wed, 10 Mar 2021 13:24:02 GMT
Thu, 11 Mar 2021 05:37:44 GMT
Thu, 11 Mar 2021 01:48:41 GMT
Wed, 10 Mar 2021 17:13:52 GMT
Thu, 11 Mar 2021 03:43:19 GMT
Thu, 11 Mar 2021 05:11:13 GMT
'
'
etc
PS - This is implemented on Python 3.9

Python imaplib module - How to extract date in (GMT-06:00) Central Time (US & Canada ) format

Using Imaplib in Python , I want to extract the datetime of the email in (GMT-06:00) Central Time (US & Canada) format.
Below are the sample Dates when using msg["date"]
Tue, 14 Jul 2020 08:30:01 +0530
Tue, 14 Jul 2020 02:59:49 +0000
Mon, 13 Jul 2020 21:53:30 -0500 (CDT)
Mon, 13 Jul 2020 18:14:36 -0600
Mon, 13 Jul 2020 16:44:06 -0700
Wed, 8 Jul 2020 20:11:26 +0530

How to create a dynamic date generator in groovy

I have a simple date generator code where it selects a date from now to 120 days.
However, I want my dates to be a little more clever. I only want it to select from months May, June, July and August only.
So if the current date is before May, then select a date at random within one of those four months for this year.
Else if the current date is past June, the select a date from any of the 4 months for next year.
How can this be written in groovy?
Below is the current date generator.
//Defines Start and End Dates for holiday departure date search
Random random = new Random();
date = new Date()
def startDate = date + random.nextInt(100 + 120)
sDate = startDate.format("yyyy-MM-dd")
testRunner.testCase.testSuite.setPropertyValue('arrivaldate', sDate);
Here is the Groovy Script which does what you intended to do.
Note that below script would throw an error if current date falls in the date range which is missing in the question.
Not implemented for the days between 1st May to 30th June
Please follow the comments inline.
import groovy.time.TimeCategory
def dateFormat = 'yyyy-MM-dd'
def getNumberInRange = { min, max -> new Random().nextInt(max + 1 - min) + min }
def isTodayBeforeMay = { Calendar.MONTH < 5 }
def isTodayAfterJune = { Calendar.MONTH > 6 }
//Get the number of days between today and given date
def getDifferenceDays = { targetDate, closure ->
def strDate = closure (targetDate)
def futureDate = new Date().parse(dateFormat, strDate)
TimeCategory.minus(futureDate, new Date()).days
}
//Get the offset between today and max date i.e.,31 august
def getOffSetDays = { date ->
//Need to change the date range if needed.
//As per OP, May to August is mentioned below
def max = getDifferenceDays(date) { "${it[Calendar.YEAR]}-08-31" }
def min = getDifferenceDays(date) { "${it[Calendar.YEAR]}-05-01" }
getNumberInRange(min, max)
}
def now = new Date()
def nextYearNow = now.updated(year: now[Calendar.YEAR] + 1)
def selected
def finalDate
println "Today : $now"
println "Next year same date : $nextYearNow"
if (isTodayBeforeMay()) {
selected = now
} else if (isTodayAfterJune()) {
selected = nextYearNow
} else {
//It is not mentioned what should happened for the mentioned period by OP
throw new Error("Not implemented for the days between 1st May to 30th June")
}
def offset = getOffSetDays(selected)
//Add the offset days to now
use(TimeCategory) {
finalDate = now + offset.days
}
println "Final future date is : $finalDate"
println "Final future date is(formatted) : ${finalDate.format(dateFormat)}"
//set the date at test case level property
context.testCase.setPropertyValue('NEXT_DATE', finalDate.format(dateFormat))
In order to get the date evaluated
Use ${#TestCase#NEXT_DATE} in requests
Use context.expand('${#TestCase#NEXT_DATE}') in groovy script (string is returned)
NOTE: println can be replaced with log.info in the above script if you want to see the output in soapui tool. Otherwise, print statements are shown int log files.
You can quickly try the above script online Demo
Output:
(edited after clarification by op)
def getRandomDate(Date now, Random random) {
def year = now.year + (now.month > 5 ? 1 : 0)
def allDates = (new Date(year, 4, 1)..new Date(year, 7, 31))
def allowedDates = allDates.findAll { now < it }
allowedDates[random.nextInt(allowedDates.size())]
}
// example usage
def now = new Date()
def random = new Random()
100.times {
def today = now + it*3
def randomDate = getRandomDate(now+it*3, random)
println "today: ${today.format('yyyy MMM dd')} ==> random date: ${randomDate.format('yyyy MMM dd')}"
}
an execution of the above prints out something along the lines of:
today: 2017 Feb 25 ==> random date: 2017 Jul 21
today: 2017 Feb 28 ==> random date: 2017 Aug 03
today: 2017 Mar 03 ==> random date: 2017 Jun 21
today: 2017 Mar 06 ==> random date: 2017 May 21
today: 2017 Mar 09 ==> random date: 2017 Jun 05
today: 2017 Mar 12 ==> random date: 2017 May 03
today: 2017 Mar 15 ==> random date: 2017 Aug 04
today: 2017 Mar 18 ==> random date: 2017 Aug 17
today: 2017 Mar 21 ==> random date: 2017 Jul 09
today: 2017 Mar 24 ==> random date: 2017 Jul 11
today: 2017 Mar 27 ==> random date: 2017 Jul 16
today: 2017 Mar 30 ==> random date: 2017 Aug 09
today: 2017 Apr 02 ==> random date: 2017 Jul 09
today: 2017 Apr 05 ==> random date: 2017 Aug 05
today: 2017 Apr 08 ==> random date: 2017 Jul 19
today: 2017 Apr 11 ==> random date: 2017 Aug 10
today: 2017 Apr 14 ==> random date: 2017 Jun 21
today: 2017 Apr 17 ==> random date: 2017 Aug 03
today: 2017 Apr 20 ==> random date: 2017 May 02
today: 2017 Apr 23 ==> random date: 2017 Jul 04
today: 2017 Apr 26 ==> random date: 2017 Jun 13
today: 2017 Apr 29 ==> random date: 2017 May 02
today: 2017 May 02 ==> random date: 2017 Aug 01
today: 2017 May 05 ==> random date: 2017 Aug 07
today: 2017 May 08 ==> random date: 2017 Aug 20
today: 2017 May 11 ==> random date: 2017 Jun 29
today: 2017 May 14 ==> random date: 2017 May 18
today: 2017 May 17 ==> random date: 2017 Jun 11
today: 2017 May 20 ==> random date: 2017 May 26
today: 2017 May 23 ==> random date: 2017 Jul 06
today: 2017 May 26 ==> random date: 2017 Aug 29
today: 2017 May 29 ==> random date: 2017 Jun 02
today: 2017 Jun 01 ==> random date: 2017 Jun 09
today: 2017 Jun 04 ==> random date: 2017 Jun 07
today: 2017 Jun 07 ==> random date: 2017 Aug 09
today: 2017 Jun 10 ==> random date: 2017 Aug 02
today: 2017 Jun 13 ==> random date: 2017 Aug 04
today: 2017 Jun 16 ==> random date: 2017 Aug 09
today: 2017 Jun 19 ==> random date: 2017 Jun 22
today: 2017 Jun 22 ==> random date: 2017 Aug 16
today: 2017 Jun 25 ==> random date: 2017 Aug 13
today: 2017 Jun 28 ==> random date: 2017 Jul 05
today: 2017 Jul 01 ==> random date: 2018 Jun 18
today: 2017 Jul 04 ==> random date: 2018 Jul 29
today: 2017 Jul 07 ==> random date: 2018 Jul 13
today: 2017 Jul 10 ==> random date: 2018 Jul 26
today: 2017 Jul 13 ==> random date: 2018 Aug 23
today: 2017 Jul 16 ==> random date: 2018 Aug 12
today: 2017 Jul 19 ==> random date: 2018 Aug 24
today: 2017 Jul 22 ==> random date: 2018 Aug 20
today: 2017 Jul 25 ==> random date: 2018 Jul 28
today: 2017 Jul 28 ==> random date: 2018 Jul 29
today: 2017 Jul 31 ==> random date: 2018 May 02
today: 2017 Aug 03 ==> random date: 2018 Jul 19
today: 2017 Aug 06 ==> random date: 2018 Aug 05
today: 2017 Aug 09 ==> random date: 2018 Aug 28
today: 2017 Aug 12 ==> random date: 2018 Jul 16
today: 2017 Aug 15 ==> random date: 2018 Aug 04
today: 2017 Aug 18 ==> random date: 2018 May 30
today: 2017 Aug 21 ==> random date: 2018 May 02
today: 2017 Aug 24 ==> random date: 2018 May 01
today: 2017 Aug 27 ==> random date: 2018 May 10
today: 2017 Aug 30 ==> random date: 2018 May 04
today: 2017 Sep 02 ==> random date: 2018 Jun 30
today: 2017 Sep 05 ==> random date: 2018 May 05
today: 2017 Sep 08 ==> random date: 2018 Jul 27
today: 2017 Sep 11 ==> random date: 2018 Aug 14
today: 2017 Sep 14 ==> random date: 2018 Jul 15
today: 2017 Sep 17 ==> random date: 2018 Jul 12
today: 2017 Sep 20 ==> random date: 2018 Jul 24
today: 2017 Sep 23 ==> random date: 2018 Aug 28
today: 2017 Sep 26 ==> random date: 2018 Jul 26
today: 2017 Sep 29 ==> random date: 2018 Jun 27
today: 2017 Oct 02 ==> random date: 2018 Aug 15
today: 2017 Oct 05 ==> random date: 2018 Jun 27
today: 2017 Oct 08 ==> random date: 2018 Jun 01
today: 2017 Oct 11 ==> random date: 2018 Jun 12
today: 2017 Oct 14 ==> random date: 2018 Jun 06
today: 2017 Oct 17 ==> random date: 2018 Aug 02
today: 2017 Oct 20 ==> random date: 2018 May 05
today: 2017 Oct 23 ==> random date: 2018 Jun 15
today: 2017 Oct 26 ==> random date: 2018 Jun 05
today: 2017 Oct 29 ==> random date: 2018 Aug 12
today: 2017 Nov 01 ==> random date: 2018 Aug 29
today: 2017 Nov 04 ==> random date: 2018 May 15
today: 2017 Nov 07 ==> random date: 2018 Jul 03
today: 2017 Nov 10 ==> random date: 2018 Aug 16
today: 2017 Nov 13 ==> random date: 2018 Aug 21
today: 2017 Nov 16 ==> random date: 2018 May 06
today: 2017 Nov 19 ==> random date: 2018 Jul 15
today: 2017 Nov 22 ==> random date: 2018 Jul 03
today: 2017 Nov 25 ==> random date: 2018 Aug 15
today: 2017 Nov 28 ==> random date: 2018 Jul 29
today: 2017 Dec 01 ==> random date: 2018 May 06
today: 2017 Dec 04 ==> random date: 2018 Aug 31
today: 2017 Dec 07 ==> random date: 2018 May 12
today: 2017 Dec 10 ==> random date: 2018 May 01
today: 2017 Dec 13 ==> random date: 2018 Aug 31
today: 2017 Dec 16 ==> random date: 2018 Jul 17
today: 2017 Dec 19 ==> random date: 2018 Jul 24
Essentially we pick a random date out of a list of allowed dates.
Performance could be improved by not creating the allowedDates list on every call.
Some of the date methods are deprecated. If deprecations are important, you might want to use Calendar instead.
The groovy range .. operator on dates return a date range containing all the dates between the two given dates, inclusive.
I have left now external to the method to make testing easier and I have left random external to the method as I believe re-creating new instances of random on every call is detrimental to the quality of the random distribution generated.
Note 1: For dates between May 1st and June 30th, the problem definition is unclear. For these dates the code returns a random date from the current year, excluding past dates. I.e. on June 15th, the algorithm would return a random date from this year between June 16th and August 31st. On July 1st it would return a random date from next year, between May 1st and August 31st.
Note 2: Alternate version without deprecated date calls and using calendar:
import static java.util.Calendar.*
import java.util.GregorianCalendar as Cal
def getRandomDate(Date now, Random random) {
def cal = new Cal(time: now)
// use current year if current date is before july, otherwise next year
def year = cal.get(YEAR) + (cal.get(MONTH) > 5 ? 1 : 0)
def allDates = (new Cal(year, 4, 1).time..new Cal(year, 7, 31).time)
def allowedDates = allDates.findAll { now < it }
allowedDates[random.nextInt(allowedDates.size())]
}

Grails get day of current week and last three weeks

I got a domain work with id, day
Day shows value from Match to current.
I need to find the list of current week and last two weeks
Ex: today is Monday (04/22) then what I need is:
Week1: 06-12 April
Week2: 13-19 April
Current week: 20-26 April.
Please helps, thanks.
Posted here for posterity:
def current = new Date().clearTime()
int currentDay = Calendar.instance.with {
time = current
get( Calendar.DAY_OF_WEEK )
}
def listOfDays = (current - 13 - currentDay)..(current + 7 - currentDay)
listOfDays.each {
println it
}
Prints:
Sun Apr 06 00:00:00 BST 2014
Mon Apr 07 00:00:00 BST 2014
Tue Apr 08 00:00:00 BST 2014
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
Wed Apr 23 00:00:00 BST 2014
Thu Apr 24 00:00:00 BST 2014
Fri Apr 25 00:00:00 BST 2014
Sat Apr 26 00:00:00 BST 2014

Groovy, get list of current week and last 2 weeks

I got a domain work with id, day, list day from January to now.
I get the current time by code:
def current = new Date()
So, I'd like to get list day from last 2 weeks, included this week, then I used the following code but it doesn't work.
def getWeek = current.Time - 13 (13 is 2 week + today)
Please help me solve it.
Not 100% sure I understand, but you should be able to use a Range:
def current = new Date().clearTime()
def listOfDays = (current - 13)..current
listOfDays.each { println it }
That prints:
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
If you mean you want the entire 2 weeks before the current week AND the current week, you could do:
def current = new Date().clearTime()
int currentDay = Calendar.instance.with {
time = current
get( Calendar.DAY_OF_WEEK )
}
def listOfDays = (current - 13 - currentDay)..(current + 7 - currentDay)
listOfDays.each {
println it
}
Which prints:
Sun Apr 06 00:00:00 BST 2014
Mon Apr 07 00:00:00 BST 2014
Tue Apr 08 00:00:00 BST 2014
Wed Apr 09 00:00:00 BST 2014
Thu Apr 10 00:00:00 BST 2014
Fri Apr 11 00:00:00 BST 2014
Sat Apr 12 00:00:00 BST 2014
Sun Apr 13 00:00:00 BST 2014
Mon Apr 14 00:00:00 BST 2014
Tue Apr 15 00:00:00 BST 2014
Wed Apr 16 00:00:00 BST 2014
Thu Apr 17 00:00:00 BST 2014
Fri Apr 18 00:00:00 BST 2014
Sat Apr 19 00:00:00 BST 2014
Sun Apr 20 00:00:00 BST 2014
Mon Apr 21 00:00:00 BST 2014
Tue Apr 22 00:00:00 BST 2014
Wed Apr 23 00:00:00 BST 2014
Thu Apr 24 00:00:00 BST 2014
Fri Apr 25 00:00:00 BST 2014
Sat Apr 26 00:00:00 BST 2014

Resources