how to search for latest content on google? - search

in google search box when we type something like " 'java code' + inurl:javalobby " we will get the search results where the website link contains the string javalobby and the page will contain the string java code.
Similarly is there a way to search the latest updated content in the internet which will contain the keyword entered in the search box ?
Thanks.

There are two tricks in google to narrow your search based on date. It is using either the keyword daterange:startdate-enddate or by content creation date.
1. Using the syntax daterange:startdate-enddate : The catch is that the date must be expressed as a Julian date, a continuous count of days since noon UTC on January 1, 4713 BC. So, for example, July 8, 2002 is Julian date 2452463.5 and May 22, 1968 is 2439998.5. Furthermore, Google isn't fond of decimals in its daterange: queries; use only integers: 2452463 or 2452464. You can convert Julian dates online here.
Example:- Geri Halliwell left the Spice Girls around May 27, 1998. If you wanted to get a lot of information about the breakup, you could try doing a date search in a ten-day window—Say, May 25 to June 4. That query would look like this:
"Geri Halliwell" "Spice Girls" daterange:2450958-2450968
2. Searching by content creation date : Try adding a string of common date formats to your query. If you wanted something from May 2003, for example, you could try appending:
("May * 2003" | "May 2003" | 05/03 | 05/*/03)
A query like that uses up most of your ten-query limit, however, so it's best to be judicious— perhaps by cycling through these formats one a time. If any one of these is giving you too many results, try restricting your search to the title tag of the page.
If you're feeling really lucky you can search for a full date, like May 9, 2003. Your decision then is if you want to search for the date in the format above or as one of many variations: 9 May 2003, 9/5/2003, 9 May 03, and so forth. Exact-date searching will severely limit your results and shouldn't be used except as a last-ditch option.
When using date-range searching, you'll have to be flexible in your thinking, more general in your search than you otherwise would be (because the date-range search will narrow your results down a lot), and persistent in your queries because different dates and date ranges will yield very different results. But you'll be rewarded with smaller result sets that are focused on very specific events and topics.

Related

Date Entity Parsing Incorrect Year for Incomplete Dates

I have a dataset (df_test) containing of several news articles (Text_4). Using SpaCy, I've extracted the 'DATE' entities. For those I want to see whether they are in the future or in the past (to identify news articles that reference future events such as product launches) compared to the article's publication date (RP_DateFormatted)
My current code is
for index, row in df_test.iterrows():
doc = nlp(row.Text_4)
entities = {key: list(g) for key, g in groupby(sorted(doc.ents, key=lambda x: x.label_), lambda x: x.label_)}
... some other steps ... then:
ListDATE3 = [dateparser.parse(replace_all((i.text), od), languages=['en'],
settings={'RELATIVE_BASE': datetime.strptime(row.RP_DateFormatted, '%Y-%m-%d'),
'PREFER_DAY_OF_MONTH': 'last',
'PREFER_DATES_FROM': 'future'}) for i in entities['DATE']]
df_test.PY_Entities_DatesParsed[index] = ListDATE3
I have trouble with the line 'PREFER_DATES_FROM': 'future', for example:
Article was written on August 15th 2005 but no year is given in the text. SpaCy extracts "Aug 15" as Date. The dateparser sets the year to 2006 (because it is in the future). Consequently, I would then believe that the news article talks about the future - which it does not.
Setting 'PREFER_DATES_FROM': 'past' would also not help me in a case when an event is described that happens in February (without a year given in the text). This is likely to be next February but the dateparser would set it to this year's February.
Is there a way to add an if statement to the settings or to create a new function based on the dateparser? Please note that each news articles can have multiple dates (entities['DATE'] is a list for each row in my dataframe).
I am using Python 3.8
I don't think you're going to be able to solve this just with options to DateParser. That interprets dates mechanically given a string, but in order to tell whether these dates are in the past or future you're using knowledge of the surrounding words and context of the article ("at next February's festival...").
This is a pretty hard thing to get right in an automated system. In NLP research this is referred to as "grounding", and includes related problems, like telling who "President of the United States" refers to (what year was it?), or what color "red" is (is it red like a stop sign, or red like red hair?).
What I would do is start by using rule-based techniques to identify whether dates are in the past or future before passing them to date parser. So take some words from around date entities, and if "last" is there then it's in the past, if "next" is there then it's in the future, that sort of thing. See how well it does. (You might think you could just take words before the date entity, but you can also have "February last year was really cold" or something.)
If you want to try a statistical system after that, you could look at using the spancat in spaCy with different kinds of context windows to classify dates as "future" or "past".

Excel bug: TEXT function doesn't work to extract day name depending on region

How to compute day name from date in Excel?
Please don't say it is TEXT(...,"ddd") because it doesn't work
Another screenshots for non-believers:
Complete formula just doesn't work too:
This is some problem with locale processing. Although my Windows in English, my region is Russia and Excel uses it in some strange places:
TEXT(...,"ddd") ( or TEXT(...;"ddd") . as required )
does work, provided you either SUBSTITUTE the dots . in your data for recognisable date separators first (eg /) or apply Find/Replace for that purpose. Though having done either (perhaps working on a copy) no formula is necessary since merely a Custom format of:
dddd
(long form, or ddd short) should be sufficient.
Note that without indication of the century Excel will guess which and not give you the right answer for a date such as 11.11.1911 (Armistice Day, a Saturday) represented in text as 11.11.11.
With string parsing you would need to be careful whether 10.08 represents October 8 on your system, or August 10.
We can always manually build one : =if(WEEKDAY(A1,2)=7,"Sunday",if(WEEKDAY(A1,2)=6,"Saturday",if(WEEKDAY(A1,2)=5,"Friday",if(WEEKDAY(A1,2)=4,"Thursday",if(WEEKDAY(A1,2)=3,"Wednesday",if(WEEKDAY(A1,2)=2,"Tuesday",if(WEEKDAY(A1,2)=1,"Monday","")))))))
[^_^]

Importing Excel to Access: Date be field name for Access table

I was trying to import an Excel worksheet into Access table and the worksheet had specific dates(E.g. 12/4/2017) as headers for columns.
And when i tried to import to Access, Access did not allow me to import that worksheet into table as "12/4/2017 isnt a valid field name"
Is there other ways to import the worksheet or work about this?
Thanks
Names of fields, controls & objects in Access:
Can be up to 64 characters long.
Can include any combination of letters, numbers, spaces, and special characters except a period (.), an exclamation point
(!), accent grave (``) or brackets ([ ]).
Can't begin with leading spaces.
Can't include control characters (ASCII values 0 through 31).
Can't include a double quotation mark (") in table, view, or stored procedure names in a Microsoft Access project.
(Source)
Date and time values in Excel are stored internally as a 64-bit floating point number. The value to the left of the decimal represents the number of days since December 30, 1899. The value to the right of the decimal represents the fraction of a day since midnight.
For example:
12:00 Noon is stored as 0.5.
1.0 represents midnight on January 1, 1900.
2.25 represents 6:00 AM on January 2, 1900.
Your example date 12/4/2017 would be stored as 43073.
Interpretation of datetime's depend on customization of regional settings according to Microsoft (not necessarily the country's government standard date format). For example, I live in North America, so by default, Excel would interpret 12/4/2017 as a date.
However, for various reasons, I prefer a date format of YYYY-MM-DD (technically named "ISO 8601"), so I changed the format in my Windows Settings. Therefore, when I enter 12/4/2017, Excel does not recognize it as a date, so it is stored as text, yet when I enter 2017-12-4, Excel knows to store it as a date.
Regional settings aside, I suspect that your field names may have times attached to them (even if they aren't formatted to display as such).
If the cell you'd like to use as a field name actually contains:
April 12, 2017 6:00 AM
which, if formatted as M/D/YYYY, "hides" the time, to display as:
12/4/2017
even though it is actually stored internally as:
43073.25
Given the Access field names can't contain a period (see above), Access becomes "confused" with the fraction of a day (.25).
Make sure your dates to be used as field names don't contain times.
You could:
Format the row that has the field names as text.
Right-click the row number and choose Format Cells.
Under the Number tab, choose Text
Use a function to remove the times:
If B1 contains a datetime you want to use as a field name in A1, you could use the Int function in cell A1 (to round the value down to a whole number):
=Int(B1)
The fraction (time) is removed but the value is still stored as a number/date.
Use a function to convert the datetime to text:
If B1 contains a date you want to use as a field name in A1, you could use the Text function in cell A1:
=Text(B1, "M/D/YYYY HH:MM")
As you can see in the image, Access allows me to use the dates as field names if they are properly formatted:
Related Further Reading:
TechRepublic: Techniques for successfully importing Excel data into Access
Office.com: Guidelines for naming fields, controls, and objects
ExcelTactics: The Definitive Guide to Using Dates and Times in Excel
Microsoft: How to use dates and times in Excel
Stack Overflow: MS Access - Date as Table Field Name
A note about Database Normalization:
Just because you can use dates as field names, that doesn't mean that you should. It is generally considered poor database design to have a field name so specific.
Perhaps your intention is to import the poorly-structured data into Access to fix this issue, but if not, you should consider storing the data in a more organized way that is conducive to database expansion and normalization.
If your data has date-specific field names:
...then the date should be added as part of the record, not as a field name:
...although this is still not normalized. Normalization is about optimizing efficiency and allowing for expansion, so perhaps the database could be setup more like:
With this method it would be database expansion and data analysis would be more logical (perhaps making it easier to find trends in Jane's troubling eating habits).
Alas, I digress. There is plenty of information available online about database normalization, to suit any experience level.
Further Reading about Normalization:
Wikipedia: Database Normalization
Microsoft: Description of Database Normalization Basics
ThoughtCo: Database Normalization Basics
Stack Overflow: Database normalization - who's right?
EDIT: (the result)
You didn't mention which method you're using to import the data from Excel to Access, which may be relevant (as there are several possible combinations). Access might handle the source data differently if your Excel data is saved in an XLSM vs XLS vs CSV, etc. Data could be imported using the New Source Data…from File interface, vs programmatically with VBA, or even other languages. Therefore, if you can't get one method working (with the dates formatted a specific way), try one of the other combinations.
For simplicity's sake, I used the built-in interface with an XLSM into an ACCDB. The result is demonstrated below:
Note that it worked even though I included times in the headers (and would work without times), since they are properly formatted as text, and First Column Contains Column Headers is selected.

How can I get the members of a PivotTable column field to display in natural order, rather than alphabetically?

I was taught how to convert a date-as-string value that was being converted from what I wanted ("Sep 2015", "Oct 2015" etc.) to what Excel thought it should be ("15-Sep", "15-Oct" etc.) here.
When it was displaying "badly," the columns at least displayed in the right order ("2015-Sep" followed by "2015-Oct"). Now that they are "Sep 15" and "Oct 15", though, they are displaying out of "natural" order and in alphabetical order ("Oct 15" followed by "Sep 15").
This is the too-typical scenario (especially egregiously evident in software development) of the solving of one problem causing another one to rear its ugly rear.
This is how I create the "month" part of the PivotTable:
var monthField = pvt.PivotFields("MonthYr");
monthField.Orientation = XlPivotFieldOrientation.xlColumnField;
Before fixing the display format problem:
After fixing the display format problem ("15-Sep" is now "Sep 15", etc., but the months are now out of order):
Can I "have my cake and eat it, too" so to speak? If so, how?
From reading the comments, it sounds like you need to convert your C# date to an excel date. In the second comment you mention that you were able to get values "201509" and "201510" onto an excel sheet.
I suggest you separate the year and month using the LEFT() and RIGHT() functions, then use the DATE() function to get the Excel serial number for 9/1/15 and 10/1/15.
Here's a screenshot of the steps I'm think of (Happy Halloween!):
Finally, you can now format the Serial number using your formula monthField.NumberFormat = "MMM yy". Excel will realize this is a date and sort it chronologically.

categorizing documents depending on their date fields

I've been stuck with an annoying problem for a while that I can't fix. I have a field in all of the documents that represents time- a date in format dd.mm.yyyy.
What I'm trying to do is to categorise them- Show the documents that have todays date, that will have todays date in closest 7 days, etc.
Here's the code (formula for the categorized field) that I have:
#If(#Today > pi_due_date; "Late docs"; #Today=pi_due_dat; "Todays docs";((pi_due_date - #Now)/86400)>0 &((pi_due_date - #Now)/86400)<7;"This weeks docs";"Future docs")
Everything was fine until today (after 12:00 PM) I noticed that this part: #Today=pi_due_dat; "Todays docs"; does not work, it does not return the document in the "Todays docs" category. Pretty much the same thing is happening to all the other categories and I don't understand what is causing this problem.
pi_due_dat is missing the 'e' at the end.
Assuming it is more than that, though, you'll want to make sure that you are only comparing the dates and not a date/time.
Try #Date(pi_due_date) = #Today instead.
I would like to point out that using #Today or #Now in a view (selection criteria or column value) will create serious performance issues, as the view will be constantly re-indexed. It will affect all applications on that server as well.
You may want to rethink the design, perhaps have a scheduled nightly agent that set a flag on the documents to indicate how they are boing categorized.

Resources