COUNTIF is showing wrong output in the sorted data:
POOJA YADAV
PRAKASH SINHA
PRATIBHA
PRATIBHA
PRATIBHA
PRATIBHA
PRATIBHA
PREETI PRAJAPATI
PREETI PRAJAPATI
PREETI PRAJAPATI
PREETI PRAJAPATI
PREETI PRAJAPATI
PREETI PRAJAPATI
PREETI PRAJAPATI
RAJENDRA SAHU
In the above example the code should return '5' when searching 'PRATIBHA', but
it shows '4'.
If I change the name of 'PRATIBHA' to 'PRATIBHA TANDI', it shows '5'.
Why is this happening?
Help
=COUNTIFS($N$2:$N$211,N166)
You could have an extra space in one cell, meaning that if you are telling Excel to look for "PRATIBHA" but it finds "PRATIBHA ", it would only return the 4 instead of 5 you know are there. I'd check the data to see if there are any inconsistencies.
Related
According to the API Doc from https://github.com/GeneralMills/pytrends that interest_by_region has city parameter, so that what i want is to get California, Texas with keyword Corona Virus from 15-Jan-2020 to 15-Feb-2020
searchkey = ['Corona Virus']
city =['California','Texas']
region = pytrend.interest_by_region(resolution='CITY', inc_low_vol=True, inc_geo_code=True)
then i receive below error
KeyError: "['geoCode'] not in index"
Any help please
Please follow the below link and make changes to the "lib\site-packages\pytrends\request.py" file in your directory.
link: KeyError: "['geoCode'] not in index" #316
It wokred for me.
pytrend = TrendReq(hl='en-US',geo = 'US', tz=360)
data = pytrend.interest_by_region(resolution='CITY', inc_low_vol=True,inc_geo_code=False)
Note: Once you change the code keep the inc_geo_code as False
I got an error in Pyspark:
AnalysisException: u'Resolved attribute(s) week#5230 missing from
longitude#4976,address#4982,minute#4986,azimuth#4977,province#4979,
action_type#4972,user_id#4969,week#2548,month#4989,postcode#4983,location#4981
in operator !Aggregate [user_id#4969, week#5230], [user_id#4969,
week#5230, count(distinct day#4987) AS days_per_week#3605L].
Attribute(s) with the same name appear in the operation: week.
Please check if the right attribute(s) are used
This seems to come from a snippet of code where the agg function is used:
df_rs = df_n.groupBy('user_id', 'week')
.agg(countDistinct('day').alias('days_per_week'))
.where('days_per_week >= 1')
.groupBy('user_id')
.agg(count('week').alias('weeks_per_user'))
.where('weeks_per_user >= 5').cache()
However I do not see the issue here. And I have previously used this line of code on the same data, many times.
EDIT: I have been looking through the code and the type of error seems to come from joins of this sort:
df = df1.join(df2, 'user_id', 'inner')
df3 = df4.join(df1, 'user_id', 'left_anti).
but still have not solved the problem yet.
EDIT2: Unfortunately the suggested question is not similar to mine, as this is not a question of column name ambiguity but of missing attribute, which seems not to be missing upon inspecting the actual dataframes.
I faced same problem and solved it using renaming the Resolved attributes missing columns to some temp name before join, its a workaround for me , hope it helps you too. Dont know the real reason behind this issue , its still going on since spark 1.6 SPARK-10925
I also faced this issue multiple times and came across this
here it's mentioned that this it's spark related bug.
Based on this article I came up with below code which resolved my issue.
The code can handle LEFT, RIGHT, INNER and OUTER Joins, though OUTER join works as FULL OUTER here.
def join_spark_dfs_sqlbased(sparkSession,left_table_sdf,right_table_sdf,common_join_cols_list=[],join_type="LEFT"):
temp_join_afix="_tempjoincolrenames"
join_type=join_type.upper()
left=left_table_sdf.select(left_table_sdf.columns)
right=right_table_sdf.select(right_table_sdf.columns)
if len(common_join_cols_list)>0:
common_join_cols_list=[col+temp_join_afix for col in common_join_cols_list]
else:
common_join_cols_list = list(set(left.columns).intersection(right.columns))
common_join_cols_list=[col+temp_join_afix for col in common_join_cols_list]
for col in left.columns:
left = left.withColumnRenamed(col, col + temp_join_afix)
left.createOrReplaceTempView('left')
for col in right.columns:
right = right.withColumnRenamed(col, col + temp_join_afix)
right.createOrReplaceTempView('right')
non_common_cols_left_list=list(set(left.columns)-set(common_join_cols_list))
non_common_cols_right_list=list(set(right.columns)-set(common_join_cols_list))
unidentified_common_cols=list(set(non_common_cols_left_list)-set(non_common_cols_right_list))
if join_type in ['LEFT','INNER','OUTER']:
non_common_cols_right_list=list(set(non_common_cols_right_list)-set(unidentified_common_cols))
common_join_cols_list_with_table=['a.'+col +' as '+col for col in common_join_cols_list]
else:
non_common_cols_left_list=list(set(non_common_cols_left_list)-set(unidentified_common_cols))
common_join_cols_list_with_table=['b.'+col +' as '+col for col in common_join_cols_list]
non_common_cols_left_list_with_table=['a.'+col +' as '+col for col in non_common_cols_left_list]
non_common_cols_right_list_with_table=['b.'+col +' as '+col for col in non_common_cols_right_list]
non_common_cols_list_with_table=non_common_cols_left_list_with_table + non_common_cols_right_list_with_table
if join_type=="OUTER":
join_type="FULL OUTER"
join_type=join_type+" JOIN"
select_cols=common_join_cols_list_with_table+non_common_cols_list_with_table
common_join_cols_list_with_table_join_query=['a.'+col+ '='+'b.'+col for col in common_join_cols_list]
query= "SELECT "+ ",".join(select_cols) + " FROM " + "left" + " a " + join_type + " " + "right" + " b" +" ON "+ " AND ".join(common_join_cols_list_with_table_join_query)
print("query:",query)
joined_sdf= sparkSession.sql(query)
for col in joined_sdf.columns:
if temp_join_afix in col:
joined_sdf = joined_sdf.withColumnRenamed(col, col.replace(temp_join_afix,''))
return joined_sdf
This is my code for a particular question on codechef :
n=int(input())
for _ in range(n):
a,b =input().split()
a=int(a)
b=int(b)
x=[int(q) for q in input().split()]
x.sort(reverse=True)
new_x=[item for pos,item in enumerate(x) if x.index(item)==pos]
last=new_x[b]
i=x.index(last)
ans= i - 1 + x.count(last)
print(ans)
It passes all the test cases when I try it on my own environment but shows runtime (NZEC) when I submit the solution.
I searched a lot on internet but not able to figure out the problem.
Please help me out.
I have a problem with a function which has an iteration for an array. Here is my function;
def create_new_product():
tree = ET.parse('products.xml')
root = tree.getroot()
array = []
appointments = root.getchildren()
for appointment in appointments:
appt_children = appointment.getchildren()
array.clear()
for appt_child in appt_children:
temp = appt_child.text
array.append(temp)
new_product = Product(
product_name = array[0],
product_desc = array[1]
)
new_product.save()
return new_product
When I call the function, it saves 2 products into database but gives an error on third one. This is the error;
product_name = array[0],
IndexError: list index out of range
Here is also the xml file. I only copied the first 3 products from xml. There are almost 2700 products in the xml file.
<?xml version="1.0" encoding="UTF-8"?>
<Products>
<Product>
<product_name>Example 1</product_name>
<product_desc>EX101</product_desc>
</Product>
<Product>
<product_name>Example 2</product_name>
<product_desc>EX102</product_desc>
</Product>
<Product>
<product_name>Example 3</product_name>
</Product>
</Products>
I don't understand why I am getting this error because it already works for the first two products in the xml file.
I have run a minimal version of your code on python 3 (I assume it's 3 since you use array.clear()):
import xml.etree.ElementTree as ET
def create_new_product():
tree = ET.parse('./products.xml')
root = tree.getroot()
array = []
appointments = root.getchildren()
for appointment in appointments:
appt_children = appointment.getchildren()
array.clear()
# skip this element and log a warning
if len(appt_children) != 2:
print ('Warning : skipping element since it has less children than 2')
continue
for appt_child in appt_children:
temp = appt_child.text
array.append(temp)
_arg={
'product_name' : array[0],
'product_desc' : array[1]
}
print(_arg)
create_new_product()
Output :
{'product_name': 'Example 1', 'product_desc': 'EX101'}
{'product_name': 'Example 2', 'product_desc': 'EX102'}
Warning : skipping element since it has less children than 2
Edit : OP has found that the products contain sometime less children than expected. I added a check of the elements number.
List index out of range is only thrown when a place in an array is invalid, so product_name[0] doesn't actually exist. Maybe try posting your XML file and and we'll see if there's an error there.
I would like to write groovy script results in to a file on my Mac machine.
How do I do that in Groovy?
Here my attempt:
log.info "Number of nodes:" + numElements
log.info ("Matching codes:"+matches)
log.info ("Fails:"+fails)
// Exporting results to a file
today = new Date()
sdf = new java.text.SimpleDateFormat("dd-MM-yyyy-hh-mm")
todayStr = sdf.format(today)
new File( "User/documents" + todayStr + "report.txt" ).write(numElements, "UTF-8" )
new File( "User/documents" + todayStr + "report.txt" ).write(matches, "UTF-8" )
new File( "User/documents" + todayStr + "report.txt" ).write(fails, "UTF-8" )
Can anyone help in the exporting results part?
Thanks,
Regards,
A
ok, i've managed to create a file with
file = new File(dir + "${todayStr}_report.txt").createNewFile()
How do I add the numelements,matches and fails? like this:
File.append (file, matches)?
I get the follwoing error:
groovy.lang.MissingMethodException: No signature of method: static java.io.File.append() is applicable for argument types: (java.lang.Boolean, java.util.ArrayList) values: [true, [EU 4G FLAT L(CORPORATE) SP, EU 4G FLAT M(CORPORATE), ...]] Possible solutions: append(java.lang.Object, java.lang.String), append(java.io.InputStream), append(java.lang.Object), append([B), canRead(), find() error at line: 33
You have wrong filepath. I don't have MAC so I'm not 100% sure but from my point of view you should have:
new File("/User/documents/${todayStr}report.txt").write(numElements, "UTF-8")
You lack at least two backslashes, first before User and second after documents in your path. With the approach you have now, it tries to save to a directory User/documentDATE, pretty sure it does not exist.
So above I showed you way with absolute path. You can also write strictly like this:
new File("${todayStr}report.txt").write(numElements, "UTF-8")
and if the file is created then you'll be 100% sure it is a problem with your filepath :)
A few more things - since it is a groovy, try to use the advantages that language has over Java, there are several ways of working with files, I've also rewritten your logs to show you how simple it is to work with Strings in groovy:
log.info "Number of nodes: ${numElements}"
log.info "Matching codes: ${matches}"
log.info "Fails: ${fails}"
I hope it helps