I'm trying to write a subquery in Laravel 4.2 where to substract data from the subquery and then to build table1 with necessary age bands but I can't get it right.
The query is this:
$var1= DB::table('table1')
->select(DB::raw("COUNT(*), CASE
WHEN age >=0 AND age <=20 THEN '0-20'
WHEN age >=21 AND age <=24 THEN '21-24'
WHEN age >=25 AND age <=29 THEN '25-29'
WHEN age >=30 AND age <=34 THEN '30-34'
WHEN age >=35 AND age <=39 THEN '35-39'
WHEN age >=40 AND age <=44 THEN '40-44'
WHEN age >=45 AND age <=49 THEN '45-49'
WHEN age >=50 AND age <=54 THEN '50-54'
WHEN age >=55 AND age <=59 THEN '55-59'
WHEN age >=60 AND age <=64 THEN '60-64'
WHEN age >=65 THEN 65+
END AS age"),
function ($query){
$query->select(DB::raw("( SELECT DATE_FORMAT(NOW(),'%Y') - DATE_FORMAT(birthday,'%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') <
DATE_FORMAT(birthday, '00-%m-%d')) AS age FROM ...
) AS table1"));
})
->groupBy('age')
->get();
Ok, I solved this question using this link http://laravel.io/forum/12-23-2015-subquery-in-laravel-eloquent-or-query-builder and I build my query accordingly where table1 is not a real table in the DB but is created from the subquery. Here is the code:
$ageband = DB::table('table1')
->select(DB::raw("COUNT(*),
CASE
WHEN age >=0 AND age <=20 THEN '0-20'
WHEN age >=21 AND age <=24 THEN '21-24'
WHEN age >=25 AND age <=29 THEN '25-29'
WHEN age >=30 AND age <=34 THEN '30-34'
WHEN age >=35 AND age <=39 THEN '35-39'
WHEN age >=40 AND age <=44 THEN '40-44'
WHEN age >=45 AND age <=49 THEN '45-49'
WHEN age >=50 AND age <=54 THEN '50-54'
WHEN age >=55 AND age <=59 THEN '55-59'
WHEN age >=60 AND age <=64 THEN '60-64'
WHEN age >=65 THEN '65+'
END AS ageband"))
->from(DB::raw("( SELECT DATE_FORMAT(NOW(),'%Y') - DATE_FORMAT(birthday,'%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') <
DATE_FORMAT(birthday, '00-%m-%d')) AS age FROM health_clients
) as table1"))
->groupBy('ageband')
->get();
Related
Here in this data set, there are null values in 'History' columns and there are three category in Age ('Old', 'Young', Middle) in Age columns.
I want to replace null values of History column with mode of each History columns based on the Age.
For exapmle:
I want to do like this If Age is 'Old' I want to replace null values of History column with 'Low' as it is mode for History column when Age is Old is Low.
I want to do same, replace null valus of History when Age is 'Middle' with 'High' as the mode of History is 'High' when Age is Midle.
Same thing want do with Age is 'Young' replace History null values with 'Low' as mode of History is 'Low' when Age is 'Young'.
How can I do that?
df.head()
Age Gender OwnHome Married Location Salary Children History Catalogs AmountSpent
0 Old Female Own Single Far 47500 0 High 6 755
1 Middle Male Rent Single Close 63600 0 High 6 1318
2 Young Female Rent Single Close 13500 0 Low 18 296
3 Middle Male Own Married Close 85600 1 High 18 2436
4 Middle Female Own Single Close 68400 0 High 12 1304
i have a dataframe of with 4 attributes, it can be seen blow.
what i wanted to do it that take the name and age of a person and count the number of friends he has. then of two ppl have the same age with different names, take the average number of friends for that age group. final divide the age range into age group and then take the average. this is how i tried.
#loc the attribute or features of interest
friends = df.iloc[:,3]
ages = df.iloc[:,2]
# default of dictionary with age as key and value as a list of friends
dictionary_age_friends = defaultdict(list)
# populating the dictionary with key age and values friend
for i,j in zip(ages,friends):
dictionary_age_friends[i].append(j)
print("first dict")
print(dictionary_age_friends)
#second dictionary, the same age is collected and the number of friends is added
set_dict ={}
for x in dictionary_age_friends:
list_friends =[]
for y in dictionary_age_friends[x]:
list_friends.append(y)
set_list_len = len(list_friends) # assign a friend with a number 1
set_dict[x] = set_list_len
print(set_dict)
# set_dict ={}
# for x in dictionary_age_friends:
# print("inside the loop")
# lis_1 =[]
# for y in dictionary_age_friends[x]:
# lis_1.append(y)
# set_list = lis_1
# set_list = [1 for x in set_list] # assign a friend with a number 1
# set_dict[x] = sum(set_list)
# a dictionary that assign the age range into age-groups
second_dict = defaultdict(list)
for i,j in set_dict.items():
if i in range(16,20):
i = 'teens_youthAdult'
second_dict[i].append(j)
elif i in range(20,40):
i ="Adult"
second_dict[i].append(j)
elif i in range(40,60):
i ="MiddleAge"
second_dict[i].append(j)
elif i in range(60,72):
i = "old"
second_dict[i].append(j)
print(second_dict)
print("final dict stared")
new_dic ={}
for key,value in second_dict.items():
if key == 'teens_youthAdult':
new_dic[key] = round((sum(value)/len(value)),2)
elif key =='Adult':
new_dic[key] = round((sum(value)/len(value)),2)
elif key =='MiddleAge' :
new_dic[key] = round((sum(value)/len(value)),2)
else:
new_dic[key] = round((sum(value)/len(value)),2)
new_dic
end_time = datetime.datetime.now()
print(end_time-start_time)
print(new_dic)
some of the feedback i got is: 1, no need to build a list if u want just to count number of friends.
2, two ppl with the same age, 18. One has 4 friends, the other 3. the current code conclude that there are 7 average friends.
3, the code is not correct and optimal.
any suggestions or help? thanks indavance for all suggestion or helps?
I haven't understood names of attributes and you haven't mention by which age groups you need to split your data. In my answer I'll treat the data as if the attributes were:
index, name, age, friend
To find amount of friends by name, I would suggest you to use groupby.
input:
groups = df.groupby([df.iloc[:,0],df.iloc[:,1]]) # grouping by name(0), age(1)
amount_of_friends_df = groups.size() # gathering amount of friends for a person
print(amount_of_friends_df)
output:
name age
EUNK 25 1
FBFM 26 1
MYYD 30 1
OBBF 28 2
RJCW 25 1
RQTI 21 1
VLIP 16 1
ZCWQ 18 1
ZMQE 27 1
To find amount of friends by age you also can use groups
input:
groups = df.groupby([df.iloc[:,1]]) # groups by age(1)
age_friends = groups.size()
age_friends=age_friends.reset_index()
age_friends.columns=(['age','amount_of_friends'])
print(age_friends)
output:
age amount_of_friends
0 16 1
1 18 1
2 21 1
3 25 2
4 26 1
5 27 1
6 28 2
7 30 1
To calculate average amount of friends per age group you can use categories and groupby.
input:
mean_by_age_group_df = age_friends.groupby(pd.cut(age_friends.age,[20,40,60,72]))\
.agg({'amount_of_friends':'mean'})
print(mean_by_age_group_df)
pd.cut returns caregorical series which we use to group data. Afterwards we use agg function to aggregate groups in dataframe.
output:
amount_of_friends
age
(20, 40] 1.333333
(40, 60] NaN
(60, 72] NaN
# ABC Inc., Gross Pay Calculator!
# Enter employee's name or 0 to quit : Nathan
# Enter hours worked : 35
# Enter employee's pay rate : 10.00
# Employee Name : Nathan
# Gross Pay: 350.0
# Enter next employee's name or 0 to quit : Toby
# Enter hours worked : 45
# Enter employee's pay rate : 10
# Employee Name : Toby
# Gross Pay : 475.0
# (overtime pay : 75.0 )
# Enter next employee's name or 0 to quit : 0
# Exiting program...
How do i make the input for 0 print "exiting program" then exit?
print('ABC inc., Gross Pay Calculator!')
name = input("Enter employee's name or 0 to quit:")
hours = float(input("Enter hours worked:"))
payrate = float(input("Enter employee's pay rate:"))
print("Employee Name:", name)
grosspay = hours * payrate
print("Gross pay:", grosspay)
if hours > 40:
print("(Overtime pay:", (hours - 40) * payrate * 1.5)
while name!=0:
name = input("Enter next employee's name or 0 to quit:")
hours = float(input("Enter hours worked:"))
payrate = float(input("Enter employee's pay rate:"))
print("Employee Name:", name)
grosspay = hours * payrate
print("Gross pay:", grosspay)
if hours > 40:
print("(Overtime pay:", (hours - 40) * payrate*1.5)
else:
print ('Exiting program...')
Do not repeat code like that. Instead use while True: and conditional break as the appropriate place. This is the standard 'loop-and-a-half' idiom in Python.
print('ABC inc., Gross Pay Calculator!')
while True:
name = input("Enter employee's name or 0 to quit:")
if name == '0':
print ('Exiting program...')
break
hours = float(input("Enter hours worked:"))
payrate = float(input("Enter employee's pay rate:"))
print("Employee Name:", name)
grosspay = hours * payrate
print("Gross pay:", grosspay)
if hours > 40:
print("(Overtime pay:", (hours - 40) * payrate * 1.5)
I would replace '0' in the prompt with 'nothing' and make the test if not name:, but this is a minor issue.
I have two txt files let's say file1 and file2 with 4 columns.....
Table name,column name,data type,length
File1 contains
Emp name varchar 30
Emp ID int 20
Emp age int 3
Emp no int 10
File2 contains
Emp name varchar 30
Emp ID varchar 20
Emp age int 5
Emp DOB. Date 10
I want to compare file1 with file2 based on column 1 and column 2 (table name, column name)
Based on that 2 columns compare if there is any mismatch in data type and length columns
If there is a mismatch in data type or length
Display the corresponding row like (data from both files, same row)
Emp age int 3 Emp age int 5
Output the unique rows in file 1 like
Emp emailid varchar 50
As output of another command
When I use comm command for unique records of file 1
It gives output like
Emp ID int 20
Emp age int 3
Emp no int 10
But want the o/p to be only
Emp no int 10
Because columns (table name,column names) are present in file 2 also.
based on your long description, I come up with this one-liner:
awk '{k=$1FS$2}NR==FNR{a[k]=$3FS$4;next}
k in a{if($3FS$4!=a[k])print $0,k,a[k];next}7' file2 file1
With your given input files:
kent$ head f1 f2
==> f1 <==
Emp name varchar 30
Emp ID int 20
Emp age int 3
Emp no int 10
==> f2 <==
Emp name varchar 30
Emp ID varchar 20
Emp age int 5
Emp DOB. Date 10
kent$ awk '{k=$1FS$2}NR==FNR{a[k]=$3FS$4;next}k in a{if($3FS$4!=a[k])print $0,k,a[k];next}7' f2 f1
Emp ID int 20 Emp ID varchar 20
Emp age int 3 Emp age int 5
Emp no int 10
Note that all case-sensitive. E.g. id and ID are different.
I have a column for age and a column for gender. How can I get the number of people with ages between 0-18 and with a gender of "male"?
For example, with the following data, the formula should return "1":
Name
Age
Gender
Rick
70
male
Morty
14
male
Summer
17
female
Beth
34
female
Jerry
35
male
Assuming your age data is in column A:A and gender in column B:B:
=COUNTIFS(A:A,"<=18",B:B,"M")
(and I hope all age values are >0). Otherwise:
=COUNTIFS(A:A,"<=18",A:A,">=0",B:B,"M")