How to sum map values based on two keys? - groovy

I have a map with two keys (customer and price) like the one shown below:
[
customer: ['Clinton', 'Clinton', 'Mark', 'Antony', 'Clinton', 'Mark'],
price: [15000.0, 27000.0, 28000.0, 56000.0, 21000.0, 61000.0]
]
customer and price values are mapped by their index positione i.e first name from customer list maps with the first price and so on.
Example
Cliton price is 15000.0
Cliton price 27000.0
Mark price 28000.0
Antony price 56000.0
Clinton price 21000.0
Mark price 61000.0
I would like to sum up price grouped by names. Expected output:
Clinton price 63000
Mark price 89000
Antony price 56000
Are there any built-in functions to achieve this in Groovy, or do I need to iterate over the map and sum values by writing my own functions?

You can start with a transpose on both lists to get tuples of customer and price. From there its basically like the other answers (group by customer, build map with customer and summed up prices). E.g.:
def data = [
customer:['Clinton', 'Clinton', 'Mark', 'Antony', 'Clinton', 'Mark'],
price:[15000.0, 27000.0, 28000.0, 56000.0, 21000.0, 61000.0]
]
println(
[data.customer, data.price].transpose().groupBy{ c, p -> c }.collectEntries{ c, ps -> [c, ps*.last().sum() ] }
)
// => [Clinton:63000.0, Mark:89000.0, Antony:56000.0]

In the problems like this, we should always plan having every entry as separate object inside a list to make it intuitive and future easy manipulation in the list.
In that case the same result can be obtained in naturally
def list = [
[customer: 'Clinton', price: 15000.0],
[customer: 'Clinton', price: 27000.0],
[customer: 'Mark', price: 28000.0],
[customer: 'Antony', price: 56000.0],
[customer: 'Clinton', price: 21000.0],
[customer: 'Mark', price: 61000.0]
]
def map = list.groupBy({it.customer}).collectEntries {k, v -> [k, v.price.sum()]}
map.each {println it}

The followint creates a map of price aggregated by customer:
def vals = (0..(-1+map.customer.size()))
.collect{['name':map.customer[it], 'price': map.price[it]]}
.groupBy{it['name']}
.collectEntries{[(it.key): it.value.collect{it['price']}.sum()]}
That results in:
[Clinton:63000.0, Mark:89000.0, Antony:56000.0]
It's essentially an iteration using a range of numbers from 0 to map.customer.size() - 1, followed by a group-by with sum of values.

This version is derived from #cfrick's answer, with an alternative to the summation. Using a map with default values isn't as "functional"/declarative, but IMHO the code is arguably easier to grok later (i.e. maintenance):
Given:
def map = [
customer: ['Clinton', 'Clinton', 'Mark', 'Antony', 'Clinton', 'Mark'],
price: [15000.0, 27000.0, 28000.0, 56000.0, 21000.0, 61000.0]
]
Approach:
def getSumMap = { data ->
def result = [:].withDefault { c -> 0.0 }
[data.customer, data.price].transpose().each { c, p -> result[c] += p }
result
}
assert 63000.0 == getSumMap(map)['Clinton']
assert 89000.0 == getSumMap(map)['Mark']
assert 56000.0 == getSumMap(map)['Antony']

Related

Create a string with an offset

I would like to create a String where the values have a fixed offset where to add values to a string.
Example
ID(0) Name(10) Lastname(20) City(30)
example
1 Chris Smith Paris
I have found
StringBuffer.putAt(IntRange range, Object value)
or similiar, but I don't want to have a range, but an index where to start.
StringBufferWriter.write(String text, int offset, int length)
I have found [StrindBufferWriter][1], but not sure if the package codehause is an offical package I can use.
Any suggestions what to use here?
You can use String.padRight to achieve this effect:
def users = [
[id: 1, name: 'Chris', lastname: 'Smith', city:'Paris'],
[id: 2, name: 'Tim', lastname: 'Yates', city:'Manchester'],
]
users.each { user ->
println "${user.id.toString().padRight(10)}${user.name.padRight(10)}${user.lastname.padRight(20)}$user.city"
}
Which prints:
1 Chris Smith Paris
2 Tim Yates Manchester

Python Iterating through List of List

Heres my code
stockList = [
['AMD', '57.00', '56.23', '58.40', '56.51'],
['AMZN', '3,138.29', '3,111.03', '3242.56689', '3,126.58'],
['ATVI', '80.76', '79.16', '81.86', '79.55'],
['BA', '178.63', '168.86', '176.96', '169.70'],
['BAC', '24.42', '23.43', '23.95', '23.54'],
['DAL', '26.43', '25.53', '26.87', '25.66'],
['FB', '241.75', '240.00', '248.06', '241.20'],
['GE', '7.04', '6.76', '6.95', '6.79'],
['GOOGL', '1,555.92', '1,536.36', '1,576.03', '1,544.04'],
['GPS', '12.77', '12.04', '12.72', '12.10'],
['GRUB', '70.96', '69.71', '70.65', '70.06'],
['HD', '262.42', '258.72', '261.81', '260.01'],
['LUV', '33.62', '32.45', '33.53', '32.61'],
['MSFT', '208.75', '206.72', '213.58', '207.76'],
['MU', '51.52', '50.49', '52.31', '50.74'],
['NFLX', '490.10', '492.26', '511.52', '494.72', 'SUCCESS'],
['PCG', '9.49', '8.96', '9.52', '9.01'],
['PFE', '36.69', '35.87', '37.02', '36.05'],
['QQQ', '264.00', '263.27', '267.11', '264.58', 'SUCCESS'],
['ROKU', '153.36', '148.37', '153.70', '149.11'],
['SHOP', '952.83', '976.45', '1,036.25', '981.33', 'SUCCESS'],
['SPY', '325.01', '323.64', '325.47', '325.25', 'SUCCESS'],
['SQ', '126.99', '125.13', '130.80', '125.76'],
['T', '30.25', '29.58', '30.07', '29.73'],
['TSLA', '1,568.36', '1,646.56', '1,712.58', '1,654.79', 'SUCCESS'],
['TTWO', '153.06', '152.45', '154.47', '153.22', 'SUCCESS'],
['TWTR', '37.01', '36.03246', '36.7210083', '36.21'],
['WFC', '26.20', '24.45272', '25.0438213', '24.57'],
['WMT', '132.33', '130.8515', '132.522049', '131.51']
]
keyword = 'SUCCESS'
secondList = []
for item in stockList:
if item[4] == keyword:
secondList.append(stockList[0])
print(secondList)
My use case is, to go through this lists of list, find which list contains the keyword, from there send the first item in the list. I am able to get it with one single list, however I can't do it with a list of list.
On top of that, how would I go through a dictionary containing lists?
{
'majorDimension': 'ROWS',
'range': 'Sheet1!A2:F30',
'values': [
['AMD', '57.00', '56.23', '58.40', '56.51'],
['AMZN', '3,138.29', '3,111.03', '3242.56689', '3,126.58'],
['ATVI', '80.76', '79.16', '81.86', '79.55'],
['BA', '178.63', '168.86', '176.96', '169.70'],
['BAC', '24.42', '23.43', '23.95', '23.54'],
['DAL', '26.43', '25.53', '26.87', '25.66'],
['FB', '241.75', '240.00', '248.06', '241.20'],
['GE', '7.04', '6.76', '6.95', '6.79'],
['GOOGL', '1,555.92', '1,536.36', '1,576.03', '1,544.04'],
['GPS', '12.77', '12.04', '12.72', '12.10'],
['GRUB', '70.96', '69.71', '70.65', '70.06'],
['HD', '262.42', '258.72', '261.81', '260.01'],
['LUV', '33.62', '32.45', '33.53', '32.61'],
['MSFT', '208.75', '206.72', '213.58', '207.76'],
['MU', '51.52', '50.49', '52.31', '50.74'],
['NFLX', '490.10', '492.26', '511.52', '494.72', 'SUCCESS'],
['PCG', '9.49', '8.96', '9.52', '9.01'],
['PFE', '36.69', '35.87', '37.02', '36.05'],
['QQQ', '264.00', '263.27', '267.11', '264.58', 'SUCCESS'],
['ROKU', '153.36', '148.37', '153.70', '149.11'],
['SHOP', '952.83', '976.45', '1,036.25', '981.33', 'SUCCESS'],
['SPY', '325.01', '323.64', '325.47', '325.25', 'SUCCESS'],
['SQ', '126.99', '125.13', '130.80', '125.76'],
['T', '30.25', '29.58', '30.07', '29.73'],
['TSLA', '1,568.36', '1,646.56', '1,712.58', '1,654.79', 'SUCCESS'],
['TTWO', '153.06', '152.45', '154.47', '153.22', 'SUCCESS'],
['TWTR', '37.01', '36.03246', '36.7210083', '36.21'],
['WFC', '26.20', '24.45272', '25.0438213', '24.57'],
['WMT', '132.33', '130.8515', '132.522049', '131.51'],
]
}
List comprehension makes this pretty simple. Try the following:
keyword = "SUCCESS"
# PEP8 calls for lower_underscore_case here
second_list = [i[0] for i in stockList if keyword in i]
print(second_list)
For the proposed dictionary structure, you'd just access the key containing the list, since not every value in that dict is a list:
second_list = [i[0] for i in stockList["values"] if keyword in i]
Based upon your question understanding. Your question is divided into two parts, these are:
How to iterate over list of lists, and get the first item from the nested list, and store it in another list
How to iterate over dictionary item, to perform the same operation
If my understanding is right, then you might want to check this out.
Please note: I have not used variable keyword, simply used "SUCCESS", just replace keyword with "SUCCESS" in the code, and you are good to go.
1. FIRST SOLUTION
# to get nested list
for item in stockList:
# this checks whether SUCCESS is present inside a list
# python way of doing it
if "SUCCESS" in item: secondList.append(item[0])
print(secondList)
# OUTPUT
# >>> ['NFLX', 'QQQ', 'SHOP', 'SPY', 'TSLA', 'TTWO']
OR
You can do this in more pythonic way, that is to use List Comprehension
# single line approach, getting the same result
secondList = [item[0] for item in stockList if "SUCCESS" in item]
print(secondList)
# OUTPUT
# >>> ['NFLX', 'QQQ', 'SHOP', 'SPY', 'TSLA', 'TTWO']
2. SECOND SOLUTION
In order to get the result, first you need to assign the Dictionary to your variable, in my case, I have assigned to a variable called stockListDictionary
secondList = []
# to get a value from key specifically
# likt any dictionary key dictionary["key_name"]
for item in stockListDictionary["values"]:
if "SUCCESS" in item: secondList.append(item[0])
print(secondList)
# OUTPUT
# >>> ['NFLX', 'QQQ', 'SHOP', 'SPY', 'TSLA', 'TTWO']
OR
Using List Comprehension
secondList = [item[0] for item in stockListDictionary["values"] if "SUCCESS" in item]
print(secondList)
# OUTPUT
# >>> ['NFLX', 'QQQ', 'SHOP', 'SPY', 'TSLA', 'TTWO']
What about something like this?
keywords={"SUCCESS"}
d = # the dictionary
second_list = list()
for nested_lists in d["values"]:
for stock_info in nested_lists:
stock_ticker = stock_info[0]
if stock_ticker in keywords:
info = set(stock_info[1:])
if info & keywords:
second_list.append(stock_ticker)
Is this better? It should allow you to have more than one keyword.

Getting IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

I'm getting this index error when I run this function. This function finds the average rent prices of a zipcode of a city. I have a dictionary of cities called city with zipcode as the key and city name as the value. There are multiple zipcodes for some cities and arrRent is a array with lists of rents of houses of the city and I want to find the average rent price.
def meanPrice(self, city):
total = 0
cityLoc = 0
for keys, cities in self.city.items():
if self.city[keys] == city:
for i in self.arrRent[cityLoc]:
total += int(self.arrRent[cityLoc][i])
mean = total / i
print(total / i)
else:
cityLoc += 1
Here's a snippet of the dictionary:
{'95129': 'San Jose'}
{'95128': 'San Jose'}
{'95054': 'Santa Clara'}
{'95051': 'Santa Clara'}
{'95050': 'Santa Clara'}
and here's a snippet of the array:
[['2659' '2623.5' '2749.5' '2826.5' '2775' '2795' '2810' '2845' '2827'
'2847' '2854' '2897.5' '2905' '2925' '2902.5' '2869.5']
['3342.5' '3386' '3385' '3353' '3300' '3190' '3087.5' '3092' '3170'
'3225' '3340' '3315' '3396' '3470' '3480' '3380']
['2996' '2989' '2953' '2950' '2884.5' '2829' '2785' '2908' '2850' '2761'
'2997.5' '3020' '2952' '2997.5' '2952' '2923.5']
['2804.5' '2850.5' '2850' '2850' '2867' '2940' '2905' '2945' '2938'
'2860' '2884' '2946' '2938' '2986.5' '2931.5' '3032.5']
['2800' '3074' '2950' '2850' '2850' '2875' '2757' '2716' '2738.5' '2696'
'2809' '2891' '3000' '2960' '2950' '2831']]
I see 2 issues:
Issue in your list 'arrRent': It is supposed to be a list of lists, containing rents. But the different rents are not separated by comma in your list:
[['2659' '2623.5' '2749.5' '2826.5' '2775' '2795' '2810' '2845' '2827'
'2847' '2854' '2897.5' '2905' '2925' '2902.5' '2869.5']
['3342.5' '3386' '3385' '3353' '3300' .......
Your issue in code seems to be in this block of code. i is the actual rent here, not the index:
for i in self.arrRent[cityLoc]:
total += int(self.arrRent[cityLoc][i])
Change it to this:
for i in self.arrRent[cityLoc]:
total += int(i)

Text file to CSV conversion

I have a text file which have content like :
Name: Aar saa
Last Name: sh
DOB: 1997-03-22
Phone: 1212222
Graduation: B.Tech
Specialization: CSE
Graduation Pass Out: 2019
Graduation Percentage: 60
Higher Secondary Percentage: 65
Higher Secondary School Name: Guru Nanak Dev University,amritsar
City: hyd
Venue Details: CMR College of Engineering & Technology (CMRCET) Medchal Road, TS � 501401
Name: bfdg df
Last Name: df
DOB: 2005-12-16
Phone: 2222222
Graduation: B.Tech
Specialization: EEE
Graduation Pass Out: 2018
Graduation Percentage: 45
Higher Secondary Percentage: 45
Higher Secondary School Name: asddasd
City: vjd
Venue Details: Prasad V. Potluri Siddhartha Institute Of Technology, Kanuru, AP - 520007
Name: cc dd ee
Last Name: ee
DOB: 1995-07-28
Phone: 444444444
Graduation: B.Tech
Specialization: ECE
Graduation Pass Out: 2019
Graduation Percentage: 75
Higher Secondary Percentage: 93
Higher Secondary School Name: Sasi institute of technology and engineering
City: hyd
Venue Details: CMR College of Engineering & Technology (CMRCET) Medchal Road, TS � 501401
I want to convert it CSV file with headers as
['Name', 'Last Name','DOB', 'Phone', 'Graduation','Specialization','Graduation Pass Out','Higher Secondary School Name','City','Venue Details']
with value as all the value after ':'
I have done something like this:
writer = csv.writer(open('result.csv', 'a'))
writer.writerow(['Name', 'Last Name','DOB', 'Phone', 'Graduation','Specialization','Graduation Pass Out','Graduation Percentage','Higher Secondary Percentage','Higher Secondary School Name','City','Venue Details'])
with open('Name2.txt') as f:
text = f.read()
myarray = text.split("\n\n")
for text1 in myarray:
parselines(text1, writer)
def parselines(lines,writer):
data=[]
for line in lines.split('\n'):
Name = line.split(": ",1)[1]
data.append(Name)
writer.writerow(data)
It worked but any efficient way would be much appreciated.
This algorithm works (kind-of a state machine)
If blank line, make a new row
Otherwise: add to current row, collect all headers and fields
def parselines(lines):
header = []
csvrows = [{}]
for line in lines:
line = line.strip()
if not line:
csvrows.append({}) # new row, in dict form
else:
field, data = line.split(":", 1)
csvrows[-1][field] = data
if field not in header:
header.append(field)
# format CSV
print(",".join(header))
for row in csvrows:
print(",".join(row.get(h,"") for h in header))

how could I make list of lists from neo4j record

I am facing list manipulation from loop iteration. I am trying to populate a list from Neo4j record
myquery="""MATCH (c :Customer {walletId:$item})-[:MR|:SENDS_MONEY]-(d)-[:PAYS]->(m)
WHERE NOT (c)-[]-(m)
RETURN c.walletId, m.walletId, m.name, COUNT(m.name) ORDER BY COUNT(m.name) DESC LIMIT 30"""
result=graphdbsessionwallet.run(myquery,item=item)
#print(result)
for record in result:
print(list(record))
and my current result is
['01302268120', '01685676658', 'Shojon Medical Hall', 6]
['01302268216', '01733243988', 'APEXFOOTWEAR LIMITED', 1]
and so on
desired
[['01302268120', '01685676658', 'Shojon Medical Hall', 6],['01302268216', '01733243988', 'APEXFOOTWEAR LIMITED', 1]]
I want to put this lists into one list , kindly help me to solve this
You can modify your query to return the list with the help of COLLECT clause:
MATCH (c :Customer {walletId:$item})-[:MR|:SENDS_MONEY]-(d)-[:PAYS]->(m)
WHERE NOT (c)-[]-(m)
WITH c, m, COUNT(m.name) as cnt
ORDER BY cnt DESC
RETURN COLLECT([c.walletId, m.walletId, m.name, cnt])
LIMIT 30

Resources