Replacing Null Values with Mean Value of the Column in Grid DB - node.js

So, I was working with GridDB NodeJs Connector, I know the query to find out the null values which shows the records/rows:
SELECT * FROM employees where employee_salary = NaN;
But I want to replace the null values of the column with the mean value of the column, in order to maintain the data consistency for data analysis. How do I do that in GridDB?
The Employee table looks like the following:
employee_id employee_salary first_name department
---------------+---------------+--------------+--------------
0 John Sales
1 60000 Lisa Development
2 45000 Richard Sales
3 50000 Lina Marketing
4 55000 Anderson Development

Related

Joining multiple Table using sequelize

I have 4 table t1,t2,t3,t4 in a POSTGRESQL database. Each of the table contains 3 common field, "currency" eg INR DOLLAR etc, "debt" eg 12006 , "credit" eg 1000. Credit and debt are integers.
Each table contains multiple entries of every currency possible
I want the sum of debt and credit of each currency across all the 4 tables
currency
debt
credit
INR
12006
1000
DOLLAR
50002
3012
yen
1234
12546
I'm using sequelize, there's no realtion between any 4 tables, I'm only able to add credit and debt of 1 table at a time using this
var a = await asset.findAll({
attributes: ['currency',[sequelize.fn('SUM', sequelize.col('credit')),
'credit'],
[sequelize.fn('SUM', sequelize.col('debt')), 'debt']
],
group: ['currency'],
});
can someone please guide me how can I make a complete outer join using sequelize preferably else raw query would also work

create database from with 2 distinct database according to customer name in python

Table A: Product Attributes
This table contains two columns; the first one is a unique product ID represented by an integer, the second is a string containing a collection of attributes assigned to that product.
product tags
100 chocolate, sprinkles
101 chocolate, sprinkles
102 glazed
Table B: Customer Attributes
The second table contains two columns as well; the first one is a string that contains a customer name, the second is an integer that contains a product number. The product IDs from column two are the same as the product IDs from column one of Table A.
customer product
A 100
A 101
B 101
C 100
C 102
B 101
A 100
C 102
Generated Table
I want to create a table matching this format, where the contents of the cells represent the count of occurrences of product attribute by customer.
customer chocolate sprinkles glazed
A ? ? ?
B ? ? ?
C ? ? ?
I want count instead of ?.
And I want to do this in python.
One more question: If the two starting tables were in a relational database or Hadoop cluster and each had 100 million rows, how might my approach change?

Join two files in Pyspark without using sparksql/dataframes

I have two files customer and sales like below
Customer :
cu_id name region city state
1 Rahul ME Vizag AP
2 Raghu SE HYD TS
3 Rohith ME BNLR KA
Sales:
sa_id sales country
2 100000 IND
3 230000 USA
4 240000 UK
Both the files are \t delimited.
I want to join both the files based on the cu_id from customer and sa_id from sales using pyspark with out using sparksql/dataframes.
your help is very much appreciated.
You can definitely use the join methods that Spark has to offer regarding workings with RDD's.
You can do something like:
customerRDD = sc.textFile("customers.tsv").map(lambda row: (row.split('\t')[0], "\t".join(row.split('\t')[1:])))
salesRDD = sc.textFile("sales.tsv").map(lambda row: (row.split('\t')[0], "\t".join(row.split('\t')[1:])))
joinedRDD = customerRDD.join(salesRDD)
And you will get a new RDD that contains the only joined records from both customer and sales files.

Pragmatically Get Count of Pivot Table Data

Suppose I have some sample data like so:
and I create a Pivot Table and order by Product > Sales Rep > Sales
If I want the # of sales John had for Product1 I would do =GETPIVOTDATA("Sales",$A$3,"Product","Product1","Sales Rep","John")
But how would I get a count of the entries there are. i.e. for Product1, John had 4 and Kevin had 2
Is it possible to get this count using GETPIVOTDATA()?
To get this you need to add another Sales field to the values section and change the "Summarize By" to Count. Then the formula will =GETPIVOTDATA("Count of Sales",$A$3,"Product","Product1") will get you the count of Product Overall.

PivotTable allows Skill filter and shows distribution of each Skill level

I have an Excel sheet set up basically with the first two columns as a person's name and their ID. Then the rest of the columns are title of a skill. The values of the table are basically the skill levels (0-4). So it looks like:
| Name | ID | Skill 1 | Skill 2|
| Jane | 01 | 3 | 4 |
I was wondering how I can use pivot tables to make it so that I have a column where I can I select in the dropdown the "Skill" and in that column would be 0, 1, 2, 3, 4 then the column next to it shows the Count of how many people put 0 for that skill etc.
Right now I have it like that but only one skill and if I wanted to change to a different skill, I have to manually change the pivot table row label. I was hoping to just change it within the pivot table itself.
I could rearrange the data to make this work but I'm having trouble conceptualizing how the data should be organized for this.
Is this doable in Excel?
Normally for pivot tables you want the data in a format more like this
Name ID Skill# Skill Value
Jane 01 1 3
Jane 01 2 4
Then you would be able to show what you want in the pivot table. You could then use report filters or column labels (with filters) to only show skill# 1 or skill#2.

Resources