How to update one grid with information from another grid - python-3.x

I have one grid called Grid 1, I would like to pass the information to another grid called Grid M.
This Grid M may or may not contain previous data, so what I want is to overwrite these previous values and just leave the new data. Please note that both sheets have the same structure when it comes to column name and their formats.
This is my code:
# Grid IDs
grid1 = 6975487445624708
grid2 = 7306936514307972
grid3 = 1060505730213764
gridM = 4175140851345284
# Read Sheets
readSheet_Grid1 = smart.Sheets.get_sheet(grid1)
readColumn_Grid1 = readSheet_Grid1.get_columns().data
readSheet_GridM = smart.Sheets.get_sheet(gridM)
readColumn_GridM = readSheet_GridM.get_columns().data
# Get Column ID from Grid M
columntoRead = []
for column in readColumn_Grid1:
columntoRead.append(column.id)
print("Column IDs from Grid M: ", columntoRead)
# Get row id from Grid M
rowtoRead_GridM = []
for MyRow_GridM in readSheet_GridM.rows:
rowtoRead_GridM.append(MyRow_GridM.id)
print("Row IDs from Grid M: ",rowtoRead_GridM)
# Get values from Grid 1
celltoRead_Grid1 = []
celltoRead_GridM = []
for MyRow_Grid1 in readSheet_Grid1.rows:
for MyCell_Grid1 in MyRow_Grid1.cells:
celltoRead_Grid1.append(MyCell_Grid1.value)
print("Values from Grid 1: ",celltoRead_Grid1)
# Build new cell value
new_cell = smartsheet.models.Cell()
new_cell.column_id = columntoRead
new_cell.value = celltoRead_Grid1
new_cell.strict = False
# Build the row to update
new_row = smartsheet.models.Row()
new_row.cells.append(new_cell)
print(new_cell)
print(new_row)
This is the output:
Column IDs from Grid M: [7236841595791236, 1607342061578116, 6110941688948612, 8503502613309316, 3999902985938820, 3859141875263364, 8362741502633860, 1044392108156804]
Row IDs from Grid M: [7323028036380548, 1693528502167428, 6197128129537924, 3945328315852676, 8448927943223172]
Values from Grid 1: [3240099.0, 'James', 'Hamilton', 'Male', 197556.0, 18.0, 'Bachelor', 'Medic', 9615534.0, 'Miranda', 'Montgomery', 'Female', 158585.0, 20.0, 'Primary', 'Historian', 9119102.0, 'Vincent', 'Wells', 'Male', 182392.0, 29.0, 'Lower secondary', 'Agronomist', 4533161.0, 'Alen', 'Murray', 'Male', 140853.0, 30.0, 'Doctoral', 'Carpenter', 1010718.0, 'Frederick', 'Farrell', 'Male', 140403.0, 29.0, 'Primary', 'Jeweller']
This is where I start to get lost, find below error code.
ValueError Traceback (most recent call last)
Input In [5], in <cell line: 42>()
40 # Build new cell value
41 new_cell = smartsheet.models.Cell()
---> 42 new_cell.column_id = columntoRead
43 new_cell.value = celltoRead_Grid1
44 new_cell.strict = False
File ~\anaconda3\lib\site-packages\smartsheet\models\cell.py:70, in Cell.__setattr__(self, key, value)
68 self.format_ = value
69 else:
---> 70 super(Cell, self).__setattr__(key, value)
File ~\anaconda3\lib\site-packages\smartsheet\models\cell.py:78, in Cell.column_id(self, value)
76 #column_id.setter
77 def column_id(self, value):
---> 78 self._column_id.value = value
File ~\anaconda3\lib\site-packages\smartsheet\types.py:165, in Number.value(self, value)
163 self._value = value
164 else:
--> 165 raise ValueError("`{0}` invalid type for Number value".format(value))
ValueError: `[7236841595791236, 1607342061578116, 6110941688948612, 8503502613309316, 3999902985938820, 3859141875263364, 8362741502633860, 1044392108156804]` invalid type for Number value
Looks like I can't put in new_cell.column_id a list, only integers, but this makes me wonder the following, how do I let Smartsheet know that I wish to update multiple rows using .value from Grid 1 into Grid M?
If I replace the list with a specific Column ID, like in this code, new_cell.column_id = 7236841595791236 this is the output:
{"columnId": 7236841595791236, "strict": false}
{"cells": [{"columnId": 7236841595791236, "strict": false}]}
This is the desired output in Grid M:
ID Name Last Name Gender Salary Age Education Occupation
3240099 James Hamilton Male 197556 18 Bachelor Medic
9615534 Miranda Montgomery Female 158585 20 Primary Historian
9119102 Vincent Wells Male 182392 29 Lower secondary Agronomist
4533161 Alen Murray Male 140853 30 Doctoral Carpenter
1010718 Frederick Farrell Male 140403 29 Primary Jeweller

If I'm understanding your scenario correctly, the following things are true:
The structure of your source sheet and your destination sheet (number of columns, column types, column sequence) is identical.
Your objective is to delete ALL rows from the destination sheet and then copy all rows from the source sheet into the destination sheet.
You want to the copied data to remain in the source sheet (i.e., you're copying rows from the source sheet into the destination sheet, not moving rows from the source sheet to the destination sheet).
The following code achieves the objective described above.
# specify source info
source_sheet_id = 5169244485773188
# specify destination info
destination_sheet_id = 2486208480733060
'''
STEP 1:
Get all rows from the source sheet and build list of Row IDs.
'''
sheet = smartsheet_client.Sheets.get_sheet(source_sheet_id)
# iterate through the rows array and build a list of row IDs
source_sheet_row_ids = []
for row in sheet.rows:
source_sheet_row_ids.append(row.id)
'''
STEP 2:
Get all rows from the destination sheet and build list of Row IDs.
'''
sheet = smartsheet_client.Sheets.get_sheet(destination_sheet_id)
# iterate through the rows array and build a list of row IDs
destination_sheet_row_ids = []
for row in sheet.rows:
destination_sheet_row_ids.append(row.id)
'''
STEP 3:
Delete ALL rows from the destination sheet (using Row IDs from STEP 2).
'''
response = smartsheet_client.Sheets.delete_rows(destination_sheet_id, destination_sheet_row_ids)
'''
STEP 4:
Copy all rows from the source sheet (using Row IDs from STEP 1) to the destination sheet.
'''
# copy rows from source sheet to (bottom of) destination sheet
# (include everything -- i.e., attachments, children, and discussions)
response = smartsheet_client.Sheets.copy_rows(
source_sheet_id,
smartsheet.models.CopyOrMoveRowDirective({
'row_ids': source_sheet_row_ids,
'to': smartsheet.models.CopyOrMoveRowDestination({
'sheet_id': destination_sheet_id
})
}),
'all'
)
It's important to note that this code will delete ALL rows from the destination sheet each time it runs (immediately before it copies all rows from the source sheet into the destination sheet). If you intend for the destination sheet to be the home of data from multiple sheets at some point in the future, then you'll want to modify the code such that it only deletes rows that originated from the specified source sheet sheet. One way to do this would be to:
Add a column to the beginning of the source sheet AND the destination sheet called Source Sheet ID.
In the first row of source sheet, populate this column (cell) with the value of that sheet's ID. In each subsequent row of the source sheet, populate this column (cell) with a formula that pulls the value from that cell in the first row (i.e., =[Source Sheet ID]$1). This will make it so that this cell within any new rows that are added later will automatically be populated with that same value.
You might consider locking this column by using the Smartsheet UI, so it won't be editable (by non-admin users).
Then in the section of code that builds up the list of destination_sheet_row_IDs, add some conditional logic to only append the current row ID if the value of the Source Sheet ID column for that row matches your source sheet ID. That way only rows that originated from the specified source sheet will be deleted from the destination sheet -- any rows there that originated from another sheet will remain untouched.
If you choose to implement this approach -- adding the Source Sheet ID column (containing the ID of the source sheet) as the first column in both the source sheet and the destination sheet -- replace STEP 2 from the code sample above with the following code instead.
'''
STEP 2:
Get all rows from the destination sheet and build list of Row IDs.
'''
destination_sheet = smartsheet_client.Sheets.get_sheet(destination_sheet_id)
# iterate through the rows array and build a list of row IDs
destination_sheet_row_ids = []
for row in destination_sheet.rows:
# only include Row IDs for rows that originated from the specified Source sheet
if row.cells[0].value == source_sheet_id:
destination_sheet_row_ids.append(row.id)

Related

I need to pull a complete row from one excel sheet to another escel sheet, based on a cell value

I have 2 Excel worksheets. In the first I have a table that has a column named "Sales Order" and "SO Item" of each row (product) plus some other columns. In this table I concatenate "Sales Order" and "SO Item" so that I have Sales Order parent (xxxxxxx00) and also Sales Order childs (xxxxxxx01, xxxxxxx02,...,xxxxxxx09). However, in the second worksheet I also have the "concatenation" column but only contains Sales Order parents. How can I pull the whole rows containing the childs of each parent from worksheet 1 to worksheet 2?
I've tried to do it using VLOOKUP but this only returns a single child value (xxxxxxx001) and also its not returning the whole row where this code is located
Table 1 is:
Sales Order
SO Item
Concatenation
Material Description
Feas Plan Date
2503319449
100
2503319449100
SYS-7210 SAS-Mxp
Bundle Header
2503319449
101
2503319449101
PS-7210 SAS-T/Mxp
1/31/2023
2503319449
102
2503319449102
SYS-7210 SAS-Mxp2VDC
Global Allocation
2503319449
200
2503319449200
OS-7210 SAS-Mxp
1/31/2023
Table 2 is:
Sales Order
SO Item
Concatenation
Material Description
Feas Plan Date
2503319449
100
2503319449100
SYS-7210 SAS-Mxp
Bundle Header
2503319449
200
2503319449200
OS-7210 SAS-Mxp
1/31/2023
I want Table 2 to extract the missing "Concatenation" items from Table 1.
It is not clear from the question, how to present the output. I assume Table2 is your lookup table. Based on the input data, you need to return the entire Table1, I assume your Table1 has more data in your real case, and you want to extract just the information based on the lookup table. In the way you construct the concatenation, for the lookup it is only necessary the SO Item column values. Put on G2 the following formula:
=LET(tbA, A3:E4, tbB, A9:E12, soA, 1*INDEX(tbA,,2), soB, 1*INDEX(tbB,,2),
DROP(REDUCE("", soA, LAMBDA(ac,x, LET(f,
FILTER(tbB, (soB >= x) * (soB < x+100),""), IF(#f="", ac, VSTACK(ac,f))))),1))
Here is the output:
The condition:
IF(#f="", ac, VSTACK(ac,f))
It is just to prevent empty result from the FILTER output (f), it is not really necessary if you want to include the parent (condition: soB >= x as it is in the formula), but if you want to exclude it (soB > x) then you need it. Check my answer to the question: how to transform a table in Excel from vertical to horizontal but with different length on how to use DROP/REDUCE/VSTACK pattern. I convert to numeric values (multiplying INDEX by 1) the value of SO Item column, in case the input data is in text format, otherwise it is not necesary.

python xlsxwriter extract value from cell

Is it possible to extract data that I've written to a xlsxwriter.worksheet?
import xlsxwriter
output = "test.xlsx"
workbook = xlsxwriter.Workbook(output)
worksheet = workbook.add_worksheet()
worksheet.write(0, 0, 'top left')
if conditional:
worksheet.write(1, 1, 'bottom right')
for row in range(2):
for col in range(2):
# Now how can I check if a value was written at this coordinate?
# something like worksheet.get_value_at_row_col(row, col)
workbook.close()
Is it possible to extract data that I've written to a xlsxwriter.worksheet?
Yes. Even though XlsxWriter is write only, it stores the table values in an internal structure and only writes them to file when workbook.close() is executed.
Every Worksheet has a table attribute. It is a dictionary, containing entries for all populated rows (row numbers starting at 0 are the keys). These entries are again dictionaries, containing entries for all populated cells within the row (column numbers starting at 0 are the keys).
Therefore, table[row][col] will give you the entry at the desired position (but only in case there is an entry, it will fail otherwise).
Note that these entries are still not the text, number or formula you are looking for, but named tuples, which also contain the cell format. You can type check the entries and extract the contents depending on their nature. Here are the possible outcomes of type(entry) and the fields of the named tuples that are accessible:
xlsxwriter.worksheet.cell_string_tuple: string, format
xlsxwriter.worksheet.cell_number_tuple: number, format
xlsxwriter.worksheet.cell_blank_tuple: format
xlsxwriter.worksheet.cell_boolean_tuple: boolean, format
xlsxwriter.worksheet.cell_formula_tuple: formula, format, value
xlsxwriter.worksheet.cell_arformula_tuple: formula, format, value, range
For numbers, booleans, and formulae, the contents can be accessed by reading the respective field of the named tuple.
For array formulae, the contents are only present in the upper left cell of the output range, while the rest of the cells are represented by number entries with 0 value.
For strings, the situation is more complicated, since Excel's storage concept has a shared string table, while the individual cell entries only point to an index of this table. The shared string table can be accessed as the str_table.string_table attribute of the worksheet. It is a dictionary, where the keys are strings and the values are the associated indices. In order to access the strings by index, you can generate a sorted list from the dictionary as follows:
shared_strings = sorted(worksheet.str_table.string_table, key=worksheet.str_table.string_table.get)
I expanded your example from above to include all the explained features. It now looks like this:
import xlsxwriter
output = "test.xlsx"
workbook = xlsxwriter.Workbook(output)
worksheet = workbook.add_worksheet()
worksheet.write(0, 0, 'top left')
worksheet.write(0, 1, 42)
worksheet.write(0, 2, None)
worksheet.write(2, 1, True)
worksheet.write(2, 2, '=SUM(X5:Y7)')
worksheet.write_array_formula(2,3,3,4, '{=TREND(X5:X7,Y5:Y7)}')
worksheet.write(4,0, 'more text')
worksheet.write(4,1, 'even more text')
worksheet.write(4,2, 'more text')
worksheet.write(4,3, 'more text')
for row in range(5):
row_dict = worksheet.table.get(row, None)
for col in range(5):
if row_dict != None:
col_entry = row_dict.get(col, None)
else:
col_entry = None
print(row,col,col_entry)
shared_strings = sorted(worksheet.str_table.string_table, key=worksheet.str_table.string_table.get)
print()
if type(worksheet.table[0][0]) == xlsxwriter.worksheet.cell_string_tuple:
print(shared_strings[worksheet.table[0][0].string])
# type checking omitted for the rest...
print(worksheet.table[0][1].number)
print(bool(worksheet.table[2][1].boolean))
print('='+worksheet.table[2][2].formula)
print('{='+worksheet.table[2][3].formula+'}')
workbook.close()
Is it possible to extract data that I've written to a xlsxwriter.worksheet?
No. XlsxWriter is write only. If you need to keep track of your data you will need to do it in your own code, outside of XlsxWriter.

Python - Error populating values to spreadsheet (using xlsxwriter)

I have pulled some data from a xml file that looks as below:
Parent
Child
Action
new
ID
54467
Type
None
Group
Name
None
ID
ab
COMMENTS
HTRER
REMARKS
LKO
CUSTOMER
HELLO
In the above sample, the first row represent the header while the row below that represents the corresponding value. I am trying to have these written to a spreadsheet such that each header is written on first row with the corresponding value in the subsequent row. I am trying to do that using the code below but see that some of the values do not get written.
Given below is the code I am using to write it to a spreadsheet using xlsxwriter:
row = 0
col = 0
row1 = 1
col1 = 0
for elem in tree.iter():
worksheet.write(row, col, elem.tag)
for subelem in elem:
worksheet.write(row1, col1, subelem.text)
col +=1
Could anyone advice as to why the values are not getting correctly or does the code above needs some edit. Thanks

Using xlrd to iterate through worksheets and workbooks

I am a total noob. I need to grab the same cell value from every other sheet (starting at the third) in a workbook and place them into another. I continue to get an IndexError: list index out of range. There are 20 sheets in the workbook. I have imported xlrd and xlwt.
Code:
sheet_id = 3
output = 0
cellval = enso.sheet_by_index(sheet_id).cell(20,2).value
sheet_cp = book_cp.get_sheet(output)
sheet_cp.write(1, 1, cellval)
book_cp.save(path)
for sheet_id in range(0,20):
sheet_enso = enso_cp.get_sheet(sheet)
sheet_cp = book_cp.get_sheet(output)
sheet_cp.write(1, 1, cellval)
sheet_id = sheet_id + 2
output = output + 1
Your problem most probably exists in here:
sheet_id = 3
cellval = enso.sheet_by_index(sheet_id).cell(20,2).value # row:20, column:0
Check the following:
1- Make sure that sheet_id=3 is what you want (where the index of sheets starts from 0), so the 3rd sheet has index=2 unless you want the 4th sheet.
2- Check cell(20,0) exists in the selected sheet (where cell(0,0) is the first cell).
Plus, you don't need to define sheet_id
instead change the range to (2: 3rd sheet, 21: for 20 sheets) > in range(2,21) where:
range([start], stop[, step])
start: Starting number of the sequence.
stop: Generate numbers up to, but not including this number.
step: Difference between each number in the sequence.
Reference: Python's range() Parameters
and to get cellval from every sheet, put cellval inside the loop.
The final code could be:
output = 0
for sheet_id in range(2,21): # (starting at the 3rd sheet (index=2), stopping at 20 "21 not included")
cellval = enso.sheet_by_index(sheet_id).cell(20,0).value # row 20, column 0
#sheet_enso = enso_cp.get_sheet(sheet) # i don't know if you need that for something else
sheet_cp = book_cp.get_sheet(output)
sheet_cp.write(1, 1, cellval)
output = output + 1
book_cp.save(path)
again check cell(20,0) exists in all source sheets to avoid errors.

Generating test data in Excel for an EAV table

This is a pretty complicated question so be prepared! I want to generate some test data in excel for my EAV table. The columns I have are:
user_id, attribute, value
Each user_id will repeat for a random number of times between 1-4, and for each entry I want to pick a random attribute from a list, and then a random value which this can take on. Lastly I want the attributes for each id entry to be unique i.e. I do not want more than one entry with the same id and attribute. Below is an example of what I mean:
user_id attribute value
100001 gender male
100001 religion jewish
100001 university imperial
100002 gender female
100002 course physics
Possible values:
attribute value
gender male
female
course maths
physics
chemistry
university imperial
cambridge
oxford
ucl
religion jewish
hindu
christian
muslim
Sorry that the table above messed up. I don't know how to paste into here while retaining the structure! Hopefully you can see what I'm talking about otherwise I can get a screenshot.
How can I do this? In the past I have generated random data using a random number generator and a VLOOKUP but this is a bit out of my league.
My approach is to create a table with all four attributes for each ID and then filter that table randomly to get between one and four filtered rows per ID. I assigned a random value to each attribute. The basic setup looks like this:
To the left is the randomized eav table and to the left is the lookup table used for the randomized values. Here's the formulas. Enter them and copy down:
Column A - Establishes a random number every four digits. This determines the attribute that must be selected:
=IF(COUNTIF(C$2:C2,C2)=1,RANDBETWEEN(1,4),A1)
Column B - Uses the formula in A to determine if row is included:
=IF(COUNTIF(C$2:C2,C2)=A2,TRUE,RANDBETWEEN(0,1)=1)
Column C - Creates the IDs, starting with 100,001:
=(INT((ROW()-2)/4)+100000)+1
Column D - Repeats the four attributes:
=CHOOSE(MOD(ROW()-2,4)+1,"gender","course","university","religion")
Column E - Finds the first occurence of the Column D attribute in the lookup table and selects a randomly offset value:
=INDEX($H$2:$H$14,(MATCH(D2,$G$2:$G$14,0))+RANDBETWEEN(0,COUNTIF($G$2:$G$14,D2)-1))
When you filter on the TRUEs in Column B you'll get your list of one to four Attributes per ID. Disconcertingly, the filtering forces a recalculation, so the filtered list will no longer say TRUE for every cell in column B.
If this was mine I'd automate it a little more, perhaps by putting the "magic number" 4 in it's own cell (the count of attributes).
There are a number of ways to do this. You could use either perl or python. Both have modules for working with spreadsheets. In this case, I used python and the openpyxl module.
# File: datagen.py
# Usage: datagen.py <excel (.xlsx) filename to store data>
# Example: datagen.py myfile.xlsx
import sys
import random
from openpyxl import Workbook
from openpyxl.cell import get_column_letter
# verify that user specified an argument
if len(sys.argv) < 2:
print "Specify an excel filename to save the data, e.g myfile.xlsx"
exit(-1)
# get the excel workbook and worksheet objects
wb = Workbook()
ws = wb.get_active_sheet()
# Modify this line to specify the range of user ids
ids = range(100001, 100100)
# data structure for the attributes and values
data = { 'gender': ['male', 'female'],
'course': ['maths', 'physics', 'chemistry'],
'university': ['imperial','cambridge','oxford', 'ucla'],
'religion': ['jewish', 'hindu', 'christian','muslim']}
# Write column headers in the spreadsheet
ws.cell('%s%s'%('A', 1)).value = 'user_id'
ws.cell('%s%s'%('B', 1)).value = 'attribute'
ws.cell('%s%s'%('C', 1)).value = 'value'
row = 1
# Loop through each user id
for user_id in ids:
# randomly select how many attributes to use
attr_cnt = random.randint(1,4)
attributes = data.keys()
for idx in range(attr_cnt):
# randomly select attribute
attr = random.choice(attributes)
# remove the selected attribute from further selection for this user id
attributes.remove(attr)
# randomly select a value for the attribute
value = random.choice(data[attr])
row = row + 1
# write the values for the current row in the spreadsheet
ws.cell('%s%s'%('A', row)).value = user_id
ws.cell('%s%s'%('B', row)).value = attr
ws.cell('%s%s'%('C', row)).value = value
# save the spreadsheet using the filename specified on the cmd line
wb.save(filename = sys.argv[1])
print "Done!"

Resources