I have 2 Excel worksheets. In the first I have a table that has a column named "Sales Order" and "SO Item" of each row (product) plus some other columns. In this table I concatenate "Sales Order" and "SO Item" so that I have Sales Order parent (xxxxxxx00) and also Sales Order childs (xxxxxxx01, xxxxxxx02,...,xxxxxxx09). However, in the second worksheet I also have the "concatenation" column but only contains Sales Order parents. How can I pull the whole rows containing the childs of each parent from worksheet 1 to worksheet 2?
I've tried to do it using VLOOKUP but this only returns a single child value (xxxxxxx001) and also its not returning the whole row where this code is located
Table 1 is:
Sales Order
SO Item
Concatenation
Material Description
Feas Plan Date
2503319449
100
2503319449100
SYS-7210 SAS-Mxp
Bundle Header
2503319449
101
2503319449101
PS-7210 SAS-T/Mxp
1/31/2023
2503319449
102
2503319449102
SYS-7210 SAS-Mxp2VDC
Global Allocation
2503319449
200
2503319449200
OS-7210 SAS-Mxp
1/31/2023
Table 2 is:
Sales Order
SO Item
Concatenation
Material Description
Feas Plan Date
2503319449
100
2503319449100
SYS-7210 SAS-Mxp
Bundle Header
2503319449
200
2503319449200
OS-7210 SAS-Mxp
1/31/2023
I want Table 2 to extract the missing "Concatenation" items from Table 1.
It is not clear from the question, how to present the output. I assume Table2 is your lookup table. Based on the input data, you need to return the entire Table1, I assume your Table1 has more data in your real case, and you want to extract just the information based on the lookup table. In the way you construct the concatenation, for the lookup it is only necessary the SO Item column values. Put on G2 the following formula:
=LET(tbA, A3:E4, tbB, A9:E12, soA, 1*INDEX(tbA,,2), soB, 1*INDEX(tbB,,2),
DROP(REDUCE("", soA, LAMBDA(ac,x, LET(f,
FILTER(tbB, (soB >= x) * (soB < x+100),""), IF(#f="", ac, VSTACK(ac,f))))),1))
Here is the output:
The condition:
IF(#f="", ac, VSTACK(ac,f))
It is just to prevent empty result from the FILTER output (f), it is not really necessary if you want to include the parent (condition: soB >= x as it is in the formula), but if you want to exclude it (soB > x) then you need it. Check my answer to the question: how to transform a table in Excel from vertical to horizontal but with different length on how to use DROP/REDUCE/VSTACK pattern. I convert to numeric values (multiplying INDEX by 1) the value of SO Item column, in case the input data is in text format, otherwise it is not necesary.
I have one grid called Grid 1, I would like to pass the information to another grid called Grid M.
This Grid M may or may not contain previous data, so what I want is to overwrite these previous values and just leave the new data. Please note that both sheets have the same structure when it comes to column name and their formats.
This is my code:
# Grid IDs
grid1 = 6975487445624708
grid2 = 7306936514307972
grid3 = 1060505730213764
gridM = 4175140851345284
# Read Sheets
readSheet_Grid1 = smart.Sheets.get_sheet(grid1)
readColumn_Grid1 = readSheet_Grid1.get_columns().data
readSheet_GridM = smart.Sheets.get_sheet(gridM)
readColumn_GridM = readSheet_GridM.get_columns().data
# Get Column ID from Grid M
columntoRead = []
for column in readColumn_Grid1:
columntoRead.append(column.id)
print("Column IDs from Grid M: ", columntoRead)
# Get row id from Grid M
rowtoRead_GridM = []
for MyRow_GridM in readSheet_GridM.rows:
rowtoRead_GridM.append(MyRow_GridM.id)
print("Row IDs from Grid M: ",rowtoRead_GridM)
# Get values from Grid 1
celltoRead_Grid1 = []
celltoRead_GridM = []
for MyRow_Grid1 in readSheet_Grid1.rows:
for MyCell_Grid1 in MyRow_Grid1.cells:
celltoRead_Grid1.append(MyCell_Grid1.value)
print("Values from Grid 1: ",celltoRead_Grid1)
# Build new cell value
new_cell = smartsheet.models.Cell()
new_cell.column_id = columntoRead
new_cell.value = celltoRead_Grid1
new_cell.strict = False
# Build the row to update
new_row = smartsheet.models.Row()
new_row.cells.append(new_cell)
print(new_cell)
print(new_row)
This is the output:
Column IDs from Grid M: [7236841595791236, 1607342061578116, 6110941688948612, 8503502613309316, 3999902985938820, 3859141875263364, 8362741502633860, 1044392108156804]
Row IDs from Grid M: [7323028036380548, 1693528502167428, 6197128129537924, 3945328315852676, 8448927943223172]
Values from Grid 1: [3240099.0, 'James', 'Hamilton', 'Male', 197556.0, 18.0, 'Bachelor', 'Medic', 9615534.0, 'Miranda', 'Montgomery', 'Female', 158585.0, 20.0, 'Primary', 'Historian', 9119102.0, 'Vincent', 'Wells', 'Male', 182392.0, 29.0, 'Lower secondary', 'Agronomist', 4533161.0, 'Alen', 'Murray', 'Male', 140853.0, 30.0, 'Doctoral', 'Carpenter', 1010718.0, 'Frederick', 'Farrell', 'Male', 140403.0, 29.0, 'Primary', 'Jeweller']
This is where I start to get lost, find below error code.
ValueError Traceback (most recent call last)
Input In [5], in <cell line: 42>()
40 # Build new cell value
41 new_cell = smartsheet.models.Cell()
---> 42 new_cell.column_id = columntoRead
43 new_cell.value = celltoRead_Grid1
44 new_cell.strict = False
File ~\anaconda3\lib\site-packages\smartsheet\models\cell.py:70, in Cell.__setattr__(self, key, value)
68 self.format_ = value
69 else:
---> 70 super(Cell, self).__setattr__(key, value)
File ~\anaconda3\lib\site-packages\smartsheet\models\cell.py:78, in Cell.column_id(self, value)
76 #column_id.setter
77 def column_id(self, value):
---> 78 self._column_id.value = value
File ~\anaconda3\lib\site-packages\smartsheet\types.py:165, in Number.value(self, value)
163 self._value = value
164 else:
--> 165 raise ValueError("`{0}` invalid type for Number value".format(value))
ValueError: `[7236841595791236, 1607342061578116, 6110941688948612, 8503502613309316, 3999902985938820, 3859141875263364, 8362741502633860, 1044392108156804]` invalid type for Number value
Looks like I can't put in new_cell.column_id a list, only integers, but this makes me wonder the following, how do I let Smartsheet know that I wish to update multiple rows using .value from Grid 1 into Grid M?
If I replace the list with a specific Column ID, like in this code, new_cell.column_id = 7236841595791236 this is the output:
{"columnId": 7236841595791236, "strict": false}
{"cells": [{"columnId": 7236841595791236, "strict": false}]}
This is the desired output in Grid M:
ID Name Last Name Gender Salary Age Education Occupation
3240099 James Hamilton Male 197556 18 Bachelor Medic
9615534 Miranda Montgomery Female 158585 20 Primary Historian
9119102 Vincent Wells Male 182392 29 Lower secondary Agronomist
4533161 Alen Murray Male 140853 30 Doctoral Carpenter
1010718 Frederick Farrell Male 140403 29 Primary Jeweller
If I'm understanding your scenario correctly, the following things are true:
The structure of your source sheet and your destination sheet (number of columns, column types, column sequence) is identical.
Your objective is to delete ALL rows from the destination sheet and then copy all rows from the source sheet into the destination sheet.
You want to the copied data to remain in the source sheet (i.e., you're copying rows from the source sheet into the destination sheet, not moving rows from the source sheet to the destination sheet).
The following code achieves the objective described above.
# specify source info
source_sheet_id = 5169244485773188
# specify destination info
destination_sheet_id = 2486208480733060
'''
STEP 1:
Get all rows from the source sheet and build list of Row IDs.
'''
sheet = smartsheet_client.Sheets.get_sheet(source_sheet_id)
# iterate through the rows array and build a list of row IDs
source_sheet_row_ids = []
for row in sheet.rows:
source_sheet_row_ids.append(row.id)
'''
STEP 2:
Get all rows from the destination sheet and build list of Row IDs.
'''
sheet = smartsheet_client.Sheets.get_sheet(destination_sheet_id)
# iterate through the rows array and build a list of row IDs
destination_sheet_row_ids = []
for row in sheet.rows:
destination_sheet_row_ids.append(row.id)
'''
STEP 3:
Delete ALL rows from the destination sheet (using Row IDs from STEP 2).
'''
response = smartsheet_client.Sheets.delete_rows(destination_sheet_id, destination_sheet_row_ids)
'''
STEP 4:
Copy all rows from the source sheet (using Row IDs from STEP 1) to the destination sheet.
'''
# copy rows from source sheet to (bottom of) destination sheet
# (include everything -- i.e., attachments, children, and discussions)
response = smartsheet_client.Sheets.copy_rows(
source_sheet_id,
smartsheet.models.CopyOrMoveRowDirective({
'row_ids': source_sheet_row_ids,
'to': smartsheet.models.CopyOrMoveRowDestination({
'sheet_id': destination_sheet_id
})
}),
'all'
)
It's important to note that this code will delete ALL rows from the destination sheet each time it runs (immediately before it copies all rows from the source sheet into the destination sheet). If you intend for the destination sheet to be the home of data from multiple sheets at some point in the future, then you'll want to modify the code such that it only deletes rows that originated from the specified source sheet sheet. One way to do this would be to:
Add a column to the beginning of the source sheet AND the destination sheet called Source Sheet ID.
In the first row of source sheet, populate this column (cell) with the value of that sheet's ID. In each subsequent row of the source sheet, populate this column (cell) with a formula that pulls the value from that cell in the first row (i.e., =[Source Sheet ID]$1). This will make it so that this cell within any new rows that are added later will automatically be populated with that same value.
You might consider locking this column by using the Smartsheet UI, so it won't be editable (by non-admin users).
Then in the section of code that builds up the list of destination_sheet_row_IDs, add some conditional logic to only append the current row ID if the value of the Source Sheet ID column for that row matches your source sheet ID. That way only rows that originated from the specified source sheet will be deleted from the destination sheet -- any rows there that originated from another sheet will remain untouched.
If you choose to implement this approach -- adding the Source Sheet ID column (containing the ID of the source sheet) as the first column in both the source sheet and the destination sheet -- replace STEP 2 from the code sample above with the following code instead.
'''
STEP 2:
Get all rows from the destination sheet and build list of Row IDs.
'''
destination_sheet = smartsheet_client.Sheets.get_sheet(destination_sheet_id)
# iterate through the rows array and build a list of row IDs
destination_sheet_row_ids = []
for row in destination_sheet.rows:
# only include Row IDs for rows that originated from the specified Source sheet
if row.cells[0].value == source_sheet_id:
destination_sheet_row_ids.append(row.id)
I have an Excel file in which I want to convert the number formatting from 'General' to 'Date'. I know how to do so for one column when referring to the column letter:
workbook = openpyxl.load_workbook('path\filename.xlsx')
worksheet = workbook['Sheet1']
for row in range(2, worksheet.max_row+1):
ws["{}{}".format(ColNames['Report_date'], row)].number_format='yyyy-mm-dd;#'
As you can see, I now use the column letter "D" to point out the column that I want to be formatted differently. Now, I would like to use the header in row 1 called "Start_Date" to refer to this column. I tried a method from the following post to achieve this: select a column by its name - openpyxl. However, that resulted in a KeyError: "Start_Date":
# Create a dictionary of column names
ColNames = {}
Current = 0
for COL in worksheet.iter_cols(1, worksheet.max_column):
ColNames[COL[0].value] = Current
Current += 1
for row in range(2, worksheet.max_row+1):
ws["{}{}".format(ColNames['Start_Date'], row)].number_format='yyyy-mm-dd;#'
EDIT
This method results in the following error:
AttributeError: 'tuple' object has no attribute 'number_format'
Additionally, I have more columns from which the number formatting needs to be changed. I have a list with the names of those columns:
DateColumns = ['Start_Date', 'End_Date', 'Birthday']
Is there a way that I can use the list DateColumns so that I can save some lines of code?
Thanks in advance.
Please note that I posted a similar question earlier. The following post was referred to as an answer Python: Simulating CSV.DictReader with OpenPyXL. However, I don't see how the answers in that post can be adjusted to my needs.
You need to know which columns you want to change the number format on which you have conveniently put into a list, so why not just use that list.
Get the headers in your sheet, check if the Header is in the DateColumns list, if so then update all the entries in that column from row 2 to max with the date format you want...
...
DateColumns = ['Start_Date', 'End_Date', 'Birthday']
for COL in worksheet.iter_cols(min_row=1,max_row=1):
header = COL[0]
if header.value in DateColumns:
for row in range(2, worksheet.max_row+1):
worksheet.cell(row, COL[0].column).number_format='yyyy-mm-dd;#'
Is there any way in python to fill in the fields of a form (first name, last name) by taking them randomly from 2 pre-filled lists (name list and surname list)? And instead automatically randomise, without taking data from a list, the selection of the date of birth by keeping it over a certain range (e.g. 19 to 32 years)?
Thanks in advance for the help.
Try using choice function from random module.
from random import choice
names = ['a','b','c']
random_name = choice(names) # use this for getting random names and surnames
for selecting a random date from a range, you can look at this post
Generate a random date between two other dates
I have an excel as shown below:
Input File
Now I want to filter fruits first from "Items" column and check which one in list of "list" column is not present in the list. For example: here "grapes" is not present in "Name" column. So I want grapes as output in next column as shown below.
Expected Output Shown
The same is to be done for many by filtering each items one by one as I have many items.
Please suggest or give some hints so that i can start this code.
I am naming the excel as Book1
import pandas as pd
frame = pd.read_excel("Book1.xlsx")
frame_list_as_String = frame.list.tolist()
frame_list = [x.split(',') for x in frame_list_as_String]
frame_Name = frame.Name.tolist()
frame_col3=[]
for item in frame_list :
frame_col3.append(list(set(items)-set(frame_Name)))
frame["col3"]=frame_col3
frame.to_excel("df.xlsx", index = False)