Hello am trying to scrape through multiple tables from this site https://hs.e-to-china.com and i want to loop through tables and get the needed information.
The problem is it only scrapes the first table and repeats it as many times as there is tables in that page.My question is how can i go from table to the next one.
Here is the code am using:
tables = response.xpath('//*[#class="tax-table"]').extract()
for table in tables:
hs_code = response.xpath('//*[#class="hs-code"]//code/text()').extract_first()
Unit = response.xpath('//*[#class="tax-table"]//tr[1]//td[1]/text()').extract_first()
Gen_General_Tariff_Rate = response.xpath('//*[#class="tax-table"]//tr[1]//td[2]/text()').extract_first()
MFN_Most_favored_Nation = response.xpath('//*[#class="tax-table"]//tr[1]//td[3]/text()').extract_first()
TaxVAT_Value_added_Tax = response.xpath('//*[#class="tax-table"]//tr[2]//td[1]/text()').extract_first()
Additional_Tariff_on_US_Imports = response.xpath('//*[#class="tax-table"]//tr[2]//td[2]/text()').extract_first()
Export_Tax_Rebate = response.xpath('//*[#class="tax-table"]//tr[2]//td[3]/text()').extract_first()
Regulations_and_Restrictions = response.xpath('//*[#class="tax-table"]//tr[3]//td[1]/text()').extract_first()
Inspection_and_Quarantine = response.xpath('//*[#class="tax-table"]//tr[3]//td[2]/text()').extract_first()
Consumption_Tax = response.xpath('//*[#class="tax-table"]//tr[3]//td[3]/text()').extract_first()
FTA_Free_Trade_Agreement_Tax = response.xpath('//*[#class="tax-table"]//tr[4]//td[1]/text()').extract_first()
CCC_Certificate = response.xpath('//*[#class="tax-table"]//tr[4]//td[2]/text()').extract_first()
In_Quota_on_Imported_Goods = response.xpath('//*[#class="tax-table"]//tr[4]//td[3]/text()').extract_first()
IT_Origin_Country_Tariff = response.xpath('//*[#class="tax-table"]//tr[5]//td[1]/text()').extract_first()
Anti_Dumping_Anti_Subsidy = response.xpath('//*[#class="tax-table"]//tr[5]//td[2]/text()').extract_first()
Be carefull when to use een XPath starting with //
This tells the engine to start from root.
If you are in a loop start with .// to use the current context
So instead off
hs_code = response.xpath('//*[#class="hs-code"]//code/text()').extract_first()
Use :
hs_code = response.xpath('.//*[#class="hs-code"]//code/text()').extract_first()
Related
I'm using Excel and VBA to get SAP to download data from SAP through RFC using INST_EXECUTE_REPORT.
It works like a charm when I have specific input parameters. I just build up .Tables("PARA") with the screen name of the parameter and the desired value. I can even use this method for date ranges.
The challenge is when I don't know exactly the input parameters. For example, I wanted to identify all internal orders with a specific text in the description, e.g. CODE40.
Is there any way to use wildcards with INST_EXECUTE_REPORT? When the program passed into INST_EXECUTE_REPORT is executed normally as a transaction on screen, I can set the parameter to *CODE40* and SAP automatically applies a wildcard search. But I can't get that to work with VBA.
I can simulate using wildcards when accessing individual tables with BBP_RFC_READ_TABLE by using LIKE statements in the selection option, but I need a similar functionality for whole reports, not individual tables.
Can anyone help?
Best regards,
The code I'm using is as follows:
Set ObjR3_EXECUTE_REPORT = ObjR3.Add("INST_EXECUTE_REPORT")
With ObjR3_EXECUTE_REPORT
Set ObjR3_EXECUTE_REPORT_Name = .Exports("PROGRAM")
Set ObjR3_EXECUTE_REPORT_Para = .Tables("PARA")
Set ObjR3_EXECUTE_REPORT_Result = .Tables("RESULT_TAB")
Set ObjR3_EXECUTE_REPORT_Output = .Tables("OUTPUT_TAB")
End With
ObjR3_EXECUTE_REPORT_Name.Value = ReportName
'Build up the table with the fields to be selected
f = 1
For a = LBound(aParameters) To UBound(aParameters)
aParameterPair = aParameters(a)
aParameterInput = aParameterPair(UBound(aParameterPair))
sParameterName = aParameterPair(LBound(aParameterPair))
For c = LBound(aParameterInput) To UBound(aParameterInput)
sParameterInput = aParameterInput(c)
ObjR3_EXECUTE_REPORT_Para.AppendRow
ObjR3_EXECUTE_REPORT_Para(f, "PARA_NAME") = sParameterName
ObjR3_EXECUTE_REPORT_Para(f, "PARA_VALUE") = sParameterInput
Debug.Print sParameterName & " " & sParameterInput
f = f + 1
Next c
Next a
I have been trying to get Excel to apply a formula over a set of columns and then extend the pattern across the entire set of rows.
This has led to the following code:
For i = 0 To avgsheetNames.Count - 1
If Contains(CStr(avgsheetNames(i)), "Scores") = True Then
With mainWorkBook.Worksheets(avgsheetNames(i))
strFormulas(1) = "=SUM(Aggregated_Internal_Scores!I2:I7)/6"
strFormulas(2) = "=SUM(Aggregated_Internal_Scores!J2:J7)/6"
strFormulas(3) = "=SUM(Aggregated_Internal_Scores!K2:K7)/6"
strFormulas(4) = "=SUM(Aggregated_Internal_Scores!L2:L7)/6"
strFormulas(5) = "=SUM(Aggregated_Internal_Scores!M2:M7)/6"
strFormulas(6) = "=SUM(Aggregated_Internal_Scores!N2:N7)/6"
strFormulas2(1) = "=SUM(Aggregated_Internal_Scores!I8:I13)/6"
strFormulas2(2) = "=SUM(Aggregated_Internal_Scores!J8:J13)/6"
strFormulas2(3) = "=SUM(Aggregated_Internal_Scores!K8:K13)/6"
strFormulas2(4) = "=SUM(Aggregated_Internal_Scores!L8:L13)/6"
strFormulas2(5) = "=SUM(Aggregated_Internal_Scores!M8:M13)/6"
strFormulas2(6) = "=SUM(Aggregated_Internal_Scores!N8:N13)/6"
mainWorkBook.Worksheets(avgsheetNames(i)).Range("C2:H2").Formula = strFormulas
mainWorkBook.Worksheets(avgsheetNames(i)).Range("C3:H3").Formula = strFormulas2
mainWorkBook.Worksheets(avgsheetNames(i)).Range("C2:H3").AutoFill Destination:=mainWorkBook.Worksheets(avgsheetNames(i)).Range("C2:H32")
End With
End If
As you can see I have tried to provide the pattern I am going for where the values extracted from the "Aggregated_Internal_Scores" sheet should follow the pattern I2:I7 > I8:I13 > I14:I19 and so on.
However, when the macro has been executed what I get is I2:I7 > I8:I13 > I4:I9 > I10:I15?
It seems Excel is taking the block C2:H3 as the pattern and just incrementing by 2 at the start of every block.
Can you anyone explain where I have gone wrong and how I can specify that I want the extraction of sheet values to follow a certain pattern?
Thank you in advance!
Use:
mainWorkBook.Worksheets(avgsheetNames(i)).Range("C2:H32").Formula = "=SUM(INDEX(Aggregated_Internal_Scores!I:I,(ROW($ZZ1)-1)*6+2):INDEX(Aggregated_Internal_Scores!I:I,(ROW($ZZ1)-1)*6+7))/6"
Replace everything inside the If with that.
If one has Office 365 with dynamic array formula then use:
mainWorkBook.Worksheets(avgsheetNames(i)).Range("C2:H32").Formula2 = "=SUM(INDEX(Aggregated_Internal_Scores!I:I,SEQUENCE(6,,(ROW($ZZ1)-1)*6+2))/6"
UPDATE!
My goal is to modify an existing Workbook ( example - master_v2.xlsm ) and produce a new workbook (Newclient4) based on the updates made to master_v2.
I'm using a single sheet within master_v2 to collect all the data which will be determining what the new workbook will be.
Currently using multiple if statements to find the value of the cells in this "repository" sheet. Based on specific cells, I'm creating and adding values to copies of an existing sheet called "PANDAS".
My goal right now is to create a dict based on two columns. The loop through
the keys so that every time I get a hit on a cell, I will gather values from specific keys.
That's listed below:
from openpyxl import load_workbook
# Start by opening the spreadsheet and selecting the main sheet
workbook = load_workbook(filename="master_v2.xlsm",read_only=False, keep_vba=True)
DATASOURCE = workbook['repository']
DATASOURCE["A:H"]
cell100 = DATASOURCE["F6"].value
CREATION = cell100
cell101 = DATASOURCE["F135"].value
CREATION2 = cell101
cell107 = DATASOURCE["F780"].value
CREATION7 = cell107
if CREATION.isnumeric():
source = workbook['PANDAS']
target = workbook.copy_worksheet(source)
ss_sheet = target
ss_sheet.title = DATASOURCE['H4'].value[0:12]+' PANDAS'
if CREATION2.isnumeric():
source = workbook['PANDAS']
target = workbook.copy_worksheet(source)
ss_sheet = target
ss_sheet.title = DATASOURCE['H133'].value[0:12]+' PANDAS'
if CREATION3.isnumeric():
source = workbook['PANDAS']
target = workbook.copy_worksheet(source)
ss_sheet = target
ss_sheet.title = DATASOURCE['H262'].value[0:12]+' PANDAS'
else:
print ("no")
workbook.save(filename="NewClient4.xlsm")
Instead of the many if statements I was hoping to be able to loop through the column as explained above,
once I found my value, gather data and copy it over to a copy of sheet which is then filled out by other cells. Each time the loop comples, I want to do repeat on the next match of the string.. but I'm only this far and it's not quite working.
Anyone have a way to get this working?
( trying to replace the many one to one mappings and if statements )
for i in range(1,3000):
if DATASOURCE.cell(row=i,column=5).value == "Customer:":
source = workbook['Design details']
target = workbook.copy_worksheet(source)
ss_sheet = target
ss_sheet.title = DATASOURCE['H4'].value[0:12]+' Design details'
else:
print ("no")
Thank you guys in advanced
How can I send data to a dialog box dynamically?
In a previous project I used edit boxes (e.g for 3 conductors) and gave those data separately for each conductor. Now I have to give them dynamically and I don't have standard number of conductors and I can't use edit box again.
Could you please give me an idea or a good link describing step by step how to create a table in a dialog box dynamically?
I have created a dialog box in which I insert data about conductors (resistivity, permeability, diameter etc (electric power systems Smile | :) )) in edit boxes but I have done it only for 3 conductors. I have to insert-edit the number of conductors and then edit their characteristics. But I can't use again edit boxes because this is static. I want something like a dynamic table which will have rows=number of conductors and columns about is characteristic (resistivity, permeability, diameter)and edit them in dialog box. I don't know how to upload my executable to male clear what I have done but here is a part of my code for the static case of three conductors Smile | :) I want another dynamic way to edit data :/
void CInputView::OnLinefeaturesFeatures()
{
// TODO: Add your command handler code here
CInputDoc* pDoc = GetDocument();
CFeaturesDialog DialogWindow;
DialogWindow.m_DialogCon = m_NumCond;
DialogWindow.m_DialogLayers = m_Layers;
DialogWindow.m_DialogPermeability = m_AirPermeability;
DialogWindow.m_DialogAirConductivity = m_AirConductivity;
DialogWindow.m_DialogAirPermittivity = m_AirPermittivity;
DialogWindow.m_DialogEarthPermeability1 = m_EarthPermeability1;
DialogWindow.m_DialogEarthConductivity1 = m_EarthConductivity1;
DialogWindow.m_DialogEarthPermittivity1 = m_EarthPermittivity;
DialogWindow.m_DialogDepth = m_depth;
DialogWindow.m_DialogEarthPermeability2 = m_EarthPermeability2;
DialogWindow.m_DialogEarthConductivity2 = m_EarthConductivity2;
DialogWindow.m_DialogEarthPermittivity2 = m_EarthPermittivity2;
DialogWindow.m_Dialogfrequency = m_frequency;
if (DialogWindow.DoModal() == IDOK)
{
m_NumCond = DialogWindow.m_DialogCon;
m_Layers = DialogWindow.m_DialogLayers;
m_AirPermeability = DialogWindow.m_DialogPermeability;
m_AirConductivity = DialogWindow.m_DialogAirConductivity;
m_AirPermittivity = DialogWindow.m_DialogAirPermittivity;
m_EarthPermeability1 = DialogWindow.m_DialogEarthPermeability1;
m_EarthConductivity1 = DialogWindow.m_DialogEarthConductivity1;
m_EarthPermittivity = DialogWindow.m_DialogEarthPermittivity1;
m_depth = DialogWindow.m_DialogDepth;
m_EarthPermeability2 = DialogWindow.m_DialogEarthPermeability2;
m_EarthConductivity2 = DialogWindow.m_DialogEarthConductivity2;
m_EarthPermittivity2 = DialogWindow.m_DialogEarthPermittivity2;
m_frequency = DialogWindow.m_Dialogfrequency;
}
}
I'm quite new to Matlab and I'm struggling trying to figure out how to properly preprocess my data in order to make some calculations with it.
I have an Excel table with financial log returns of many companies such that every row is a day and every column is a company:
I imported everything correctly into Matlab like this:
Now I have to create what's caled "rolling windows". To do this I use the following code:
function [ROLLING_WINDOWS] = setup_returns(RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
for i = 1:limit_rows
ROLLING_WINDOWS(i).SYS = RETURNS(i:bandwidth+i-1,1);
end
end
Well if I run this code for the first column of returns everything works fine... but my aim is to produce the same thing for every column of log returns. So basically I have to add a second for loop... but what I don't get is which syntax I need to use in order to make that ".SYS" dynamic and based on my array of string cells containing company names so that...
ROLLING_WINDOWS(i)."S&P 500" = RETURNS(i:bandwidth+i-1,1);
ROLLING_WINDOWS(i)."AIG" = RETURNS(i:bandwidth+i-1,2);
and so on...
Thanks for your help guys!
EDIT: working function
function [ROLLING_WINDOWS] = setup_returns(COMPANIES, RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
for i = 1:limit_rows
offset = bandwidth + i - 1;
for j = 1:columns
ROLLING_WINDOWS(i).(COMPANIES{j}) = RETURNS(i:offset, j);
end
end
end
Ok everything is perfect... just one question... matlab intellissense tells me "ROLLING_WINDOWS appears to change size on every loop iteration bla bla bla consider preallocating"... how can I perform this?
You're almost there. Use dynamic field names by building strings for fields. Your fields are in a cell array called COMPANIES and so:
function [ROLLING_WINDOWS] = setup_returns(COMPANIES, RETURNS)
bandwidth = 262;
[rows, columns] = size(RETURNS);
limit_rows = rows - bandwidth;
%// Preallocate to remove warnings
ROLLING_WINDOWS = repmat(struct(), limit_rows, 1);
for i = 1:limit_rows
offset = bandwidth + i - 1;
for j = 1:columns
%// Dynamic field name referencing
ROLLING_WINDOWS(i).(COMPANIES{j}) = RETURNS(i:offset, j);
end
end
end
Here's a great article by Loren Shure from MathWorks if you want to learn more: http://blogs.mathworks.com/loren/2005/12/13/use-dynamic-field-references/ ... but basically, if you have a string and you want to use this string to create a field, you would do:
str = '...';
s.(str) = ...;
s is your structure and str is the string you want to name your field.