Java Apache POI reading empty column after row data column - apache-poi

I have 100 rows 10 columns. reading each row one by one and column also. After 10th column the cursor not moving into next row.
It's going to 11 12 13 etc column. Could anyone help me how to move the next row once reading 10th column and how to stop reading empty 11 column.
Here is some code:
while (rowIterator.hasNext()) {
row = rowIterator.next();
while (cellIterator.hasNext()) {
cell = cellIterator.next();
if(cell.getColumnIndex()==0 ) { }
.....
if(cell.getColumnIndex()==10) { }
}
}

First, but this will not necessarily fix your problem, you should be using the for each syntax to iterate rows and columns, then once you get past column 10 you can just break out of the loop like so:
for (Row row : sheet){
for (Cell cell : row) {
...
if (cell.getColumnIndex() >= 10) break;
}
}
This is documented in the POI Quick Guide here: https://poi.apache.org/spreadsheet/quick-guide.html#Iterator
NOTE: I am breaking when column index is 10 or greater (that would be the 11th column as the indexes are 0 based). I mention this only because your code example is using column indexes 0 - 10, but your text says that there are only 10 valid columns.

if you want to skip column you can use
while (cellsInRow.hasNext()) {
Cell currentCell = cellsInRow.next();
if (currentCell.getColumnIndex() >= 3 && currentCell.getColumnIndex() <= 6) {
continue;
}
}
to skip column 4, 5, 6 and 7 from excel file. index starts at 0

Related

in power Query: How to create conditional column that removes numbers and keeps text

col 1 contains rows that have just numbers and just text. example:
row 1 = 70
row 2 = RS
row 3= abcddkss
row 5 = 5
row 6 = 88
and so on
What I want to do is add a column using logic like this: if Col1 not a number then Col1 else null.
what I have so far:
=let mylist=List.RemoveItems(List.Transform({1..126}, each Character.FromNumber(_)),{"0".."9"})
in
if List.Contains(mylist,Text.From([Column1])) then [Column1] else null
however, this will not work for rows that have more than one letter and will only work on ones that have one letter
You can use this:
if Value.Is(Value.FromText([dat]), type number) then null else [dat]
You could also check if the string is purely digit characters.
if [Column1] = Text.Select([Column1], {"0".."9"}) then null else [Column1]

How to use Logstash filters to add a new column based on some rows of other column?

Suppose I have a csv file looking like this:
cummulated_values
0
2
5
10
17
How can I use logstash filters to add a new "values" column, which rows are defined as values[n] := cummulated_values[n] - cummulated_values[n-1], where 0 < n <= total number of rows and values[0] := cummulated_values[0], where cummulated_values[n] means n-th row of "cummulated_values" column?
So the output table will look like this:
cummulated_values, values
0, 0
2, 2
5, 3
10, 5
17, 7
I would implement that using a ruby filter.
csv { autodetect_column_names => true }
ruby {
code => '
c = event.get("cummulated_values").to_i
#values ||= c
event.set("values", c - #values)
#values = c
'
}
You need the order of events to be preserved and you need all the events to go through the same instance of the ruby filter. Therefore you must set pipeline.workers to 1, and verify that if pipeline.ordered is set then it is set to either auto or true (the default value is auto, so if you have not set it you will be OK).

Compare row with all other previous string in one column and change value of another column in Python

I have a csv file named namelist.csv, it includes:
Index String Size Name
1 AAA123000DDD 10 One
2 AAA123DDDQQQ 20 One
3 AAA123000DDD 25 One
4 AAA123D 20 One
5 ABA 15 One
6 FFFrrrSSSBBB 60 Two
7 FFFrrrSSSBBB 30 Two
8 FFFrrrSS 50 Two
9 AAA12 70 Two
I want to compare row in column String of each name group: if the string in each row is match or is substring of all above rows then remove the previous rows and sum the value of Size column to the value of subtring row.
Example: i take row 3rd: AAA123000DDD, i compare it to 2 row 1st and 2nd, it see that it is a match with 1st row, it will remove the 1st row then sum value of the 1st row column Size to the 3rd row column Size .
then the table will be like:
Index String Size Name
2 AAA123DDDQQQ 20 One
3 AAA123000DDD 35 One
4 AAA123D 20 One
...
the final result will be:
Index String Size Name
3 AAA123000DDD 35 One
4 AAA123D 40 One
5 ABA 15 One
8 FFFrrrSS 140 Two
9 AAA12 70 Two
i think of using groupby of pandas to group all Name column, but i don't know how to apply the comparison of String column and sum of Size column.
I am new to Python so any help I will very appreciate.
Assuming Name is distinct with String, here's how you would do the aggregation. I kept Name so that it also shows in the final DataFrame.
df_group = df.groupby(['String', 'Name'])['Size'].sum().reset_index()
Edit:
To match the substrings (and using the example above that it appears that a substring will not match with multiple strings), you can make a mapping of substrings to full strings and then group by the full string column as before:
all_strings = set(df['Strings'])
substring_dict = dict()
for row in df.itertuples():
for item in all_strings:
if row.String in item:
substring_dict[row.String] = item
def match_substring(x):
return substring_dict[x]
df['full_strings'] = df.String.apply(match_substring)
df_group = df.groupby(['full_strings', 'Name'])['Size'].sum().reset_index()

How do you search through a row (Row 1) of a CSV file, but search through the next row (Row 2) at the same time?

Imagine there are THREE columns and a certain number of rows in a dataframe. First column are random values, second column are Names, third column are Ages.
I want to search through every row (First Row) of this dataframe and find when value 1 appears in the first column. Then simultaneously, I want to know that if value 1 does indeed exist in the column, does value 2 appear in the SAME column but in the next row.
If this is the case. Copy First Rows, Value, Name And Age into an empty dataframe. Every time this condition is met, copy these rows into an empty dataframe
EmptyDataframe = pd.DataFrame(columns['Name','Age'])
csvfile = pd.DataFrame(columns['Value', 'Name', 'Age'])
row_for_csv_dataframe = next(csv.iterrows())
for index, row_for_csv_dataframe in csv.iterrows():
if row_for_csv_dataframe['Value'] == '1':
# How to code this:
# if the NEXT row after row_for_csv_dataframe finds the 'Value' == 2
# then copy 'Age' and 'Name' from row_for_csv_dataframe into the empty DataFrame.
Assuming you have a dataframe data like this:
Value Name Age
0 1 Anne 10
1 2 Bert 20
2 3 Caro 30
3 2 Dora 40
4 1 Emil 50
5 1 Flip 60
6 2 Gabi 70
You could do something like this, although this is probably not the most efficient:
iterator1 = data.iterrows()
iterator2 = data.iterrows()
iterator2.__next__()
for current, next in zip(iterator1,iterator2):
if(current[1].Value==1 and next[1].Value==2):
print(current[1].Value, current[1].Name, current[1].Age)
And would get this result:
1 Anne 10
1 Flip 60

Assign values in large scale into a data base. Formula (libreOffice calc)

I am trying to create a data base in libreOffice spreed-sheet application. And what I need is, the first column to be Id's, but each Id has to fill 100 cells. So I would like to have 2000 Id's and each Id takes up 100 cells, we have 200 000 cells. (Id's values = range(1,2000))
row#1 : row#100 = Id#1 // row#101 : row#200 = Id#2 ....// row#199900 : row#200000 = Id#2000
What I simply want is to assign the value 1 to the first 100 cells in the first column, the value 2 to the next 100 cells in the same column and so on, until I have the 2000 Id's in the first column.
So I would like to find a formula to achieve that with out having to select and scroll manually 2000 times the sheet.
Thanks
If the ID is in A column:
=QUOTIENT(ROW(A1);100)+1
The formula adds 1 to integer part of the number of row divided by 100.
Apply with a loop?
Public Sub test()
Dim i As Long
For i = 1 To 2000
Range("A1:A100").Offset((i - 1) * 100, 0) = i
Next
End Sub

Resources