Get ConcurrentModificationException when createRow() with Apache POI within multi-threads method evem I have already syncronized this code. - apache-poi

I use multi-threads to create rows in excel with Apache POI package.
Each thread will create a row . to avoid the concurrency issue, I out the code inside the synchronized block to make sure there is only one thread can create a row and cell at the same time. however I still get an "ConcurrentModificationException". When code is running, I can reproduce the exception in the same place where code is processing.
the exception is
java.util.ConcurrentModificationException
at java.util.TreeMap$NavigableSubMap$SubMapIterator.nextEntry(TreeMap.java:1703)
at java.util.TreeMap$NavigableSubMap$SubMapEntryIterator.next(TreeMap.java:1751)
at java.util.TreeMap$NavigableSubMap$SubMapEntryIterator.next(TreeMap.java:1745)
at java.util.TreeMap$NavigableSubMap$EntrySetView.size(TreeMap.java:1637)
at java.util.TreeMap$NavigableSubMap.size(TreeMap.java:1507)
at org.apache.poi.xssf.usermodel.XSSFSheet.createRow(XSSFSheet.java:777)
at org.apache.poi.xssf.usermodel.XSSFSheet.createRow(XSSFSheet.java:149)
at com.citi.mrr.rootquery.RootQueryThread.writeQueryResultToRow(RootQueryThread.java:157)
at com.citi.mrr.rootquery.RootQueryThread.run(RootQueryThread.java:42)
at java.lang.Thread.run(Thread.java:748)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
My code is :
#Override
public void run() {
...
//add row to excel sheet
synchronized(this){
Row row = sheet.createRow(querySequence+1); // here I get exception
row.createCell(0).setCellValue( rqResult.getRQName());
..
}
}
Thanks for any help.

Related

what code-instrument should be added to register each http event in MeterRegistry with specific tag & minute value. Event requests are in millions

I need to analyse one http event value which should not be greater than 30mins. & 95% event should belong to this bucket. If it fails send the alert.
My first concern is to get the right metrics in /actuator/prometheus
Steps I took:
As in every http request event, I am getting one integer value called eventMinute.
Using micrometer MeterRegistry, I tried below code
// MeterRegistry meterRegistry ...
meterRegistry.summary("MINUTES_ANALYSIS", tags);
where tag = EVENT_MINUTE which receives some integer value in each
http event.
But this way, it floods the metrics due to millions of event.
Guide me a way please, i am beginner to this. Thanks!!
The simplest solution (which I would recommend you start with) would be to just create 2 counters:
int theThing = //getTheThing()
if(theThing > 30) {
meterRegistry.counter("my.request.counter.abovethreshold").inc()
}
meterRegistry.counter("my.request.counter.total").inc()
You would increment the counter that matches your threshold and another that tracks all requests (or reuse another meter that does that for you).
Then it is simple to setup a chart or alarm:
my_request_counter_abovethreshold/my_request_counter_total < .95
(I didn't test the code. It might need a tiny bit of tweaking)
You'll be able to do a similar thing with DistributionSummary by setting various SLOs (I'm not familiar with them to be able to offer one), but start with something simple first and if it is sufficient, you won't need the other complexity.
There are certain ways to solve this problem
1 ; here is a function which receives tags, name of metrics and a value
public void createOrUpdateHistogram(String metricName, Map<String, String> stringTags, double numericValue)
{
DistributionSummary.builder(metricName)
.tags(tags)
//can enforce slo if required
.publishPercentileHistogram()
.minimumExpectedValue(1.0D) // can take this based on how you want your distibution
.maximumExpectedValue(30.0D)
.register(this.meterRegistry)
.record(numericValue);
}
then it produce metrics like
delta_bucket{mode="CURRENT",le="30.0",} 11.0
delta_bucket{mode="CURRENT", le="+Inf",} 11.0
so as infinte also hold the less than value, so subtract the le=30 from le=+Inf
Another ways could be
public void createOrUpdateHistogram(String metricName, Map<String, String> stringTags, double numericValue)
{
Timer.builder(metricName)
.tags(tags)
.publishPercentiles(new double[]{0.5D, 0.95D})
.publishPercentileHistogram()
.serviceLevelObjectives(new Duration[]{Duration.ofMinutes(30L)})
.minimumExpectedValue(Duration.ofMinutes(30L))
.maximumExpectedValue(Duration.ofMinutes(30L))
.register(this.meterRegistry)
.record((long)timeDifference, TimeUnit.MINUTES);
}
it will only have two le, the given time and +inf
it can be change based on our requirements also it gives us quantile.

How to save data using multiple threads in grails-2.4.4 application using thread pool

I have a multithreaded program running some logic to come up with rows of data that I need to save in my grails (2.4.4) application. I am using a fixedthreadpool with 30 threads. The skeleton of my program is below. My expectation is that each thread calculates all the attributes and saves on a row in the table. However, the end result I am seeing is that there are some random rows that are not saved. Upon repeating this exercise, it is seen that a different set of rows are not saved in the table. So, overall, each time this is attempted a certain set of rows are NOT saved in table at all. GORMInstance.errors did not reveal any errors. So, I have no clue what is incorrect in this program.
ExecutorService exeSvc = Executors.newFixedThreadPool(30)
for (obj in list){
exeSvc.execute({-> finRunnable obj} as Callable)
}
Also, here's the runnable program that the above snippet invokes.
def finRunnable = {obj ->
for (item in LIST-1){
for (it in LIST-2){
for (i in LIST-3){
rowdata = calculateValues(item, it, i);
GORMInstance instance = new GORMInstance();
instance.withTransaction{
instance.attribute1=rowdata[0];
instance.attribute2=rowdata[1];
......so on..
instance.save(flush:true)/*without flush:true, I am
running into HeuristicCompletion exception. So I need it
here. */
}//endTransaction
}//forloop 3
}//forloop 2
}//forloop 1
}//runnable closure

How do i read and edit huge excel files using POI?

I have a requirement to do the following
1)Copy a huge excel file 1400*1400 and make a copy.
2)Read the copied file and add new columns and rows and also edit at the same time.
3)This is going to be a standalone program and not on a server. I have limitations of having low memory footprint and fast performance.
I have done some reading and have found the following
1)There is no API to copy sucg a huge file
2)SXSSF can be using for writing but not for reading
3)XSSF and SAX (Event API) can be using for reading but not for editing.If i tried to read and store as objects again i will have a memory issue.
Please can you help on how i can do this?
Assuming your memory size is large enough to use XSSF/SAX to read and SXSSF to write, let me suggest the following solution.
1) Read the file using XSSF/SAX. For each row, create an object with the row data and immediately write it out into a file using ObjectOutputStream or any other output format you find convenient. You will create a separate file for each row. And there will only be 1 row object in memory, because you can keep modifying the same object with each row's data.
2) Make whatever modifications you need to. For rows that need to be modified, read the corresponding file back into your row object, modify as needed, and write it back out. For new rows, simply set the data in your row object and write it out to a new file.
3) Use SXSSF to reassemble your spreadsheet by reading 1 row object file at a time and storing it in your output spreadsheet.
That way, you will only have 1 row in memory at a time.
If there is much data due to which 'Out of Memory' or 'GC overlimit exceeded' occurs and if memory is a problem the data can be initially parsed to a xml file. The excel sheet can be replaced with the xml file so that memory usage will be minimum.
In excel the sheets are represented as xml. Using java.util.zip.ZipFile each entries can be identified. The xml for the sheet can be replaced with the parsed xml so that we get the expected data in excel sheet.
Following class can be used to create xml files:
public class XmlSpreadsheetWriter {
private final Writer _out;
private int _rownum;
public XmlSpreadsheetWriter(Writer out){
_out = out;
}
public void beginSheet() throws IOException {
_out.write("<?xml version=\"1.0\" encoding=\"UTF-8\"?>" +
"<worksheet xmlns=\"http://schemas.openxmlformats.org/spreadsheetml/2006/main\">" );
_out.write("<sheetData>\n");
}
public void endSheet() throws IOException {
_out.write("</sheetData>");
_out.write("</worksheet>");
}
public void insertRow(int rownum) throws IOException {
_out.write("<row r=\""+(rownum+1)+"\">\n");
this._rownum = rownum;
}
public void endRow() throws IOException {
_out.write("</row>\n");
}
public void createCell(int columnIndex, String value, int styleIndex) throws IOException {
String ref = new CellReference(_rownum, columnIndex).formatAsString();
_out.write("<c r=\""+ref+"\" t=\"inlineStr\"");
_out.write(" s=\""+styleIndex+"\"");
_out.write(">");
_out.write("<is><t>"+value+"</t></is>");
_out.write("</c>");
}
public void createCell(int columnIndex, double value, int styleIndex) throws IOException {
String ref = new CellReference(_rownum, columnIndex).formatAsString();
_out.write("<c r=\""+ref+"\" t=\"n\"");
_out.write(" s=\""+styleIndex+"\"");
_out.write(">");
_out.write("<v>"+value+"</v>");
_out.write("</c>");
}
public void createEmptyCell(int columnIndex, int styleIndex)throws IOException {
String ref = new CellReference(_rownum, columnIndex).formatAsString();
_out.write("<c r=\""+ref+"\" t=\"n\"");
_out.write(" s=\""+styleIndex+"\"");
_out.write(">");
_out.write("<v></v>");
_out.write("</c>");
}
}
If memory is the problem with processing the number of records you pointed out (i.e. 1400*1400 ) then getting XML data and processing those might be a solution for you. I know it may not be the best solution but it will for sure address the low memory requirement that you have. Even POI site points this solution too:
"If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself. This is intended for intermediate developers who are willing to learn a little bit of low level structure of .xlsx files, and who are happy processing XML in java. Its relatively simple to use, but requires a basic understanding of the file structure. The advantage provided is that you can read a XLSX file with a relatively small memory footprint."
source:http://poi.apache.org/spreadsheet/how-to.html

How to deserialize DynamicComposite column value?

I am trying to implement a data model where row keys are Strings, column names are Longs and column values are DynamicComposites. Using Hector, an example of the stored procedure looks like this:
// create the value
DynamicComposite colVal = new DynamicComposite();
colVal.add(0, "someString");
colVal.setComparatorByPosition(0, "org.apache.cassandra.db.marshal.UTF8Type");
colVal.setSerializerByPosition(0, StringSerializer.get());
// create the column
HColumnImpl<Long, DynamicComposite> newCol = new
HColumnImpl<Long, DynamicComposite>(longSerializer,
dynamicCompositeSerializer);
newCol.setName(longValue);
newCol.setValue(colVal);
newCol.setClock(keySpace.createClock());
// insert the new column
Mutator<String> mutator = HFactory.createMutator(keySpace,stringSerializer);
mutator.addInsertion("rowKey","columnFamilyName",newCol);
mutator.execute();
Now, when I try to retrieve the data:
// create the query
SliceQuery<String,Long,DynamicComposite> sq =
HFactory.createSliceQuery(keySpace, stringSerializer, longSerializer,
dynamicCompositeSerializer);
// set the query
sq.setColumnFamily("columnFamilyName");
sq.setKey("rowKey");
sq.setColumnNames(longValue);
// execute the query
QueryResult<ColumnSlice<Long, DynamicComposite>> qr = sq.execute();
// get the data
qr.get().getColumnByName(longValue).getValue();
or when I just try to get plain byes:
// get the data
dynamicSerializer.fromByteBuffer(qr.get().
getColumnByName(longValue).getValueBytes());
I run into an exception:
Exception in thread "main" java.lang.NullPointerException
at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
at com.google.common.collect.ImmutableClassToInstanceMap.getInstance(ImmutableClassToInstanceMap.java:147)
at me.prettyprint.hector.api.beans.AbstractComposite.serializerForComparator(AbstractComposite.java:321)
at me.prettyprint.hector.api.beans.AbstractComposite.getSerializer(AbstractComposite.java:344)
at me.prettyprint.hector.api.beans.AbstractComposite.deserialize(AbstractComposite.java:713)
at me.prettyprint.hector.api.beans.DynamicComposite.fromByteBuffer(DynamicComposite.java:25)
at me.prettyprint.cassandra.serializers.DynamicCompositeSerializer.fromByteBuffer(DynamicCompositeSerializer.java:35)
As far as I have understood from all the tutorials I read, it should be possible to use DynamicComposite as column value. Therefore I want to ask: what am I doing wrong? From the exception it seems I am just forgetting to set something somewhere.
Radovan,
Its probably due to compatibility issues of the Guava library used in conjuction with the Hector version.
See also : https://github.com/hector-client/hector/pull/591
I am on Hector-core-1.1-1.jar, in combination with the Guava-14.0.jar I get the same error as you. When I use it with the Guava-12.0.1.jar however it works fine for me.

Getting Error when Calling method "addEqualsExpression" in Hector,Cassandra

Hello fellow developer,iam getting error when running code above:
public void getConditioningQuery(String columnName,String value){
QueryResult<OrderedRows<String, String, String>> result =
(QueryResult<OrderedRows<String, String, String>>) new IndexedSlicesQuery<String, String, String>(keyspace, serializer, serializer, serializer)
.addEqualsExpression("state", "TI")
.setReturnKeysOnly()
.setColumnFamily(CF_NAME)
.setStartKey("")
.execute();
System.out.println("Result="+result.get().getList());
}
this method is to find row where state=TI.
i've added index in my column family and if i manual query in cassandra-cli,the data is show,but if im using the code using hector,im getting this error:
in my IDE=
345 [main] INFO me.prettyprint.cassandra.service.JmxMonitor - Registering JMX me.prettyprint.cassandra.service_MyCluster:ServiceType=hector,MonitorType=hector
867 [main] INFO me.prettyprint.cassandra.hector.TimingLogger - start[1306754734185] time[91] tag[WRITE.success_]
10926 [main] INFO me.prettyprint.cassandra.hector.TimingLogger - start[1306754734314] time[10021] tag[READ.fail_]
me.prettyprint.hector.api.exceptions.HTimedOutException: TimedOutException()
at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:32)
and in cassandra log=
ERROR 18:25:34,326 Fatal exception in thread Thread[ReadStage:102,5,main]
java.lang.AssertionError: No data found for NamesQueryFilter(columns=) in DecoratedKey(165611378069681836494944905825187619237, 73616e6a6f7578):QueryPath(columnFamilyName='user', superColumnName='null', columnName='null') (original filter NamesQueryFilter(columns=)) from expression 'user.state EQ TI'
at org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1603)
at org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
im so confuse because the error is tell if the data in column in column family user is not found,but if im using cassandra-cli,the data is shows..
im so confuse and still stuck here..maybe my method is wrong?somebody can help me telling me what is wrong?im still google to solve this problem..thanks for your attention and sorry for my bad english :D..
It looks like you're hitting CASSANDRA-2653 if you're using 0.8-rc1 or similar. This should be fixed in the current 0.8 branch.
You can't use setReturnKeysOnly with index queries, yet. This will be fixed in a future release (see the ticket Tyler linked); in the meantime, simply let it return one or more columns as a workaround.

Resources