we have Cassandra 3.4.4.
In the system.log we have a lot message like this:
INFO [CompactionExecutor:2] 2020-09-16 13:42:52,916 PerSSTableIndexWriter.java:211 - Rejecting value (size 1.938KiB, maximum 1.000KiB) for column payload (analyzed false) at /opt/cassandra/data/cfd/request-c8399780225a11eaac403f5be58182da/md-609-big SSTable.
What are the significance of these messages?
These entries appear several hundred per second, log rotates every minute.
The symptoms you described tell me that you have added a SASI index on the payload column of the cfd.request table but didn't used to.
Those messages are coming because Cassandra is going through the data trying to index them that the payload column has too much data in it. The maximum term size for SASI is 1024 bytes but in the example you posted, the term size was 1.9KB.
If the column only contains ASCII characters, the maximum term length is 1024 characters since each ASCII character is 1 byte. If the column has extended Unicode such as Chinese or Japanese characters, the maximum term length is shorter since each of those take up 3 bytes.
You don't have a SASI analyzer configured on the index (analyzed false) so the whole column value is taken up as a single term. If you use the standard SASI analyzer, the column value will get tokenised breaking them up into multiple terms which are shorter and you won't see those indexing failures get logged.
If you're interested in the detailed fix steps, see https://community.datastax.com/questions/8370/. Cheers!
We are facing an issue when uploading long texts (longer than 255 symbols) from Excel file using Data Services in SAP BODS.
Data Services ODBC driver truncates all further texts in this column to 255 symbols, even if the field length is defined as varchar(2500) in Excel file format in Data Services and if the column contains longer texts in next rows.
- I tried to set parameter TypeGuessRows = 0 -- but it's not working.
- Also tried using keeping record on first row in source Excel but it's not working.
can anyone knows how to load max length data using sap bods.
This is a known issue which is described in note 1675110. This is default (faulty) behavior of SAP DS, which sets file width according first 100 rows of Excel book. The subsequent rows even longer ones will not be treated longer than 255 characters.
SOLUTION: move longer rows to the top 100 or make the fake first row of the necessary length which consider longest column in your book.
i am trying to add a table to a presentation using python-pptx with a specific dimensions
i created a slide layout from power-point which contains the table placeholder in the area i want and loaded it using python-pptx.
slide_layout
but regardless of the placeholder dimensions, the table itself after creation is exceeding the placeholder are.
mainly it is dependent on the number of rows as per the documentation "The table's height is determined by the number of rows."
shape_id, name, height = self.shape_id, self.name, Emu(rows * 370840)
i tried to update the placeholder.py file manually and change the row height but the same output appears.
shape_id, name, height = self.shape_id, self.name, Emu(rows * 18429)
the table is insisting on exceeding the placeholder area as per the below image
output
below is my code, any clues ?
from pptx import Presentation
# the presentation including the table placeholder
prs = Presentation('Presentation2.pptx')
slide1 = prs.slides.add_slide(prs.slide_layouts[11])
table_placeholder = slide1.shapes[0]
shape = table_placeholder.insert_table(rows=22,cols=2)
prs.save('test.pptx')
Tables in PowerPoint expand vertically to accommodate new rows; that's just the way they work. You will need to do any resizing yourself, which you may find is a challenging problem. This isn't the same kind of problem when a user is creating a table in the application because they will just make adjustments for fit until it looks best for their purposes.
You'll need to adjust font-size and row-height and perhaps other attributes like column-width etc. based on your application and whatever heuristics you can resolve, perhaps related to the row count and length of text in certain cells and so on.
A table placeholder really just gives a starting position and width.
I am currently working on reading a pdf file and extracting the contents of a pdf file.
However, three particular (invoice value, tax, total amount payable) are coming up as one concatenated field.
So if the pdf file has invoice value as 1000, tax as 118 and amount payable as 1118, i get 1,0001181,118 as a field.
Is there any way to create a special delimiter where i check the number of digits after a comma as a rule?
I found a very roundabout way of doing this. The best solution I could come up with was to create a lookup table on the number of digits that a particular combination of numbers would end up with (for instance, if there are 14 digits, it means that invoice value had 5 digits, tax had 4 digits and final amount had 5 digits) and so on and so forth.
Based on the number of digits, I was then able to determine what the final amount was, then back-calculate what the tax and invoice value is.
Very crude method but i suppose it worked
I am developing a SSIS package, trying to update an existing SQL table from a CSV flat file. All of the columns are successfully updating except for one column. If I ignore this column on truncate, my package completes successfully. So I know this is a truncate problem and not error.
This column is empty for almost every row. However, there are a few rows where this field is 200-300 characters. My data conversion task identified this field as a DT_WSTR, but from what I've read elsewhere maybe this should be DT_NTEXT. I've tried both and I even set the DT_WSTR to 500. But none of this fixed my problem. How can I fix? What data type should this column be in my SQL table?
Error: 0xC02020A1 at Data Flow Task 1, Source - Berkeley812_csv [1]: Data conversion failed. The data conversion for column "Reason for Delay in Transition" returned status value 4 and status text "Text was truncated or one or more characters had no match in the target code page.".
Error: 0xC020902A at Data Flow Task 1, Source - Berkeley812_csv [1]: The "output column "Reason for Delay in Transition" (110)" failed because truncation occurred, and the truncation row disposition on "output column "Reason for Delay in Transition" (110)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.
Error: 0xC0202092 at Data Flow Task 1, Source - Berkeley812_csv [1]: An error occurred while processing file "D:\ftproot\LocalUser\RyanDaulton\Documents\Berkeley Demographics\Berkeley812.csv" on data row 758.
One possible reason for this error is that your delimiter character (comma, semi-colon, pipe, whatever) actually appears in the data in one column. This can give very misleading error messages, often with the name of a totally different column.
One way to check this is to redirect the 'bad' rows to a separate file and then inspect them manually. Here's a brief explanation of how to do that:
http://redmondmag.com/articles/2010/04/12/log-error-rows-ssis.aspx
If that is indeed your problem, then the best solution is to fix the files at the source to quote the data values and/or use a different delimeter that isn't in the data.
I've had this issue before, it is likely that the default column size for the file is incorrect. It will put a default size of 50 characters but the data you are working with is larger. In the advanced settings for your data file, adjust the column size from 50 to the table's column size.
I suspect the
or one or more characters had no match in the target code page
part of the error.
If you remove the rows with values in that column, does it load?
Can you identify, in other words, the rows which cause the package to fail?
It could be the data is too long, or it could be that there's some funky character in there SQL Server doesn't like.
If this is coming from SQL Server Import Wizard, try editing the definition of the column on the Data Source, it is 50 characters by default, but it can be longer.
Data Soruce -> Advanced -> Look at the column that goes in error -> change OutputColumnWidth to 200 and try again.
I've had this problem before, you can go to "advanced" tab of "choose a data source" page and click on "suggested types" button, and set the "number of rows" as much as you want. after that, the type and text qualified are set to the true values.
i applied the above solution and can convert my data to SQL.
In my case, some of my rows didn't have the same number of columns as the header. Example, Header has 10 columns, and one of your rows has 8 or 9 columns. (Columns = Count number of you delimiter characters in each line)
If all other options have failed, trying recreating the data import task and/or the connection manager. If you've made any changes since the task was originally created, this can sometimes do the trick. I know it's the equivalent of rebooting, but, hey, if it works, it works.
I have same problem, and it is due to a column with very long data.
When I map it, I changed it from DT_STR to Text_Stream, and it works
In the destination, in advanced, check that the length of the column is equal to the source.
OuputColumnWidth of column must be increased.
Path: Source Connection manager-->Advanced-->OuputColumnWidth