Using ADODB SQL in VBA, why are Strings truncated [to 255] only when I use grouping? - excel

I’m using ADODB to query on Sheet1. If I fetch the data using SQL query on the sheet as below without grouping I’m getting all characters from comment.
However, if I use grouping my characters are truncated to 255.
Note – My first row contains 800 len of characters so drivers have identified the datatype correctly.
Here is my query output without grouping
Select Product, Value, Comment, len(comment) from [sheet1$A1:T10000]
With grouping
Select Product, sum(value), Comment, len(comment) from [sheet1$A1:T10000] group by Product, Comment

Thanks for posting this! During my 20+ years of database development using ADO recordsets I had never faced this issue until this week. Once I traced the truncation to the recordset I was really scratching my head. Couldn't figure how/why it was happening until I found your post and you got me focused on the GROUP BY. Sure enough, that was the cause (some kind of ADO bug I guess). I was able to work around it by putting correlated scalar sub-queries in the SELECT list, vice using JOIN and GROUP BY.
To elaborate...
At least 9 times out of 10 (in my experience) JOIN/GROUP BY syntax can be replaced with correlated scalar subquery syntax, with no appreciable loss of performance. That's fortunate in this case since there is apparently a bug with ADO recordset objects whereby GROUP BY syntax results in the truncation of text when the string length is greater than 255 characters.
The first example below uses JOIN/GROUP BY. The second uses a correlated scalar subquery. Both would/should provide the same results. However, if any comment is greater than 255 characters these 2 queries will NOT return the same results if an ADODB recordset is involved.
Note that in the second example the last column in the SELECT list is itself a full select statement. It's called a scalar subquery because it will only return 1 row / 1 column. If it returned multiple rows or columns an error would be thrown. It's also known as a correlated subquery because it references something that is immediately outside its scope (e.emp_number in this case).
SELECT e.emp_number, e.emp_name, e.supv_comments, SUM(i.invoice_amt) As total_sales
FROM employees e INNER JOIN invoices i ON e.emp_number = i.emp_number
GROUP BY e.emp_number, e.emp_name, e.supv_comment
SELECT e.emp_number, e.emp_name, e.supv_comments,
(SELECT SUM(i.invoice_amt) FROM invoices i WHERE i.emp_number = e.emp_number) As total_sales
FROM employees e

Related

Is there a workaround for the maximum length of an ODBCConnection.CommandText string in VBA?

I have a VBA script that generates a query string for a SAP HANA ODBC Connection in Excel. The query is determined by user inputs and can vary greatly in length. The query itself uses many versions of a similar query appended to one another using UNION ALL syntax.
The script sometimes throws a runtime error when trying to refresh. From my research, it has become clear that the reason for this is that the CommandText string exceeds a maximum allowed length of 32,767 (https://ask.sqlservercentral.com/questions/50819/too-long-sql-in-excel-vba.html).
I wondered whether there is a workaround for this, other than using a stored procedure (I am not against this if there is a way to create a stored procedure at runtime then execute it, but I cannot use a predefined stored procedure as my query is always different hence the need for VBA to create it)
Some more info about the dynamic query in VBA:
Column names, as well as parameters, are created dynamically and can be different every time
The query uses groups of lists of product numbers to generate an IN statement for each product group, then sums the sales for those products under the name of the group. These are then all UNION'd together to create one table with grouped records
Example of user input:
Example of resulting query:
WITH SOME_CTE (SOME_FIELDS) AS
(SELECT SOME_STUFF
FROM SOME_TABLE
WHERE SOME_STUFF_IS_GOING_ON)
SELECT GEND "Gender", 'Attribute 1' "Attribute", SUM(UNITS) "Units", SUM(VAL) "Value", SUM(MARGIN) "Margin"
FROM SOME_CTE
WHERE PRODUCT IN ('12345', '23456', '34567', '45678')
GROUP BY GEND
UNION ALL
SELECT GEND, 'Attribute 2' ATTR_NAME, SUM(UNITS), SUM(VAL), SUM(MARGIN)
FROM SOME_CTE
WHERE PRODUCT IN ('01234', '02345', '03456', '03567')
GROUP BY GEND
ORDER BY "Gender", "Attribute"
...and so on.
As you can see, with 2 attribute groups containing 4 products each there is no problem, but when we get to about 30 with several hundred each, it could be too long.
Note: I have tried things like shortening field references in the repeated parts of the query string to 1 character etc. which helps but does not solve the problem.
Any help would be greatly appreciated.
One workaround is to send multiple queries. Since you are using union all, you could execute every time single select statement, i.e.
create table in (for example) master database (don't create temporary tables! as they will be dropped after every query) - but before that, make sure you create new table, so delete old one if exists (also drop the table after you are done with it). Now every single select statement you'll change to insert statement, which will insert records to your so-called temporary table.
This way, you'll avoid lengthy queries, you'll just send single insert .. into.. select statements.
At the end, to get all results, you just need simple select query. After getting this data, you should drop that table, as it's no longer needed.

Expressing percentage

I need to be able to divide the values of each record in a certain field contained in a query result by the sum of all the values in the same field. This will enable me to calculate the percentage that each record is of the whole. The results must be expressed in the same domain in a new field.
That is: Each record in ColumnA must be divided by the sum of ColumnA, then expressed as a fraction/percentage in a new, neigbouring ColumnB.
I have tried this in Query1:
Total: (select sum(ColumnA) from Query1) - then in a second subquery Percentage:([CountOfColumnA/Total])
This worked once, but Access stopped me with a circular error because of the Query1 from Query1.
Now I am trying to rewrite it and reach the answer in only one subquery to avoid the circular problem. I can easily do this in Excel, but don't know enough about expressions and coding to manage it in Access.
I found a question on this forum from someone with a similar question, but I do not understand the answer - I do not know enough about SQL (nothing actually) to adapt it to my problem. My adaptation was something like this:
SELECT Query1, ColumnA / ((SELECT SUM(ColumnA) FROM Records) AS Percentage FROM Records
Assuming Query1 is a strange field name, try with:
Select
Query1,
ColumnA,
ColumnA / (Select Sum(T.ColumnA) From Records As T) As
Percentage
From
Records

Select all from column family not returning all

I'm having inconsistent results when I attempt to select * from a column family. For fun, I ran the same query in a loop and then counted the result set returned. No matter what I do, it seems to vary (sometimes as much as +/- 150 rows for every ~4,000).
rSet = session.execute(SimpleStatement('SELECT * FROM colFam', consistency_level=ConsistencyLevel.QUORUM))
Using this query results in different row counts returned. The table in question isn't going to hold millions of rows - being able to accurately select * from it is important. Running the query directly in CQLShell yields similar weirdness.
Is there something inherent in CQL/Cassandra that I haven't yet learned about that prevents Cassandra from being able to return an accurate representation of *? Or is my Google fu just failing me?

How to use Cognos report studio to change default Total() calculation

I have a crosstab report that calculates for failure rates for my products ; it has two measures (PASSCOUNT, FAILCOUNT) and a calculation FAILRATE (FAILCOUNT/PASSCOUNT+FAILCOUNT)
The report layout looks thusly:
OEM
MODEL
TESTYEAR TESTMONTH PASSCOUNT FAILCOUNT FAILRATE
When I select the Total icon, it logically adds up each of the columns like so
2012 OCT 7547 697 0.08
NOV 9570 373 0.04
DEC 1879 107 0.05
------------------------
Total 18996 1177 0.17
My user however wants TOTAL FAILRATE to be
TOTAL FAILCOUNT/(TOTAL PASSCOUNT+TOTAL FAILCOUNT)
which translates to
1177 / (18996+1177) = 0.058
How can I create this custom total in the report? I am reading about creating a Query calculation but I am not clear this is the right approach.
Cognos Report Studio 8.4 IBM DB2 UDB
You can indeed. There a couple methods.
In your report query you have the individual query items that comprise your desired calculation working fine(PASSCOUNT, FAILCOUNT, etc...). You can create a new data item/expression in that same query item list and edit the definition. In the left hand side you can choose to use columns from your original db data source but there is another pane you can select to actually re-use other query items/calculations you have definied in that same query (PASSCOUNT FAILCOUNT FAILRATE). Cognos knows that when you create an expression using other columns in the same query to resolve those query items first in order to resolve your calculation that is dependent upon them. You want to make sure your calculated/derived column is listed after the other dependent query items(Actually this may not matter but it makes sense when looking at it). Also I believe you will have to set the the failure rate query item/expression to "Calculated" as its aggregate method.
The summary line might/might not be a bit trickier. Not having report studio right in front of me it might be smart enough when you add the totals section to the list report and use your new failure rate expression... to extend the correct calculation or you may have to use another method to do the summary which is a report expression.
You create a report expression not in the query but on the gui page of the report(its in the toolbox along side tables, singltons etc...). It has an expression builder just like in the query but you will notice the function set is different because it is being done after the query runs and as the results come back so simple things like you are doing is fine which is just math but you will notice other database functions are not available on report expressions simply because they are occuring in the html output and not during the query run against the database.
Hope this helps. In summary you are creating a calculated column based on the summarized calculations of other columns in the same query/result set. This would be in theory the same as this SQL statement which will not work because SQL does not allow this directly but hope it helps explain what cognos is doing.
Select 1+2 As FAILCOUNT, 2+3 AS PASSCOUNT, (FAILCOUNT/PASSCOUNT+FAILCOUNT) AS FAILRATE From SOMETABLE
-- Cognos is able to use the results other aggregate calculated columns in the same query and if you gen sql you can see how it arranges the SQL to do this.
Thanks,
Tim
It's important to pick the right total in Crosstab.
When you add the total row, make sure you pick Automatic Summary:
This option will make sure the aggregation is determined individually by each query data item.
If it still does not give you expected result, then on the query explorer, pick the query that is being used on the crosstab, and on the FAILRATE data item, pick the calculated option (on the properties panel):
In the crosstab reports what happens is the columnar data is rendered first n then the row level data, so this is the cause of your problems ..To resolve this click on the List Column Title 'FAILRATE' & assign it a Solve Order = 1, as long as the solve order of total is NULL..The trick is that the Solve order would control the calcuation of FAILRATE after all others are done.

Order SharePoint search results by more columns

I'm using a FullTextSqlQuery in SharePoint 2007 (MOSS) and need to order the results by two columns:
SELECT WorkId FROM SCOPE() ORDER BY Author ASC, Rank DESC
However it seems that only the first column from ORDER BY is taken into account when returning results. In this case the results are ordered correctly by Author, but not by Rank. If I change the order the results will be ordered by Rank, but not by Author.
I had to resort to my own sorting of the results, which I don't like very much. Has anybody a solution to this?
Edit: Unfortunately it also doesn't accept expressions in the ORDER BY clause (SharePoint throws an exception). My guess is that even if the query looks like legitimate SQL it is parsed somehow before being served to the SQL server.
I tried to catch the query with SQL Profiler, but to no avail.
Edit 2: In the end I used ordering by a single column (Author in my case, since it's the most important) and did the second ordering in code on the TOP N of the results. Works good enough for the project, but leaves a bad feeling of kludgy code.
Microsoft finally posted a knowledge base article about this issue.
"When using RANK in the ORDER BY clause of a SharePoint Search query, no other properties should be used"
http://support.microsoft.com/kb/970830
Symptom: When using RANK in the ORDER BY clause of a SharePoint Search query only the first ORDER BY column is used in the results.
Cause: RANK is a special property that is ranked in the full text index and hence cannot be used with other managed properties.
Resolution: Do not use multiple properties in conjunction with the RANK property.
Rank is a special column in MOSS FullTextSqlQuery that give a numeric value to the rank of each result. That value will be different for each query, and is relative to the other results for that particular query. Because of this rank should have a unique value for each result, and sorting by rank then author would be the same as just sorting by rank. I would try sorting on another column instead of rank to see if results come back as you expect, if so your trouble could be related to the way MOSS is ranking the results, which will vary for each unique query.
Also you are right, the query looks like SQL, but it is not the query actually passed to the SQL server, it is special Microsoft Enterprise Search SQL Query syntax.
I, too, am experiencing the same problem with FullTextSqlQuery and MOSS 2007 where only the first column in a multi-column "ORDER BY" is respected.
I entered this topic in the MSDN Forums for SharePoint Search, but have not received any replies:
http://social.msdn.microsoft.com/Forums/en-US/sharepointsearch/thread/489b4f29-4155-4c3b-b493-b2fad687ee56
I have no experience in SharePoint, but if it is the case where only one ORDER BY clause is being honored I would change it to an expression rather than a column. Assuming "Rank" is a numeric column with a maximum value of 10 the following may work:
SELECT WorkId FROM SCOPE() ORDER BY AUTHOR + (10 - Rank) ASC

Resources