I am using the ColdFusion gateways to fire and forget a large number of actions. To do this, I have a loop that goes through a query with a SendGatewayMessage() at the end. However, the query that I loop through can get extremely large. (100.000+ records)
To prevent actions from being lost, I increased the queue size and the number of threads.
Because actions still got lost, I included a loop before the SendGatewayMessage() like so:
<cfloop condition="#gatewayService.getQueueSize()# GTE #gatewayService.getMaxQueueSize()#">
<cfset guardianCount = guardianCount+1>
</cfloop>
<cflog file="gatewayGuardian" text="#i# waited for #guardianCount# iterations. Queuesize:#gatewayService.getQueueSize()#">
<cfset SendGatewayMessage("EventGateway",eventData)>
(More info on the gatewayService class here)
This is more or less acceptable, since I can increase the request timeout to a few hours(!), but I am still looking for a more effective way to slow down the sending of messages to the queue in the hope that the overall process will go faster with less pressure on the resources of the server.
Any suggestions?
Any thoughts on the consequences of increasing the queue size even further?
Right now, I use application variables to keep track of the records in the whole job, the number of batches already processed and the number of records processed.
At the start of the job, I have a piece of code that initiates all these variables like so:
<cfif not structKeyExists(application,"batchNumber") or application.batchNumber
eq 0 or application.batchNumber eq "">
<cfset application.batchNumber = 0>
<cfset application.recordsToDo = 0>
<cfset application.recordsDone = 0>
<cfset application.recordsDoneErrors = 0>
</cfif>
After that, I set all the records in a query and determine which records in that query we need to process in the current batch.
The amount of records in the batch is determined by the total amount of records and the maximum queue size. This way, each batch will never occupy more than about half of the queue. This makes sure that the job will never interfere with other operations or jobs and that the initial request will not time out.
<cfset application.recordsToSync = qryRecords.recordcount>
<cfif not structKeyExists(application,"recordsPerBatch") or application.recordsPerBatch eq "" or application.recordsPerBatch eq 0>
<cfset application.recordsPerBatch = ceiling(application.recordsToDo/(ceiling(application.recordsToDo/gatewayService.getMaxQueueSize())+1))>
</cfif>
<cfset startRow = (application.recordsPerBatch*application.batchNumber)+1>
<cfset endRow = startRow + application.recordsPerBatch-1>
<cfif endRow gt application.recordsToDo>
<cfset endRow = application.recordsToDo>
</cfif>
Then I loop through the query with a from/to loop to fire off the gateway events. I kept the guardian so there will never be a record lost because the queue is full.
<cfloop from="#startRow#" to="#endRow#" index="i">
<cfset guardianCount = 0>
<!--- load all values from the record into a struct --->
<cfset stRecordData = structNew()>
<cfloop list="#qryRecords.columnlist#" index="columnlabel">
<cfset stRecordData[columnlabel] = trim(qryRecords[columnlabel][i])>
</cfloop>
<cfset eventData = structNew()>
<cfset eventData.stData = stRecordData>
<cfset eventData.action = "bigJob">
<cfloop condition="#gatewayService.getQueueSize()# GTE #gatewayService.getMaxQueueSize()#">
<cfset guardianCount = guardianCount++>
</cfloop>
<cfset SendGatewayMessage("eventGateway",eventData)>
</cfloop>
Whenever a record is done, I have a function that checks the number of done vs the number of records to do. When they are the same, I'm done. Otherwise we may need to start a new batch.
Notice that the check to see if we're done is in a cflock, but the actual event post is not. This is because otherwise you might get a deadlock when the event you posted can't read the variables you use inside the lock.
I hope this is of use to someone or someone else has a better idea still.
<cflock timeout="30" name="jobResult">
<cfset application.recordsDone++>
<cfif application.recordsDone eq application.recordsToDo>
<!--- We are done. Set all the application variables we used back to zero, so they do not get in the way when we start the job again --->
<cfset application.batchNumber = 0>
<cfset application.recordsToDo = 0>
<cfset application.recordsDone = 0>
<cfset application.recordsPerBatch = 0>
<cfset application.recordsDoneErrors = 0>
<cfset application.JobStarted = 0>
<!--- If the number of records we have done is the same as the number of records in a batch times the current batchnumber plus one, we are done with the batch. --->
<cfelseif application.recordsDone eq application.recordsPerBatch*(application.batchNumber+1)
and application.recordsDone neq application.recordsToDo>
<cfset application.batchNumber++>
<cfset doEventAnnounce = true>
</cfif>
</cflock>
<cfif doEventAnnounce>
<!--- Fire off the event that starts the job. All the info it needs is in the applicationscope. --->
<cfhttp url="#URURLHERE#/index.cfm" method="post">
<cfhttpparam type="url" name="event" value="startBigJob">
</cfhttp>
</cfif>
I am trying to cache a block of code that instantiates two objects (the primary object extends the generic abstract one.
Without the cache-specific code everything works fine. But when I run the below code I only get a blank page. I'm unsure if this is expected behavior but I doubt it.
I'm calling this like so:
test.cfm
<cfset foobar = CreateObject("foo") />
<cfset foobar.pushLeads(
a = 1,
b = 2
) />
foo.cfc
<cffunction name="pushLeads" access="public" returntype="void">
<cfargument name="a" required="true" />
<cfargument name="b" required="true" />
<cfset local.cachedVendorData = cacheGet("vendorExport") />
<cfif IsNull(local.cachedVendorData)>
<cfsavecontent variable="local.vendorCFC">
<cfset local.leadsObj = createobject("baz").init() />
<!--- Take leads and pass into cfc for pushing to remote server --->
<cfset test = local.leadsObj.pushLeadData(
a = arguments.a,
b = arguments.b
) />
<cfdump var="#test#">
</cfsavecontent>
<cfoutput>#local.vendorCFC#</cfoutput>
<cfset cachePut("vendorExport", local.vendorCFC, CreateTimeSpan(0,0,1,0))>
</cfif>
</cffunction>
Edit - I forgot to add before that, before caching I had a CFDUMP that would show all results returned. Now that I added caching the dump results are not appearing.
First I am very new to ColdFusion, but am learning pretty quickly. So I am trying to build a large database that originally displays all results with 25 lines per page and have a next/prev link to navigate through the pages.
That all works fine, but when I perform a search, and when the new results display of about a couple of pages worth, the pagination links don't work. When I click on the "next" link it goes back to the original all records display. How can I fix this or what do I need to do to make it work?
Sorry I'm new at posting and this is my first one. Hope this is better.
My pagination code...
<cfset Next = StartRow + DisplayRows>
<cfset Previous = StartRow - DisplayRows>
<cfoutput>
<cfif Previous GTE 1>
<b>Previous #DisplayRows# Records</b>
<cfelse>
Previous Records
</cfif>
<b> | </b>
<cfif Next lte records.RecordCount>
<a href="#CGI.Script_Name#?StartRow=#Next#"><b>Next
<cfif (records.RecordCount - Next) lt DisplayRows>
#Evalute((records.RecordCount - Next)+1)#
<cfelse>
#DisplayRows#
</cfif>Records</b></a>
<cfelse> Next Records
</cfif>
<cfoutput>
My code at the top...
<cfparam name="StartRow" default="1">
<cfparam name="DisplayRows" default="25">
<cfset ToRow = StartRow + (DisplayRows - 1)>
<cfif ToRow gt records.RecordCount>
<cfset ToRow = records.RecordCount>
</cfif>
Let me know if you need to see more...thank you.
Here is an example I whipped up (sorry if it is terse), and it covers things you already discussed with Mark. I also like Mark's <cfloop> examples above (below). Lol...Where ever this response ends up.
So we have:
query recordcount (max)
starting in your range
ending in your range
output per page
With bonus pageNum querystring for your next grouping of records (which I think is something you would like).
Then it can look like this in your page:
<cfparam name="pageNum" default="1">
<cfquery name="q" datasource="#application.dsn#">
select * from yourTable
</cfquery>
<cfset maxRows = 10>
<cfset startRow = min( ( pageNum-1 ) * maxRows+1, max( q.recordCount,1 ) )>
<cfset endRow = min( startRow + maxRows-1, q.recordCount )>
<cfset totalPages = ceiling( q.recordCount/maxRows )>
<cfset loopercount = round( q.recordCount/10 )>
<cfoutput>
<cfloop from="1" to="#looperCount#" index="i">
#i#
</cfloop>
</cfoutput>
<br><br>
<cfoutput
query="q"
startrow="#startRow#"
maxrows="#maxRows#">
#id#<br>
</cfoutput>
You need to show how you are actually navigating in your code - that's where the secret sauce is buried. You have everything else you need (maybe more than you need).
You probably have a cfoutput or cfloop in your code somewhere. You would use your startrow and displayrows to output a set number of rows from the records - like so:
<Cfoutput query="records" startrow="#next#" maxrows="#displayrows#">
... code to output your data goes here
</cfoutput>
If you are using cfloop it is similar.
<Cfloop query="records" startrow="#next#" endrow="#next+displayrows#">
...code to output your data.
</cfloop>
You can also use an index loop like so:
<cfloop from="#next#" to="#next+displayrows#" index="x">
.... your outputs will look like this:
#records[columname][x]#
</cfoutput>
HOpefully one of those samples will ring a bell. The logic you put in your code snippets is only creating a starting point and defining how many loops. It's the output that teases out the records to display.
Also note the comment - you almost never need evaluate() in your code.
I worked this out using a cfform tag with BACK - MORE - HOME submit buttons.
The first page had the query for ID 1 to 25 and a MORE submit button.
hidden field was the count 25
The next page had HOME and MORE buttons
Home had a hidden field of 1
More had a hidden field of count + 25 (50)
The next page had BACK HOME and MORE buttons
Back had hidden field of count - 25
HOME had a hidden field of 1
MORE had a hidden field of count + 25 (75)
and so on.
Query used the number of the hidden field depending on the value of the SUBMIT button to create the query WHERE and output the 25 rows
<cfif submit IS "NEXT">
<cfset count1 = #count# + 1>
<cfset count2 = #count# + 25>
<cfelseif submit is "BACK">
<cfset count1 = #count# - 26>
<cfset count2 = #count#>
<cfelseif submit is "HOME">
<cfset count1 = 1>
<cfset count2 = 25>
</cfif>
In query
SELECT *
FROM mytabl
WHERE ID BETWEEN #count1# AND #count2#
The display
<table>
<cfoutput query="myquery">
<tr>
<td>
#my data1#
</td>
<td>
#my data2#
</td>
</cfoutput>
</tr>
<table>
I seem to be having a problem with URL's which contain a percent sign. For example this URL is okay:
http://example.com/json.cfm/json_type/answer_grid/league/268/survey_id/323/requesttimeout/50000/team_view/0/division_id/0/group/0/return_script/1
However, this URL fails.
http://example.com/json.cfm/json_type/answer_grid/league/268/survey_id/323/requesttimeout/50000/team_view/0/division_id/0/group/0/return_script/%
It seems to produce a mysterious Jakarta/ISAPI error:
The requested URL was not found on this server! If you entered the URL
manually please check your spelling and try again. Jakarta/ISAPI/isapi_redirector/1.2.32 ()
The system seems to be rejecting the percent sign. How can I allow this to go through? The full URL I'm trying to pass is:
http://example.com/json.cfm/json_type/answer_grid/league/268/survey_id/323/requesttimeout/50000/team_view/0/division_id/0/group/0/return_script/%2Fmanager%5Fpro%2Ecfm%2Fleague%2F268%2Faction%2Fregistration%2Fcontent%5Faction%2Fmanagesurveys%2Ftabindex%2F1
Notice this one DOES work
http://example.com/json.cfm?json_type=answer_grid&league=268&survey_id=323&requesttimeout=50000&team_view=0&division_id=0&group=0&return_script=%2Fmanager%5Fpro%2Ecfm%2Fleague%2F268%2Faction%2Fregistration%2Fcontent%5Faction%2Fmanagesurveys%2Ftabindex%2F1
I use this code on CF9 to convert the URL's and it works fine:
<cffunction name="set_spider_friendly_urls" access="remote" returntype="string">
<cfset cfmx7_updated_path_info = "#cgi.script_name#/cgi.path_info">
<cfif findnocase("#cgi.script_name#/",cfmx7_updated_path_info) and not len(query_string)>
<cftry>
<CFSET str_path=replacenocase(cgi.path_info,"#cgi.script_name#/","","all")>
<CFSET str_path=replace(str_path,"//","/ /","all")>
<CFSET clear=structclear(url)>
<CFSET int_len=listlen(str_path,"/")>
<CFSET str_delim="/">
<cfloop index="int_cur" from="1" to="#int_len#" step="2">
<cfif int_cur eq int_len>
<CFSET clear=setvariable("url.#listgetat(str_path,int_cur,str_delim)#","")>
<cfelse>
<CFSET tmp_var=rereplace(listgetat(str_path,int_cur+1,str_delim),"["",/\\\*&()$%^#~ยด?;'']","","all")>
<CFSET clear=setvariable("url.#listgetat(str_path,int_cur,str_delim)#",tmp_var)>
</cfif>
</cfloop>
<CFSET bln_newurl=1>
<CFSET str_currentpage=cgi.path_info>
<cfcatch>
<cffile action="APPEND" file="#ExpandPath( "./" )#cc_gateway_logs\hurl.log" output="#now()#,#remote_addr#,#cgi.path_info#" addnewline="Yes">
</cfcatch>
</cftry>
<cfelse>
<CFSET str_currentpage=replacelist("#cgi.script_name#?#cgi.query_string#","?,&,=","/,/,/")>
<CFSET bln_newurl=0>
</cfif>
<cfreturn cfmx7_updated_path_info>
</cffunction>
% are usually used for encoding values, meaning most servers or services will try to decode it. You therefore need to encode it first.
% encodes as %25.
I have 2 columns of select boxes. The first (left) is populated by all columns of an uploaded CSV file.
The second (right) is all of the columns of a "Clients" table that they can import to. The number of pairs is determined by the number of total columns in the uploaded file.
Users can then go through and set what columns of their data will update which columns in our Clients table.
For instance, they would set the first box in the left to "Email" and the first box on the right to "Email" and their emails would be updated to the email column in our DB.
If they have a column called "Organization" and we only have "Company" then they can set it accordingly to update.
Basically mapping their imported clients, so they can use a wider range of column name convention.
I already have the loops setup to populate from some help here.
Now I'm trying to update the query.
Here's the selectboxes after the file is uploaded.
<form class="formContent960" id="csvmap" name="csvmap" method="post" action="custom_upload_update.cfm">
<table class="form960" cellpadding="5">
<tbody>
<!--- Set Uploaded file to Array --->
<cfset arrCSV = CSVToArray(CSVFilePath = #form.UploadedFile#,Delimiter = ",",Qualifier = """") />
<!--- Create Key array from column names --->
<cfloop from="1" to="#ArrayLen(arrCSV[1])#" index="t">
<!--- Variable Headers --->
<cfif Len(form.UploadedFile) GTE 5>
<cfoutput>
<select name="upfield[#t#]" class="search" id="Header">
</cfoutput>
<option selected value="">--- Headers Uploaded ---</option>
<cfoutput>
<cfloop from="1" to="1" index="i">
<cfloop from="1" to="#ArrayLen(arrCSV[i])#" index="j">
<option value="#arrCSV[i][j]#">#arrCSV[i][j]#</option>
</cfloop>
</cfloop>
</cfoutput>
</select> =
</cfif>
<!---Column Constants--->
<cfoutput>
<select name="bofield[#t#]" class="search" id="Column">
</cfoutput>
<option selected value="">--- Headers Clients ---</option>
<cfoutput>
<cfloop query="clientsCols">
<option value="#Column_name#">#Column_name#</option>
</cfloop>
</cfoutput>
</select><br /><br />
</cfloop>
</tbody>
<cfoutput>
<input type="hidden" name="filelength" id="filelength" value="#ArrayLen(arrCSV[1])#">
</cfoutput>
<input type="submit" name="csvmapsubmit" id="csvmapsubmit">
</table>
</form>
So I'm thinking I need to set a variable containing the values of the Clients(Right) columns select string to set which columns to update in the query inside of a loop.
Then set the uploaded fields to update the data in those rows inside a sub loop for the values.
Like:
<cfloop>
<cfset bostring = "#bofields#"/>
</cfloop>
<cfloop>
<cfquery name="addclientubmit" datasource="#request.dsn#">
INSERT INTO Clients
(
#bostring#
)
VALUES
(
<cfloop>
#uploaded Values#
</cfloop>
)
</cfquery>
</cfloop>
Not working with proper syntax, just trying to include my general logic of the issue for discussion purposes.
Any help would be appreciated.
Thank you in Advance,
Steve
Alternate Approach
Before I get to your current form, let me mention another option: using your database's import tools, like OPENROWSET or BULK INSERT. The former is a little more flexible it can be used from a SELECT statement. So you could do a direct insert from the CSV file, no looping. (I usually prefer to insert into a temp table first. Run a few validation queries, then insert/select the data into the primary table. But it depends on the application ..)
Anyway, once you have validated the column names, the insert with OPENROWSET is just a single query:
<!--- see below for how to validate list of column names --->
<cfquery name="insertRawData" datasource="yourDSN">
INSERT INTO YourTable ( #theSelectedColumnNames# )
SELECT *
FROM OPENROWSET( 'Microsoft.Jet.OLEDB.4.0'
,'text;HDR=YES;Database=c:\some\path\'
, 'SELECT * FROM [yourFileName.csv]' )
</cfquery>
Current Approach
Form:
Using your current method you would need to read the CSV file twice: once on the "mapping" page and again on the action page. Technically it could be as simple as giving the db column select lists the same name. So the names would be submitted as a comma delimited list:
<cfset csvHeaders = csvData[1]>
<cfloop array="#csvHeaders#" index="headerName">
<cfoutput>
Map file header: #headerName#
to column:
<select name="targetColumns">
<option value="" selected>--- column name---</option>
<cfloop query="getColumnNames">
<option value="#column_name#">#column_name#</option>
</cfloop>
</select>
</cfoutput>
<br>
</cfloop>
Validate Columns:
Then re-validate the list of column names against your db metadata to prevent sql injection. Do not skip that step!. (You could also use a separate mapping table instead, so as not to expose the db schema. That is my preference.)
<cfquery name="qVerify" datasource="yourDSN">
SELECT COUNT(COLUMN_NAME) AS NumberOfColumns
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'YourTableName'
AND COLUMN_NAME IN
(
<cfqueryparam value="#form.targetColumns#" cfsqltype="cf_sql_varchar">
)
</cfquery>
<cfif qVerify.recordCount eq 0 OR qVerify.NumberOfColumns neq listLen(form.targetColumns)>
ERROR. Missing or invalid column name(s) detected
<cfabort>
</cfif>
Insert Data:
Finally re-read the CSV file and loop to insert each row. Your actual code should contain a LOT more validation (handling of invalid column names, etcetera) but this is the basic idea:
<cfset csvData = CSVToArray(....)>
<!--- deduct one to skip header row --->
<cfset numberOfRows = arrayLen(csvData) - 1>
<cfset numberOfColumns = arrayLen(csvData[1])>
<cfif numberOfColumns eq 0 OR numberOfColumns neq listLen(form.targetColumns)>
ERROR. Missing or invalid column name(s) detected
<cfabort>
</cfif>
<cfloop from="1" to="#numberOfRows#" index="rowIndex">
<cfquery ...>
INSERT INTO ClientColumnMappings ( #form.targetColumns# )
VALUES
(
<cfloop from="1" to="#numberOfColumns#" index="colIndex">
<cfif colIndex gt 1>,</cfif>
<cfqueryparam value="#csvData[rowIndex][colIndex]#" cfsqltype="cf_sql_varchar">
</cfloop>
)
</cfquery>
</cfloop>
See if this will assist you. Please note that I have modified your initial code for demonstration purposes, but have denoted so you should be able to wire back up to test.
This can be tricky... but should give you a good starting point.
Please note that there are new tools available within Coldfusion for processing CSV files - I wrote my utilities in 2008 for CF 8, but they still are in use today. Compare and contrast what works for you.
Hope this helps.
=== cfm page
<!---import csv utility component (modify for your pathing)--->
<cfset utilcsv = CreateObject("component","webroot.jquery.stackoverflow.csvColumnMap.utils_csv_processing_lib")>
<!---declare the csv file (modify for your pathing)--->
<cfset arrCSV = utilcsv.readinCSV(ExpandPath('./'),'Report-tstFile.csv') />
<!---declare the header row column values--->
<cfset headerRow = listToArray(arrCSV[1],',')>
<!---declare the column names query--->
<cfset q = QueryNew('offer,fname,lname,address,city,state,zip',
'CF_SQL_VARCHAR,CF_SQL_VARCHAR,CF_SQL_VARCHAR,CF_SQL_VARCHAR,CF_SQL_VARCHAR,CF_SQL_VARCHAR,CF_SQL_VARCHAR')>
<cfset colList = q.columnList>
<!---form submission processing--->
<cfif isdefined("form.csvmapsubmit")>
<cfset collection = ArrayNew(1)>
<!---collect the column and column map values : this step could be eliminated by
just assigning the the arrays in the next step, however this allows reference for
dump and debug--->
<cfloop collection="#form#" item="key">
<cfif FIND('BOFIELD',key) && trim(StructFind(form,key)) neq "">
<cfset fieldid = ReREPLACE(key,"\D","","all")>
<cfset valueKey = 'UPFIELD[' & fieldid & ']'>
<cfset t = { 'column'=StructFind(form,key),'value'=StructFind(form,valueKey) }>
<cfset arrayappend(collection,t)>
</cfif>
</cfloop>
<!---collect the column and column map values : this ensures that the table column is in the same position as the mapped column for the sql statement--->
<cfset tblColsArr = ArrayNew(1)>
<cfset valColsArr = ArrayNew(1)>
<cfloop index="i" from="1" to="#ArrayLen(collection)#">
<cfset arrayappend(tblColsArr, collection[i]['column'])>
<cfset arrayappend(valColsArr, collection[i]['value'])>
</cfloop>
<!---convert the uploaded data into an array of stuctures for iteration--->
<cfset uploadData = utilcsv.processToStructArray(arrCSV)>
<!---loop uploaded data--->
<cfloop index="y" from="1" to="#ArrayLen(uploadData)#">
<!---create sql command for each record instance--->
<cfset sqlCmd = "INSERT INTO Clients(" & arraytolist(tblColsArr) & ") Values(">
<cfloop index="v" from="1" to="#ArrayLen(valColsArr)#">
<!---loop over the column maps to pull the approriate value for the table column--->
<cfif isNumeric(trim(valColsArr[v])) eq true>
<cfset sqlCmd &= trim(uploadData[y][valColsArr[v]])>
<cfelse>
<cfset sqlCmd &= "'" & trim(uploadData[y][valColsArr[v]]) & "'">
</cfif>
<cfset sqlCmd &= (v lt ArrayLen(valColsArr)) ? "," : ")" >
</cfloop>
<!---perform insert for record--->
<!---
<cfquery name="insert" datasource="">
#REReplace(sqlCmd,"''","'","ALL")# <!---In the event that the quotation marks are not formatted properly for execution--->
</cfquery>
--->
</cfloop>
</cfif>
<form class="formContent960" id="csvmap" name="csvmap" method="post">
<table class="form960" cellpadding="5">
<tbody>
<cfloop from="1" to="#ArrayLen(headerRow)#" index="t">
<tr>
<td>
<!--- Variable Headers --->
<cfif ArrayLen(headerRow) GTE 5>
<cfoutput>
<select name="upfield[#t#]" class="search" id="Header">
<option selected value="">--- Headers Uploaded ---</option>
<cfloop from="1" to="#ArrayLen(headerRow)#" index="j"><option value="#headerRow[j]#">#headerRow[j]#</option></cfloop>
</select> =
</cfoutput>
</cfif>
</td>
<td>
<!---Column Constants--->
<cfoutput>
<select name="bofield[#t#]" class="search" id="Column">
<option selected value="">--- Headers Clients ---</option>
<cfloop list="#colList#" index="li" delimiters=","><option value="#li#">#li#</option></cfloop>
</select>
</cfoutput>
</td>
</tr>
</cfloop>
<tr>
<td> </td>
<td>
<cfoutput>
<input type="hidden" name="filelength" id="filelength" value="#ArrayLen(headerRow)#">
</cfoutput>
<input type="submit" name="csvmapsubmit" id="csvmapsubmit">
</td>
</tr>
</tbody>
</table>
</form>
== utils_csv_processing_lib.cfc
<!---////////////////////////////////////////////////////////////////////////////////
//// CSV File Processing - Read In File /////
//// Return is array with each array item being a row /////
//// 9.22.08 BP /////
//// /////
/////////////////////////////////////////////////////////////////////////////////--->
<cffunction name="readinCSV" access="public" returntype="array">
<cfargument name="fileDirectory" type="string" required="yes">
<cfargument name="fileName" type="string" required="yes">
<!---/// 1. read in selected file ///--->
<cffile action="read" file="#fileDirectory##fileName#" variable="csvfile">
<!---/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
// 2. set csv file to array ***Note; the orginal csv file ListToArray only used the carrige return/line return as delimiters, ///
// so each array value/member is a full record in comma delimited format (i.e.: 01, Fullname, Address1, City, etc) //////////--->
<cfset csvList2Array = ListToArray(csvfile, "#chr(10)##chr(13)#")>
<cfset ret = checkCSVRowLengths(csvList2Array)>
<cfreturn ret>
</cffunction>
<!---////////////////////////////////////////////////////////////////////////////////
//// Create Structured Array of CSV FILE /////
//// Return is a structured array uing the colmn header as the struct element name //
//// 9.22.08 BP /////
//// /////
//// ****UPDATED 1.6.09********** /////
//// Added empty field file processing - takes empty value /////
//// and replaces with "nul" /////
//// /////
/////////////////////////////////////////////////////////////////////////////////--->
<cffunction name="processToStructArray" access="public" returntype="array">
<cfargument name="recordFile" type="array" required="yes">
<!---retrieve the placeholder we are setting for strings containing our default list delimiter (",")--->
<cfinvoke component="utils_csv_processing_lib" method="SetGlobalDelimiter" returnvariable="glblDelimiter">
<!---/// 1. get length of array (number of records) in csv file ///--->
<cfset csvArrayLen = ArrayLen(recordFile)>
<!---/////////////////////////////////////////
//// EMPTY VALUE Processing //
//////////////////////////////////////////--->
<!---// a. create array to hold updated file for processing--->
<cfset updatedRowsFnlArr = ArrayNew(1)>
<!---// b. loop entire csv file to process each row--->
<cfloop index="li2" from="1" to="#csvArrayLen#">
<!---// c. grab each column (delimited by ",") for internal loop. *******The value of each array index/item is a comma delimited list*******--->
<cfset currRecRow = #recordFile[li2]#>
<!---/// d. loop each row in file--->
<cfloop list="#currRecRow#" index="updateRowindex" delimiters="#chr(10)##chr(13)#">
<!---// e. find and replace empty column values in list with a set value for processing--->
<!---consolidated for single list output per array index: regenerates a value of val,val,val for a value of val,,val--->
<!---// process middle positions in list //--->
<cfset currRowListed = updateRowindex>
<cfset updatedRowListed = REreplace(currRowListed,",,",",nul,","ALL")>
<cfset updatedRowListed = REreplace(updatedRowListed,",,",",nul,","ALL")>
<!---// process 1st position in list //--->
<cfset frstpos = REFIND(",",updatedRowListed,1)>
<cfif frstpos EQ 1>
<cfset updatedRowListed = REReplace(updatedRowListed,",","nul,","one")>
</cfif>
<!---// process last position in list //--->
<cfset rowStrngLen = Len(updatedRowListed)>
<cfset lastpos = REFIND(",",updatedRowListed,rowStrngLen)>
<cfif lastpos EQ rowStrngLen>
<cfset updatedRowListed = updatedRowListed & "nul">
</cfif>
<!---// f. append current row with updated value of 'nul' for empty list positions to array--->
<cfset ArrayAppend(updatedRowsFnlArr, updatedRowListed)>
</cfloop>
</cfloop>
<!---/// 2. get number of records in updated array--->
<cfset updatedRowsFnlLen = ArrayLen(updatedRowsFnlArr)>
<!---/// 3. set the first item in the array to a variable (at postion 1). This will set the entire first record to the variable, delimited by commas ///--->
<cfset getRecColumns = updatedRowsFnlArr[1]>
<!---/// 4. get length of 1st record row, which will tell us hom many columns are in the csv file ///--->
<cfset ColumnCount = ListLen(updatedRowsFnlArr[1],",")>
<!---/// 5. create array to hold value for return and start loop of list *****Loop started at 2 to exclude header row***** ///--->
<cfset recordArr = ArrayNew(1)>
<cfloop index="i" from="2" to="#updatedRowsFnlLen#">
<!---/// 6. grab each column (delimited by ",") internal loop. The value of each array index/item is a comma delimited list ///--->
<cfset currRecRow = #updatedRowsFnlArr[i]#>
<!---/// 7. We now create a structure and assign each row value to the corresponding header within the structure ///--->
<cfset recordStruct = StructNew()>
<cfloop index="internal" from="1" to="#ColumnCount#">
<!---conditional to set the 'nul' value added for empty list position values in order to process back to empty values--->
<cfif listGetAt(currRecRow,internal,",") NEQ 'nul'>
<!---check for global placeholder delimiter and reset to ","--->
<cfif FIND(glblDelimiter,listGetAt(currRecRow,internal,",")) NEQ 0>
<cfset resetDelimiterVal = Replace(listGetAt(currRecRow,internal,","),glblDelimiter,',','All')>
<cfelse>
<cfset resetDelimiterVal = listGetAt(currRecRow,internal,",")>
</cfif>
<cfset recordStruct[listGetAt(getRecColumns,internal,",")] = resetDelimiterVal>
<cfelse>
<cfset recordStruct[listGetAt(getRecColumns,internal,",")] = "">
</cfif>
</cfloop>
<!---/// 8. append the struct to the array ///--->
<cfset ArrayAppend(recordArr,recordStruct)>
</cfloop>
<cfreturn recordArr>
</cffunction>
<!---////////////////////////////////////////////////////////////////////////////////
//// SetGlobalDelimiter /////
//// Sets a placeholder for strings containing the primary delimiter (",") /////
//// 02.6.11 BP /////
/////////////////////////////////////////////////////////////////////////////////--->
<cffunction name="SetGlobalDelimiter" access="public" returntype="string" hint="set a placeholder delimiter for the strings that contain the primary list comma delimiter">
<cfset glblDelimiter = "{_$_}">
<cfreturn glblDelimiter>
</cffunction>
===missing cfc function
<!---////////////////////////////////////////////////////////////////////////////////////////////////////////
//// checkCSVRowLengths /////
//// due to some inconsistencies in excel, some csv files drop the delimiter if list is empty /////
//// 7.20.11 BP /////
/////////////////////////////////////////////////////////////////////////////////////////////////////////--->
<cffunction name="checkCSVRowLengths" access="public" returntype="array">
<cfargument name="readArray" type="array" required="yes">
<cfset column_row = readArray[1]>
<cfset column_row_len = listlen(column_row,',')>
<cfloop index="i" from="2" to="#ArrayLen(readArray)#">
<cfset updateRowindex = readArray[i]>
<cfif listlen(updateRowindex) lt column_row_len>
<!---// process middle positions in list //--->
<cfset currRowListed = updateRowindex>
<cfset updatedRowListed = REreplace(currRowListed,",,",",nul,","ALL")>
<cfset updatedRowListed = REreplace(updatedRowListed,",,",",nul,","ALL")>
<!---// process 1st position in list //--->
<cfset frstpos = REFIND(",",updatedRowListed,1)>
<cfif frstpos EQ 1>
<cfset updatedRowListed = REReplace(updatedRowListed,",","nul,")>
</cfif>
<!---// process last position in list //--->
<cfset rowStrngLen = Len(updatedRowListed)>
<cfset lastpos = REFIND(",",updatedRowListed,rowStrngLen)>
<cfif lastpos EQ rowStrngLen>
<cfset updatedRowListed = updatedRowListed & "nul">
</cfif>
<cfelse>
<cfset updatedRowListed = updateRowindex>
</cfif>
<cfif listlen(updatedRowListed) lt column_row_len>
<cfset lc = column_row_len - listlen(updatedRowListed)>
<cfloop index="x" from="1" to="#lc#">
<cfset updatedRowListed = updatedRowListed & ',nul'>
</cfloop>
</cfif>
<cfset readArray[i] = updatedRowListed>
</cfloop>
<cfreturn readArray>
</cffunction>