Sorting Excel rows alphabetically in F# (Office.Interop) - excel

I am using the Excel interop in Visual Studio 2010 to try to sort all of these rows of data alphabetically. Some are already in alphabetical order.
Accountancy Graduate, Trainees Banking, Insurance, Finance
Accountancy Graduate, Trainees Customer Services
Accountancy Graduate, Trainees Education
Accountancy Graduate, Trainees Health, Nursing
Accountancy Graduate, Trainees Legal
Accountancy Graduate, Trainees Management Consultancy
Accountancy Graduate, Trainees Media, New Media, Creative
Accountancy Graduate, Trainees Oil, Gas, Alternative Energy
Accountancy Graduate, Trainees Public Sector & Services
Accountancy Graduate, Trainees Recruitment Sales
Accountancy Graduate, Trainees Secretarial, PAs, Administration
Accountancy Graduate, Trainees Telecommunications
Accountancy Graduate, Trainees Transport, Logistics
The current version of my code is as follows (I'm getting my code to work in interactive before putting it into an fs file).
#r "office.dll"
#r "Microsoft.Office.Interop.Excel.dll"
open System;;
open System.IO;;
open Microsoft.Office.Interop.Excel;;
let app = new ApplicationClass(Visible = true)
let inputBook = app.Workbooks.Open #"C:\Users\simon.hayward\Dropbox\F# Scripts\TotalJobsSort\SortData.xlsx" //work
//let inputBook = app.Workbooks.Open #"C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\SortData.xlsx" //home
let outputBook = app.Workbooks.Add()
let inSheet = inputBook.Worksheets.[1] :?> _Worksheet
let outSheet = outputBook.Worksheets.[1] :?> _Worksheet
let rows = inSheet.UsedRange.Rows.Count;;
let toSeq (range : Range) =
seq {
for r in 1 .. range.Rows.Count do
for c in 1 .. range.Columns.Count do
let cell = range.Item(r, c) :?> Range
yield cell
}
for i in 1 .. rows do
let mutable row = inSheet.Cells.Rows.[i] :?> Range
row |> toSeq |> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort |>
(outSheet.Cells.Rows.[i] :?> Range).Value2 <- row.Value2;;
app.Quit();;
But there is a problem with types. The final line before the quit command
(outSheet.Cells.Rows.[i] :?> Range).Value2 <- row.Value2;;
Is red underlined by intellisense and the error I get is
"This expression is expected to have type seq -> 'a but here has type unit".
I get what VS is trying to tell me, but I have made several attempts to fix this now and i can't seem to get around the type issue.
Can anyone please advise how I can get the pipeline to the correct type so that the output will write to my output sheet?
EDIT 1: This is the full error message that I get with the sorted variable commented out as follows
let sorted = row |> toSeq //|> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort
The error message is:-
System.Runtime.InteropServices.COMException (0x800A03EC): Exception from HRESULT: 0x800A03EC
at System.RuntimeType.ForwardCallToInvokeMember(String memberName, BindingFlags flags, Object target, Int32[] aWrapperTypes, MessageData& msgData)
at Microsoft.Office.Interop.Excel.Range.get_Item(Object RowIndex, Object ColumnIndex)
at FSI_0122.toSeq#34-47.Invoke(Int32 c) in C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\sortExcelScript.fsx:line 36
at Microsoft.FSharp.Collections.IEnumerator.map#109.DoMoveNext(b& )
at Microsoft.FSharp.Collections.IEnumerator.MapEnumerator1.System-Collections-IEnumerator-MoveNext()
at Microsoft.FSharp.Core.CompilerServices.RuntimeHelpers.takeOuter#651[T,TResult](ConcatEnumerator2 x, Unit unitVar0)
at Microsoft.FSharp.Core.CompilerServices.RuntimeHelpers.takeInner#644[T,TResult](ConcatEnumerator2 x, Unit unitVar0)
at <StartupCode$FSharp-Core>.$Seq.MoveNextImpl#751.GenerateNext(IEnumerable1& next)
at Microsoft.FSharp.Core.CompilerServices.GeneratedSequenceBase1.MoveNextImpl()
at Microsoft.FSharp.Core.CompilerServices.GeneratedSequenceBase1.System-Collections-IEnumerator-MoveNext()
at Microsoft.FSharp.Collections.SeqModule.ToArray[T](IEnumerable1 source)
at Microsoft.FSharp.Collections.ArrayModule.OfSeq[T](IEnumerable1 source)
at .$FSI_0122.main#() in C:\Users\Simon Hayward\Dropbox\F# Scripts\TotalJobsSort\sortExcelScript.fsx:line 42
Stopped due to error
EDIT 2: Could this problem be due to the toSeq function being designed to turn a whole sheet into a sequence? Where I apply it I only want it to apply to one row.
I have tried limiting the r variable in toSeq to 1, but this didn't help.
Does the fact that my actual data is a jagged array matter? It does not always have 3 entries in each row, it varies between 1 and 4.
EDIT 3:
Here is the current iteration of my code, based on Tomas' suggestions
#r "office.dll"
#r "Microsoft.Office.Interop.Excel.dll"
open System;;
open System.IO;;
open Microsoft.Office.Interop.Excel;;
let app = new ApplicationClass(Visible = true);;
let inputBook = app.Workbooks.Open #"SortData.xlsx" //workbook
let outputBook = app.Workbooks.Add();;
let inSheet = inputBook.Worksheets.[1] :?> _Worksheet
let outSheet = outputBook.Worksheets.[1] :?> _Worksheet
let rows = inSheet.UsedRange.Rows.Count;;
let columns = inSheet.UsedRange.Columns.Count;;
// Get the row count and calculate the name of the last cell e.g. "A13"
let rangeEnd = sprintf "A%d" columns
// Get values in the range A1:A13 as 2D object array of size 13x1
let values = inSheet.Range("A1", rangeEnd).Value2 :?> obj[,]
// Read values from the first (and only) column into 1D string array
let data = [| for i in 1 .. columns -> values.[1, i] :?> string |]
// Sort the array and get a new sorted 1D array
let sorted1D = data |> Array.sort
// Turn the 1D array into 2D array (13x1), so that we can write it back
let sorted2D = Array2D.init 1 columns (fun i _ -> data.[i])
// Write the data to the output sheet in Excel
outSheet.Range("A1", rangeEnd).Value2 <- sorted2D
But because the actual data has a variable number of entries in each row I am getting the standard range exception error (this is an improvement on the HRESULT exception errors of the last few days at least).
So I need to define columns for each individual row, or just bind the length of the row to a variable in the for loop. (I would guess).

It looks like you have an additional |> operator at the end of the line with Seq.sort - this means that the list is sorted and then, the compiler tries to pass it to the expression that performs the assignment (which does not take any parameter and has a type unit).
Something like this should compile (though there may be some other runtime issues):
for i in 1 .. rows do
let row = inSheet.Cells.Rows.[i] :?> Range
let sorted = row |> toSeq |> Seq.map (fun x -> x.Value2.ToString()) |> Seq.sort
(outSheet.Cells.Rows.[i] :?> Range).Value2 <- Array.ofSeq sorted
Note that you do not need to mark row as mutable, because the code creates a copy (and - in my version - assigns it to a new variable sorted).
I also use Array.ofSeq to convert the sorted sequence to an array, because I think the Excel interop works better with arrays.
When setting the Value2 property on a range, the size of the range should be the same as the size of the array that you're assigning to it. Also, depending on the range you want to set, you might need a 2D array.
EDIT Regarding runtime errors, I'm not entirely sure what is wrong with your code, but here is how I would do the sorting (assuming you have just one column with string values and you want to sort the rows):
// Get the row count and calculate the name of the last cell e.g. "A13"
let rows = inSheet.UsedRange.Rows.Count
let rangeEnd = sprintf "A%d" rows
// Get values in the range A1:A13 as 2D object array of size 13x1
let values = inSheet.Range("A1", rangeEnd).Value2 :?> obj[,]
// Read values from the first (and only) column into 1D string array
let data = [| for i in 1 .. rows -> values.[i, 1] :?> string |]
// Sort the array and get a new sorted 1D array
let sorted1D = data |> Array.sort
// Turn the 1D array into 2D array (13x1), so that we can write it back
let sorted2D = Array2D.init rows 1 (fun i _ -> data.[i])
// Write the data to the output sheet in Excel
outSheet.Range("A1", rangeEnd).Value2 <- sorted

Related

Finding a special row in a data frame in Rcpp ( filtering in Rcpp corresponding to filter() in R)

I am very new in Rcpp. Assume we have two data frames: edge and ref, edge consists of three columns: time, sender, receiver. ref consists of three columns sender, receiver and teller. teller shows the indices of rows which is from 1 to nrow(ref). You can see an example below. I want to go through each row of "edge" and find which row of the "ref" is the same as that. Let's say you go through edge and ref and find that the index of that row in ref is 10. Then I create a data frame, say "dat", with two columns: time and status. Then I replace the corresponding value in dat$status with one. That is, dat$status[10]<- 1. I wrote the code in R as follows:
cdata <- lapply(2:nrow(edge), function(z) {
welke <- filter(ref, sender == edge[z, "sender"], receiver == edge[z, "receiver"])$teller
dat <- matrix(0, nrow = nrow(ref), ncol = 2) %>%
as.data.frame() %>%
set_colnames(c("time","status"))
dat$status[welke] <- 1
return(dat)
}) %>%
dplyr::bind_rows()
I do not know how can I translate into Rcpp.

Splitting the output obtained by Counter in Python and pushing it to Excel

I am using the counter function to count every word of the description of 20000 products and see how many times this word repeats like 'pipette' repeats 1282 times.To do this i have split a column A into many columns P,Q,R,S,T,U & V
df["P"] = df["A"].str.split(n=10).str[0]
df["Q"] = df["A"].str.split(n=10).str[1]
df["R"] = df["A"].str.split(n=10).str[2]
df["S"] = df["A"].str.split(n=10).str[3]
df["T"] = df["A"].str.split(n=10).str[4]
df["U"] = df["A"].str.split(n=10).str[5]
df["V"] = df["A"].str.split(n=10).str[6]
This shows the splitted products
And the i am individually counting all of the columns and then add them to get the total number of words.
d = Counter(df['P'])
e = Counter(df['Q'])
f = Counter(df['R'])
g = Counter(df['S'])
h = Counter(df['T'])
i = Counter(df['U'])
j = Counter(df['V'])
m = d+e+f+g+h+i+j
print(m)
This is the image of the output i obtained on using counter.
Now i want to transfer the output into a excel sheet with the Keys in one column and the Values in another.
Am i using the right method to do so? If yes how shall i push them into different columns.
Note: Length of each key is different
Also i wanna make all the items of column 'A' into lower case so that the counter does not repeat the items. How shall I go about it ?
I've been learning python for just a couple of months but I'll give it a shot. I'm sure there are some better ways to perform that same action. Maybe we both can learn something from this question. Let me know how this turns out. GoodLuck
import pandas as pd
num = len(m.keys())
df = pd.DataFrame(columns=['Key', 'Value']
for i,j,k in zip(range(num), m.keys(), m.values()):
df.loc[i] = [j, k]
df.to_csv('Your_Project.csv')

Get value from query string in Python 3 without the [' '] showing up in the value

I have the following code in a Python 3 http server parse out a URL and then parse out a query string:
parsedURL = urlparse(self.path)
parsed = parse_qs(parsedURL.query)
say that parsedURL.query in this case turns about to be x=7&=3.I want to get the 7 and the 3 out and set them equal to variables x and y. I've tried both
x = parsed['x']
y = parsed['y']
and
x = parsed.get('x')
y = parsed.get('y')
both of these solutions come up with x = ['7'] and y = ['3'] but I don't want the brackets and single quotes, I want just the values 7 and 3, and I want them to be integers. How do I get the values out and get rid of the brackets/quotes?
Would simply:
x = int(parsed['x'][0])
y = int(parsed['y'][0])
or
x = int(parsed.get('x')[0])
y = int(parsed.get('y')[0])
serve your purpose? You should of course have suitable validation checks, but all you want to do is convert the first element of the returned array to an int, so this code will do the business.
This is because the get() returns an array of values (I presume!) so if you try parsing url?x=1&x=2&x=foo you would get back a list like ['1', '2', 'foo']. Normally there is only one (or zero, of course) instance of each variable in a query string, so we just grab the first entry with [0].
Note the documentation for parse_qs() says:
Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.

MATLAB: Write Dynamic matrix to Excel

I'm using MATLAB R2009a and following this example:
http://uk.mathworks.com/help/matlab/matlab_external/using-a-matlab-application-as-an-automation-client.html
I'd like to edit it so that I can write a matrix of unknown size into a column in an excel sheet, therefore not explicitly stating the range. I've attempted it this way:
%Put MATLAB data into the worksheet
Hop = [47; 53; 93; 10]; %Pretend I don't know what size this matrix is.
p = length(Hop);
p = strcat('A',num2str(p));
eActivesheetRange = e.Activesheet.get('Range','A1:p');
eActivesheetRange.Value = Hop;
However, this errors out. I've tried several variations of this to no avail. For example, using 'A:B' puts this array in columns A and B in excel and a NAN into every cell beyond my array. As I only want column A filled, using simple ('Range','A') errors out also.
Thanks in advance for any advice you can offer.
You're having issues because you're trying to use your variable p in a string directly
range = 'A1:p';
'A1:p'
This isn't going to work, you want to include the value of p. There are a number of ways you can do this.
In the code you have provided, you have already set p = 'A10' so if you wanted to append that to your range, you'd perform string concatenation
p = 'A10';
range = strcat('A1:', p);
I personally prefer to use sprintf to place the number directly into my strings rather than concatenating a bunch of strings.
p = 10;
range = sprintf('A1:A%d', p)
'A1:A10`
So if we adapt your code to use this we should get
range = sprintf('A1:A%d', numel(Hop));
eActivesheetRange = e.Activesheet.get('Range', range);
eActivesheetRange.Value = Hop;
Also just to be a little explicit, I would use numel rather than length as length is ambiguous. Also, I would flatten Hop into a column vector just to make sure that it's the proper dimension to be written to the spreadsheet.
eActivesheetRange.Value = Hop(:);
Essentially, the idea is to replace xx in 'B1:Bxx' with the number of elements in your matrix.
I tried this:
e = actxserver('Excel.Application');
eWorkbook = e.Workbooks.Add;
e.Visible = 1;
eSheets = e.ActiveWorkbook.Sheets;
eSheet1 = eSheets.get('Item',1);
eSheet1.Activate;
A = [1 2 3 4];
eActivesheetRange = e.Activesheet.get('Range','A1:A4');
eActivesheetRange.Value = A;
The above is directly from the link you shared. The reason why what you are trying to do is failing is that the p you pass into e.Activesheet.get() is a variable and not a string. To avoid this, try the following:
B = randi([0 10],10,1)
eActivesheetRange = e.Activesheet.get('Range',['B1:B' num2str(numel(B))]);
eActivesheetRange.Value = B;
Here, num2str(numel(B)) will pass in a string, which is the number of elements in B. This is variable in the sense that it depends on the number of elements in B.

F# insert row into Excel dynamically

I need to insert rows in excel dynamically. I am generating an invoice which can have a varied number of transactions and their for require a row for each transaction.
I want to start with one row (row 17) and duplicate it for n number of transactions.
Can anyone shed any light on doing this in F#?
I wouldn't necessarily recommend this approach, but if you're ok with dynamically being interpreted as by means of F#'s dynamic operator (an implementation of which can be found here), then you can do Interop without any reference to a specific library.
let xlApp = System.Runtime.InteropServices.Marshal.GetActiveObject "Excel.Application"
let rng : obj = xlApp?ActiveWorkbook?ActiveSheet?Rows(17)
rng?Copy()
rng?Insert()
That is, copy a row to the clipboard, and insert it at the same position.
Edit
It turns out that many Reflection-based definitions of the dynamic operator aren't suited to interop, including the one linked above. Supplying my own:
let (?) (o: obj) name : 'R =
let bindingFlags = System.Reflection.BindingFlags.GetProperty
let invocation args =
o.GetType().InvokeMember(name, bindingFlags, null, o, args)
let implementation args =
let argType, resType = FSharpType.GetFunctionElements typeof<'R>
if argType = typeof<unit> then [| |]
elif FSharpType.IsTuple argType then FSharpValue.GetTupleFields args
else [| args |]
|> invocation
|> fun res -> if resType = typeof<unit> then null else res
if FSharpType.IsFunction typeof<'R> then
FSharpValue.MakeFunction(typeof<'R>, implementation)
else invocation null
|> unbox<'R>
We can now define a function in order to insert a variable number of copied rows.
let copyAndInsertRow17 n =
if n > 0 then
let xlApp =
System.Runtime.InteropServices.Marshal.GetActiveObject "Excel.Application"
let sheet = xlApp?ActiveWorkbook?ActiveSheet
sheet?Rows(17)?Copy()
sheet?Range(sheet?Rows(17 + 1), sheet?Rows(17 + n))?Insert()

Resources