What Formula to use for this task? - excel

So for a recent interview I was given a set of tasks to carry out on a workbook. I'm now going through the tasks that I couldn't understand or complete. I was wondering if anyone could help me through it or know a site maybe better suited for this sort of thing?
Question
Dataset

So here's the original formula:
=IF(E2="national","10",IF(E2="local","25",""))
And they want you to reprice local calls so they're 5p per minute with a price cap of 25.
Easiest way to do that is to nest another IF statement:
=IF(E2="national","10",IF(E2="local",IF(K2>5,"25",K2*5),""))
In the future, please write a question title that others will be able to easily search for when they have the same issue.

Related

Spotfire In Statements, Ranges, Etc

I come from an SQL background, so trying to limit data coming into the Spotfire project using the expression builder has become a challenge. For instance, I can't seem to find an example of the equivalent of the following:
SELECT *
FROM TABLE
WHERE COLUMN IN ('X','Y','Z')
It's probably staring me in the face, but the only examples I can find are for IF and CASE statements, which are not what I want.
Another example is trying to find an example of an equivalent BETWEEN statement (i.e, WHERE COLUMN BETWEEN 1 and 20). I found the RANGE function, but I have not found any good examples on how to use it.
Does anyone have any suggestions to sites which would provide some examples of what should be easy expressions to create? The online Spotfire guide is not that helpful and my google fu has failed to find anything.
If you could give me some pointers on the two statements above, I would really appreciate it!
Thanks!
I managed to get some suggestions from some friends of mine. Gaia, thank you for your suggestion--YouTube was one suggestion I did receive. I also heard about community.tibco.com, which I will be checking out.

On batch inserts make sure checkid/promise resolves before next call

Throwing this out there but I know vitaly is such a hawk that he'll probably give me the answer or at least a solid hint ;)
I am basically uploading a batch of records and checking if the Company has already been added to a Company table. If not, I'll add the Company and then add all the records linking with the new CompanyID.
This isn't a pg-promise specific issue for sure but some of the dialogue here https://github.com/vitaly-t/pg-promise/blob/master/examples/select-insert.md show it's a true design concern and people are trying to come up with an elegant solution ... at least back in October.
I'm still new to asynchronous stuff but my gut says while I could insert a delay or possible chain promises together and process them in parallel (https://daveceddia.com/waiting-for-promises-in-a-loop/ or Resolve promises one after another (i.e. in sequence)?) but what vitaly mentions related to single query alternatives may be the real way to do this and not lock down the event loop
I'll hack on this today but I also want to do this in an elegant way since it's a common pattern
Thanks to vitaly in advance ;)
PS you rock vitaly

data and structure cleansing of excel sheets

I have over 6,000 excel sheets. While all the sheets describe the same thing, they are independently formatted. They all have between 9 and 13 columns, but they are out of order, the column names are independently misspelled, and they may or may not have a second, or third, column header.
I am currently trying in python to read cells in a left-down-right-up motion to attempt to locate the same data, but there is physically too many differences in structure names, column ordering, and data definitions to lock them in one a time. Is there a tool that I can use to read these documents and conform them to a single format, via a rapid mapping function?
Thanks much.
Thanks
Wow, it's the Ultimate Data Horror Story.
I want to ask how you ever let it get this way... but I actually don't want to know; I'm already going to have nightmares about this.
It's like that Hoarding show on TV, but with data.
No, I'm afraid that if you can't even identify a pattern then there's no magic function that will be able to either.
But that doesn't mean it's a lost cause. It's just going to need some human interaction, and there are ways to minimize the pain.
What you need is a custom interface that will load the documents one by one, and will walk a human through clicking each relevant column or area, and then automatically load the next document.
There would also need to be buttons for sorting things like obvious garbage sheets (blanks?), "unknowns" (that get put in a folder for advanced research later), and other "unpredictables" may come up during the process.
Also, perhaps once you get into it, you'll notice a pattern you're not thinking of, like maybe *"the person who handled the files from 2002 to 2004 set them up this way"*, or, "when Budget is misspelled, it's always either Bugdet or Budteg".
In this scope, little patterns like that can make a big difference.
Depending on your coding skills, you may or may not need outside assistance with this. I assume this is not data that can just get thrown out, or you wouldn't be asking...
If each document took an average of 20 seconds to process, that would be about 33 hours in total. An hour a day an it's done in a month. Or someone full-time, and it's done in a week.
Do you have a budget you can throw at this? Data archaeology is an actual thing! Hell, I'll do it for you for the right price... (wouldn't break the bank, depending on how urgent it is, of course!)
Either way, this ain't going to be fun for "someone"...

Generating multiple optimal solutions using Excel solver

Is there a way to get all optimal solutions when you are solving some problem with Excel Solver (Simplex LP method)?
If not, what is the best way/add-in to Excel to solve it and convert existing VBA code to use this new way?
Actually, I have found a way to do this with Excel solver, although it is not optimal in sense of time consumption but that is not issue for me.
If you can assign unique id for each possible solution on some way, which is true in my case, then for each solution you find you can check if there is some solution with same value with different id on following way :
Find first optimal solution and save solution id and result. I will call this origID, origRes
Check if there is some solution with id < origID and res = origRes
If yes, then consider newId as initial id and continue with step 2 until you can't find solutions which satisfied criteria
After that, do the same thing with condition id > origID and res = origRes
After you make sure you found all solutions with optimal solution origRes, then we can go and find solution which is not optimal as origRes. I did it on a way to add condition that new solution needs to be <= (origRes - 0.01) because I know that all solutions will be with 2 decimal places.
Go to step 2 again
I know this is not the best way but I usually do not need more than 100 solutions and currently I can get it in 2 mins which is acceptable for me.
Although this looks easy, it actually is not such an easy question. Even the definition of "all possible optimal solutions" is not clear. There may by infinitely many of them. Asking for "all basic feasible solutions" (i.e. corner points) sounds better. To my knowledge there are no solvers providing this. I also do not know of a really simple technique to enumerate all optimal bases.
One interesting approach is to use a MIP formulation to enumerate all optimal bases:
Sangbum Lee, Chan Phalakornkule, Michael M. Domach, Ignacio E. Grossmann, "Recursive MILP model for finding all the alternate optima in LP
models for metabolic networks," Computers and Chemical Engineering 24 (2000) 711-716. (link)

What is the best way to search multiple sources simultaneously?

I'm writing a phonebook search, that will query multiple remote sources but I'm wondering how it's best to approach this task.
The easiest way to do this is to take the query, start a thread per remote source query (limiting max results to say 10), waiting for the results from all threads and aggregating the list into a total of 10 entries and returning them.
BUT...which of the remote source is more important if all sources return at least 10 results, so then I would have to do a search on the search results. While this would yield accurate information it seems inefficient and unlikely to scale up well.
Is there a solution commercial or open source that I could use and extend, or is there a clever algorithm I can use that I've missed?
Thanks
John, I believe what you want is federated search. I suggest you check out Solr as a framework for this. I agree with Nick that you will have to evaluate the relative quality of the different sources yourself, and build a merge function. Solr has some infrastructure for this, as this email thread shows.
To be honest I haven't seen a ready solution, but this is why we programmers exist: to create a solution if one is not readily availble :-)
The way I would do it is similar to what you describe: using threads - if this is a web application then ajax is your friend for speed and usability, for a desktop app gui representation is not even an issue.
It sounds like you can't determine or guess upfront which source is the best in terms of reliability, speed & number of results. So you need to setup you program so that it determines best results on the fly. Let's say you have 10 data sources, and therfore 10 threads. When you fire up your threads - wait for the first one to return with results > 0. This is going to be you "master" result. As other threads return you can compare them to your "master" result and add new results. There is really no way to avoid this if you want to provide unique results. You can start displaying results as soon as you have your first thread. You don't have to update your screen right away with all the new results as they come in but if takes some time user may become agitated. You can just have some sort of indicator that shows that more results are available, if you have more than 10 for instance.
If you only have a few sources, like 10, and you limit the number of results per source you are waiting for, to like 10, it really shouldn't take that much time to sort through them in any programming language. Also make sure you can recover if your remote sources are not available. If let's say, you are waiting for all 10 sources to come back to display data - you may be in for a long wait, if one of the sources is down.
The other approach is to f00l user. Sort of like airfare search sites do - where they make you want a few seconds while they collect and sort results. I really like Kayak.com's implementation - as it make me feel like it's doing something unlike some other sites.
Hope that helps.

Resources