I'm doing some work across two visualisation packages, primarily Gephi but also a bespoke package called Linkoder.
They can both use .CSV formats, but data must be laid out differently, and I'm trying to find a way to do this quickly in Excel. Transpose, Offset, Direct commands all seem to come close, but not quite.
Based on my transcript, I need to convert this matrix:
Statement | links
No1 | 5 4 3 2 1
No2 | 3 1
No3 | 6 4 2
No4 | 5 2 1
...to this target-link format:
Target | Link
No1 | 5
No1 | 4
No1 | 3
No1 | 2
No1 | 1
No2 | 3
No2 | 1
No3 | 6
I am struggling to find a simple way to do this, but that's likely because I'm less adept at Excel formulations that I would like to be.
Anyone refer me to a command (or set) which can quickly convert between these formats? Because I'm looking at thousands of lines of links to convert...
Thanks in advance!
hey_arno
Even though there are ways to do that in Excel they are semi-automatic AFAIK. I would recommend you to have a look at OpenRefine, which is a tool to manipulate and tidy datasets. It can read from many sources including Excel. What you need is to split the Link column by space and then Transpose the resulting columns.
Check this tutorial and scroll down to Transposing columns for instruction on how to do it.
Related
When using the summary function "Group" in Saved Searches (SS), is it possible to show the total count inline or even use it inline? For example, I have a SS that counts the number of cases closed in a certain date range and it groups by the assigned employee and the total is listed at the bottom, as per usual. However, when trying to calculate the percent of the total each employee closed, they all show as 100%.
Here is a picture of the results and I have also added the formulas I am currently using. Here is what it looks like when I'm editing the search. The right most columns were my attempt at getting the total inline.
I'm fairly certain this is because I am grouping by the employees (or else there would be almost 3k lines in the report), but I don't think there is a better way to solve that problem other than by grouping by the employee.
We have tried doing an actual report in NetSuite (as opposed to saved search), however, the report times out quickly and we are hoping for a quicker solution. We also considered a KPI scorecard, but the issue would be that we would need to make a SS for each employee which isn't a good long-term solution due to team changes.
Is there a way of calculating the percent of the total when using grouping? Sorry for the long post, I was trying to be as descriptive as possible. The goal is to see how much (percent wise) each employee contributed to the total cases closed.
Take your SUM column that counts the closed cases.
duplicate it but add the function % of total
So
| Field | Summary Type | Function | Formula |
| ----- | ------------ | -------- | ---------------------------------------------------- |
|Formula Numeric | Sum | | Case When {status} like 'Closed%' then 1 else 0 End |
|Formula Numeric | Sum | % of Total | Case When {status} like 'Closed%' then 1 else 0 End |
So I would greatly appreciate your help I'm having quite a struggle. What I need currently is listed below. To begin I have a data set with many individuals and information. There are a few columns that I'm interested in.
Table 1:
So each Individual is either labeled as Free, Arc 1, Arc 2 or Arc 3. Each individual also has a number of people associated with it and lastly a cost.
Individual | # of people | Cost | Type | Compliant with Costs?
A |3 |45 |Free |Yes/No?
B |2 |57 |Arc 2 |
Table 2:
I then have a table below that is broken out. Free can have 1 2 or 3 individuals and cost can be between 20 - 30 dollars for 1 30-40 for 2 and 40-50 for 3.
|Free |Arc 1 |Arc2 | Arc 3
# of people | Cost | Cost | Cost | Cost
1 |20-30 |30-40 | 60-70| 90-100
2 |30-40 |40-50 | 70-80| 100-110
3 |40-50 |60-70 | 80-90| 110-120
So i want to take the Individual column noted above and say if Individual A1 is in Free and has 3 individuals is their cost between 40-50 dollars, if so yes, if not no.
I know this will use some if formulas probably many, I tried Index(match()) and such too but couldn't figure it out if you could help that would be greatly appreciated.
Below is an example of a sample Excel file, it looks easy with just two individuals but there are hundreds so I'm hoping there is an easy formula. Again any help is greatly appreciated.
Excel Screenshot Link (Same as example above)
If you change your reference table as follows, you can use a sumifs easily to pull the min and the max and see if your cost is in between
# Type Min Max
1 Free 20 30
1 Arc 1 60 70
2 Free 30 40
I agree that the best way would be to restructure your reference table, but if you'd like to know how to get your answer as is, you can use a combination of =INDEX(MATCH(),MATCH()),=LEFT() and =MID() to get your answer, as in the following example:
The formula in E2 is:
=IF(AND(C2>= (LEFT(INDEX($I$1:$M$5,MATCH(B2,$I$1:$I$5,0),MATCH(D2,$I$1:$M$1,0)),FIND("-",INDEX($I$1:$M$5,MATCH(B2,$I$1:$I$5,0),MATCH(D2,$I$1:$M$1,0)))-1)*1), C2<=MID(INDEX($I$1:$M$5,MATCH(B2,$I$1:$I$5,0),MATCH(D2,$I$1:$M$1,0)),FIND("-",INDEX($I$1:$M$5,MATCH(B2,$I$1:$I$5,0),MATCH(D2,$I$1:$M$1,0)))+1,256)*1),"Yes","No")
Background
I am building a simple dashboard in the Power BI plugins for Excel (Power Query, Power Pivot & Power View) to get some experience with Power BI. The dashboard is for presenting simple time reports made by a consultant (i.e. myself). The format i want to use for inputting data is in a Excel table as follows:
InputData:
Date | Timecode | Duration[hrs] | Tags
-----------|-----------|---------------|----------------------
2016-02-01 | CustomerA | 1.2 | Support;ProductA
2016-02-01 | CustomerB | 0.3 | Support;ProductB
2016-02-02 | Internal | 4.2 | Development;ProductA
The Date field is simple. The date that the time report is for. The Timecode is the "name" of the hours reported. In common software this is usually a Project code or similar, but i want to keep it on a customer basis. The Duration is a float representing the number of hours spent for that Timecode that day. The Tags column is the interesting part: to simplify input is want it to be a (semicolon) delimited string, but that wont do when creating a data model for Power view.
What I am trying to make is a separate table with the all the tags, and a link table to connect the tags to the corresponding rows from the time report. In the Power view report, I want to be able to filter my time reports on the tags, such as analyzing the time spent on ProductA or Support.
Question
How do you take a non-normalized field such as Tags above and replace it with a dimension table and a link-table, using Power BI plugins for Excel? How do I end up with the following Three tables:
TimeReport:
Date | Timecode | Duration[hrs] | TimeReportID
-----------|-----------|---------------|----------------------
2016-02-01 | CustomerA | 1.2 | 1
2016-02-01 | CustomerB | 0.3 | 2
2016-02-02 | Internal | 4.2 | 3
LinkTable:
TimeReportID | TagID
-------------|--------
1 | 1
1 | 2
2 | 1
2 | 3
3 | 4
3 | 2
TagsTable:
TagID | TagName
-------|----------
1 | Support
2 | ProductA
3 | ProductB
4 | Development
Attempt
By picking out only the Tags-column and then splitting, pivoting and removing duplicates (inspired by this link i have managed to create the list of all tags as in:
Tags:
TagName
----------
Support
ProductA
ProductB
Development
But I cant manage to understand how to link the tables to eachother. Please aid me in this.
I think you have 2 options:
using Power Query, add Merge & Expand Column steps to join TimeReport to LinkTable and then TagsTable
using Power Pivot, load all 3 tables then go to the Diagram view and establish relationships between them. Use "Hide from Client Tools" to hide the columns that are meaningless to the User e.g. TagID
I prefer Power Query as the functionality is more flexible and it is much easier to debug.
I have a sheet that looks something like this.
A | B | C
1 Age | how often | occupation
2 21 | I don't | student
3 22 |x times a week| photographer
4 23 | etc | student
5 22 | etc | builder
6 21 | etc | car mechanic
7 20 | I don't | student
I want to track various things, such as the amount of times a student said "I don't".
I'm using google spread sheets at the moment.
How in google spread sheets can I calculate this?
At the moment this is the query I'm using to try to calculate this.
=ARRAYFORMULA(sum((B2:B7="I don't") * (C2:C7="student")))
All results are coming up as zero - cannot seem to get a result.
If anyone could help it would be much appreciated.
Have also tried below with no luck.
=SUM(IF(B2:B7="I don't",IF(C2:C7="student",1,0)))
Any help would be greatly appreciated. Thank you very much.
In google-spreadsheets, as well as in Excel, you can use the COUNTIFS() (Google, Excel) function:
=COUNTIFS(B2:B7, "I don't", C2:C7, "student")
In a google spreadsheet you can also use SUMPRODUCT()
=sumproduct(B2:B7="I don't", C2:C7="student")
You have a few options to present this information, I'll talk to countifs and pivots:
To count the amount of occurrences a text has appeared in a range of cells you'll use a formula like =COUNTIFS(B2:B7, "I don't", C2:C7, "student") however, if you're looking to see how many the string "I don't" appears, ie.. "I don't know" or "I don't care for your formula" enter in as an array selecting crtl + Shift + Enter. How to Count the Occurrences of a Text String
But. For your case, which I'm guessing is interpreting the data and drawing comparisons, such as counting students vs builders, age vs occurrences, you're better off using a pivot table. This will pull together all the information and provide an automatic sum or count on the criteria set, quickly drawing comparisons. Google has a great how to guide you can refer off for google sheets Google Pivot Table
Hi all – not sure where to post this, but I am trying to figure out how to implement this idea. Basically I am implementing a timeline, but it differs slightly from examples I have seen.
I have 3 items - Projects, events, and a timeline. Projects are associated with multiple events, each of which occurs on a date. I want my timeline along x axis, and the projects listed down the y axis.
Each project is a “lane” has the same number of events, but they occur at different times for each project. The intent is to visualize how many of the events have occurred and when, per project.
I imagine the data would look like:
Event 1 | Event 2 | Event 3
Project 1 | 1/1/2015 | 5/1/2015 | 7/1/2015
Project 2 | 4/1/2014 | 9/1/2014 | --
Project 3 | 3/1/2015 | -- | --
Where blanks were not reported.
I tried an Excel scatter plot which gets me close , but I can’t figure out how to get the timeline on the x axis, and the “project” axis does not represent the project names, but instead displays numbers.
Any direction appreciated.
I think the basic problem is that there are no y-coordinates as such and it's not obvious what they should be.
Having played around with it a bit I came up with this data layout:-
So basically
Create a scatter with columns A & B
Add columns C & D
Add columns E & F
Add data labels.
You can use the dates for the data labels if you prefer.