How can I reshape without a unique "j" variable in Stata? - reshape

So here's my trouble: I'd like to reshape a long-format data file to wide. However, I don't have a unique "j" variable; each record in the long-format file has several key variables.
For example, I'd like to take this:
| caseid | gender | age | relationship to respondent|
|---------------------------------------------------|
| 1234 | F | 89 | mother |
| 1234 | F | 10 | daughter |
| 1235 | M | 15 | cousin |
etc
and turn it into this:
|caseid | gender1 | age1 | rel1 | gender2 | age2 | rel2 |
|--------------------------------------------------------------|
| 1234 | F | 89 | mother| F | 10 | daughter|
| 1235 | M | 15 | cousin| . | . | . |
etc
However, the data lack the suffix variable necessary for the reshape command. Is there a way that Stata will automatically generate this suffix?

Sample data:
+----------------------------------+
| caseid gender age relati~p |
|----------------------------------|
1. | 1234 F 89 mother |
2. | 1234 F 10 daughter |
3. | 1235 M 15 cousin |
4. | 1235 F 14 sister |
5. | 1235 F 55 mother |
|----------------------------------|
6. | 1236 M 32 brother |
7. | 1236 M 68 father |
+----------------------------------+
Generate a new id:
. by caseid: gen newid = _n
Gives you this:
+------------------------------------------+
| caseid gender age relati~p newid |
|------------------------------------------|
1. | 1234 F 89 mother 1 |
2. | 1234 F 10 daughter 2 |
3. | 1235 M 15 cousin 1 |
4. | 1235 F 14 sister 2 |
5. | 1235 F 55 mother 3 |
|------------------------------------------|
6. | 1236 M 32 brother 1 |
7. | 1236 M 68 father 2 |
+------------------------------------------+
Which you can now reshape with this:
. reshape wide gender age relationship, i(caseid) j(newid)
To get this:
+--------------------------------------------------------------------------------------------+
| caseid gender1 age1 relati~1 gender2 age2 relati~2 gender3 age3 relati~3 |
|--------------------------------------------------------------------------------------------|
1. | 1234 F 89 mother F 10 daughter . |
2. | 1235 M 15 cousin F 14 sister F 55 mother |
3. | 1236 M 32 brother M 68 father . |
+--------------------------------------------------------------------------------------------+

Related

How to create calculated column to assign a row tag to the Max Operation Sequence within a range of a range?

| Manufacturing Router Template ID | Operation Sequence | Gate | Calculated Column |
|------------------------------------|--------------------|------|---------------------|
| XXX1 | 10 | A | |
| XXX1 | 20 | A | |
| XXX1 | 30 | A | Last Gate Operation |
| XXX1 | 40 | B | |
| XXX1 | 50 | B | Last Gate Operation |
| XXX1 | 60 | C | |
| XXX1 | 70 | C | |
| XXX1 | 80 | C | |
| XXX1 | 90 | C | |
| XXX1 | 100 | C | Last Gate Operation |
| XXX2 | 10 | A | |
| XXX2 | 20 | A | Last Gate Operation |
| XXX2 | 30 | B | |
| XXX2 | 40 | B | |
| XXX2 | 50 | B | |
| XXX2 | 60 | B | Last Gate Operation |
| XXX2 | 70 | C | |
| XXX2 | 80 | C | Last Gate Operation |
Sample data shows 2 Router Template examples XXX1 and XXX2 that have a different number of operation sequences within it (XXX1 has 10 operations numbered 10-100 while XXX2 has 8 operations numbered 10-80). On top of that, the operations are assigned Gates (A, B, C). The calculated column shows the result i'm trying to generate: which is to have a formula that can assign the max operation sequence within each Gate (A, B, C) within each router template (XXX1, XXX2) a row tag 'Latest Gate Operation'
A formula i've tried using is a IF - MAX - INDEX function, kind of like this:
=IF(MAX(INDEX((A2=$A$2:$A$15000) * (B2=$B$2:$B$15000) * (C2=$C$2:$C$15000),)),"Lastest Gate Operation","")
Which is most definitely incorrect. Any help would be appreciated. thanks.

Excel In cell formula Number of surveys administered

Good Afternoon,
I have an excel sheet that records encounters with community residents by name and date. During each encounter a brief survey is also administered. I want to track changes to these survey questions by name over time. Is there any way to do this with in cell formulas? Here's an example of the table I have in mind:
| Name | Date | Q1 | Q2 | Stress | Survey Number |
| | | | | | (calculated) |
|--------------|------------------|----|----|--------|---------------|
| Steve Rogers | 5/1/2018 | y | y | 5 | 1 |
| Steve Rogers | 5/2/2018 | y | y | 6 | 2 |
| Tony Stark | 5/1/2018 | n | n | 10 | 1 |
| Nick Fury | 5/1/2018 | n | y | 8 | 1 |
| Nick Fury | 5/2/2018 | y | y | 5 | 2 |
| Tony Stark | 5/2/2018 | y | n | 8 | 2 |
| Tony Stark | 5/3/2018 | n | n | 4 | 3 |
I want to calculate the survey number by referencing the name and the date. I have no idea where to start, honestly. Is this even possible using an in-cell reference?
Use COUNTIFS()
=COUNTIFS(A:A,A2,B:B,"<=" & B2)
Put that in F2 and copy/drag down.

How to compose sales table for collections of items that are sold separately?

I want to compose sales table for purchased and sold items to see total profit. It's easy to do when items are purchased and sold individually or as a lot. But how to handle situation when one buys collection of items and sells them one by one. For example, I buy a collection (C) of a hammer and a screwdriver and sell tools separately. If I would enter data into simple table as in the image, I would get wrong profit result.
When there are only two items, I could divide their purchase price randomly, but when there are many items and not all of them are yet sold, I can't easily see if this collection already made profit or not.
I expect correct output of profit. In this case collection cost was 10 and selling price of all collection items was 13. Thus it should show profit of 3, not loss of -7. I was thinking of adding 2 new column, like IsCollection, CollectionID. Then derive a formula, which would use either simple subtraction or would check price of a whole collection and subtract it from the sum of items that belong to that collection. Deriving such formula is another question... But maybe there is an easier way of accomplishing the same
I added a column COLLECTION to identify item who belong to a collection.
Then I used SUMIF to sum sell price for items which belong at the same collection.
Then I used IF in Profit column to use summed sell price or single sell price.
You need to define in some formula a range of cell (see below).
Problem: you can't add profit values to obtain Total profit.
I used opencalc (but it should be almost the same in Excel).
Content of
SUM_COLL (row2):
=SUMIF($A$1:$A$22;"="&A2;$D$1:$D$22)
SUM_COLL (row3):
=SUMIF($A$1:$A$22;"="&A3;$D$1:$D$22)
and so on.
Profit (row2):
=IF(A2<>"";E2-C2;D2-C2)
Profit (row3):
=IF(A3<>"";E3-C3;D3-C3)
+------------+-----------+-------------+------------+----------+--------+
| COLLECTION | Item name | Purch Price | Sell Price | SUM_COLL | Profit |
+------------+-----------+-------------+------------+----------+--------+
| | A | 1 | 1.5 | 0 | 0.5 |
+------------+-----------+-------------+------------+----------+--------+
| | B | 2 | 2.1 | 0 | 0.1 |
+------------+-----------+-------------+------------+----------+--------+
| C | C1 | 10 | 7 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| C | C2 | 10 | 6 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| D | D1 | 7 | 15 | 23 | 16 |
+------------+-----------+-------------+------------+----------+--------+
| | E | 8 | 12 | 0 | 4 |
+------------+-----------+-------------+------------+----------+--------+
| C | C3 | 10 | 14 | 27 | 17 |
+------------+-----------+-------------+------------+----------+--------+
| D | D2 | 7 | 8 | 23 | 16 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
| | | | | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+
Update:
I added two more column to make Profit summable:
COUNT_COLL (row2):
=COUNTIF($A$1:$A$22;"="&A2)
COUNT_COLL (row3):
=COUNTIF($A$1:$A$22;"="&A3)
Profit_SUMMABLE (row2)
=IF(A2<>"";(E2-C2)/G2;D2-C2)
Profit_SUMMABLE (row3)
=IF(A3<>"";(E3-C3)/G3;D3-C3)
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| COLLECTION | Item name | Purch Price | Sell Price | SUM_COLL | Profit | COUNT_COLL | Profit_SUMMABLE |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | A | 1 | 1.5 | 0 | 0.5 | 0 | 0.5 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | B | 2 | 2.1 | 0 | 0.1 | 0 | 0.1 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C1 | 10 | 7 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C2 | 10 | 6 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| D | D1 | 7 | 15 | 23 | 16 | 2 | 8 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | E | 8 | 12 | 0 | 4 | 0 | 4 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| C | C3 | 10 | 14 | 27 | 17 | 3 | 5.6666666667 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| D | D2 | 7 | 8 | 23 | 16 | 2 | 8 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
| | | | | 0 | 0 | 0 | 0 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+
...
...
| TOTAL | | | | | 87.6 | | 37.6 |
+------------+-----------+-------------+------------+----------+--------+------------+-----------------+

Excel: Give scores based on range, where max = 1 and min = 10

I have following problem:
I want to give scores to a range of numbers from 1-10 for example:
| | A | B |
|---|------|----|
| 1 | 1209 | 1 |
| 2 | 401 | 7 |
| 3 | 123 | 9 |
| 4 | 49 | 10 |
| 5 | 30 | 10 |
(Not sure if B is 100% correct but roughly)
I got the B values with
=ABS(CEILING(A1;MAX($A$1:$A$32)/10)*10/MAX($A$1:$A$32)-11)
It seems to work but if I for example take numbers like
| | A | B |
|---|------|----|
| 1 | 100 | 1 |
| 2 | 90 | 2 |
| 3 | 80 | 3 |
| 4 | 70 | 4 |
| 5 | 50 | 6 |
But I want 50 to be 10.
I would like to have it scalable so I can do it with a 1-10 or 1-100 or 5-27 or whatever scale and with however many numbers in the list and whatever numbers to score from.
Thanks!
Use this formula:
=$E$1 + ROUND((MIN($A:$A)-A1)/((MAX($A:$A)-MIN($A:$A))/($E$1-$E$2)),0)
It is scalable. You put the max and min in E1 and E2.

excel I need formula in column name "FEBRUARY"

I have a set of data as below.
SHEET 1
+------+-------+
| JANUARY |
+------+-------+
+----+----------+------+-------+
| ID | NAME |COUNT | PRICE |
+----+----------+------+-------+
| 1 | ALFRED | 11 | 150 |
| 2 | ARIS | 22 | 120 |
| 3 | JOHN | 33 | 170 |
| 4 | CHRIS | 22 | 190 |
| 5 | JOE | 55 | 120 |
| 6 | ACE | 11 | 200 |
+----+----------+------+-------+
SHEET2
+----+----------+------+-------+
| ID | NAME |COUNT | PRICE |
+----+----------+------+-------+
| 1 | CHRIS | 13 | 123 |
| 2 | ACE | 26 | 165 |
| 3 | JOE | 39 | 178 |
| 4 | ALFRED | 21 | 198 |
| 5 | JOHN | 58 | 112 |
| 6 | ARIS | 11 | 200 |
+----+----------+------+-------+
The RESULT should look like this in sheet1 :
+------+-------++------+-------+
| JANUARY | FEBRUARY |
+------+-------++------+-------+
+----+----------+------+-------++-------+-------+
| ID | NAME |COUNT | PRICE || COUNT | PRICE |
+----+----------+------+-------++-------+-------+
| 1 | ALFRED | 11 | 150 || 21 | 198 |
| 2 | ARIS | 22 | 120 || 11 | 200 |
| 3 | JOHN | 33 | 170 || 58 | 112 |
| 4 | CHRIS | 22 | 190 || 13 | 123 |
| 5 | JOE | 55 | 120 || 39 | 178 |
| 6 | ACE | 11 | 200 || 26 | 165 |
+----+----------+------+-------++-------+-------+
I need formula in column name "FEBRUARY". this formula will find its match in sheet 2
Assuming the first Count value should go in cell E3 of Sheet1, the following formula would be the usual way of doing it:-
=INDEX(Sheet2!C:C,MATCH($B3,Sheet2!$B:$B,0))
Then the Price (in F3) would be given by
=INDEX(Sheet2!D:D,MATCH($B3,Sheet2!$B:$B,0))
I think this query will work fine for your requirement
SELECT `Sheet1$`.ID,`Sheet1$`.NAME, `Sheet1$`.COUNT AS 'Jan-COUNT',`Sheet1$`.PRICE AS 'Jan-PRICE', `Sheet2$`.COUNT AS 'Feb-COUNT',`Sheet2$`.PRICE AS 'Feb-PRICE'
FROM `C:\Users\Nagendra\Desktop\aaaaa.xlsx`.`Sheet1$` `Sheet1$`, `C:\Users\Nagendra\Desktop\aaaaa.xlsx`.`Sheet2$` `Sheet2$`
WHERE (`Sheet1$`.NAME=`Sheet2$`.NAME)
Provide Actual path insted of
C:\Users\Nagendra\Desktop\aaaaa.xlsx
First you need to know about how to make connection. So refer http://smallbusiness.chron.com/use-sql-statements-ms-excel-41193.html

Resources