Calculate decile limits in Stata - statistics
Here is my problem: I have monthly income data and have used the "xtile" command to calculate the 5% quantiles
xtile income_decile=bbh5101, nq(20)
How can I find out which borders Stata used to allocate the observation to a certain quantile bin, e.g. first quantile bin from 0 to 800€, second quantile bin from 801 to 1600€ and so on?
I believe you just want the percentiles. Use the corresponding _pctile command. For example:
clear all
set more off
sysuse auto
xtile q = weight, nq(10)
_pctile weight, nq(10)
sort weight
list weight q
return list
Checking those two lists, should be useful. See also the Methods and formulas section in [D] pctile.
The result:
. list weight q
+-------------+
| weight q |
|-------------|
1. | 1,760 1 |
2. | 1,800 1 |
3. | 1,800 1 |
4. | 1,830 1 |
5. | 1,930 1 |
|-------------|
6. | 1,980 1 |
7. | 1,990 1 |
8. | 2,020 1 |
9. | 2,040 2 |
10. | 2,050 2 |
|-------------|
11. | 2,070 2 |
12. | 2,110 2 |
13. | 2,120 2 |
14. | 2,130 2 |
15. | 2,160 2 |
|-------------|
16. | 2,200 3 |
17. | 2,200 3 |
18. | 2,230 3 |
19. | 2,240 3 |
20. | 2,280 3 |
|-------------|
21. | 2,370 3 |
22. | 2,410 3 |
23. | 2,520 3 |
24. | 2,580 4 |
25. | 2,640 4 |
|-------------|
26. | 2,650 4 |
27. | 2,650 4 |
28. | 2,670 4 |
29. | 2,690 4 |
30. | 2,730 4 |
|-------------|
31. | 2,750 5 |
32. | 2,750 5 |
33. | 2,830 5 |
34. | 2,830 5 |
35. | 2,930 5 |
|-------------|
36. | 3,170 5 |
37. | 3,180 5 |
38. | 3,200 6 |
39. | 3,210 6 |
40. | 3,220 6 |
|-------------|
41. | 3,250 6 |
42. | 3,260 6 |
43. | 3,280 6 |
44. | 3,300 6 |
45. | 3,310 6 |
|-------------|
46. | 3,330 7 |
47. | 3,350 7 |
48. | 3,370 7 |
49. | 3,370 7 |
50. | 3,400 7 |
|-------------|
51. | 3,420 7 |
52. | 3,420 7 |
53. | 3,430 8 |
54. | 3,470 8 |
55. | 3,600 8 |
|-------------|
56. | 3,600 8 |
57. | 3,670 8 |
58. | 3,690 8 |
59. | 3,690 8 |
60. | 3,700 8 |
|-------------|
61. | 3,720 9 |
62. | 3,740 9 |
63. | 3,830 9 |
64. | 3,880 9 |
65. | 3,900 9 |
|-------------|
66. | 4,030 9 |
67. | 4,060 9 |
68. | 4,060 9 |
69. | 4,080 10 |
70. | 4,130 10 |
|-------------|
71. | 4,290 10 |
72. | 4,330 10 |
73. | 4,720 10 |
74. | 4,840 10 |
+-------------+
.
. return list
scalars:
r(r1) = 2020
r(r2) = 2160
r(r3) = 2520
r(r4) = 2730
r(r5) = 3190
r(r6) = 3310
r(r7) = 3420
r(r8) = 3700
r(r9) = 4060
You can put the percentiles in a variable. Just use:
pctile p = weight, nq(10)
Related
Auto Incrementing Number Values in Excel
I have to re-number over 30,000 rows in excel and am looking for a way to do this through an embedded excel function. I have two columns, the original BuildingCount and the Test column. In the BuildingCount column, I have inconsistent count that needs to be consecutive 1,2,3 numbers in order to run a macros. However, the numbers are not always consecutive. I have been writing different variations of excel functions. The below is the output for =IF(A2>1),A2+1,1) +----+---------------+------------+ | | A | B | +----+---------------+------------+ | 1 | BuildingCount | TestColumn | | 2 | 1 | #VALUE! | | 3 | 2 | 1 | | 4 | 3 | 3 | | 5 | 5 | 4 | | 6 | 6 | 6 | | 7 | 9 | 7 | | 8 | 1 | 10 | | 9 | 2 | 1 | | 10 | 3 | 3 | | 11 | 4 | 4 | | 12 | 5 | 5 | +----+---------------+------------+ Ideally, the output would be the following: +----+---------------+------------+ | | A | B | +----+---------------+------------+ | 1 | BuildingCount | TestColumn | | 2 | 1 | 1 | | 3 | 2 | 2 | | 4 | 3 | 3 | | 5 | 5 | 4 | | 6 | 6 | 5 | | 7 | 7 | 6 | | 8 | 1 | 1 | | 9 | 2 | 2 | | 10 | 3 | 3 | | 11 | 4 | 4 | | 12 | 5 | 5 | +----+---------------+------------+ Any ideas would be very welcomed.
Formula in B2: =IF(ROW()=2,1,IF(A2>A1,B1+1,1)) And dragged down
Joining 2 Tables (without Power Query - Macbook, Index/Match too slow) - Potential VBA Option?
I want to join 2 tables. I know I can do it with power query but as I am on Macbook I can't do it, unfortunately. Does anyone have any suggestions? (I would love to try this in VBA would that be possible?) I've created Pivot Tables before using VBA but never joining 2 tables. My goal is to create a Pivot Table from the resulting table (resulting table being after combining Table 1 and Table 2). Table 1 Foreign Keys: Division and Location Division | Year | Week | Location | SchedDept | PlanNetSales | ActNetSales | AreaCategory ----------|------|------|----------|-----------|--------------|-------------|-------------- 5 | 2018 | 10 | 520 | 541 | 1943.2 | 2271.115 | Non-Comm 5 | 2018 | 10 | 520 | 608 | 4378.4 | 5117.255 | Non-Comm 5 | 2018 | 10 | 520 | 1059 | 1044.8 | 1221.11 | Comm 5 | 2018 | 10 | 520 | 1126 | 6308 | 7372.475 | Non-Comm 5 | 2018 | 10 | 520 | 1605 | 1119.2 | 1308.065 | Non-Comm 5 | 2018 | 10 | 520 | 151 | 2995.2 | 3500.64 | Non-Comm 5 | 2018 | 10 | 520 | 1637 | 6371.2 | 7446.34 | Non-Comm 5 | 2018 | 10 | 520 | 3081 | 1203.2 | 1406.24 | Non-Comm 5 | 2018 | 10 | 520 | 6645 | 7350.4 | 8590.78 | Vendor Paid 5 | 2018 | 10 | 520 | 452 | 1676.8 | 1959.76 | Non-Comm 5 | 2018 | 10 | 520 | 527 | 7392 | 8639.4 | Non-Comm 5 | 2018 | 10 | 520 | 542 | 6824.8 | 7976.485 | Non-Comm 5 | 2018 | 10 | 520 | 824 | 1872.8 | 2188.835 | Non-Comm 5 | 2018 | 10 | 520 | 1201 | 6397.6 | 7477.195 | Non-Comm 5 | 2018 | 10 | 520 | 1277 | 2517.6 | 2942.445 | Non-Comm 5 | 2018 | 10 | 520 | 1607 | 2196.8 | 2567.51 | Vendor Paid 5 | 2018 | 10 | 520 | 104 | 3276.8 | 3829.76 | Non-Comm Table 2 Foreign Keys: Division and Location Division | Location | LocationName | Region | RegionName | District | DistrictName ----------|----------|--------------|--------|------------|----------|-------------- 5 | 520 | Location 520 | 1 | Region 1 | 1 | District 1 5 | 584 | Location 584 | 1 | Region 1 | 1 | District 1 5 | 492 | Location 492 | 1 | Region 1 | 2 | District 2 5 | 215 | Location 215 | 1 | Region 1 | 3 | District 3 5 | 649 | Location 649 | 1 | Region 1 | 4 | District 4 5 | 674 | Location 674 | 1 | Region 1 | 1 | District 1 5 | 139 | Location 139 | 1 | Region 1 | 1 | District 1 5 | 539 | Location 539 | 1 | Region 1 | 5 | District 5 5 | 489 | Location 489 | 1 | Region 1 | 5 | District 5 5 | 139 | Location 139 | 1 | Region 1 | 1 | District 1 5 | 161 | Location 161 | 1 | Region 1 | 6 | District 6 5 | 543 | Location 543 | 1 | Region 1 | 4 | District 4 5 | 166 | Location 166 | 1 | Region 1 | 6 | District 6 5 | 71 | Location 71 | 1 | Region 1 | 5 | District 5 5 | 618 | Location 618 | 1 | Region 1 | 5 | District 5 I did it with index match but it is super slow. Here's a screenshot. I tried it with the above and then again with the Table Name and Column Names. =INDEX(LocTable[[#Headers],[Region]], MATCH(MetricsTable[[#Headers],[Division]]&MetricsTable[[#Headers],[Location]],LocTable[[#Headers],[Division]]&LocTable[[#Headers],[Location]],0)) However the above creates a table array "multi-cell array formulas are not allowed in tables". Is the only solution to revert back to nontables so I can run my formula and just deal with the super slowness or is there an option in VBA etc? Thanks in advance!
Excel: Give scores based on range, where max = 1 and min = 10
I have following problem: I want to give scores to a range of numbers from 1-10 for example: | | A | B | |---|------|----| | 1 | 1209 | 1 | | 2 | 401 | 7 | | 3 | 123 | 9 | | 4 | 49 | 10 | | 5 | 30 | 10 | (Not sure if B is 100% correct but roughly) I got the B values with =ABS(CEILING(A1;MAX($A$1:$A$32)/10)*10/MAX($A$1:$A$32)-11) It seems to work but if I for example take numbers like | | A | B | |---|------|----| | 1 | 100 | 1 | | 2 | 90 | 2 | | 3 | 80 | 3 | | 4 | 70 | 4 | | 5 | 50 | 6 | But I want 50 to be 10. I would like to have it scalable so I can do it with a 1-10 or 1-100 or 5-27 or whatever scale and with however many numbers in the list and whatever numbers to score from. Thanks!
Use this formula: =$E$1 + ROUND((MIN($A:$A)-A1)/((MAX($A:$A)-MIN($A:$A))/($E$1-$E$2)),0) It is scalable. You put the max and min in E1 and E2.
How use grep for that complicated expressions?
+----+-------+-----+ | ID | STORE | QTY | +----+-------+-----+ | | | | | 9 | 101 | 18 | | | | | | 8 | 154 | 19 | | | | | | 7 | 111 | 13 | | | | | | 9 | 154 | 18 | | | | | | 8 | 101 | 19 | | | | | | 7 | 101 | 13 | | | | | | 9 | 111 | 18 | | | | | | 8 | 111 | 19 | | | | | | 7 | 154 | 14 | +----+-------+-----+ Suppose that I have 3 stores, and I'd like to take STORE for every id which qty is the same for every store. e.g id 9 is in 3 stores, in every store has 18 qty, but id 7 is in stores but in only two store has equal qty (in store 111 and 101 - in 154 - id has 14 qty); how can I get that result using grep? Do you think that is impossible to get that one in one expressions? I thought about regex but I don't know in which way I get Qty and compare to another row. In my file it looks like:
Extract the first and last columns by cut, count the number of uniq combinations, and output only those whose count is 3 (i.e. the value is the same for all three stores): $ cut -d\| -f2,4 | sort | uniq -c | grep '^ *3 ' 3 8 | 19 3 9 | 18
How to add space between rows and sum up automatically in Excel
let's say that I have a table like the below: | | Value 1 | Value 2 | Value 3 | | |---|---------|---------|---------|---| | A | 22 | 12 | 3 | | | A | 5 | 6 | 12 | | | A | 19 | 9 | 13 | | | A | 22 | 43 | 31 | | | B | 7 | 12 | 23 | | | B | 5 | 5 | 8 | | | B | 35 | 78 | 9 | | | B | 45 | 1 | 8 | | | C | 34 | 56 | 0 | | | C | 22 | 1 | 14 | | | C | 13 | 46 | 45 | | and that I'd need to transform it into the below: | | Value 1 | Value 2 | Value 3 | | |---|---------|---------|---------|---| | A | 22 | 12 | 3 | | | A | 5 | 6 | 12 | | | A | 19 | 9 | 13 | | | A | 22 | 43 | 31 | | | | 68 | 70 | 59 | | | | | | | | | B | 7 | 12 | 23 | | | B | 5 | 5 | 8 | | | B | 35 | 78 | 9 | | | B | 45 | 1 | 8 | | | | 92 | 96 | 48 | | | | | | | | | C | 34 | 56 | 0 | | | C | 22 | 1 | 14 | | | C | 13 | 46 | 45 | | | | 69 | 103 | 59 | | How could I obtain the desired effect automatically? There would be n empty rows after each group and the sums of each column within the group.
You can use the Subtotal feature of Excel. Subtotal is in the "Data" tab of the ribbon. To automatically add the totals between groupings. I don't think it adds the blank row. If you absolutely need the blank row, then I can generate some VBA that will work.