I need to Convert one large column on windows excel 2016 of numbers into multiple columns of 10 rows each.
I am currently doing this manually. Please help me Stackoverflow! :)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
The results should be this:
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
10 20
and so on....
Try:
Formula in C1:
=INDEX($A:$A,ROW()+((COLUMN(A1)-1)*10))
Drag right and 10 down.
Related
I have list of integers, from which I would first like to get unique numbers, first ordered by their occurrences and then the numbers with equal counts should be ordered in descending order.
example 1:
input1 = [1,2,2,1,6,2,1,7]
expected output = [2,1,7,6]
explanation: both 2 and 1 appear thrice while 6 and 7 appear once. so, the numbers occurring thrice will be placed first and in descending order; and same for the set that appears once.
another example case:
input_2 = list(map(int, '40 29 2 44 30 79 46 85 118 66 113 52 55 63 48 99 123 51 110 66 40 115 107 46 6 114 36 99 13 108 85 39 14 121 42 37 56 11 104 28 24 123 63 51 118 52 120 28 64 43 44 86 42 71 101 78 93 1 6 14 42 33 88 107 35 70 74 30 54 76 27 91 115 71 63 103 94 109 39 4 16 108 97 83 29 57 86 121 53 94 28 7 5 31 123 21 2 17 112 104 75 124 88 30 108 14 65 118 28 81 80 14 14 107 21 60 47 97 50 53 19 112 43 46'.split()))
output_2 = list(map(int, '14 28 123 118 108 107 63 46 42 30 121 115 112 104 99 97 94 88 86 85 71 66 53 52 51 44 43 40 39 29 21 6 2 124 120 114 113 110 109 103 101 93 91 83 81 80 79 78 76 75 74 70 65 64 60 57 56 55 54 50 48 47 37 36 35 33 31 27 24 19 17 16 13 11 7 5 4 1'.split()))
This was from a coding test I took. This must be solved without using functions from imports like collections, itertools etc,. and using functions already available in python's namespace like dict, sorted is allowed. How do I do this as efficiently as possible?
def sort_sort(input1):
a = {i:input1.count(i) for i in set(input1)}
b ={i:[] for i in set(a.values())}
for k,v in a.items():
b[v].append(k)
for v in b.values():
v.sort(reverse=True)
output=[]
quays =list(b.keys())
quays.sort(reverse=True)
for q in quays:
output +=b[q]
print(output)
I've got two ways to amend a subarray in J but I don't like either of them.
(Imagine selecting a rectangle in a paint program and applying some arbitrary
operation to that rectangle in place.)
t =. i. 10 10 NB. table to modify
xy=. 2 3 [ wh =. 3 2 NB. region i want want to modify
u =. -#|. NB. operation to perform on region
I can fetch the subarray and apply the
operation in one step with cut (;.0):
st =. ((,./xy),:(|,./wh)) u;.0 t
Putting it back is easy enough, but seems to require
building a large boxed array of indices:
(,st) (xy&+each,{;&:i./wh) } t
I also tried recursively splitting and glueing
the table into four "window panes" at a time:
split =: {. ; }. NB. split y into 2 subarrays at index x
panes =: {{ 2 2$ ; |:L:0 X split&|:&.> Y split y [ 'Y X'=.x }}
glue =: [: ,&>/ ,.&.>/"1 NB. reassamble
xy panes t
┌────────┬────────────────────┐
│ 0 1 2│ 3 4 5 6 7 8 9│
│10 11 12│13 14 15 16 17 18 19│
├────────┼────────────────────┤
│20 21 22│23 24 25 26 27 28 29│
│30 31 32│33 34 35 36 37 38 39│
│40 41 42│43 44 45 46 47 48 49│
│50 51 52│53 54 55 56 57 58 59│
│60 61 62│63 64 65 66 67 68 69│
│70 71 72│73 74 75 76 77 78 79│
│80 81 82│83 84 85 86 87 88 89│
│90 91 92│93 94 95 96 97 98 99│
└────────┴────────────────────┘
NB. then split the lower right pane again,
NB. extract *its* upper left pane...
s0 =. 1 1 {:: p0 =. xy panes t
s1 =. 0 0 {:: p1 =. wh panes s0
NB. apply the operation and reassemble:
p1a =. (<u s1) (<0 0) } p1
glue (<glue p1a) (<1 1) } p0
The first approach seems to be the quicker and
easier option, but it feels like there ought
to be a more primitive way to apply a verb at
a sub-array without extracting it, or to paste
in a subarray at some coordinates without manually
creating the array of indices for each element.
Have I missed a better option?
I would begin by creating the set of indices that I wanted to amend
[ ind =. < xy + each i. each wh
┌───────────┐
│┌─────┬───┐│
││2 3 4│3 4││
│└─────┴───┘│
└───────────┘
I can use those to select the atoms I want from t
ind { t
23 24
33 34
43 44
And if I can select with them then I can use the same indices to amend t
_ ind } t
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 _ _ 25 26 27 28 29
30 31 32 _ _ 35 36 37 38 39
40 41 42 _ _ 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
and finally I can use a hook with the left tine being ind}~ after preprocessing t with the right tine (ind u#{ ]) to get my result
(ind}~ ind u#{ ]) t
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19
20 21 22 _43 _44 25 26 27 28 29
30 31 32 _33 _34 35 36 37 38 39
40 41 42 _23 _24 45 46 47 48 49
50 51 52 53 54 55 56 57 58 59
60 61 62 63 64 65 66 67 68 69
70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89
90 91 92 93 94 95 96 97 98 99
You actually gave me the solution when you asked how you can 'amend' your array in place.
I'm currently stuck trying to get the Hodrick-Prescott trend from different groups within a monthly dataset. Here's a replica of the dataset:
import pandas as pd
import numpy as np
import statsmodels.api as sm
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)),
columns=list('abcd'))
df['date'] = pd.date_range(start='2018-01-01',
periods=100, freq='M')
df['id'] = ['Group 1', 'Group 2', 'Group 3', 'Group 4'] * 25
df.rename({'a': 'target'}, axis=1, inplace=True)
final_df = df.groupby('id',
group_keys=False).apply(
lambda x: x.sort_values('date'))
The Dataset looks like this:
target b c d date id
0 28 45 17 46 2018-01-31 Group 1
4 58 23 34 76 2018-05-31 Group 1
8 30 98 91 79 2018-09-30 Group 1
12 15 23 25 96 2019-01-31 Group 1
16 67 45 41 38 2019-05-31 Group 1
20 28 40 36 38 2019-09-30 Group 1
24 8 95 28 86 2020-01-31 Group 1
28 14 53 58 75 2020-05-31 Group 1
32 46 3 26 61 2020-09-30 Group 1
36 50 71 80 34 2021-01-31 Group 1
40 78 38 97 75 2021-05-31 Group 1
44 15 74 83 25 2021-09-30 Group 1
48 27 43 18 84 2022-01-31 Group 1
52 84 38 11 24 2022-05-31 Group 1
56 23 29 81 22 2022-09-30 Group 1
60 87 56 92 65 2023-01-31 Group 1
64 24 99 55 86 2023-05-31 Group 1
68 16 68 36 63 2023-09-30 Group 1
72 43 29 80 44 2024-01-31 Group 1
76 0 48 35 49 2024-05-31 Group 1
80 17 50 51 51 2024-09-30 Group 1
84 17 16 40 87 2025-01-31 Group 1
88 98 13 70 27 2025-05-31 Group 1
92 21 30 96 87 2025-09-30 Group 1
96 19 35 32 47 2026-01-31 Group 1
1 21 45 34 61 2018-02-28 Group 2
5 35 15 95 11 2018-06-30 Group 2
9 3 31 94 25 2018-10-31 Group 2
13 65 89 1 7 2019-02-28 Group 2
17 77 41 12 58 2019-06-30 Group 2
... ... ... ... ... ... ...
82 32 99 54 27 2024-11-30 Group 3
86 67 5 71 44 2025-03-31 Group 3
90 79 94 34 53 2025-07-31 Group 3
94 4 60 37 85 2025-11-30 Group 3
98 20 16 32 97 2026-03-31 Group 3
3 70 63 94 98 2018-04-30 Group 4
7 2 13 14 5 2018-08-31 Group 4
11 49 44 20 27 2018-12-31 Group 4
15 11 60 39 10 2019-04-30 Group 4
19 22 96 48 5 2019-08-31 Group 4
23 23 22 30 8 2019-12-31 Group 4
27 39 11 58 89 2020-04-30 Group 4
31 61 72 68 78 2020-08-31 Group 4
35 29 20 7 30 2020-12-31 Group 4
39 53 20 32 98 2021-04-30 Group 4
43 97 31 60 74 2021-08-31 Group 4
47 46 65 15 93 2021-12-31 Group 4
51 31 24 5 75 2022-04-30 Group 4
55 42 59 87 68 2022-08-31 Group 4
59 75 50 62 60 2022-12-31 Group 4
63 5 24 15 83 2023-04-30 Group 4
67 77 12 81 44 2023-08-31 Group 4
71 74 15 11 90 2023-12-31 Group 4
75 34 0 19 81 2024-04-30 Group 4
79 2 26 36 98 2024-08-31 Group 4
83 45 66 9 23 2024-12-31 Group 4
87 74 67 35 98 2025-04-30 Group 4
91 69 78 46 7 2025-08-31 Group 4
95 66 77 91 41 2025-12-31 Group 4
99 66 11 96 91 2026-04-30 Group 4
Here's my current approach:
groups = final_df.groupby('id')
group_keys = list(groups.groups.keys())
bs = pd.DataFrame()
for key in group_keys:
g = groups.get_group(key)
target = g['target']
cycle, trend = sm.tsa.filters.hpfilter(target, lamb=129600)
g['hp_trend'] = trend
bs.append(g)
My goal is to simply generate the trend from HP-Filter for each group and append it to that group as a column such that each group will have its own trend based on the target field specified.
Currently, the bs dataframe only returns the empty dataframe that it started with. How can I get the result that I need?
Thanks for reading.
groups = final_df.groupby('id')
group_keys = list(groups.groups.keys())
bs = pd.DataFrame()
for key in group_keys:
g = groups.get_group(key).copy()
target = g['target']
cycle, trend = sm.tsa.filters.hpfilter(target, lamb=129600)
g['hp_trend'] = trend
bs = bs.append(g)
bs
I have written a code to print a snake and ladder grid.
I want the numbers to be aligned such that they are in a straight vertical line.
My code is:
for i in range(100,0,-1):
if i%20 == 0:
for i in range(i,i-10,-1):
print(i, end = " ")
print()
elif i%10 == 0:
for i in range(i-9,i+1):
print(i, end = " ")
print()
The present output is:
100 99 98 97 96 95 94 93 92 91
81 82 83 84 85 86 87 88 89 90
80 79 78 77 76 75 74 73 72 71
61 62 63 64 65 66 67 68 69 70
60 59 58 57 56 55 54 53 52 51
41 42 43 44 45 46 47 48 49 50
40 39 38 37 36 35 34 33 32 31
21 22 23 24 25 26 27 28 29 30
20 19 18 17 16 15 14 13 12 11
1 2 3 4 5 6 7 8 9 10
If you're using Python3, try replacing your:
print(i, end = " ")
lines with:
print(format(i, '6d'), end='')
If you must have the numbers left-justified, try this instead:
print('{:<6d}'.format(i), end='')
These will account for the fact that not every number has the same amount of digits, yet you want every number to take up the same amount of space.
How can I numerically count only lines that have words in them? In the example below, I have four lines with words in them:
100314:Status name one: 15
24 1 7 5 43 13 24 64 10 47 31 100 22 20 38 63 49 24 18 82 66 22 21 77 52 8 6 11 50 20 5 1 0
101245:Status name two: 14
2 10 2 2 25 53 3 31 30 1 21 41 9 14 18 40 6 10 18 72 20 16 33 29 19 18 12 60 48 12 8 50 43 13
103765:Yet another name here: 29
45 29 29 475 63 69 47 94 65 65 69 55 53 905 117 57 42 92 90 59 91 52 79 101 192 87 144 74 115 82 78 109 12 96 64 78 111 106 84 19 0 7
102983:Blah blah yada yada: 82
41 37 40 60 82 72 17 41 17 19 43 3
I've tried using different pipe combinations of wc -l and grep/uniq. I also tried counting only the odd lines (which works in the MWE above), but I'm looking for something more general-purpose for a large unstructured dataset.
It depends on how you define a word. If, for example, it's any two consecutive letters, you can just use something like:
grep -E '[a-zA-z]{2}' fileName | wc -l
You can simply adjust the regular expression depending on how you define a word (that one I've provided won't pick up "A" or "I" or "I'm" for example), but the concept will remain the same