I want 04 as output in premonth . Can some one help on this? tried diff format and no luck.
enter code here
premonth = str(int(time.strftime('%m'))-1)
tried using
python date of the previous month
but due to strftime restriction I am not able to proceed.
Not the best way but this should work:
a = str(int(time.strftime('%m'))-1)
a = '0'+a if len(a)==1 else a
The following f-string will give you what you need, it also handles January correctly by using arithmetic manipulation to ensure 1 becomes 12:
f'{(int(time.strftime("%m")) + 10) % 12 + 1:02}'
Breaking that down, an f-string is a modern way to build strings from arbitrary expressions, in a way that keeps formatting and data together (unlike the old "string".format(item, item) and even older "string" % (item, item)).
Inside that f-string is a rather complex looking expression which is formatted with :02, meaning two places, zero-padded on left.
The expression is what correctly decrements your month with proper wrapping, as you can see from the following table:
+-------+-----+-----+----++-------+-----+-----+----+
| Value | +10 | %12 | +1 || Value | +10 | %12 | +1 |
+-------+-----+-----+----++-------+-----+-----+----+
| 1 | 11 | 11 | 12 || 7 | 17 | 5 | 6 |
| 2 | 12 | 0 | 1 || 8 | 18 | 6 | 7 |
| 3 | 13 | 1 | 2 || 9 | 19 | 7 | 8 |
| 4 | 14 | 2 | 3 || 10 | 20 | 8 | 9 |
| 5 | 15 | 3 | 4 || 11 | 21 | 9 | 10 |
| 6 | 16 | 4 | 5 || 12 | 22 | 10 | 11 |
+-------+-----+-----+----++-------+-----+-----+----+
and the following statement:
print(", ".join([f'{mm}->{(mm + 10) % 12 + 1:02}' for mm in range(1, 13)]))
which outputs:
1->12, 2->01, 3->02, 4->03, 5->04, 6->05, 7->06, 8->07, 9->08, 10->09, 11->10, 12->11
Related
This question already has an answer here:
How to split a file into chunks with 1000 lines in each chunk in Bash? [duplicate]
(1 answer)
Closed 3 months ago.
I need help splitting a big file (1.6 M records) into multiple files based on the maximum number of lines allowed per the sub files, with the caveat that an order should not spill across files and appear in multiple files.
Quick overview about the file:
The file has order information about transaction at a retail store. Each order can have multiple items. Below is a small example of a sample file.
sample_file:
order_nu
item_nu
Sale
1
1
10
1
2
20
1
3
30
2
1
10
2
2
20
3
1
10
3
2
10
4
1
20
4
2
24
4
3
34
4
4
10
4
5
20
5
1
30
5
2
20
5
3
40
Is it possible to write a Linux script that can help me split a file based on the number of lines with the caveat that an order should not spill across files and appear in multiple files.
For example for the above file, I need it be split with the condition that the individual sub_files should not have more than by 5 records per file, and an order should not appear in more than one file (assumption is an order will not have more than 5 items). Below is the expected output:
sub_file1 :
| order_nu | item_nu | Sale |
| -------- | --------|-------|
| 1 | 1 | 10 |
| 1 | 2 | 20 |
| 1 | 3 | 30 |
| 2 | 1 | 10 |
| 2 | 2 | 20 |
sub_file2:
| order_nu | item_nu | Sale |
| -------- | --------|-------|
| 3 | 1 | 10 |
| 3 | 2 | 10 |
sub_file3:
| order_nu | item_nu | Sale |
| -------- | --------|-------|
| 4 | 1 | 20 |
| 4 | 2 | 24 |
| 4 | 3 | 34 |
| 4 | 4 | 10 |
| 4 | 5 | 20 |
sub_file4:
| order_nu | item_nu | Sale |
| -------- | --------|-------|
| 5 | 1 | 30 |
| 5 | 2 | 20 |
| 5 | 3 | 40 |
Please let me know if there are any questions
Thank you!
Try something like this
max_lines=x
counter=1
while read line;
do
echo $line >> sub_file$counter.txt
if [ `wc -l < sub_file$counter.txt` -gt $max_lines ]
then
counter=$((counter+1))
fi
done < sample_file.txt
I have a question on the Mathematics Stack Exchange site where I ask about generating an exponential regression equation.
One of the answers provides a mathematical solution to my problem. The solution is written in mathematical notation:
Unfortunately, I'm not a math wiz, and I'm having trouble translating the mathematical notation to Microsoft Excel syntax.
What would the math look like in Excel?
+--------------+---------------+
| X (AGE) | Y (CONDITION) |
+--------------+---------------+
| 0 | 20 |
| 1 | 20 |
| 2 | 20 |
| 3 | 20 |
| 4 | 20 |
| 5 | 20 |
| 6 | 18 |
| 7 | 18 |
| 8 | 18 |
| 9 | 18 |
| 10 | 16 |
| 11 | 16 |
| 12 | 14 |
| 13 | 14 |
| 14 | 12 |
| 15 | 12 |
| 16 | 10 |
| 17 | 8 |
| 18 | 6 |
| 19 | 4 |
| 20 | 2 |
+--------------+---------------+
I can verify that your formula for a translates as follows into Excel:
=SUMPRODUCT(E2:E22,F2:F22)/SUMSQ(E2:E22)
where my E2:E22 is just your x and my F2:F22 is ln(21-y). It gives the same answer, 0.147233112, as doing an exponential fit and forcing the intercept to be zero (which corresponds to setting b=1 in
y-21=b*exp(ax)
as you can verify by taking logs).
The formula quoted is the same as the one mentioned here under Simple linear regression without the intercept term (single regressor)
So this begs the question of whether b should, in fact, be equal to 1 and this is outside the scope of the question.
So I've looked at some other posts, but they didn't quite help. I'm not new to python, but I'm relatively new to pandas and this has me stumped as to how to accomplish it in any manner that's not horribly inefficient. The data sets I've got are a little bit large and have some extraneous columns of data that I don't need, I've got them loaded as dataframes but they basically look like this:
+---------+---------+--------+-------+
| Subject | Week | Test | Value |
+---------+---------+--------+-------+
| 1 | Week 4 | Test 1 | 4 |
| 1 | Week 8 | Test 1 | 7 |
| 1 | Week 12 | Test 1 | 3 |
| 1 | Week 4 | Test 2 | 6 |
| 1 | Week 8 | Test 2 | 3 |
| 1 | Week 12 | Test 2 | 9 |
| 2 | Week 4 | Test 1 | 1 |
| 2 | Week 8 | Test 1 | 4 |
| 2 | Week 12 | Test 1 | 2 |
| 2 | Week 4 | Test 2 | 8 |
| 2 | Week 8 | Test 2 | 1 |
| 2 | Week 12 | Test 2 | 3 |
+---------+---------+--------+-------+
I want to rearrange the dataframes so that they look like this:
+---------+---------+--------+--------+
| Subject | Week | Test 1 | Test 2 |
+---------+---------+--------+--------+
| 1 | Week 4 | 4 | 6 |
| 1 | Week 8 | 7 | 3 |
| 1 | Week 12 | 3 | 9 |
| 2 | Week 4 | 1 | 8 |
| 2 | Week 8 | 4 | 1 |
| 2 | Week 12 | 2 | 3 |
+---------+---------+--------+--------+
If anyone has any ideas on how I can make this happen, I'd greatly appreciate it, and thank you in advance for your time!
Edit: After trying the solution provided by #HarvIpan, this is the output I'm getting:
+-----------------------------------------------+
| Subject Week Test_Test 1 Test_Test 2 |
+-----------------------------------------------+
| 0 1 Week 12 5 0 |
| 1 1 Week 4 5 0 |
| 2 1 Week 8 11 0 |
| 3 2 Week 12 0 12 |
| 4 2 Week 4 0 14 |
| 5 2 Week 8 0 4 |
+-----------------------------------------------+
Try using df.pivot_table.
You should be able to get the desired outcome with:
df.pivot_table(index=['Subject','Week'], columns='Test', values='Value')
You need get dummy variable for column Test with pd.get_dummies(df[['Test', 'Value']], 'Test').mul(df['Value'], 0)] with multiplication of their Value before concatenating them back to your original df. Then groupby Subject and Week before summing them.
pd.concat([df.drop(['Test', 'Value'],1), pd.get_dummies(df[['Test']], 'Test').mul(df['Value'], 0)], axis=1).groupby(['Subject', 'Week']).sum(axis=1).reset_index()
Output:
Subject Week Test_ Test 1 Test_ Test 2
0 1 Week 12 3 9
1 1 Week 4 4 6
2 1 Week 8 7 3
3 2 Week 12 2 3
4 2 Week 4 1 8
5 2 Week 8 4 1
I am trying to build a dataset from an online questionnaire. In this questionnaire, participants were asked to name 6 items. These items are represented with numbers from 1 to 6 (order of mention does not matter). Afterwards, participants were asked to rank those items from most important to least important (order here matters). Right now I have three columns "Named items", "Item ranked" and "Rank." The last column represents the position at which each case was ranked at. Thus, the idea would be to look at the number in the first column "Named item" and search for its position on the second column "Items Ranked" and return its position to the third column corresponding row.
Since the numbers go from 1 to 6, every six rows the process has to start again on the 7th row. I have a total of 186 participants, which means there's a total of 1116 items. What would be the most efficient way of doing this and preventing human error?
Here is an example of how the sheet looks like done manually:
+----------------------+-----------------------------+------+
| Order of named items | Items ranked (# = Identity) | Rank |
+----------------------+-----------------------------+------+
| 1 | 2 | 4 |
| 2 | 5 | 1 |
| 3 | 6 | 6 |
| 4 | 1 | 5 |
| 5 | 4 | 2 |
| 6 | 3 | 3 |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
| 6 | 6 | 6 |
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
| 6 | 6 | 6 |
| 1 | 5 | 3 |
| 2 | 6 | 4 |
| 3 | 1 | 5 |
| 4 | 2 | 6 |
| 5 | 3 | 1 |
| 6 | 4 | 2 |
| 1 | 2 | 2 |
| 2 | 1 | 1 |
| 3 | 6 | 4 |
| 4 | 3 | 5 |
| 5 | 4 | 6 |
| 6 | 5 | 3 |
+----------------------+-----------------------------+------+
You can use this non volatile function:
=MATCH(A2,INDEX(B:B,INT((ROW(1:1)-1)/6)*6+2):INDEX(B:B,INT((ROW(1:1)-1)/6)*6+7),0)
Assuming 1st column starts at A2 and second column at B2 use this formula in C2 copied down
=MATCH(A2,OFFSET(B$2,6*INT((ROWS(C$2:C2)-1)/6),0,6),0)
OFFSET returns the 6 cell range required and MATCH finds the position of the relevant item within that
See screenshot below
I have following problem:
I want to give scores to a range of numbers from 1-10 for example:
| | A | B |
|---|------|----|
| 1 | 1209 | 1 |
| 2 | 401 | 7 |
| 3 | 123 | 9 |
| 4 | 49 | 10 |
| 5 | 30 | 10 |
(Not sure if B is 100% correct but roughly)
I got the B values with
=ABS(CEILING(A1;MAX($A$1:$A$32)/10)*10/MAX($A$1:$A$32)-11)
It seems to work but if I for example take numbers like
| | A | B |
|---|------|----|
| 1 | 100 | 1 |
| 2 | 90 | 2 |
| 3 | 80 | 3 |
| 4 | 70 | 4 |
| 5 | 50 | 6 |
But I want 50 to be 10.
I would like to have it scalable so I can do it with a 1-10 or 1-100 or 5-27 or whatever scale and with however many numbers in the list and whatever numbers to score from.
Thanks!
Use this formula:
=$E$1 + ROUND((MIN($A:$A)-A1)/((MAX($A:$A)-MIN($A:$A))/($E$1-$E$2)),0)
It is scalable. You put the max and min in E1 and E2.