How to automatically adjust spaces to make columns in Notepad++ [duplicate] - text

This question already has answers here:
Notepad++ - Aligning text vertically in multiple columns
(2 answers)
Closed 1 year ago.
I have two columns separated by :.
'col2': 'str',
'col3': 'float',
'col4': 'float'
'DBN': 'float',
'School Name': 'float',
'Category': 'float',
'Year': 'float',
'Total Enrollment': 'float',
'#Grade K': 'float',
'#Grade 1': 'float',
'#Grade 2': 'float',
'#Grade 3': 'float',
'#Grade 4': 'float',
I want to automatically add spaces so resulting text is:
'col2': 'str',
'col3': 'int',
'col4': 'float'
'DBN': 'str',
'School Name': 'str',
'Category': 'str',
'Year': 'int',
'Total Enrollment': 'int',
'#Grade K': 'float',
'#Grade 1': 'float',
'#Grade 2': 'float',
'#Grade 3': 'float',
'#Grade 4': 'float',
I am looking for solution in Notepad++, but if you know other tools for this task, please also share them.

Update:
TextFX doesn't is outdated and doesn't work on current version of Notepad++
Here is some effort to make it work and probably source code source code is working, but binaries are not available:
https://github.com/PhilR73/NPPTextFX
Thanks to suggestion by Jeff Holt I easily found solution: plugin TextFX.
screenshots from https://www.youtube.com/watch?v=5cQ-1JnsW_g
Before:
After:

Related

Modify a column in Python such that the numbering is continuous

I have a dataset given as such:
#Load the required libraries
import pandas as pd
#Create dataset
data = {'team': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
'Run_time': [1, 2, 3, 4, 5, 1, 2, 3, 1, 2, 3, 4],
'Married': ['No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'No'],
'Self_Employed': ['No', 'No', 'Yes', 'No', 'No', 'No', 'Yes', 'No', 'No', 'Yes', 'No', 'No'],
'LoanAmount': [123, 128, 66, 120, 141, 52,96,15,85,36,58,89],
}
#Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)
The dataset looks as such:
Here, in the 'Run_time' column, the numbering starts at different index values.
I wish to ensure that the 'Run_time' column starts from 1 only.
The dataset needs to look as such:
Can somebody please let me know how to modify this column in Python such that the numbering is continuous?
import pandas as pd
#Create dataset
data = {'team': ['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A'],
'Run_time': [1, 2, 3, 4, 5, 1, 2, 3, 1, 2, 3, 4],
'Married': ['No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'No'],
'Self_Employed': ['No', 'No', 'Yes', 'No', 'No', 'No', 'Yes', 'No', 'No', 'Yes', 'No', 'No'],
'LoanAmount': [123, 128, 66, 120, 141, 52,96,15,85,36,58,89],
}
#Convert to dataframe
df = pd.DataFrame(data)
# print("df = \n", df)
df.Run_time = df.index+1
df

Create an additional column in a datframe based on a specific condition

I have a dataset given as such:
#Load the required libraries
import pandas as pd
#Create dataset
data = {'team': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C', 'C'],
'Run_time': [1, 2, 3, 4, 5, 1, 2, 3, 1, 2, 3, 4],
'Married': ['No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'Yes', 'No', 'Yes', 'Yes', 'No'],
'Self_Employed': ['No', 'No', 'Yes', 'No', 'No', 'No', 'Yes', 'No', 'No', 'Yes', 'No', 'No'],
'LoanAmount': [123, 128, 66, 120, 141, 52,96,15,85,36,58,89],
}
#Convert to dataframe
df = pd.DataFrame(data)
print("df = \n", df)
Here, I wish to add an additional column 'Last_entry' which will contain 0's and 1's.
This column appears such that, for team-A, the last run-time is 5. So that row has Last_entry=1 and all other run-times for team-A should be 0.
For team-B, the last run-time is 3. So that row has Last_entry=1 and all other run-times for team-B should be 0.
For team-C, the last run-time is 4. So that row has Last_entry=1 and all other run-times for team-C should be 0.
The net result needs to look as such:
New dataframe by adding additional column
Can somebody please let me know how to achieve this task in python?
I wish to add an additional column in an existing dataset by using python
You can use groupby and tail to get the last entry for each team. Then make a new column of zeroes, and set the resulting rows to one:
# Determine indicies for last entries
last_entry_idx = df.groupby('team').tail(1).index
# Create new column
df['last_entry'] = 0
df.loc[last_entry_idx, 'last_entry'] = 1

Can't run matplotlib library, error in (The DTypes <class 'numpy.dtype[float64]'> and <class 'numpy.dtype[datetime64]'> do not have a common DType)

Unable to co-run code for result. Gives an error message [TypeError: The DTypes <class 'numpy.dtype[float64]'> and <class 'numpy.dtype[datetime64]'> do not have a common DType. For example they cannot be stored in a single array unless the dtype is object. ].[data set][1]
headers = ['Day', 'Hour', 'time', 'T', 'P0', 'P', 'Pa', 'U', 'Ff', 'R', 'Q']
dtypes = {'Day': 'str', 'Hour': 'str', 'time': 'str', 'T': 'float', 'P0': 'float', 'Pa': 'float',
'U': 'float', 'Ff': 'float', 'R': 'float', 'Q': 'float'}
parse_dates = ['Day', 'Hour', 'time']
points = pandas.read_excel('last.xlsx', sheet_name=1,
names=headers, dtype=str, parse_dates=True)
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
points.time = pandas.to_datetime(points.time)
points.iloc[:,3:] = points.iloc[:,3:].astype(float)
x = points['time'].values
y = points['T'].values
z = points['Q'].values
ax.scatter(x, y, z, c='r', marker='o')
plt.show()
what am I doing wrong?
[1]: https://disk.yandex.ru/i/JxQBErHf7hptJw

Pandas error TypeError: bad operand type for unary ~: 'float'

I have inherited this piece of code
dummy_data1 = {
'id': ['1', '2', '3', '4', '5'],
'Feature1': ['A', 'C', 'E', 'G', 'I'],
'Feature2': ['Mouse', 'dog', 'house and parrot', '23', np.NaN],
'dates': ['12/12/2020','12/12/2020','12/12/2020','12/12/2020','12/12/2020']}
df1 = pd.DataFrame(dummy_data1, columns = ['id', 'Feature1', 'Feature2', 'dates'])
df1 = df1.assign(
Feature2=lambda df: df.Feature2.where(
~df.Feature2.str.isnumeric(),
pd.to_numeric(df.Feature2, errors="coerce").astype("Int64"),
)
)
print(df1)
I know that this is because of the np.NAN value. What does the code do? My understanding is that it tries to convert the String to Int, if it is of type integer. Also please tell me how to overcome this issue.
You can try via pd.to_numeric() and then fill NaN's:
df['Feature2']=pd.to_numeric(df['Feature2'], errors="coerce").fillna(df['Feature2'])
OR
go with the where() condition by filling those NaN's with fillna() in your condition ~df.Feature2.str.isnumeric():
df['Feature2']=df['Feature2'].where(~df.Feature2.str.isnumeric().fillna(True),
pd.to_numeric(df.Feature2, errors="coerce").astype("Int64")
)

Creating a dictionary inside another dictionary

Given the following data how can I create a dictionary where the keys are the names of the students, and the values are dictionaries where the key is the test and it´s value is the grade they got in it.
grades = [
['Students', 'Test 1', 'Test 2', 'Test 3'],
['Tomas', '100', '90', '80'],
['Marcos', '88', '99', '111'],
['Flavia', '45', '56', '67'],
['Ramon', '59', '61', '67'],
['Ursula', '73', '79', '83'],
['Federico', '89', '97', '101']
]
I tried doing this, but I don´t know why it´s not showing the grades correctly.
notas_dict={}
def dic(etiquets, notas):
for i in range(len(etiquets)):
notas_dict[etiquets[i]]=int(notas[i])
return notas_dict
dic(['Test 1','Test 2', 'Test 3'], ['100','80','90'] )
dic_final={}
for line in grades[1:]:
line_grades=[int(element) for element in line[1:]]
dic_final[line[0]]=dic(['Test 1','Test 2', 'Test 3'], line_grades)
print(dic_final)
The output should be :
{'Tomas': {'Test 1': 100, 'Test 2': 90, 'Test 3': 80}, 'Marcos': {'Test 1': 88, 'Test 2': 99, 'Test 3': 111}, 'Flavia': {'Test 1': 45, 'Test 2': 56, 'Test 3': 67}, 'Ramon': {'Test 1': 59, 'Test 2': 61, 'Test 3': 67}, 'Ursula': {'Test 1': 73, 'Test 2': 79, 'Test 3': 83}, 'Federico': {'Test 1': 89, 'Test 2': 97, 'Test 3': 101}}
You can use:
{i[0]:dict(zip(grades[0][1:],i[1:])) for i in grades[1:]}
results in:
{'Tomas': {'Test 1': '100', 'Test 2': '90', 'Test 3': '80'},
'Marcos': {'Test 1': '88', 'Test 2': '99', 'Test 3': '111'},
'Flavia': {'Test 1': '45', 'Test 2': '56', 'Test 3': '67'},
'Ramon': {'Test 1': '59', 'Test 2': '61', 'Test 3': '67'},
'Ursula': {'Test 1': '73', 'Test 2': '79', 'Test 3': '83'},
'Federico': {'Test 1': '89', 'Test 2': '97', 'Test 3': '101'}}
If you want to get grades as int:
{i[0]:dict(zip(grades[0][1:],list(map(int,i[1:])))) for i in grades[1:]}
create a dataframe then use to_records to create a list of tuples where each tuple is a row. You can then slice the tuple by index.
grades = [
['Students', 'Test 1', 'Test 2', 'Test 3'],
['Tomas', '100', '90', '80'],
['Marcos', '88', '99', '111'],
['Flavia', '45', '56', '67'],
['Ramon', '59', '61', '67'],
['Ursula', '73', '79', '83'],
['Federico', '89', '97', '101']
]
Columns=grades[0]
df=pd.DataFrame(columns=Columns)
for i in range(1, len(grades)):
df_length = len(df)
df.loc[df_length] = grades[i]
print(df.to_records())
output:
[(0, 'Tomas', '100', '90', '80') (1, 'Marcos', '88', '99', '111')
(2, 'Flavia', '45', '56', '67') (3, 'Ramon', '59', '61', '67')
(4, 'Ursula', '73', '79', '83') (5, 'Federico', '89', '97', '101')]
or
dict=df.T.to_dict()
for k,v in dict.items():
print(k,v['Students'],v['Test1'],v['Test2'],v['Test3'])

Resources