**I have some text files & I need to convert that file into excel format. **
Txt_1 ,
Txt_2
*I'm using Python Pandas to do it. But I'm getting output as this *
Converted_excel_file
After converting data is not in proper order
*Code..*
import pandas as pd
my_cols = [str(i) for i in range(20)] # create some row names
df = pd.read_table('TEST_1.txt',
sep='\s+[\s|\r]',
skip_blank_lines=False,
skipinitialspace=True,
names=my_cols,
engine="python")
df.to_excel('Converted_excel_file.xlsx' )
How to Convert Txt file in proper way using Pandas or xlwt ,Openpyxl ??
Related
Any idea on how can I acccess or get the box data (see image) under TI_Binning tab of excel file using python? What module or similar code you can recommend to me? I just need those specifica data and append it on other file such as .txt file.
Getting the data you circled:
import pandas as pd
df = pd.read_excel('yourfilepath', 'TI_Binning', skiprows=2)
df = df[['Number', 'Name']]
To appending to an existing text file:
import numpy as np
with open("filetoappenddata.txt", "ab") as f:
np.savetxt(f, df.values)
More info here on np.savetxt for formats to fit your output need.
I have downloaded a Zip file containing files(.dbf, .shp) from the website using python.
from zipfile import ZipFile
with ZipFile('.\ZipDataset\ABC.zip','r') as zip_object:
print(zip_object.namelist())
zip_object.extract('A.dbf')
zip_object.extract('B.shp')
My question is how to convert the above extension file to excel spread sheet using python?
You need an additional libraries like pandas , pyshp or geopandas
According to this link the easiest way is :
You need to install 2 libraries
pip install pandas
pip install pyshp
and then run this code :
import shapefile
import pandas as pd
#read file, parse out the records and shapes
sf = shapefile.Reader('.\ZipDataset\ABC.zip')
fields = [x[0] for x in sf.fields][1:]
records = sf.records()
shps = [s.points for s in sf.shapes()]
#write into a dataframe
df = pd.DataFrame(columns=fields, data=records)
df = df.assign(coords=shps)
df.to_csv('output.csv', index=False)
I am pretty new to Python (using Python3) and read Pandas to import dataset.
I need to import dataset from url - https://newonlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/leukemia_remission/index.txt
and convert it to csv file, I am getting some special character in converted csv -> ��
I am download txt file and converting it to csv, is is the right approach?
and converted csv is putting entire text into one column
from urllib.request import urlretrieve
import pandas as pd
from pandas import DataFrame
url = 'https://newonlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/leukemia_remission/index.txt'
urlretrieve(url, 'index.txt')
df = pd.read_csv('index.txt', sep='/t', engine='python', lineterminator='\r\n')
csv_file = df.to_csv('index.csv', sep='\t', index=False, header=True)
print(csv_file)
after successful import, I have to Extract X as all columns except the first column and Y as first column also.
I'll appreciate your all help.
from urllib.request import urlretrieve
import pandas as pd
url = 'https://newonlinecourses.science.psu.edu/stat501/sites/onlinecourses.science.psu.edu.stat501/files/data/leukemia_remission/index.txt'
urlretrieve(url, 'index.txt')
df = pd.read_csv('index.txt', sep='\t',encoding='utf-16')
Y = df[['REMISS']]
X = df.drop(['REMISS'],axis=1)
I use the following code to read a csv file and save it as pandas data frame, but the method always return the iris dataset not my data. What is the problem?
import pandas as pd
a = pd.read_csv(r"D:\data.csv")
print(a)
I have an excel file with string stored in each cell:
rtypl srtyn OCVXZ srtyn
KPLNV KLNWZ bdfgh KLNWZ
xcvwh mvwhd WQKXM mvwhd
GYTR xvnm YTZN YTZN
ngws jklp PLNM jklp
I wanted to read excel file and write it in csv file. As you can see below:
import pandas as np
import csv
df = pd.read_excel(file, encoding='utf-16')
words= open("words.csv",'wb')
wr = csv.writer(words, dialect='excel')
for item in df:
wr.writerow(item)
But it reads the each line in separated alphabet and not as a string.
r,t,y,p,l
I am limited to write file as csv as I gonna use the result in a library that has lots of facility for csv file. Any advice on how I can read all the rows as a string in the cell is appreciated.
You can try the easiest solution:
# -*- coding: utf-8 -*-
import pandas as pd
df = pd.read_excel(file, encoding='utf-16')
df.to_csv('words.csv', encoding='utf-16')
Adding to zipa : If excel has multiple sheets : you can also try
import pandas as pd
df = pd.read_excel(file, 'Sheet1')
df.to_csv('words.csv')
Refer :
http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/