I´m struggling to open a text file using pandas.
I have this dataset from an experiment and it should be a 87x249 table.
However, when I use the pandas df.read_csv() command I get all the time a table which is 87x1. I tried to change the delimiters but I always get different tables of 87x1.
I tried to open the ascii file in excel and then save it as a csv. Then it worked and I got a nice table. But the point for me is to use the txt file directly.
Related
How can i use jupyter lab to read and extract tables from pdf files
A typical pdf file with text tilles subtitles and tables in between. I need the coding to extract the table under a specific title, and cleaning some unwanted text like page numbers
What are some of the coding to do that ?
Tabula-py: you can parse a PDF and convert it into a CSV, TSV, JSON, or a pandas DataFrame.
Pandas DataFrames - how do I export list 'X' to a CSV so it appears as a string? The problem is when I open the CSV using Excel it appears in date format.
X=['1-4', '1-5', '2-3', '4-8']
ie. when list 'X' is exported to a CSV and opened with Excel it appears as a date:
I would like list 'X' to appear in Excel as is - that is, not converted it to date format.
Desired output for Excel is:
I have tried the following code - but it throws an error:
import pandas as pd
X=['1-4', '1-5', '2-3', '4-8']
Y=[1,4,3,5]
df=pd.DataFrame(list(zip(X,Y)))
column_names=['A','B']
df.columns=[column_names]
df.A.to_string()
df.to_csv('yyy.csv', mode='a', header=True)
Thankyou
worked fine with me...
maybe the excel or whatever program u use to open the file is casting it... try open it as text file...
Even if Excel reads in date format, when you open in pandas it will come in original format (at least in my case). If someone only wants to save data in csv and work in pandas again, it should be fine.
I also tried doing the 2nd option here (https://www.winhelponline.com/blog/stop-excel-convert-text-to-number-date-format-csv-file/) which transform the data as text. And then saving again. It worked for me.
I have a dictionary that I want to write into the CSV file. while writing the string value it becomes float. but I need the same string value in CSV file not float. any idea?
mydict={'date':int(20200729),'number':int(123),'code':int(707),'cipher':str('54545417e92')}
print mydict.values()
with open('formatting.csv','ab') as f:
w=csv.writer(f)
w.writerow(mydict.keys())
w.writerow(mydict.values())
Manually increase the width of the column and you'll see the format changes.
Since you are writing a csv file (which, unlike xlsx, does not contain its own styling and formatting), it's not related to Python and there's nothing Python can do to make Excel use a specific format.
Like DeepSpace said, this comes from what excel is doing, not from what python is doing. But from my experience, once excel has opened the file and assumed your data to be a float you cannot get that precision back. I suggest viewing your data raw by opening the .CSV file with a text editor instead of excel.
If you must open the file in excel, then there is a different way to do it. Open a blank excel document and then go to the data tab and click "From Text/CSV". Then follow the prompts and use the wizard to import your data. This way you can make sure the data type does not change from string to float.
EDIT - as a side note, I see that you tagged your question with "python 3.x", but in your example you use the old python 2 syntax of print "a string". Starting in python 3.0, you must use print("a string").
A 3rd party software 'Eclipse Orchestrator' saves its config file as 'csv' format. Among other things it includes camera exposure times like '1/2000' to indicate a 1/2000 sec exposure. Here a sample line from the csv file:
FOR,(VAR),0.000,5.000,49.000
TAKEPIC,MAGPRE (VAR),-,00:01:10.0,EOS450D,1/2000,9.0,100,0.000,RAW+FL,,N,Partial 450D
ENDFOR
When the csv file is loaded into Excel the screen display reads 'Jan-00'. So Excel interprets the string 1/2000 as a date. When the file is saved again as csv and inspected in an ascii editor it reads:
FOR,(VAR),0,5,49,,,,,,,,
TAKEPIC,MAGPRE (VAR),-,01:10.0,EOS450D,Jan-00,9,100,0,RAW+FL,,N,Partial 450D
ENDFOR,,,,,,,,,,,,
I had hoped to use Excel to variablearize the data and make it easier changeable. But the conversion to fake dates is not helping here.
The conversion at load-time affects the saved data format making it then unreadable for the 'Eclipse Orchestrator' program.
Any way to save the day in Excel, or just move on to write a prog to do the patching of the csv file?
Thanks,
Gert
If you import the CSV file instead of opening it, you can use the import wizard (Data ribbon > From Text) to define the data type of each column. Select Text for the exposure time and Excel will not attempt to convert it.
In excel I can open up a csv file using external data sources, and then chose to get data from text. This takes me through a set of steps to import the file. This works great, but I have a need to automate this process as many of these documents will need to be converted over time.
Is there a way to run a similar process as a script? I'm a complete newbie in this space.
You can run this command in a script:
csv2odf yourdata.csv yourtemplate.xlsx output.xlsx
You would need to get csv2odf and Python and create a template like this:
Insert column titles with the same number of columns as the csv.
Add one sample row of data. You can add formatting if you want.
Save the template as xlsx.