Append multiple CSVs Python - python-3.x

I have multiple CSVs with same data structures in all these.
All i want is to append all these in one or to create a separate master CSV in which all the records from all files will be stored.
Note: I don't want to use any of the dataframe like pandas or dask or any.
Can somebody help me out?
Thanks

# append option
for csv_file in csv_files:
with open(csv_file, 'r') as f:
data = f.readlines()[1:]
with open(master_file, 'a') as f:
f.write(data)
Note- You will need to add header line to the master_csv before loop

Related

For data driven test using robotframework, is it possible to have more than one data driver (using more than one datasheet)?

I'm currently doing a data-driven test. I created a test template that will use data from one excel file(specific sheet) for the input data. Is it possible that I import the data from more than one excel sheet in a single robot file?
Yes, it is possible. Use for loop to search through several files one by one.
you can use something like below: f1,f2,f3 etc.
`with open(filepath, 'r') as f1:
lines = f.readlines()
for line in lines:
#do something`

Switching between columns in a csv file

I have a folder with a lot of csv files, I want move column A to C, leave column A empty and push all other columns to the right.
I tried looking for something similar, but all other examples I saw refer to specific csv files and not an iteration over a folder.
thank you,
Here, the code to iterate over csvfile on your folder. Fill ellipsis with your code:
import pathlib
import shutil
root_dir = pathlib.Path('your_directory_path_here')
for csvfile in root_dir.glob('*.csv'):
df = pd.read_csv(csvfile, ...) # read your csv
# modify the order of your column here
...
shutil.copyfile(csvfile, f"{csvfile}.bak") # backup your csv
df.to_csv(csvfile, ...) # write back your dataframe

how to read text data and Convert to pandas dataframe

I am reading from the URL
import pandas as pd
url = 'https://ticdata.treasury.gov/Publish/mfh.txt'
Data = pd.read_csv(url, delimiter= '\t')
But when I export the Dataframe, I see all the columns are combined in a single column
I tried with different separators but didn't work. I want to get the Proper data frame. How I can achieve it.
Please help
I just opened this file in a text editor. This is not a tab delimited file, or anything delimited file. This is a fixed width fields files from line 9 to 48. You should use pd.read_fwf instead skiping some lines.
This would work:
import pandas as pd
url = 'https://ticdata.treasury.gov/Publish/mfh.txt'
Data = pd.read_fwf(url, widths=(31,8,8,8,8,8,8,8,8,8,8,8,8,8), skiprows=8, skipfooter=16, header=(0,1))
Data
The file you're linking is not in the CSV/TSV format. You need to transform the data to look something like this before loading it in this way.

How to read the dataset (.data and .names) directly into Python DataFrame from UCI Machine Learning Repository

I am looking a way to read the dataset directly from UCI Machine Learning repository. But i am only able to get dataset.. not its description.
Here is the link https://archive.ics.uci.edu/ml/datasets/Car+Evaluation and https://archive.ics.uci.edu/ml/machine-learning-databases/car/ to the data I want to import.
The files are .data and .names.
How do you import them into Python as a data frame?
I have tried as below.. where i have to manually write the features or column names. Is there a way we can read .names file and set the features from there.
Creating the feature names manually might be ok for dataset with handful of features.. but as features grow it will be hard to do it manually.
# Without Column Names
df = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data', header=None)
# Generating Column Name manually.
names=[ 'buying','maint','doors','persons','lug_boot','safety','class']
df2 = pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/car/car.data', names = names)
Any help, will be appreciated.
Thanks.
.names files are unstructered, unfortunately for this reason you would have to open the file and extract the column names manually. Once you do so, you could add these names into a list. Given that you have multiple .data files and that these are in the same order, you could use a for loop to label the column names and read the datafiles simultaneously.
column_names = ["example1", "example2", "example3"]
data_list =[]
data = ["link to the sourcefile/file.data", "link to the
sourcefile/file.data", "link to the sourcefile/file.data"]
for file in data:
df = pd.read_csv(file, names = column_names)
data_list.append(df)

Combining multiple CSV files to a single xls file

I have multiple CSV files say 5 of them, I want to consoildate all these CSV to a single excel workbook in separate sheets of that Combined excel file.
Please help me.
You did not specify a language or tech. so I understand you have no preferences. I had an article exactly on this using R: http://analystcave.com/r-staring-you-journey-with-r-and-machine-learning/
See an elegant solution in R below:
csv1 <- read.csv(file="testcsv1.csv", sep=";")
csv2 <- read.csv(file="testcsv2.csv", sep=";")
csvRes <- rbind(csv1,csv2)
write.table(csvRes, file="resCsv.csv", sep=";",row.names=FALSE)

Resources