np.savetxt tuple converted into array - python-3.x

I am trying to pull out a .wav file into a txt file for analysis but I am getting a single value in Audio.txt in the end instead of the all the data points(value). I unable to figure out where I am going wrong in the piece of code given below:
import wave, struct
import numpy as np
waveFile = wave.open('test1.wav', 'r')
length = waveFile.getnframes()
for i in range(0,length):
waveData = waveFile.readframes(1)
data = struct.unpack("<h", waveData)
data_x = np.array(int(data[0])) #saving all value in a single array
#print(int(data[0]))
np.savetxt("Audio.txt", data_x.reshape(1,),delimiter=",")
Any help ??
Thank You.

Related

Passing Key,Value into a Function

I want to check a YouTube video's views and keep track of them over time. I wrote a script that works great:
import requests
import re
import pandas as pd
from datetime import datetime
import time
def check_views(link):
todays_date = datetime.now().strftime('%d-%m')
now_time = datetime.now().strftime('%H:%M')
#get the site
r = requests.get(link)
text = r.text
tag = re.compile('\d+ views')
views = re.findall(tag,text)[0]
#get the digit number of views. It's returned in a list so I need to get that item out
cleaned_views=re.findall('\d+',views)[0]
print(cleaned_views)
#append to the df
df.loc[len(df)] = [todays_date, now_time, int(cleaned_views)]
#df = df.append([todays_date, now_time, int(cleaned_views)],axis=0)
df.to_csv('views.csv')
return df
df = pd.DataFrame(columns=['Date','Time','Views'])
while True:
df = check_views('https://www.youtube.com/watch?v=gPHgRp70H8o&t=3s')
time.sleep(1800)
But now I want to use this function for multiple links. I want a different CSV file for each link. So I made a dictionary:
link_dict = {'link1':'https://www.youtube.com/watch?v=gPHgRp70H8o&t=3s',
'link2':'https://www.youtube.com/watch?v=ZPrAKuOBWzw'}
#this makes it easy for each csv file to be named for the corresponding link
The loop then becomes:
for key, value in link_dict.items():
df = check_views(value)
That seems to work passing the value of the dict (link) into the function. Inside the function, I just made sure to load the correct csv file at the beginning:
#Existing csv files
df=pd.read_csv(k+'.csv')
But then I'm getting an error when I go to append a new row to the df (“cannot set a row with mismatched columns”). I don't get that since it works just fine as the code written above. This is the part giving me an error:
df.loc[len(df)] = [todays_date, now_time, int(cleaned_views)]
What am I missing here? It seems like a super messy way using this dictionary method (I only have 2 links I want to check but rather than just duplicate a function I wanted to experiment more). Any tips? Thanks!
Figured it out! The problem was that I was saving the df as a csv and then trying to read back that csv later. When I saved the csv, I didn't use index=False with df.to_csv() so there was an extra column! When I was just testing with the dictionary, I was just reusing the df and even though I was saving it to a csv, the script kept using the df to do the actual adding of rows.

How to read from a CSV file in Python while still maintaining single responsibility principle?

When reading from a file, I save the read in as a dictionary like this:
with open('StudentsPerformance.csv', 'r') as file:
read_csv = csv.DictReader(file)
# Find the minimum math score
min_math_score = 0
for row in read_csv:
math_score = int(row['math score'])
if math_score < min_math_score:
min_math_score = math_score
But I would like to iterate through the read_csv variable outside of the "with open" function. How can I do this without getting a:
ValueError: I/O operation on closed file.
Thank you.

Converting multiple .pdf files with multiple pages into 1 single .csv file

I am trying to convert .pdf data to a spreadsheet. Based on some research, some guys recommended transforming it into csv first in order to avoid errors.
So, I made the below coding which is giving me:
"TypeError: cannot concatenate object of type ''; only Series and DataFrame objs are valid"
Error appears at 'pd.concat' command.
'''
import tabula
import pandas as pd
import glob
path = r'C:\Users\REC.AC'
all_files = glob.glob(path + "/*.pdf")
print (all_files)
df = pd.concat(tabula.read_pdf(f1) for f1 in all_files)
df.to_csv("output.csv", index = False)
'''
Since this might be a common issue, I am posting the solution I found.
"""
df = []
for f1 in all_files:
df = pd.concat(tabula.read_pdf(f1))
"""
I believe that breaking the item iteration in two parts would generate the dataframe it needed and therefore would work.

How to get unique "point source ID's" for LIDAR block?

I'm trying to get information about unique flightlines appearing in a block of LIDAR data using a laspy.
I have already tried running a lasInfo module for the whole block, but what I get is just a min and max point_source_ID values opposed to list of individual flightlines, which I need.
This is what I've tried so far:
import laspy
import glob
las_files_list = glob.glob(r'PATH\*.las')
print(las_files_list)
las_source_id_set = set()
for f in las_files_list:
las_file = laspy.file.File(f, mode='r')
las_source_id_list = las_file.pt_src_id
for i in las_source_id_list:
las_source_id_set.add(i)
las_file.close()
print(las_source_id_set,' ', f)
print(las_source_id_set)
with open('point_source_id.txt', 'w') as f:
f.write(las_source_id_set)
Unfortuanetelly the whole process is rather slow, and with a larger dataset I get a stack overflow error and eventually never get to the 'write a file' part.
The process is slower than it could be, because you are doing a loop in Python.
There is a numpy function that you can use to make the process faster : numpy.unique
Your script would become:
import laspy
import glob
import numpy as np
las_files_list = glob.glob(r'PATH\*.las')
print(las_files_list)
las_source_id_set = set()
for f in las_files_list:
with laspy.file.File(p) as las:
las_source_id_set.update(np.unique(las.pt_src_id))
print(las_source_id_set,' ', f)
print(las_source_id_set)
with open('point_source_id.txt', 'w') as f:
f.write(las_source_id_set)

CSV to Pythonic List

I'm trying to convert a CSV file into Python list I have strings organize in columns. I need an Automation to turn them into a list.
my code works with Pandas, but I only see them again as simple text.
import pandas as pd
data = pd.read_csv("Random.csv", low_memory=False)
dicts = data.to_dict().values()
print(data)
so the final results should be something like that : ('Dan', 'Zac', 'David')
You can simply do this by using csv module in python
import csv
with open('random.csv', 'r') as f:
reader = csv.reader(f)
your_list = map(list, reader)
print your_list
You can also refer here
If you really want a list, try this:
import pandas as pd
data = pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist()
This produces
['Dan', 'Zac', 'David']
If you want a tuple instead, just cast the list:
data = tuple(pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist())
And this produces
('Dan', 'Zac', 'David')
I assumed that you use commas as separators in your csv and your file has no header. If this is not the case, just change the params of read_csv accordingly.

Resources