I am trying to pull out a .wav file into a txt file for analysis but I am getting a single value in Audio.txt in the end instead of the all the data points(value). I unable to figure out where I am going wrong in the piece of code given below:
import wave, struct
import numpy as np
waveFile = wave.open('test1.wav', 'r')
length = waveFile.getnframes()
for i in range(0,length):
waveData = waveFile.readframes(1)
data = struct.unpack("<h", waveData)
data_x = np.array(int(data[0])) #saving all value in a single array
#print(int(data[0]))
np.savetxt("Audio.txt", data_x.reshape(1,),delimiter=",")
Any help ??
Thank You.
Related
I want to check a YouTube video's views and keep track of them over time. I wrote a script that works great:
import requests
import re
import pandas as pd
from datetime import datetime
import time
def check_views(link):
todays_date = datetime.now().strftime('%d-%m')
now_time = datetime.now().strftime('%H:%M')
#get the site
r = requests.get(link)
text = r.text
tag = re.compile('\d+ views')
views = re.findall(tag,text)[0]
#get the digit number of views. It's returned in a list so I need to get that item out
cleaned_views=re.findall('\d+',views)[0]
print(cleaned_views)
#append to the df
df.loc[len(df)] = [todays_date, now_time, int(cleaned_views)]
#df = df.append([todays_date, now_time, int(cleaned_views)],axis=0)
df.to_csv('views.csv')
return df
df = pd.DataFrame(columns=['Date','Time','Views'])
while True:
df = check_views('https://www.youtube.com/watch?v=gPHgRp70H8o&t=3s')
time.sleep(1800)
But now I want to use this function for multiple links. I want a different CSV file for each link. So I made a dictionary:
link_dict = {'link1':'https://www.youtube.com/watch?v=gPHgRp70H8o&t=3s',
'link2':'https://www.youtube.com/watch?v=ZPrAKuOBWzw'}
#this makes it easy for each csv file to be named for the corresponding link
The loop then becomes:
for key, value in link_dict.items():
df = check_views(value)
That seems to work passing the value of the dict (link) into the function. Inside the function, I just made sure to load the correct csv file at the beginning:
#Existing csv files
df=pd.read_csv(k+'.csv')
But then I'm getting an error when I go to append a new row to the df (“cannot set a row with mismatched columns”). I don't get that since it works just fine as the code written above. This is the part giving me an error:
df.loc[len(df)] = [todays_date, now_time, int(cleaned_views)]
What am I missing here? It seems like a super messy way using this dictionary method (I only have 2 links I want to check but rather than just duplicate a function I wanted to experiment more). Any tips? Thanks!
Figured it out! The problem was that I was saving the df as a csv and then trying to read back that csv later. When I saved the csv, I didn't use index=False with df.to_csv() so there was an extra column! When I was just testing with the dictionary, I was just reusing the df and even though I was saving it to a csv, the script kept using the df to do the actual adding of rows.
When reading from a file, I save the read in as a dictionary like this:
with open('StudentsPerformance.csv', 'r') as file:
read_csv = csv.DictReader(file)
# Find the minimum math score
min_math_score = 0
for row in read_csv:
math_score = int(row['math score'])
if math_score < min_math_score:
min_math_score = math_score
But I would like to iterate through the read_csv variable outside of the "with open" function. How can I do this without getting a:
ValueError: I/O operation on closed file.
Thank you.
I am trying to convert .pdf data to a spreadsheet. Based on some research, some guys recommended transforming it into csv first in order to avoid errors.
So, I made the below coding which is giving me:
"TypeError: cannot concatenate object of type ''; only Series and DataFrame objs are valid"
Error appears at 'pd.concat' command.
'''
import tabula
import pandas as pd
import glob
path = r'C:\Users\REC.AC'
all_files = glob.glob(path + "/*.pdf")
print (all_files)
df = pd.concat(tabula.read_pdf(f1) for f1 in all_files)
df.to_csv("output.csv", index = False)
'''
Since this might be a common issue, I am posting the solution I found.
"""
df = []
for f1 in all_files:
df = pd.concat(tabula.read_pdf(f1))
"""
I believe that breaking the item iteration in two parts would generate the dataframe it needed and therefore would work.
I'm trying to get information about unique flightlines appearing in a block of LIDAR data using a laspy.
I have already tried running a lasInfo module for the whole block, but what I get is just a min and max point_source_ID values opposed to list of individual flightlines, which I need.
This is what I've tried so far:
import laspy
import glob
las_files_list = glob.glob(r'PATH\*.las')
print(las_files_list)
las_source_id_set = set()
for f in las_files_list:
las_file = laspy.file.File(f, mode='r')
las_source_id_list = las_file.pt_src_id
for i in las_source_id_list:
las_source_id_set.add(i)
las_file.close()
print(las_source_id_set,' ', f)
print(las_source_id_set)
with open('point_source_id.txt', 'w') as f:
f.write(las_source_id_set)
Unfortuanetelly the whole process is rather slow, and with a larger dataset I get a stack overflow error and eventually never get to the 'write a file' part.
The process is slower than it could be, because you are doing a loop in Python.
There is a numpy function that you can use to make the process faster : numpy.unique
Your script would become:
import laspy
import glob
import numpy as np
las_files_list = glob.glob(r'PATH\*.las')
print(las_files_list)
las_source_id_set = set()
for f in las_files_list:
with laspy.file.File(p) as las:
las_source_id_set.update(np.unique(las.pt_src_id))
print(las_source_id_set,' ', f)
print(las_source_id_set)
with open('point_source_id.txt', 'w') as f:
f.write(las_source_id_set)
I'm trying to convert a CSV file into Python list I have strings organize in columns. I need an Automation to turn them into a list.
my code works with Pandas, but I only see them again as simple text.
import pandas as pd
data = pd.read_csv("Random.csv", low_memory=False)
dicts = data.to_dict().values()
print(data)
so the final results should be something like that : ('Dan', 'Zac', 'David')
You can simply do this by using csv module in python
import csv
with open('random.csv', 'r') as f:
reader = csv.reader(f)
your_list = map(list, reader)
print your_list
You can also refer here
If you really want a list, try this:
import pandas as pd
data = pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist()
This produces
['Dan', 'Zac', 'David']
If you want a tuple instead, just cast the list:
data = tuple(pd.read_csv('Random.csv', low_memory=False, header=None).iloc[:,0].tolist())
And this produces
('Dan', 'Zac', 'David')
I assumed that you use commas as separators in your csv and your file has no header. If this is not the case, just change the params of read_csv accordingly.